Hal here, your friendly Lorax and developer evangelist! I wanted to share with everyone a guest post from a Splunker whom I met and see regularly at the Metro Atlanta Splunk User Group, Robert Labrie. Robert is a DevOps Engineer at The Network Inc, a company which builds solutions that prevent, detect and remediate misconduct to help companies maintain ethical cultures.

This post is about how Robert approached building out a new architecture, and of course, how to index the data generated by all of the components. Without further ado, take it away, Robert!

The team at TNWDevLabs started a new effort to develop an internal SaaS product. It’s a greenfield project, and since everything is new, it let us pick up some new technology and workflows, including neo4j and nodejs. In my role as DevOps Engineer, the big change was running all the application components in Docker containers hosted on CoreOS.

CoreOS is a minimalist version of Linux, basically life support for Docker containers. It is generally not recommended to install applications on the host, which raises a question: “How am I going to get my application logs into Splunk?”.

With a more traditional Linux system, you would just install a Splunk forwarder on the host, and pick up files from the file system.

With CoreOS, you can’t really install applications on the host, so there is the possibility of putting a forwarder in every container.

This would work, but it seems wasteful to have a number of instances of Splunk running on the same host, and it doesn’t give you any information about the host.

CoreOS leverages SystemD which has an improved logging facility called JournalD. All system events (updates, etcd, fleet, etc) from CoreOS are written to JournalD. Apps running inside docker containers generally log to stdout, and all those events are sent to JournalD as well. This improved logging facility means getting JournalD into Splunk is the obvious solution.

The first step was to get the Splunk Universal Forwarder into a docker container. There were already some around, but I wanted to take a different approach. The idea is instead of trying to manage .conf files and getting them into the container, I leverage the Deployment Server feature already built into Splunk. The result is a public image called tnwinc/splunkforwarder. It takes two parameters passed in as environment variables: DEPLOYMENTSERVER and CLIENTNAME. These two parameters are fed into $SPLUNK_HOME/etc/system/local/deploymentclient.conf when the container is started. This is the bare minimum to get the container up and talking to the deployment server.

Setting up CoreOS

Running CoreOS, the container is started as a service. The service definition might look like:

ExecStart=/usr/bin/docker run -h %H -e DEPLOYMENTSERVER=splunk.example.com:8089 -e CLIENTNAME=%H -v /var/splunk:/opt/splunk --name splunk tnwinc/splunkforwarder

-h %H	Sets the hostname inside the container to match the hostname of the CoreOS host. %H gets expanded to the CoreOS hostname when the .service file is processed.
-e DEPLOYMENTSERVER	The host and port of your deployment server
-e CLIENTNAME =%H	This is the friendly client name.
-v /var/splunk:/opt/splunk	This exposes the real directory /var/splunk as /opt/splunk inside the container. This directory persists on disk when the container is restarted.

Now that Splunk is running and has a place to live, we need to feed data to it. To do this, I setup another service, which uses the journalctl tool to export the journal to a text file. This is required as Splunk can’t read the native JournalD binary format:

ExecStart=/bin/bash -c '/usr/bin/journalctl --no-tail -f -o json > /var/splunk/journald'

This command dumps everything in the journal out to JSON format (more on that later), then tails the journal. The process doesn’t exit, and continues to write out to /var/splunk/journald, which exists as /opt/splunk/journald inside the container.

Also note that as the journald file will continue to grow, I have an ExecStartPre directive that will trim the journal before the export happens:

ExecStartPre=/usr/bin/journalctl --vacuum-size=10M

Since I’m not appending, every time the service starts, the file is replaced. You may want to consider a timer to restart the service on a regular interval, based on usage.

Get the data into Splunk

This was my first experience with Deployment Server; it’s pretty slick. The clients picked up and reached out to the deployment server. My app lives in $SPLUNK_HOME\etc\deployment-apps\journald\local. I’m not going to re-hash the process of setting up a deployment server, there is great documentation at Splunk.com on how to do it.

My inputs.conf simply monitors the file inside the container:

[monitor:/opt/splunk/journald]
sourcetype = journald

The outputs.conf then feeds it back to the appropriate indexer:

[tcpout]
defaultGroup = default-autolb-group

[tcpout:default-autolb-group]
server = indexer.example.com:9997

[tcpout-server://indexer.example.com:9997]

Do something useful with it

None of getting the data into Splunk is special, but the props.conf is fun so I’m covering it separately. Running the journal out in JSON structures the data in a very nice format, and the following props.conf helps Splunk understand it:

[journald]
KV_MODE = json
MAX_TIMESTAMP_LOOKAHEAD = 10
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = false
TIME_FORMAT = %s
TIME_PREFIX = \"__REALTIME_TIMESTAMP\" : \"
pulldown_type = 1
TZ=UTC

KV_MODE=json	Magically parse JSON data. Thanks, Splunk!
TIME_PREFIX	This ugly bit of regex pulls out the timestamp from a field called __REALTIME_TIMESTAMP
TIME_FORMAT	Standard strpdate for seconds
MAX_TIMESTAMP_LOOKAHEAD	JournalD uses GNU time which is in microseconds (16 characters). This setting tells splunk to use the first 10.

Once that app was published to all the agents, I could query the data out of Splunk. It looks great!

This is where the JSON output format from journald really shines. I get PID, UID, the command line, executable, message, and more. For me, coming from a Windows background, this is the kind of logging we’ve had for 20 years, now finally in Linux, and easily analyzed with Splunk.

The journal provides a standard transport, so I don’t have to think about application logging on Linux ever again. The contract with the developers is: You get it into the journal, I’ll get it into Splunk.

Edit @ 5/1/15: added detail about managing size of journald export file.

Integrating Splunk with Docker, CoreOS, and JournalD

Setting up CoreOS

Get the data into Splunk

Do something useful with it

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112