Achieving scale with the Kafka Modular Input

A hot topic in my inbox over the recent months has been how to achieve scalability with the Kafka Modular Input , primarily in terms of message throughput. I get a lot of emails from users and our own internal Splunk team about this , so rather than continuing to dish out the same replys , I thought I’d just pen a short blog to share some tips and tricks.

So let’s start off with this simple scenario :

a single instance of Splunk 6.3
downloaded and installed the freely available Kafka Modular Input from Splunkbase

These are the scaling steps that I would try in order.

Enable HTTP Event Collector output

With the recent release of Splunk 6.3 , I also updated the Kafka Modular Input to be able to output it’s received messages to Splunk via the new HTTP Event Collector. You can read more about this here.

Select the HEC option
Use HTTP (not HTTPs)
Enable Batch Mode. This will buffer events in memory until the batch buffer is flushed depending on how you tune the flush settings.You can tune the size of the batch buffer depending on the scale of the throughput in your Kafka environment ie: higher throughput => larger batch buffer will be more optimal.

Setup multiple consumer connections in the same consumer group

This is one area that many users are most likely not aware of. When you setup multiple Kafka stanzas in Splunk , these will actually run as multiple consumer threads inside of the same single JVM instance. And you can aggregate them into the same consumer group by setting the same Kafka Group ID.

Below are 3 Kafka consumer connection threads in the “my_test_group” consumer group running in the same JVM.

Boost JVM Heap

If you are running many threads (stanzas) inside of a single JVM , then you may need to boost the JVM heap settings. This is easy to do.

From the documentation :

Additional Kafka Consumer settings

For more advanced users , you can also set any of the full palette of Kafka consumer configuration parameters that you want. You just declare these as comma delimited key=value pairs.

Going beyond a single instance of Splunk

If the above steps don’t warrant enough scale for you , then you can start to think about horizontal scalability.This is basically just the same steps that I mentioned above but replicated horizontally across (n) Splunk instances.

deploy (n) Kafka Modular Inputs across (n) Splunk Universal Forwarder instances
deploy a cluster of Splunk Indexer instances

Achieving scale with the Kafka Modular Input

Enable HTTP Event Collector output

Setup multiple consumer connections in the same consumer group

Boost JVM Heap

Additional Kafka Consumer settings

Going beyond a single instance of Splunk

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112