Support for modular inputs in Splunk 5.0 and later enables you to add new types of inputs to Splunk that are treated as native Splunk inputs.
Last week Jon announced updates to the Splunk SDKs for Java, Python, and JavaScript, now we’ll take a deep dive into modular input support for the Splunk SDK for Python.
The latest release of the Splunk SDK for Python brings modular input support. The Splunk SDKs for C# (see Developing Modular Inputs in C#) and Java also have this functionality as of version 1.0.0.0 and 1.2, respectively. The Splunk SDK for Python enables you to use Python to create new modular inputs for Splunk.
Getting started
The Splunk SDK for Python comes with two example modular input apps: random numbers and Github forks. You can get the Splunk SDK for Python on dev.splunk.com. Once you have the Splunk SDK for Python, you can build the .spl files for these examples and install them via the app manager in Splunkweb. Do this by running python setup.py dist
in the root level of the SDK, the .spl files will be in the build directory.
Now I’ll walk you through the random numbers example.
Random numbers example
The random numbers example app will generate Splunk events containing a random number between the two specified values. Let’s get into the steps for creating this modular input.
Inherit from the Script class
As with all modular inputs, we should inherit from the abstract base class Script from splunklib.modularinput.script from the Splunk SDK for Python. They must override the get_scheme and stream_events functions, and, if the scheme returned by get_scheme has Scheme.use_external_validation set to True, the validate_input function.
Below, I’ve created a MyScript class in a new file called random_numbers.py which inherits from Script, and added the imports that will be used by the functions we will override.
import random, sys from splunklib.modularinput import * try: import xml.etree.cElementTree as ET except ImportError: import xml.etree.ElementTree as ET class MyScript(Script): # TODO: fill in this class
Override get_scheme
Now that we have a class set up, let’s override the get_scheme function from the Script class. We need to create a Scheme object, add some arguments, and return the Scheme object.
def get_scheme(self): scheme = Scheme("Random Numbers") scheme.description = "Streams events containing a random number." # If you set external validation to True, without overriding # validate_input, the script will accept anything as valid. # Generally you only need external validation if there are # relationships you must maintain among the parameters, # such as requiring min to be less than max in this # example, or you need to check that some resource is # reachable or valid. Otherwise, Splunk lets you # specify a validation string for each argument # and will run validation internally using that string. scheme.use_external_validation = True scheme.use_single_instance = True min_argument = Argument("min") min_argument.data_type = Argument.data_type_number min_argument.description = "Minimum random number to be produced by this input." min_argument.required_on_create = True # If you are not using external validation, add something like: # # setValidation("min > 0") scheme.add_argument(min_argument) max_argument = Argument("max") max_argument.data_type = Argument.data_type_number max_argument.description = "Maximum random number to be produced by this input." max_argument.required_on_create = True scheme.add_argument(max_argument) return scheme
Optional: Override validate_input
Since we set scheme.use_external_validation to True in our get_scheme function, we need to specify some validation for our modular input in the validate_input function.
This is one of the great features of modular inputs, you’re able to validate data before it gets into Splunk.
In this example, we are using external validation to verify that min is less than max. If validate_input does not raise an exception, the input is assumed to be valid. Otherwise it prints the exception as an error message when telling splunkd that the configuration is invalid.
def validate_input(self, validation_definition): # Get the parameters from the ValidationDefinition object, # then typecast the values as floats minimum = float(validation_definition.parameters["min"]) maximum = float(validation_definition.parameters["max"]) if minimum >= maximum: raise ValueError("min must be less than max; found min=%f, max=%f" % minimum, maximum)
Override stream_events
The stream_events function handles all the action: Splunk calls this modular input without arguments, streams XML describing the inputs to stdin, and waits for XML on stdout describing events.
def stream_events(self, inputs, ew): # Go through each input for this modular input for input_name, input_item in inputs.inputs.iteritems(): # Get the values, cast them as floats minimum = float(input_item["min"]) maximum = float(input_item["max"]) # Create an Event object, and set its data fields event = Event() event.stanza = input_name event.data = "number=\"%s\"" % str(random.uniform(minimum, maximum)) # Tell the EventWriter to write this event ew.write_event(event)
Bringing it all together
Let’s bring all the functions together for our complete MyScript class. In addition, we need to add these 2 lines at the end of random_numbers.py to actually run the modular input script:
if __name__ == "__main__": sys.exit(MyScript().run(sys.argv))
Here is the complete random_numbers.py:
import random, sys from splunklib.modularinput import * try: import xml.etree.cElementTree as ET except ImportError: import xml.etree.ElementTree as ET class MyScript(Script): def get_scheme(self): scheme = Scheme("Random Numbers") scheme.description = "Streams events containing a random number." # If you set external validation to True, without overriding # validate_input, the script will accept anything as valid. # Generally you only need external validation if there are # relationships you must maintain among the parameters, # such as requiring min to be less than max in this # example, or you need to check that some resource is # reachable or valid. Otherwise, Splunk lets you # specify a validation string for each argument # and will run validation internally using that string. scheme.use_external_validation = True scheme.use_single_instance = True min_argument = Argument("min") min_argument.data_type = Argument.data_type_number min_argument.description = "Minimum random number to be produced by this input." min_argument.required_on_create = True # If you are not using external validation, add something like: # # setValidation("min > 0") scheme.add_argument(min_argument) max_argument = Argument("max") max_argument.data_type = Argument.data_type_number max_argument.description = "Maximum random number to be produced by this input." max_argument.required_on_create = True scheme.add_argument(max_argument) return scheme def validate_input(self, validation_definition): # Get the parameters from the ValidationDefinition object, # then typecast the values as floats minimum = float(validation_definition.parameters["min"]) maximum = float(validation_definition.parameters["max"]) if minimum >= maximum: raise ValueError("min must be less than max; found min=%f, max=%f" % minimum, maximum) def stream_events(self, inputs, ew): # Go through each input for this modular input for input_name, input_item in inputs.inputs.iteritems(): # Get the values, cast them as floats minimum = float(input_item["min"]) maximum = float(input_item["max"]) # Create an Event object, and set its data fields event = Event() event.stanza = input_name event.data = "number=\"%s\"" % str(random.uniform(minimum, maximum)) # Tell the EventWriter to write this event ew.write_event(event) if __name__ == "__main__": sys.exit(MyScript().run(sys.argv))
Optional: set up logging
It’s best practice for your modular input script to log diagnostic data to splunkd.log. Use an EventWriter‘s log method to write log messages, which include both a standard splunkd.log level (such as DEBUG or ERROR) and a descriptive message.
Add the modular input to Splunk
We’ve got our script ready, now let’s prepare to add this modular input to Splunk.
Package the script and the SDK library
To add a modular input that you’ve created in Python to Splunk, you’ll need to first add the script as a Splunk app.
- Create a directory that corresponds to the name of your modular input script—for instance, random_numbers—in a location such as your Documents directory. (You’ll copy the directory over to your Splunk directory at the end of this process.)
- In the directory you just created, create the following three empty directories:
- bin
- default
- README
- From the root level of the Splunk SDK for Python, copy the splunklib directory into the bin directory you just created.
- Copy the modular input Python script (for instance, random_numbers.py) into the bin directory. Your app directory structure should now look like the following:
.../ bin/ app_name.py splunklib/ __init__.py ... default/ README/
Create an app.conf file
Within the default directory, create a file called app.conf. This file is used to maintain the state of an app or customize certain aspects of it in Splunk. The contents of the app.conf file can be very simple:
[install] is_configured = 0 [ui] is_visible = 1 label = My App [launcher] author = Splunk Inc description = My app is awesome. version = 1.0
For more examples of what to put in the app.conf file, see the corresponding files in the modular inputs examples.
Create an inputs.conf.spec file
You need to define the configuration for your modular input by creating an inputs.conf.spec file manually. See Create a modular input spec file in the main Splunk documentation for instructions, or take a look at the SDK samples’ inputs.conf.spec file, which is in the application’s README directory. For instance, the following is the contents of the random numbers example’s inputs.conf.spec file:
[random_numbers://<name>] *Generates events containing a random floating point number. min = <value> max = <value>
Move the modular input script into your Splunk install
Your directory structure should look something like this:
.../ bin/ app_name.py splunklib/ __init__.py ... default/ app.conf README/ inputs.conf.spec
The final step to install the modular input is to copy the app directory to the following path: $SPLUNK_HOME$/etc/apps/
Restart Splunk, and on the App menu, click Manage apps. If you wrote your modular input script correctly, the name of the modular input—for instance, Random Numbers—will appear here. If not, go back and double-check your script. You can do this by running python random_numbers.py --scheme
and python random_numbers.py --validate-arguments
from the bin directory of your modular input. These commands will verify that your scheme and arguments are configured correctly, these commands will also catch any indenting issues which could cause errors.
If your modular input appears in the list of apps, in Splunk Manager (or, in Splunk 6.0 or later, the Settings menu), under Data, click Data inputs. Your modular input will also be listed here. Click Add new, fill in any settings your modular input requires, and click Save.
Congratulations, you’ve now configured an instance of your modular input as a Splunk input!