Preparing an analytic for Lumen
Before uploading an analytic to the Lumen platform, there are some simple steps that need to be undertaken to ensure a smooth experience. Below provides a detailed approach on how to prepare an analytic for Lumen.
Preparing a Python analytic for Lumen
This section details out the actions required to prepare a Python analytic for onboarding onto Lumen.
The analytic should be separated out from the data it runs on. This means within the analytic code there should be no calls to databases, APIs or other external systems, and no loading of input data directly from files. This separation ensures both the analytics and data sources can be used by other components in the system if required, and the data source can be updated without affecting the analytic.
Input should be passed into the analytic as parameters from the command line. The first parameter should always be a Universal Unique Identifier(UUID), which allows Lumen to match executions to results. All other required data should follow this.
Tip: Python treats all command line arguments initially as strings. These can easily be cast to the correct type on assignment.
Dependant files, such as model weights or static data, can still be used within the analytic. These can be packaged up with the code and accessed by the analytic at run time by loading them from the current working directory.
Tip: Files in the working directory can be accessed directly without an explicit path, ie csv_read(‘filename.csv’). Ensure there are no directory paths to your files, as when they run on Lumen files will be stored differently than they are on your local machine.
Results from the analytic should be captured in a dictionary as a key-value pair. The key represents the field name and should be a string associated with that result. The value is the result itself. There are no limits to the number of outputs in Lumen.
Tip: Each output key-value becomes a separate piece of data that can be used by other components within Lumen. It is therefore good practice to split up output into individual values where appropriate.
In Lumen we use redis as a common approach across languages to allow us to capture results. We require a couple of functions to be added to the analytics. The functions save_to_redis and save_exeption_to_redis should be added to the analytic.
At the end of the analytics’ execution, the dictionary of results should be saved to redis using the provided function and the uuid.
The core functionality of the analytic should be wrapped within a try catch statement. This allows any errors within the analytic to be captured at run time and displayed back to the user.
The analytic should be able to run from the command line. Before onboarding to Lumen it is recommended that the analytic is tested running from a command line. This may require the user to install redis to capture outputs or comment the save to redis function out and print to command line.
Any external libraries used in the analytic, such as Numpy, SciPy, Pandas, Scikit Learn, etc, need to be captured in a requirements.txt. This allows Lumen to ensure that when running the analytic it has configured the correct environment. Version numbers can be associated with these libraries in the requirements file to ensure compatibility.
All files required to run the analytic, including the code, the requirements.txt and any other dependant files, should be placed in the root directory of a folder, and then added to a zip file. This file is then ready to be uploaded to Lumen.