RapidMiner Integration
Use the vovrtsad daemon for integration with the RapidMiner Runtime Scoring Agent.
RapidMiner Overview
- Create models: Use automated, visual, and code-based approaches to streamline model creation
- Train and evaluate models: Use the latest data analytics techniques to train, evaluate, explain, and deploy models
- Build decision trees: Use a wizard-based tool to build explainable decision trees that visualize complex interactions within data
- Design and prototype AI models: Use a visual drag-and-drop workflow designer to design and prototype AI and machine learning models
- Prepare data: Use interactive data prep capabilities to import, clean up, and prepare data
- Deploy models: Deploy models directly on new data
Integrate vovrtsad with RapidMiner Runtime Scoring Agent
-
Obtain a copy of RapidMiner’s Runtime Scoring Agent.
For the remainder of this documentation, assume that the RapidMiner Scoring Agent has been installed in the directory /opt/rapidminer-scoring-agent-embedded. Remember where you installed this executable and related files on your system, as you will need to point your vovrtsad daemon configuration file to this location in order to retrain models. Current Accelerator configuration setup expects a minimum of three RapidMiner AI Studio Processes in a deployment; one process for duration predictions, one process (preferably empty) used for a quick ping to check REST API connectivity, and another process for retraining the duration prediction model. If you are not going to retrain models you can potentially just work with a containerized version of the Scoring Agent directly but will miss out on the ability to update models directly with the Scoring Agent.
-
Create a new Accelerator project.
Alternatively, you can add the required config files and daemon to an existing Accelerator project. For the rest of this document, assume the active Accelerator project directory is ~/vov/vnc/ncTest.swd.SWD=$(nc cmd vovserverdir -p .) ncmgr start -force -q ncTest vovproject enable ncTestTip: A new silent option is available under Accelerator’s nc run command for integrating with RapidMiner’s Runtime Scoring Agent via the -predict option flag. For example:nc run -predict – vovmemtime 5 0 -
Setup a new vovrtsad daemon and start it. Some template
files have been provided that you can copy to get started.
cp $VOVDIR/../common/etc/autostart/start_vovrtsad.tcl ~/vov/vnc/ncTest.swd/autostart/start_vovrtsad.tcl mkdir -p *.swd/vovrtsad cp $VOVDIR/../common/etc/config/vovrtsad/config.tcl ~/vov/vnc/ncTest.swd/vovrtsad/config.tcl cp $VOVDIR/../common/etc/config/vovrtsad/vovrtsad.cfg ~/vov/vnc/ncTest.swd/vovrtsad/vovrtsad.cfg vovdaemon start vovrtsad -
Configure the vovrtsad daemon
~/vov/vnc/ncTest.swd/vovrtsad/vovrtsad.cfg file to
match expected data output from the RapidMiner Embedded Runtime Scoring Agent.
This is covered in a later step, but for now you need to prepare to work with
other RapidMiner tools such as RapidMiner AI Studio, and RapidMiner AI Hub to
create the initial model and a deployment file that can be used later with the
Embedded Scoring Agent to predict duration or other values of interest from
Accelerator.

-
Once the data is prepared and ready for training, select one of the
pre-existing model operators in AI Studio and train the model that you will use
with Accelerator for predictions. Again here is an example of an AI Studio
process for illustration, which is using a Decision Tree model operator. The
actual process that you would use for predictions will be a series of operations
to execute a combination of data prep like the one shown above followed by
loading and applying the model created below.
Figure 2. 
The prediction process should end up looking something like this:Figure 3. 
-
Once you have a trained model, and all of your processes (predict, retrain,
ping) are ready to your satisfaction, create a snapshot and add it to AI Hub.
From that published snapshot in AI Hub you will create a deployment file
(*.zip) that is used inside of Accelerator for the
predictions. The following images show the key steps for how endpoints for other
devices and deployments are created for a project in AI Hub. Please consult
Altair RapidMiner AI Hub documentation for more details.
Figure 4. 
Figure 5. 
- You can use the newly created deployment *.zip file elsewhere to integrate Accelerator with RapidMiner Runtime Scoring Agent. Copy it to the proper location inside of the rapidminer-scoring-agent location from Step 1, which for this example is in the folder /opt/rapidminer-scoring-agent-embedded/home/deployments.
-
For the REST API portion of the config file you will need to make sure that the
URL, PORT and URN portions of the file match what was configured for your
RapidMiner AI Hub server, which will be the same values that are used for the
runtime-scoring-agent (RTSA). The following values should follow their
respective rules:
- Value
- Rules
- Project name and deployment
- Must match the values used in AI Hub
- Predict, retrain, and ping
- Must match processes created in AI Studio, and published and deployed via AI Hub to your *.zip deployment file
- User and host
- Must be supplied if the ssh value is enabled and non-zero
- Refresh
- Indicates how long in seconds between attempts to update the vovrtsad.cfg config file
- Start
- Specifies in typical timespec values, such as 1y, 1m, 1w, 1d, etc, the interval from current time used for selecting retraining data. By default, all files found in *.swd/data/jobs will be used if this value is not set.
- Debug
- Used to toggle verbose debugging information.
- csvPath, csvDir, and csvName
- Specifies external locations outside of the default *.swd/data/jobs/rtsa/*.csv locations
-
To map data between the RTSA embedded REST API endpoints and Accelerator job
submission data, specify the same discrete values defined in your Altair
RapidMiner AI Studio processes Discretize operator, in your
vovrtsad.cfg list of values, as shown below. This is
how Accelerator matches predicted values and replaces expected values with the
prediction results. Both the class names and upper limits should match the
values used in the config file.
Figure 6. 
Figure 7. 
The value used to map to the returned prediction output is
prediction(DURATIONML). If you call your prediction attribute something other thanDURATIONMLyou must rename the corresponding attribute in the vovrtsad.cfg file. The scoring process name is only used for reference. The remaining values are used by the vovrtsad daemon or nc_rtsa wrapper for various things. Use theconfidence_minimumvalue as the threshold that must be met apply a prediction value, such as 0.6250 in this example. Thequality_minimumvalue is used for retraining; in this example it is set to 0.4000. The frequency the daemon checks for potential retraining is set viacheck_quality_secsin terms of seconds, so once an hour in this example. Finally,retrain_job_minimumis used to set a minimum threshold of completed jobs monitored by the vortsad daemon before it will permit model retraining. -
The vovrtsa_util tool also offers some options for obscuring
the key used for launching the scoring agent in the
vovrtsad.cfg config file, ( -e -u, -g). Here is an
example usage:
$ vovrtsa_util -g aVV1AGhVO63rDAwGWNU3Vt+9T/FDZzFEa/zvk3YKmek= $ vovrtsa_util -e "My string" -u aVV1AGhVO63rDAwGWNU3Vt+9T/FDZzFEa/zvk3YKmek= 33:wiATyTB7ATpsWGYcu5HkARNq7yyVFhpDD7r8m/3RndBoAs used from within the vovrtsad.cfg config file:
"key" = 33:wiATyTB7ATpsWGYcu5HkARNq7yyVFhpDD7r8m/3RndBo", "pkey" = "aVV1AGhVO63rDAwGWNU3Vt+9T/FDZzFEa/zvk3YKmek",