Kinetica Blackbox Software Development Kit (SDK) Guide

The Kinetica Blackbox SDK assists users in creating blackbox models to wrap existing code/functionality and make it deployable within the Kinetica system. The Active Analytics Workbench (AAW) currently can only import blackbox models that have been containerized and implement the BlackBox SDK. Users provide the Python module scripts, modify some SDK files, and the SDK will build a Docker Container from the files and publish it to a given Docker Registry (private or public).

For help with containerizing models, the Kinetica Blackbox Wizard is available via the Model + Analytics portion of the AAW User Interface (UI).

Prerequisites

Download and Configuration

Download the Blackbox SDK from GitHub and select a version that is compatible with the current database version. The SDK version should be less than or equal to the current version of the database that the blackbox model will be running against. For example, if Kinetica is at version 7.1.0.0, the SDK tag version should be less than or equal to 7.1.0.0.

Important

Models published using the 7.1.x.y Blackbox SDK are compatible with AAW 7.0 and AAW 7.1. If upgrading the database to 7.1.x.y, models published using the 7.0.15.x or earlier Blackbox SDK will need to upgrade their SDK to the 7.1.x.y Blackbox SDK. The models will then need to be republished prior to re-using them.

  1. Clone the project and change directory into the folder:

    1
    2
    
    git clone https://github.com/kineticadb/container-kml-blackbox-sdk.git
    cd container-kml-blackbox-sdk
    
  2. Get a list of tags, which correspond to Blackbox SDK versions, for the repository:

    1
    
    git tag -l
    
  3. Checkout the desired tagged version of the repository:

    1
    
    git checkout tags/<tag_name>
    

    Note

    The latest version compatible is preferred.

Setup

The repository contains all the files needed to build and publish a blackbox model Docker container compatible with AAW. The important files and their function:

Warning

It's highly recommended the sdk/* and bb_runner.sh files are not modified!

Filename Description
sdk/bb_runner.py Python script called from the Docker container entrypoint script (bb_runner.sh). Contains the code necessary for the Blackbox models to interface with the database.
sdk/validate.py Python script used to validate the Blackbox model's respective spec.json file for container inspection.
bb_module_*.py Python script containing sample model code. The default code is a template for you to reuse and/or replace.
bb_runner.sh Entrypoint for the Docker container; this script will be run initially when AAW pulls the container for execution.
docker_release.log Log file containing the published releases.
Dockerfile File containing all the instructions for Docker to build the model image properly.
featmem_zoo.json Sample featureset based on zoo animals.
release.sh Utility script for building and publishing the model to a Docker Hub or Docker Registry.
repo_uri.info File that contains the Docker repository, iamge, and tag for the Blackbox model that will be published by release.sh
requirements.txt Text file that stores the required python libraries for the model. Default libraries (gpudb, pyzmq, requests, pandas) must be left intact.
spec.json Template Blackbox model specification file used for container inspection.
VERSION Matches the latest database version.

To setup the repository for publishing your model:

  1. Update bb_module_*.py with the desired model code. The model can contain as many methods as desired or call as many other modules as desired, but the default method must take a dictionary in (inMap) and return a dictionary (outMap):

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    
    import math
    
    def predict_taxi_fare(inMap=None):
    
    # method code ...
    
    # Calculate fare amount from trip distance
    fare_amount = (dist * 3.9)
    
    outMap = {'fare_amount': fare_amount}
    
    return outMap
    
  2. Optionally, update the name of the bb_module_*.py scripts. If the module name is updated, they will need to be referenced appropriately when deploying the model via the AAW UI or the AAW REST API. See Usage for more information.

  3. Open the Dockerfile in an editor and include any required installations that are not easily installable with pip:

    RUN apt-get install -y git wget
    
  4. Add all module files:

    ADD <module file.py> ./
    

    Important

    By default, the Dockerfile includes a reference to bb_module_*.py. This reference must be updated if the file name was changed earlier.

  5. Open requirements.txt in an editor and include any additional required python libraries, e.g.,

    numpy==1.16.3
    tensorflow
    

    Important

    The default gpudb, pyzmq, requests, and pandas packages inside requirements.txt must be left in the file.

  6. Update repo_uri.info with the desired Docker repository, image, and tag.

    Tip

    The Docker repository will be created if it doesn't exist.

  7. Update spec.json as necessary with the desired specification information for each module and function.

Usage

Publishing the Model

  1. Login into your Docker Hub or Docker Registry:

    1
    2
    3
    4
    5
    
    # Docker Hub
    docker login
    
    # Docker Registry
    docker login <hostname>:<port>
    
  2. Run the release.sh script to build a Docker image of the model and publish it to the provided Docker Hub or Docker Registry:

    1
    
    ./release.sh
    

Importing the Model

After publishing the model, it can be imported into AAW using two methods:

REST API

If using the REST API, a model is defined using JSON. The cURL command line tool can be used to send a JSON string or file to AAW. To import a blackbox model into AAW using cURL and the REST API:

  1. Define the model. Kinetica recommends placing the model definition inside a local JSON file.

  2. Post the JSON to the /model/blackbox/instance/create endpoint of the AAW REST API:

    1
    2
    3
    4
    5
    
    # Using a JSON file
    curl -X POST -H "Content-Type: application/json" -d @<model_file>.json http://<kinetica-host>:9187/kml/model/blackbox/instance/create
    
    # Using a JSON string
    curl -X POST -H "Content-Type: application/json" -d '{"model_inst_name": "<model_name>", ... }' http://<kinetica-host>:9187/kml/model/blackbox/instance/create
    

To aid in creating the necessary JSON, use the following endpoint and schema:

Endpoint name: /model/blackbox/instance/create

Input parameters:

Name Type Description
model_inst_name string Name of the model.
model_inst_desc string Optional description of the model.
problem_type string Problem type for the model. Always BLACKBOX.
model_type string Type for the model. Always BLACKBOX.
input_record_type array of map(s) of strings to strings

An array containing a map for each input column. Requires two keys:

Name Type Description
col_name string Name for the input column.
col_type string Type for the input column.

Important

There will need to be as many maps (containing both name and type) as there are columns in the inMap variable inside the default blackbox module.

model_config map of strings to various

A map containing model configuration information.

Name Type Description
db_user string Username for database authentication.
db_pass string Password for database authentication.
blackbox_module string Module name for the blackbox model.
blackbox_function string Function name inside the blackbox module.
container string Docker URI for the container, e.g., <repo_name>/<image_name>:<tag_name>
output_record_type string

An array containing a map for each output column. Similar to input_record_type, requires two keys:

  • col_name -- a string value representing the name of the output column
  • col_type -- a string value representing the type of the output column

Important

There will need to be as many maps in output_record_type as there are columns in the outMap variable inside the default blackbox module.

Example JSON:

The final JSON string should look similar to this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
{
  "model_inst_name": "Taxi Fare Predictor",
  "model_inst_desc": "Blackbox model for on-demand deployments",
  "problem_type": "BLACKBOX",
  "model_type": "BLACKBOX",
  "input_record_type": [
    {
      "col_name": "pickup_longitude",
      "col_type": "float"
    },
    {
      "col_name": "pickup_latitude",
      "col_type": "float"
    },
    {
      "col_name": "dropoff_longitude",
      "col_type": "float"
    },
    {
      "col_name": "dropoff_latitude",
      "col_type": "float"
    }
  ],
  "model_config": {
    "db_user": "",
    "db_pass": "",
    "blackbox_module": "bb_module_default",
    "blackbox_function": "predict_taxi_fare",
    "container": "kinetica/kinetica-blackbox-quickstart:latest",
    "output_record_type": [
      {
        "col_name": "fare_amount",
        "col_type": "double"
      }
    ]
  }
}

AAW User Interface (UI)

The AAW UI offers a simpler WYSIWYG-style approach to importing a blackbox model. There are two ways you can import a Blackbox model: via automatic inspection or manually.

To import a blackbox model using inspection:

Important

Inspection requires that the model has a spec.json file included in the image, otherwise AAW will not be able to inference the required information.

  1. Navigate to the AAW UI (http://<aaw-host>:8070)
  2. Click Models + Analytics .
  3. Click + Add Model ‣ New Blackbox
  4. Optionally, provide Docker Registry Credentials.
  5. Input the Docker URI for the container, e.g., <repo_name>/<image_name>:<tag_name>
  6. Click Inspect. The first time you inspect a container, it may take a few minutes. If the inspection was a success, the container specification information (name, description, module, inputs, etc.) will appear.
  7. Review the container specifications and click Import.
  8. Edit and/or confirm the inferenced fields, then click Create.

To import a blackbox model manually:

  1. Navigate to the AAW UI (http://<aaw-host>:8070)
  2. Click Models + Analytics.
  3. Click + Add Model ‣ New Blackbox.
  4. Under Create a Blackbox Model Manually, click Create.
  5. Provide a Model Name and optional Model Description.
  6. Input the Docker URI for the container, e.g., <repo_name>/<image_name>:<tag_name>
  7. Input the Module Name and Module Function.
  8. For Input Columns:
    1. Click Add Input Column to create input columns.
    2. Provide a Column name and Type.
  9. For Output Columns:
    1. Click Add Output Column to create output columns.
    2. Provide a Column name and Type.
  10. Click Create.

Example UI:

The final UI inputs should look similar to this:

../img/aaw_ui_new_bb_model_filled.png

Upgrading

Upgrading the SDK is as simple as pulling in the base Blackbox SDK repository into a local fork and pushing the upgraded files to a remote repository.

  1. Change into the directory containing the Blackbox SDK fork and checkout the master branch:

    cd ~/<repo-name> && git checkout master
    
  2. Ensure all changes are checked-in and the master branch is clean:

    git status
    
  3. Backup the entire local repository:

    git archive --format=tar -o ../<repo-name>.master_$(date +"%Y-%m-%d_%T").tar HEAD
    
  4. Pull the desired branch from the base Blackbox SDK repository into your local fork:

    git pull https://github.com/kineticadb/container-kml-blackbox-sdk release/<version>
    
  5. Resolve conflicts and stage the appropriate changes.

    Important

    Make special note of any files to be added because they may contain release artifacts and repositories from the base Blackbox SDK repository.

  6. Publish the upgraded model and verify the release log looks correct:

    ./release.sh && cat docker_release.log
    
  7. Commit the merge and post-publishing artifacts:

    git commit -m "Upgrading models to <version> SDK and publishing."
    
  8. Review the changes and push them up:

    git push
    
  9. Optionally, remove the backup:

    rm -rf ../<repo-name>.master*.tar