Skip to content

Which Database for your AI Platform?

The database is often an under appreciated part of an effective AI platform. Kinetica brings high performance, lower costs, and ease-of-use to enable the development of more robust features that lead to greater model efficacy.
AI

Kinetica Aids Model Development and Model Deployment

Kinetica supports high-speed model inference and feature generation in an agile manner.

Feature Generation

Features are often derived from a variety of data sources, such as sensor data, machine data, video, and traditional transactions and log data. All of this should be stored in a data platform that can effectively support the generation.

With it's rich time-series and spatial capabilities, Kinetica is able to power more sophisticated features and hence increase the prediction power of ML models Kinetica can also store the results of feature extraction algorithms, such as the movement of objects derived from drone or closed circuit tv cameras.

Inference & Operationalization

Inference is the process by which a trained machine learning model makes predictions on new data. This typically involves serving the model through an API or other service that can accept requests and return predictions.

For more advanced ML models, Kinetica can store intermediate results or caches to improve the performance of the inference service. For example, Kinetica can continuously update and store the results of expensive computations every 10 seconds so that they do not need to be recomputed when predictions are requested 100 times every second.

Try Kinetica Now: Kinetica Cloud is free for projects up to 10GBGet Started »

Features You'll Need

Performance

Kinetica is a distributed, in-memory, fully-vectorized database. These three features act as performance force-multipliers for query. Distributed headless ingest, native Kafka integration, and lockless tables ensure data is availble as soon as it lands..
Kinetica Architecture »

Data Processing Capabilities

Kinetica has window capabilities for time-series calculations and geo-joins and geometry functions for processing location aware sensor data. These capabilities lead to better feature engineering which leads to better ML efficacy.
Time-Series & Spatial Analytics »

High Cardinality Joins

Joining large tables across many unique values can be particularly challenging for feature generation and inference. Kinetica is specifically designed for high-cardinality data and can handle large-scale joins more efficiently.
The Power of Vectorization »

High Speed Read/Writes

Kinetica's converged columnar and key-value data structures enable higher read and write speeds than would be expected from an analytical database. This reduces complexity when there’s a need to enable downstream applications infused with ML predictions.

Lower Costs

Your mileage will vary, so it’s always good to benchmark your unique workloads. A top Wall Street bank found that a workload running on 700 nodes of Spark ran faster on 16 nodes of Kinetica.
Do More with Less »

Model Management in SQL

Kinetica provides support for ML container registry model creation and management with SQL. A Docker registry serves as a container for ML models, while an ML model represents a function from which an inference can be computed.
Model Management with SQL »

Machine Learning Functions in Kinetica

Kinetica also gives you native prediction and outlier detection functions.

Prediction

The PREDICT table function will predict the values of the dependent variables that correspond to a given column of independent variables, using a given base table containing "historical" values of each. This table will be used as the basis to calculate the prediction.
 

SELECT * FROM TABLE
(
    PREDICT
    (
        HISTORY_TABLE => INPUT_TABLE(example.ticket_prices),
        X_COLUMN => 'year',
        Y_COLUMN => 'cost',
        PREDICT_ON_TABLE => INPUT_TABLE(example.future_years),
        PREDICT_ON_COLUMN => 'year'
    )
)

Outlier Detection

The OUTLIERS table function will calculate the outliers in a given data set, based on a specified calculation type, threshold, and partition column. The partition column allows the data to be segmented into subsets, one per unique partition column value, and have the outliers for each subset calculated & determined independently from other subsets.

SELECT * FROM TABLE
(
    OUTLIERS
    (
        DATA_TABLE     => INPUT_TABLE(example.employee),
        DATA_COLUMN    => 'salary',
        THRESHOLD_LOW  => -1,
        THRESHOLD_HIGH => 1
    )
)

Book a Demo!

The best way to appreciate the possibilities that Kinetica brings to high-performance real-time analytics is to see it in action.

Contact us, and we'll give you a tour of Kinetica. We can also help you get started using it with your own data, your own schemas and your own queries.