StreamSets + Kinetica

A lightweight and powerful way to route and process real-time data streams into Kinetica.

Enterprises are now faced with wrangling massive volumes of complex, streaming data from a variety of different sources, a new paradigm known as extreme data. However, the traditional data integration model that’s based on structured batch data and stable data movement patterns makes it difficult to analyze extreme data in real-time. StreamSets and Kinetica, together, create a modern data architecture to solve these challenges. Streamsets enables continuous data movement and transformation across the enterprise while Kinetica harnesses the power of GPUs to accelerate analytics on streaming data.

Introduction to StreamSets

StreamSets is a platform for developing and operating continuous data pipelines. StreamSets allows you to productionize your real-time data streams and performantly move data into and across your infrastructure. You can download StreamSets Data Collector here. With StreamSets, you can seamlessly connect traditional and big data sources to analytics platforms and applications quickly with a minimum of hand-coding. To become a data-powered organization, you need to be able to wrangle data from all different kinds of data sources and platforms and analyze it in real-time. StreamSets is a key part of that equation.

Kinetica Destination in StreamSets

Kinetica’s certified StreamSets destination makes it easy to stream data into Kinetica. The Kinetica destination writes data to a table in a Kinetica cluster using the Kinetica bulk inserter.

Since Kinetica is designed to work with big data sets and a distributed, clustered environment, a parallel high-performance mechanism (multi-head ingest) is provided for fast data ingestion.  Most interactions are done through the single head node, and from there, parcelled out to the rest of the cluster. However, multi-head operations allow insertion (and retrieval) of data through transactions with cluster nodes directly, bypassing the head node and improving performance.  This allows multihead ingest to maximize utilization of disk IO and network bandwidth of each node in the cluster simultaneously to load vast amounts of data quickly.

For more information on the Kinetica destination in StreamSets, visit the the StreamSets website.

 

StreamSets Reference Architecture

 

The Kinetica destination has been certified by StreamSets and meets their criteria for code development, security, and documentation.

Together, Kinetica and StreamSets can help you to manage and analyze your streaming data and become a real-time business.

Resources

Past Events & Webinars

Recently in the Blog