Version:

Release Notes


Version 7.0

Publish Date: 01/31/2019

New Features

Active Analytics Workbench

  • The Active Analytics Workbench (AAW) platform is now available. A brand new API and UI are available to improve existing machine learning workflows. For machine learning, Tensorflow and Blackbox (using Docker containers) models are supported. AAW also supports continuous deployments, on-demand deployments (via an inferencing REST endpoint), and audits of both TensorFlow models trained internally and imported models. AAW has a new ingestion interface that allows ingesting data from variety of sources and methods, including Kinetica, PostgreSQL, and Kafka.

Core

  • Kinetica now supports Resource Management. Resource management involves the three following resources:
    • Storage Tiers: Data containment layers within the database (VRAM, RAM, Disk Cache, Cold Storage)
    • Tier Strategies: Data object eviction priorities within each storage tier to help define memory usage and data priorities
    • Resource Groups: Resource fencing -- process scheduling priorities and other limits imposed on specific groups of users
  • Partitioning is now available. Table data that is sharded or replicated can be partitioned to aid storage tiering and data skipping. The two types of supported partitioning schemes are:
    • Range
    • Interval
  • The ability to alter multiple columns in a single call is possible via the new /alter/table/columns endpoint.

Geospatial

  • Kinetica now includes a network graph solver server. The graph server provides a generic and extensible design of networks with the aim of being tailored or used for various real-life applications, including transportation, utility, social, and geospatial.
  • Kinetica now offers a Vector Tile Service (VTS) to generate Vector Tiles and support client-side visualization of geospatial data contained within the tiles. Generating Vector Tiles using Kinetica involves passing in the VTS URL to the client-side renderer.
  • The /wms endpoint now supports contour plot visualization functionality

KAgent

  • The Kinetica Agent (KAgent) UI is now available to automate Kinetica installation and configuration. KAgent can automatically install Nvidia drivers for CUDA installations, the Active Analytics Workbench (AAW) and Kubernetes (required for AAW), configure the cluster for SSL and/or external authentication and high availability. KAgent is also cloud ready and able to provision and/or deploy to instances in Amazon Web Services (AWS), Google Cloud Platform (GCP), or Microsoft Azure. KAgent also allows one to configure the location of the head node, AAW, and the graph server within the cluster. Upgrades will now be managed using KAgent.

SQL

  • Kinetica is now packaged with a new and improved SQL interpreter.
  • A new /execute/sql endpoint is available. You can now send SQL commands to the native API without an ODBC driver.
  • The SQL engine uses a query planner that analyzes an entire query for the many different ways it can be solved (e.g., performing a filter before a join, performing a filter after a join, etc.) and selecting the plan that is the most efficient. Once this plan is utilized, it’s cached so the plan can be used again without having to re-compute it when the same query is received again.
  • Complex SQL queries that involve multiple operations are now analyzed for interdependency. Any operations that have no dependencies on others are executed in parallel.
  • Distributed operations, such as UNIONs between sharded and replicated tables or distributed JOINs, are now possible using SQL syntax or /execute/sql. The database will automatically re-shard or replicate tables temporarily as necessary to help process the query. Note that the distributed query can be slower and use more memory than a traditional non-distributed operation. The sql.distributed_joins setting in /opt/gpudb/core/etc/gpudb.conf controls the ability to use distributed operations.
  • Correlated sub-queries are now supported.
  • New SQL support for:
    • EXPLAIN
    • Partitions
    • Tier Strategy Definitions
    • Logging Levels

Enhancements

UI

  • The Kinetica Administration Application (GAdmin) has undergone a visual refresh and now supports functionality for managing resource groups, interfacing with tier strategies and partitions, reviewing graph node and edge counts, and deleting graphs.
  • Kinetica Reveal service has undergone a visual refresh as well as received some usability improvements.

Version 6.2

Core

  • Dictionary encoding performance enhancements
  • Jobs can now be performed asynchronously
  • The Kinetica File System (KiFS) has been introduced to greatly ease ML workflows centered around file processing and file generation

Features

  • Non-HA multi-head primary key lookup is now possible using the RecordRetriever object available in the Java and Python APIs

  • HA support for multi-head primary key lookup and multi-head ingest

  • /alter/table jobs are now cancellable

  • A Host Manager-controlled alerting system for application-level significant events and hardware resource usage. Alerts can be managed via gpudb.conf, and you can view the most recent alerts in Kinetica Administration Application (GAdmin)

  • Improvements to aggregation:

    • Aggregates listed in a HAVING expression no longer have to exist in the column name list
    • Aggregates can now be listed before grouping attributes in the column name list
    • Additional grouping functions (for both SQL and the native APIs):
  • Full Materialized View support via SQL or the native APIs--any number of source tables and intermediary tables & operations can be involved in creating a materialized view

  • Various improvements to the GAdmin user interface

  • Support for rank, partition, and window functionality in both SQL and the native APIs

  • The Python API has been updated to include an extension that enables increased speed when inserting and retrieving records from a GPUdbTable object

  • Joins can now be created using derived columns, e.g.,

    h_db.create_join_table(
      join_table_name="my_join",
      table_names=[table.alias("a"),table.alias("b")],
      column_names=["a.x as ax","b.y as by","a.x+b.y as c"],
      expressions=["a.x = b.x"]
    )
    
  • The /aggregate/statistics endpoint can now tune the behavior of the percentile() function using a second, comma-separated resolution value. The higher the resolution, the more accurate the estimation is but the longer the calculation takes, e.g., a 50th percentile resolution of 200:

    h_db.aggregate_statistics(
        table_name="my_table"
        column_name="col1",
        stats="count,min,percentile(50,200)"
    )
    

Geospatial

  • Improved symbology scaling

Security

  • Passwords now have a character limit of 1024, and user names and role names now have a character limit of 64

SQL

User-Defined Functions (UDFs)

  • Improved performance of non-distributed UDF read and write
  • Per-node concurrency limit setting available in GAdmin or the max_concurrency_per_node option in the /create/proc endpoint
  • Each UDF API has access to a status field under ProcData that helps convey status information during UDF execution

Version 6.1.0

Features

  • The Kinetica deployment process has been simplified. The Kinetica Administration Application (GAdmin) interface handles all deployment administration, even across multiple nodes
  • The new Host Manager service is an independent supervisor process that manages several processes that are part of Kinetica, restarting them automatically should they fail unexpectedly. Host Manager also helps investigate problems and monitor cluster-wide performance
  • A new Job Manager feature monitors all incoming jobs for Kinetica and maintains request ordering across the nodes in your Kinetica cluster. Job Manager can assist in cancelling extremely long or stalled jobs
  • GAdmin's user interface has been overhauled to provide a more user-friendly experience, including inline documentation, APIs, and driver links/downloads; access to SQL Query Tool; and simplification of cluster administration. Also, GAdmin is now compatible with Internet Explorer

Security

  • Reveal and ODBC are now integrated with the Kinetica authorization system
  • Kinetica now supports mapping database roles to LDAP/Active Directory groups
  • Database auditing capability per user and per endpoint has been significantly improved
  • Additional settings have been added to the ODBC client and server configuration files to allow for a more comprehensive setup

Core

Geospatial

  • By moving to a new OpenGL Framework architecture, geospatial visualizations--most notably polygon and line geometries with high vertex counts, scatter plots, and complex graphs or charts--have been greatly improved
  • 2D Chart rendering has also been accelerated
  • Added the highest priority filters and geospatial functions as defined by ST_Geometry to enhance geospatial processing
  • Geospatial columns are now fully supported
  • Spatial joins via Expressions

SQL

  • All new geospatial functions have been exposed for SQL usage

  • Operations: INSERT, CREATE TEMP TABLE AS, CREATE VRAM TABLE AS

  • Functions: KI_SHARD_KEY

  • Data type: DATETIME

  • New KI hint that enables specifying a Job ID tag to assist in cancelling the job if necessary: KI_HINT_JOBID_PREFIX(x)

  • Additional alterable properties available:

    • ALTER TABLE ... SET ... COMPRESSION [TO compression_type]
    • ALTER TABLE ... SET [ACCESS MODE|PROTECTED]
    • ALTER TABLE ... [SET SCHEMA | MOVE TO] ...
  • Schema/collection alteration from SQL is now supported; operations available:

    • ALTER [SCHEMA | COLLECTION] ... ALLOW HOMOGENOUS [TABLES] <TRUE | FALSE>
    • ALTER [SCHEMA | COLLECTION ] ... SET PROTECTED <TRUE | FALSE>
    • ALTER [SCHEMA | COLLECTION] ... RENAME TO ...
    • ALTER [SCHEMA | COLLECTION] ... SET TTL ...
  • Support for equality of lists, e.g.,

    SELECT *
    FROM calcs
    WHERE (str, num) = ('FURNITURE', 12.3);
    

APIs

  • The Python API can be installed from PyPI
  • Database backups possible while the system is still online
  • The /wms endpoint has had numerous enhancements: * Support for tables with multiple geospatial columns * Additional RASTER and CB_RASTER mode options * The ability to use a default class for the CB_RASTER mode
  • The Java API BulkInserter will automatically retry any failed inserts to attempt to handle errors automatically
  • The PIVOT operation is enabled via the /aggregate/groupby endpoint options pivot and pivot_values. See Pivot for more information
  • The UNPIVOT operation is enabled via the new /aggregate/unpivot endpoint

Connectors

  • Spark Connector redesigned to focus on massive parallel ingestion performance
  • FME Connector now supports all native data types and null values

Reveal

  • Time series reports are now filterable
  • The roles and permissions model for non-admin users has been improved to provide a better out-of-the-box experience

User-Defined Functions (UDFs)

  • Performance enhancements to accelerate output table creation
  • Improved GPU multi-tenancy by including GPU usage metrics data structure
  • Deepchem package added to the User-Defined Function runtime environment
  • A UDF simulator script is packaged with the Python API that allows you to simulate running any type of UDF (Java, C++, or Python) without the UDF having to be created in the database

Cloud Deployment

  • Marketplace instances with proper licensing and monitoring hooks are now available for AWS and Azure
  • An auto-launcher for AWS and Azure has been developed to allow users to "bring your own license" and set cluster size parameters for easy button deployment

Version 6.0.1

Features

  • Tensorflow added to the User-Defined Function runtime environment

API

  • New C# language bindings for the Kinetica API
  • Improved default Java API ingest performance
  • Enhanced /alter/table support
  • New ttl (Time to Live) option for /create/union, /create/projection, /create/jointable, /aggregate/groupby, /aggregate/unique, /filter
  • Various helper classes and constants added to the APIs
  • New failover URL support for the Java API
  • Secure communications (HTTPS) enabled for the C++ API

Core

  • Improving performance and types of JOIN operations supported
  • Various bug fixes and performance & stability improvements
  • Heatmaps are available in non-GPU mode

SQL

  • Operations: WITH (Common Table Expressions), INTERSECT, EXCEPT, UPDATE, DELETE, CREATE REPLICATED TABLE, CREATE [OR REPLACE] TABLE ... AS, ALTER TABLE, and DROP TABLE IF EXISTS
  • Functions: POSITION, CAST, DECODE, ZEROIFNULL, and DATEDIFF
  • Types: Fixed-limit strings to optimize memory usage & performance
  • New 32-bit Windows client for ODBC
  • Various enhancements for general SQL support



Version 6.0

Features

  • Reveal data exploration tool and analytics GUI
  • User-Defined Function Framework
    • User-defined binaries can receive table data, perform computations, and persist results in a distributed manner
    • Orchestration API provided in C++ and Java bindings; integrates with high-speed IPC layer of every data container
  • Visual Installer--allows for easy click-button installation across hundreds of nodes

Types

  • Full NULL support
  • Decimal
  • Date
  • Time

Data Operations

  • Union, intersect & minus set operations
  • Multi-column order-by support
  • Added projections--filters that allow column expressions in the result set column list
  • Moving average capability

Security

  • Integration with LDAP & Kerberos
  • Table-level data access control per user

Core

  • Increased node capacity & utilization, with active memory management; operations are host memory aware and write to disk, as needed
  • GPU-only column designation for accelerated query performance

API

  • Support for renaming tables and adding/modifying/deleting columns in place

SQL

  • Operations: CROSS JOIN, FULL OUTER JOIN, UNION DISTINCT, UPDATE, DELETE, CREATE TABLE, DROP TABLE
  • Functions: LIKE
  • Types: DECIMAL(P,S), TYPE_DATE, & TYPE_TIME



Version 5.2

Types

  • 32, 64, 128, & 256 byte character strings
  • Initial NULL support

Functions

  • Character-based: lower, upper, min, max, ltrim, rtrim & trim
  • Geographic: wkt_dist & wkt_is_within_dist
  • Null: is_null, nvl, nvl2 & nullif

Data Operations

  • Added support for inner, left & right join set operations
  • Added filter planner for displaying join path
  • Added set concatenation operation
  • Aggregation result sets can be made into views

Security

  • Added role-based security and supporting endpoints
  • Removed passphrase-based authorization checks; no longer needed in new security model

GAdmin

  • Added to HA & security infrastructures
  • Sortable data grid columns
  • Added to table view: shard & foreign keys, replicated & protected statuses, TTL

API

  • Collapsed /alter/table/* and /show/table/* endpoints

Connectors

  • Increased stability & performance across connectors

SQL

  • Operations: JOIN, LEFT/RIGHT JOIN, EXISTS, SELECT DISTINCT, TOP, UNION ALL & IN
  • Functions: CASE, CONVERT, IFNULL, NOT, TIMESTAMPADD, TRUNCATE