Skip to content

Querying with SQL

Data in Kinetica can be queried with SQL. One of the unique capabilities of Kinetica is the speed at which Kinetica can query and perform filtering & aggregation operations on the data.

In this tutorial, we are going to walk you through some examples of querying the taxi dataset using SQL.

Though this tutorial focuses on using the SQL tool in GAdmin, you can query using the KiSQL Command-line editor (KiSQL is available in /opt/gpudb/kitools) or you could connect your own favorite SQL editors, such as SQuirrelSQL, DBeaver or DbVisualizer, with the Kinetica database using Kinetica’s JDBC or ODBC drivers.

Query

Kinetica is a SQL-92-compliant database. The query below shows how Kinetica can maintain materialized views on streaming data, giving you the ability to compare the current status of your data, and the historical context, all in a single query.

To use the SQL tool in GAdmin:

  1. Navigate to GAdmin (http://<kinetica-host>:8080/)
  2. Click Query > SQL.

Comparing the current and historical taxi data, what are the shortest, average, and longest trip lengths for each and every vendor?

Using a materialized view, we calculate the statistics for the streaming taxi fares using an aggregate group by and a roll-up, then we join them the historical table using the same statistics to compare the results.

CREATE OR REPLACE MATERIALIZED VIEW M1
REFRESH EVERY 10 SECONDS AS
SELECT
    Today_Vendor_Name as Vendor_Name,
    Today_Shortest_Trip,
    Today_Average_Trip,
    Today_Longest_Trip,
    Today_Total_Trips,
    Hist_Shortest_Trip,
    Hist_Average_Trip,
    Hist_Longest_Trip,
    Hist_Total_Trips
FROM 
(
    SELECT
        NVL(vendor_id, '<ALL VENDORS>') AS Today_Vendor_Name,
        DECIMAL(MIN(trip_distance)) AS Today_Shortest_Trip,
        DECIMAL(AVG(trip_distance)) AS Today_Average_Trip,
        DECIMAL(MAX(trip_distance)) AS Today_Longest_Trip,
        COUNT(*) AS Today_Total_Trips
    FROM taxi_data_streaming
    GROUP BY vendor_id
) t1
INNER JOIN
(
    SELECT
        NVL(vendor_id, '<ALL VENDORS>') AS Hist_Vendor_Name,
        DECIMAL(MIN(trip_distance)) AS Hist_Shortest_Trip,
        DECIMAL(AVG(trip_distance)) AS Hist_Average_Trip,
        DECIMAL(MAX(trip_distance)) AS Hist_Longest_Trip,
        COUNT(*) AS Hist_Total_Trips
    FROM taxi_data_historical
    GROUP BY vendor_id
) t2
ON t2.Hist_Vendor_Name = t1.Today_Vendor_Name

Kinetica Trial Feedback