Data in Kinetica can be queried with SQL. One of the unique capabilities of Kinetica is the speed at which Kinetica can query and perform filtering & aggregation operations on the data.
In this tutorial, we are going to walk you through some examples of querying the taxi dataset using SQL.
Though this tutorial focuses on using the SQL tool in GAdmin, you can query using the KiSQL Command-line editor (KiSQL is available in
/opt/gpudb/kitools) or you could connect your own favorite SQL editors, such as SQuirrelSQL, DBeaver or DbVisualizer, with the Kinetica database using Kinetica’s JDBC or ODBC drivers.
Kinetica is a SQL-92-compliant database. The query below shows how Kinetica can maintain materialized views on streaming data, giving you the ability to compare the current status of your data, and the historical context, all in a single query.
To use the SQL tool in GAdmin:
- Navigate to GAdmin (
- Click Query > SQL.
Comparing the current and historical taxi data, what are the shortest, average, and longest trip lengths for each and every vendor?
Using a materialized view, we calculate the statistics for the streaming taxi fares using an aggregate group by and a roll-up, then we join them the historical table using the same statistics to compare the results.
CREATE OR REPLACE MATERIALIZED VIEW M1 REFRESH 10 SECONDS AS SELECT NVL(vendor_id, '<ALL VENDORS>') AS Today_Vendor_Name, DECIMAL(MIN(trip_distance)) AS Today_Shortest_Trip, DECIMAL(AVG(trip_distance)) AS Today_Average_Trip, DECIMAL(MAX(trip_distance)) AS Today_Longest_Trip, COUNT(*) AS Today_Total_Trips FROM taxi_data_streaming Inner join ( select NVL(vendor_id, '<ALL VENDORS>') AS Hist_Vendor_Name, DECIMAL(MIN(trip_distance)) AS Hist_Shortest_Trip, DECIMAL(AVG(trip_distance)) AS Hist_Average_Trip, DECIMAL(MAX(trip_distance)) AS Hist_Longest_Trip, COUNT(*) AS Hist_Total_Trips FROM taxi_data_historical ) t2 On t2.Historical_Vendor_Name = t1.Today_Vendor_Name GROUP BY ROLLUP(vendor_id) ORDER BY 1;