Querying with SQL
Data in Kinetica can be queried with SQL. One of the unique capabilities of Kinetica is the speed at which Kinetica can query and perform filtering & aggregation operations on the data.
In this tutorial, we are going to walk you through some examples of querying the taxi dataset using SQL.
Though this tutorial focuses on using the SQL tool in GAdmin, you can query using the KiSQL Command-line editor (KiSQL is available in /opt/gpudb/kitools
) or you could connect your own favorite SQL editors, such as SQuirrelSQL, DBeaver or DbVisualizer, with the Kinetica database using Kinetica’s JDBC or ODBC drivers.
Query
Kinetica is a SQL-92-compliant database. The query below shows how Kinetica can maintain materialized views on streaming data, giving you the ability to compare the current status of your data, and the historical context, all in a single query.
To use the SQL tool in GAdmin:
- Navigate to GAdmin (
http://<kinetica-host>:8080/
) - Click Query > SQL.
Comparing the current and historical taxi data, what are the shortest, average, and longest trip lengths for each and every vendor?
Using a materialized view, we calculate the statistics for the streaming taxi fares using an aggregate group by and a roll-up, then we join them the historical table using the same statistics to compare the results.
CREATE OR REPLACE MATERIALIZED VIEW M1 REFRESH EVERY 10 SECONDS AS SELECT Today_Vendor_Name as Vendor_Name, Today_Shortest_Trip, Today_Average_Trip, Today_Longest_Trip, Today_Total_Trips, Hist_Shortest_Trip, Hist_Average_Trip, Hist_Longest_Trip, Hist_Total_Trips FROM ( SELECT NVL(vendor_id, '<ALL VENDORS>') AS Today_Vendor_Name, DECIMAL(MIN(trip_distance)) AS Today_Shortest_Trip, DECIMAL(AVG(trip_distance)) AS Today_Average_Trip, DECIMAL(MAX(trip_distance)) AS Today_Longest_Trip, COUNT(*) AS Today_Total_Trips FROM taxi_data_streaming GROUP BY vendor_id ) t1 INNER JOIN ( SELECT NVL(vendor_id, '<ALL VENDORS>') AS Hist_Vendor_Name, DECIMAL(MIN(trip_distance)) AS Hist_Shortest_Trip, DECIMAL(AVG(trip_distance)) AS Hist_Average_Trip, DECIMAL(MAX(trip_distance)) AS Hist_Longest_Trip, COUNT(*) AS Hist_Total_Trips FROM taxi_data_historical GROUP BY vendor_id ) t2 ON t2.Hist_Vendor_Name = t1.Today_Vendor_Name