Accelerating Geospatial Queries for Supply Chain… from Days with Alteryx, to Seconds on Kinetica
Recently, a top 3 global retailer was able to reduce run times for a strategic analytic application from days on Alteryx to just 22 seconds using Kinetica.
The retailer sought to optimize supply chain logistics and operations, with the aim of identifying faster and more profitable delivery routing, and to identify optimal store locations. Kinetica was able to dramatically reduce the time taken to calculate most probable paths (MPP) and point-to-point (P2P) drive times at scale.
This example serves as a good case study for where Kinetica excels in comparison with other analytics applications, such as Alteryx, for workloads that require performance at scale with multi-step advanced analytics.
The Business Challenge
The real-estate division of this major US retailer used a geospatial platform to optimize their supply chain logistics and operations, with the aim of identifying faster and more profitable delivery routing, and to identify optimal store locations.
The problem that they faced was how to calculate the most probable path solves and drive times, at scale, and how to iterate faster in order to provide the insights to the executive team that are needed to feed into the tactical and strategic planning processes.
The customer was using an 80+ node Alteryx cluster to solve their most probable path calculations, but they were facing problems because the calculations were slow, and they were limited in the scale at which they could run the calculations. Some runs might take days, but there was an increasing expectation from the business to be able to run the calculations multiple times per day.
At this point Kinetica was invited to demonstrate our capabilities, and the customer set us the challenge of solving the most probable path by drive time, from one million possible start points to a specific endpoint within the Dallas Fort Worth metropolitan area.
The target time for this exercise was the 4 hour SLA that had previously been set for the Alteryx system. Based on their experience with Alteryx the customer stated that completing the task in under 8 hours would be viewed as a good result.
Proof of Technology (PoT) Architecture
The infrastructure used for the PoT was a relatively modest single node in GCP, with:
- 120 GB memory
- 32-core single-socket CPU
- 4 NVIDIA T4 GPUs
The Kinetica installation process was very straightforward, requiring the installation of the Nvidia GPU drivers and the Kinetica software. No additional software or configuration was required.
The plan for the PoT involved loading the data for the DFW road network. The DFW road network data was supplied by our technology partner Here.com, the same company that provides road and traffic data for companies such as Tom Tom.
We created a second dataset containing a million start points in the DFW area with a common endpoint.
Kinetica Network Graph & Graph Solvers
Kinetica provides a generic and extensible design of networks that can be tailored or used for various real-life applications, such as transportation, utility, social, and geospatial.
Key features of Kinetica graph capabilities include:
- Zero config — all the integration is already built into the platform
- Distributed processing for more scalability and resilience
- Highly performant graph technology with a comprehensive and growing list of graph solvers that you can use out-of-the-box with no need for a host of data scientists to make it work.
Kinetica currently provides the following list of solvers, and this list continues to grow as customers find more and more geospatial and graph applications for Kinetica:
|SHORTEST_PATH | PAGE_RANK | PROBABILITY_RANK | CENTRALITY | CLOSENESS | MULTIPLE_ROUTING INVERSE_SHORTEST_PATH | BACKHAUL_ROUTING | ALLPATHS | STATS_ALL | MARKOV_CHAIN | MATCH_SUPPLY_DEMAND | MATCH_LOOPS | MATCH_OD_PAIRS | MATCH_BATCH_SOLVES | ADJACENCY_SOLVER|
Proof of Technology Approach
This diagram below illustrates how we create and solve a geospatial graph in Kinetica.
- In Step 1 we load the geojson data provided by Here.com using Kinetica’s geojson loader.
- Once that data is in a table in Kinetica we can enrich and augment the dataset for example in Step 2 by setting the traffic speeds of each road segment. In a production scenario traffic volumes and updates to the network itself would usually be updated using a real-time feed.
- In Step 3 we build a native graph in Kinetica based on the underlying database table that we created in Step 2.
- In Step 4 we apply Kinetica’s powerful graph solving capabilities to identify the most probable path and P2P drive times at scale, optionally directing the output to a database table which can be used for further analysis and visualisation.
A key differentiator for Kinetica that separates us from traditional analytics is that Kinetica is vectorised from the ground up, and built to leverage both GPUs and AVX CPUs for extreme high-performance distributed graph, SQL and analytic operations. Kinetica uses GPUs to visualize data at scale with GPU-accelerated rendering of maps and accompanying dashboards at interactive speed inside Reveal, Kinetica’s powerful but easy-to-use geospatial and data exploration and insight discovery tool.
Proof of Technology Results
Kinetica created the graph of the Dallas Fort Worth road network from the underlying database table in approximately 9 seconds.
- Creation of the Kinetica native graph using the Here.com table data
|Create network graph of the DFW metro area (~420K unique road segments)||<10 secs|
- Solve times for the 1 million point-to-point drive-time exercise
|Activity||Solve||1MM WKT Updates|
|Solve 1MM MPP P2P||< 30 secs||<1,100 secs|
Kinetica solved the 1 million P2P drive times in approximately 22 seconds, exceeding the 4 hour best-case SLA stretch target by a significant margin. Generating the complex geospatial routing records for all 1 million origin/destination pairs took a little longer, at approximately 18 minutes. And of course not forgetting that this test was conducted on a single-node cloud instance, and that Kinetica’s distributed and linearly scalable architecture means that you can increase performance and throughput deterministically by scaling horizontally.
This is a typical demonstration of Kinetica’s powerful ability to integrate location analytics on large datasets, blend with other data at speed and at scale, and shows how Kinetica simplifies the effort needed to combine these analytic processes.
We’ve seen that organizations which have invested in their supply network management systems have benefitted from being able to react and re-plan in real-time, dynamically optimizing delivery routes to maintain their operational efficiency and customer satisfaction.
If you’d like to know more about our best-in-class geospatial and graphing analytics database capabilities please try our developer edition for free and contact me directly at email@example.com
Simon Ambridge is Sr. Solution Engineer at Kinetica.