Version:

Network Graphs & Solvers Concepts

Kinetica provides a generic and extensible design of networks with the aim of being tailored or used for various real-life applications, such as transportation, utility, social, and geospatial. Networks comprise a graph and a solver. A graph represents topological relationships via nodes that are connected by edges; a solver represents the type of solution appropriate for the issue the graph was made to illustrate. Kinetica includes a graph server in default setups. Using several API endpoints, a user can create a graph from several components, solve the graph using one of several solver types, and query the solved graph. The graph server is enabled by default; it can be managed via standard Kinetica Service Management. Graph server logs are available in /opt/gpudb/graph/logs. See the Configuration Reference for more information on configuring the graph server.

Tip

Graphs can be created, queried, solved, and matched via the GAdmin graph interface. See Graphs for details.

../_images/graph_solver_definitions.png

Solvers

Kinetica has several different solvers available for presenting solutions to various types of network graph problems. Note that some solvers are only available to /solve/graph and likewise for /match/graph. Consult Solving a Graph or Matching a Graph for more information on the desired operation.

Solve Graph Solvers

Solver Description CPU Parallel GPU Parallel
SHORTEST_PATH Determines the shortest path upstream using single-source routing. X X
PAGE_RANK Calculates how connected the nodes are and determines which nodes are the most important. Weights are not required. X  
PROBABILITY_RANK Calculates the probability of a node being connected to another node using hidden Markov chains. X  
CENTRALITY Calculates the betweenness of the nodes   X
MULTIPLE_ROUTING Calculates the shortest possible route between the nodes and returns to the origin node -- also known as the travelling salesman X X
INVERSE_SHORTEST_PATH Determines the shortest path downstream using multiple technician routing X X
BACKHAUL_ROUTING Determines the optimal routes between remote asset nodes and fixed asset nodes X X
ALLPATHS Determines all reasonable paths between a source and destination pair. X X

Match Graph Solvers

Solver Description
INCREMENTAL_WEIGHTED Matches sample points to the graph using time and/or distance between points to influence one or more shortest paths across the sample points.
MARKOV_CHAIN Matches sample points to the graph using the Hidden Markov Model (HMM)-based method, which conducts a range-tree closest-edge search to find the best combinations of possible road segments for each sample point to create the best route. The route is secured one point at a time, so the prediction is corrected after each point. This solution type is the most accurate but also the most computationally intensive.
MATCH_OD_PAIRS Matches sample points to find the most probable path between origin and destination (OD) pairs with given cost constraints.
MATCH_SUPPLY_DEMAND Matches sample generic supply depots to generic demand points using abstract transportation (referred to as trucks). Each route is determined by a truck's ability (size) to service demand at each demand point.

Components and Identifiers

Graph components (e.g., nodes, edges, weights, and/or restrictions) must be defined with varying combinations of the identifiers listed in the tables below.

Components for create, solve, query, and match operations can be identified in three ways:

  • Aliasing existing table columns, e.g., table_name.column_name AS NODE_WKTPOINT
  • Using expressions, e.g., ST_LENGTH(wkt) AS WEIGHTS_VALUESPECIFIED
  • Using raw values, e.g., {0} AS RESTRICTIONS_ONOFFCOMPARED

Note

There are separate identifiers and combinations for /query/graph and /match/graph. See the associated Querying a Graph and Matching a Graph sections for more information.

Identifiers are flagged aliases that the database knows to look for; existing source columns can be used as identifiers. Note that it will often be the case that source columns are reused for different identifiers because the components must naturally be linked together to create a network graph. For example, source table seattle_road_network has columns WKTLINE (a wkt column), TwoWay (an integer column), and time (also an integer column); these columns could be identified via /create/graph like so:

create_s_graph_response = kinetica.create_graph(
    graph_name=GRAPH_S,
    directed_graph=True,
    nodes=[],
    edges=[
        TABLE_SRN + ".WKTLINE AS EDGE_WKTLINE",
        TABLE_SRN + ".TwoWay AS EDGE_DIRECTION"
    ],
    weights=[
        TABLE_SRN + ".WKTLINE AS WEIGHTS_EDGE_WKTLINE",
        TABLE_SRN + ".TwoWay AS WEIGHTS_EDGE_DIRECTION",
        TABLE_SRN + ".time AS WEIGHTS_VALUESPECIFIED"
    ],
    restrictions=[],
    options={
        "recreate": "true"
    }
)

Important

Identifiers with string as a supported type can be any string column type, e.g., char1 - char256 or unrestricted string.

Nodes

Nodes represent fundamental topological units of a graph. Nodes can be defined with an integer ID, a string name, or geospatial information, e.g., WKT point ("POINT(X Y)") or XY pair. Nodes are optional, as the start and end points of an edge can implicitly be used as nodes.

Identifier Supported Types Description
NODE_ID int, long A number representing a node identifier.
NODE_X float, double, int, long A number representing a node's X or longitude value.
NODE_Y float, double, int, long A number representing a node's Y or latitude value.
NODE_NAME string A string value representing a node's name.
NODE_WKTPOINT wkt A WKT string representing a node's geospatial point, e.g., "POINT(X Y)". See Geospatial Objects for more information on WKTs.
NODE_LABEL string

A string value representing a node's label. Exclusively used with /query/graph restrictions.

Important

Even though this identifier can only be used when querying a graph, this identifier must be specified during the /create/graph stage.

Edges

Edges represent the required fundamental topological units of a graph that typically connect nodes. Edges can be defined with an integer ID, string name, or geospatial information, e.g., WKT point ("POINT(X Y)"), line ("LINESTRING(X1 Y1, X2 Y2)"), or XY pairs. An edge can be implicitly drawn between two nodes. If an edge is defined using WKT LINESTRINGs, the graph server is capable of splitting many LINESTRING segments into multiple, separate LINESTRINGs, thus creating one edge per LINESTRING segment.

Identifier Supported Types Description
EDGE_ID int, long A number representing an edge identifier.
EDGE_NODE1_ID int, long A number representing the first of the edge's nodes.
EDGE_NODE2_ID int, long A number representing the second of the edge's nodes.
EDGE_WKTLINE wkt A WKT string representing an edge's geospatial line, e.g., "LINESTRING(X1 Y1, X2 Y2)". See Geospatial Objects for more information on WKTs.
EDGE_NODE1_X float, double, int, long A number representing the first of the edge's nodes' X or longitude value.
EDGE_NODE1_Y float, double, int, long A number representing the first of the edge's nodes' Y or latitude value.
EDGE_NODE2_X float, double, int, long A number representing the second of the edge's nodes' X or longitude value.
EDGE_NODE2_Y float, double, int, long A number representing the second of the edge's nodes' Y or latitude value.
EDGE_NODE1_WKTPOINT wkt A WKT string representing the first of the edge's nodes' geospatial point, e.g., "POINT(X1, Y1)". See Geospatial Objects for more information on WKTs.
EDGE_NODE2_WKTPOINT wkt A WKT string representing the second of the edge's nodes' geospatial point, e.g., "POINT(X2, Y2)". See Geospatial Objects for more information on WKTs.
EDGE_NODE1_NAME string A string value representing the first of the edge's nodes' name.
EDGE_NODE2_NAME string A string value representing the second of the edge's nodes' name.
EDGE_DIRECTION int A number representing what direction the edge can be traveled; 0 being a forward one-way edge (node 1 to node 2), 1 being a two-way edge (node 1 to node 2 or node 2 to node 1), and 2 being a backward one-way edge (node 2 to node 1).
EDGE_LABEL string

A string value representing an edge's label. Exclusively used with /query/graph restrictions.

Important

Even though this identifier can only be used when querying a graph, this identifier must be specified during the /create/graph stage.

Weights

Weights represent a method of informing the graph solver of the cost of including a given edge in a solution. Weights can be defined using an integer ID, string node names, or spatial information ("LINESTRING(X Y, X Y)") and a static cost value or a cost multiplier. Each edge is associated with one weight, but there can be many edges between two nodes in a graph with directionality (EDGE_DIRECTION), allowing for many different weights along the same edge, which can have useful applications in real-world examples, e.g., different lanes between two junctions may have different speeds of travel.

Note

If EDGE_DIRECTION is specified for an edge in a directed graph, the weight will be the same going in each direction.

Identifier Supported Types Description
WEIGHTS_EDGE_ID int, long A number representing a weight's associated edge identifier.
WEIGHTS_EDGE_DIRECTION int A number representing what direction the weight's associated edge can be traveled; 0 being a forward one-way edge (node 1 to node 2), 1 being a two-way edge (node 1 to node 2 or node 2 to node 1), and 2 being a backward one-way edge (node 2 to node 1).
WEIGHTS_EDGE_WKTLINE wkt A WKT string representing a weight's associated edge geospatial line, e.g., "LINESTRING(X1 Y1, X2 Y2)". See Geospatial Objects for more information on WKTs.
WEIGHTS_EDGE_NODE1_ID int, long A number representing the first of a weight's associated edge's nodes.
WEIGHTS_EDGE_NODE2_ID int, long A number representing the second of a weight's associated edge's nodes.
WEIGHTS_EDGE_NODE1_NAME string A string value representing the first of a weight's associated edge's nodes' name.
WEIGHTS_EDGE_NODE2_NAME string A string value representing the second of a weight's associated edge's nodes' name.
WEIGHTS_VALUESPECIFIED float, double, int, long A number representing the weight's value.
WEIGHTS_FACTORSPECIFIED float, double, int, long A number representing how much incoming cost values will be multiplied.

Important

Currently, WEIGHTS_FACTORSPECIFIED will only affect the cost if the edge has a WEIGHTS_VALUESPECIFIED already established. This means that WEIGHTS_FACTORSPECIFIED should only be used in /solve/graph or in conjunction with a WEIGHTS_VALUESPECIFIED during /create/graph

Restrictions

Restrictions represent a method of informing the graph solver which edges and/or nodes should be ignored for the solution. Restrictions can be defined using an integer ID and a value or as a switch (on or off).

Identifier Supported Types Description
RESTRICTIONS_NODE_ID int, long A number representing the restriction's associated node identifier.
RESTRICTIONS_EDGE_ID int, long A number representing the restriction's associated edge identifier.
RESTRICTIONS_NODE_WKTPOINT wkt A WKT string representing the restriction's associated node's geospatial point, e.g., "POINT(X, Y)". See Geospatial Objects for more information on WKTs.
RESTRICTIONS_NODE_NAME string A string value representing the restriction's associated node.
RESTRICTIONS_EDGE_WKTLINE wkt A WKT string representing a weight's associated edge geospatial line, e.g., "LINESTRING(X1 Y1, X2 Y2)". See Geospatial Objects for more information on WKTs.
RESTRICTIONS_VALUECOMPARED float, double, int, long A number representing the value incoming costs will be compared against.
RESTRICTIONS_ONOFFCOMPARED int A number representing if the associated node or edge cannot be traversed, with 1 meaning it can be traversed and 0 meaning it cannot.
RESTRICTIONS_NODE_LABEL string

A string value referring to a node label for restrictive purposes. Exclusively used with /query/graph restrictions.

Important

Even though this identifier can only be used when querying a graph, this identifier must be specified during the /create/graph stage.

RESTRICTIONS_EDGE_LABEL string

A string value referring an edge label for restrictive purposes. Exclusively used with /query/graph restrictions.

Important

Even though this identifier can only be used when querying a graph, this identifier must be specified during the /create/graph stage.

Note

When using RESTRICTIONS_VALUECOMPARED, solvers will not use the given node or edge if the current cost is less than the restriction value. When using RESTRICTIONS_ONOFFCOMPARED, solvers will not use the given node or edge if the RESTRICTIONS_ONOFFCOMPARED value is set to 0 (off).

Identifier Combinations

For each component, there's a minimum set of identifiers that must be used to properly create a graph. Each component's identifier combinations must reference columns from the same table, e.g., the combination NODE_ID and NODE_NAME must both use the same table. The columns must also not be nullable. Identifier types across components should match where possible.

Important

WKT identifiers can be matched to X/Y identifiers (and vice versa) within a user-specified tolerance (merge_tolerance under the /create/graph endpoint's options map). Using ID or NAME identifiers relies on exact matching. The WKTLINE identifiers will use the line's start and end points to map to an XY pair or WKTPOINT.

For example, if the identifier combination used for nodes is:

nodes=[
    TABLE_TAXI_N + ".id AS NODE_ID"
],

The edges identifier combination should include a match for the NODE_ID. The following edge combination would match correctly with the node combination; note that matching node point(s) to edge endpoint(s) requires two edge endpoints to make an implicit edge between the points:

edges=[
    TABLE_TAXI_EW + ".pickup_id AS EDGE_NODE1_ID",
    TABLE_TAXI_EW + ".dropoff_id AS EDGE_NODE2_ID"
],

Note

The above example is not the only edge combinations available for the NODE_ID identifier combination. See the Edges section below for other combinations.

If using multiple groups of combinations while creating, solving, querying, or matching a graph, the combinations must be separated by empty quotes ("") in the respective array, e.g.:

  "{'Jane'} AS QUERY_NODE_NAME",
  "",
  "{'chess'} AS QUERY_TARGET_NODE_LABEL"

If specifying identifier combinations as raw values, the number of values within each identifier must match across the combination group, e.g.:

"{'Bill', 'Alex'} AS RESTRICTIONS_NODE_NAME",
"{0, 0} AS RESTRICTIONS_ONOFFCOMPARED"

Nodes

  • NODE_ID
  • NODE_ID, NODE_NAME
  • NODE_ID, NODE_WKTPOINT
  • NODE_ID, NODE_X, NODE_Y
  • NODE_NAME
  • NODE_NAME, NODE_LABEL
  • NODE_NAME, NODE_WKTPOINT
  • NODE_NAME, NODE_X, NODE_Y
  • NODE_WKTPOINT
  • NODE_X, NODE_Y

Edges

  • EDGE_ID, EDGE_NODE1_ID, EDGE_NODE2_ID
  • EDGE_ID, EDGE_NODE1_ID, EDGE_NODE2_ID, EDGE_DIRECTION
  • EDGE_ID, EDGE_NODE1_NAME, EDGE_NODE2_NAME
  • EDGE_ID, EDGE_NODE1_WKTPOINT, EDGE_NODE2_WKTPOINT
  • EDGE_ID, EDGE_NODE1_X, EDGE_NODE1_Y, EDGE_NODE2_X, EDGE_NODE2_Y
  • EDGE_ID, EDGE_WKTLINE
  • EDGE_ID, EDGE_WKTLINE, EDGE_DIRECTION
  • EDGE_NODE1_ID, EDGE_NODE2_ID
  • EDGE_NODE1_NAME, EDGE_NODE2_NAME
  • EDGE_NODE1_NAME, EDGE_NODE2_NAME, EDGE_LABEL
  • EDGE_NODE1_WKTPOINT, EDGE_NODE2_WKTPOINT
  • EDGE_NODE1_X, EDGE_NODE1_Y, EDGE_NODE2_X, EDGE_NODE2_Y
  • EDGE_WKTLINE
  • EDGE_WKTLINE, EDGE_DIRECTION

Weights

  • WEIGHTS_EDGE_ID, WEIGHTS_VALUESPECIFIED
  • WEIGHTS_EDGE_ID, WEIGHTS_FACTORSPECIFIED
  • WEIGHTS_EDGE_WKTLINE, WEIGHTS_EDGE_DIRECTION, WEIGHTS_VALUESPECIFIED
  • WEIGHTS_EDGE_WKTLINE, WEIGHTS_VALUESPECIFIED
  • WEIGHTS_EDGE_WKTLINE, WEIGHTS_FACTORSPECIFIED
  • WEIGHTS_EDGE_NODE1_NAME, WEIGHTS_EDGE_NODE2_NAME, WEIGHTS_VALUESPECIFIED
  • WEIGHTS_EDGE_NODE1_NAME, WEIGHTS_EDGE_NODE2_NAME, WEIGHTS_FACTORSPECIFIED
  • WEIGHTS_EDGE_NODE1_ID, WEIGHTS_EDGE_NODE2_ID, WEIGHTS_VALUESPECIFIED
  • WEIGHTS_EDGE_NODE1_ID, WEIGHTS_EDGE_NODE2_ID, WEIGHTS_FACTORSPECIFIED

Restrictions

  • RESTRICTIONS_EDGE_ID, RESTRICTIONS_ONOFFCOMPARED
  • RESTRICTIONS_EDGE_ID, RESTRICTIONS_VALUECOMPARED
  • RESTRICTIONS_EDGE_LABEL, RESTRICTIONS_ONOFFCOMPARED
  • RESTRICTIONS_EDGE_WKTLINE, RESTRICTIONS_VALUECOMPARED
  • RESTRICTIONS_EDGE_WKTLINE, RESTRICTIONS_ONOFFCOMPARED
  • RESTRICTIONS_NODE_ID, RESTRICTIONS_ONOFFCOMPARED
  • RESTRICTIONS_NODE_ID, RESTRICTIONS_VALUECOMPARED
  • RESTRICTIONS_NODE_LABEL, RESTRICTIONS_ONOFFCOMPARED
  • RESTRICTIONS_NODE_NAME, RESTRICTIONS_VALUECOMPARED
  • RESTRICTIONS_NODE_NAME, RESTRICTIONS_ONOFFCOMPARED
  • RESTRICTIONS_NODE_WKTPOINT, RESTRICTIONS_VALUECOMPARED
  • RESTRICTIONS_NODE_WKTPOINT, RESTRICTIONS_ONOFFCOMPARED

Creating a Graph

Creating a graph is serviced by the /create/graph endpoint; this involves reading from tables with annotated component identifiers and drawing relationships between given nodes and/or edges on a graph, taking into account nodes or edges between nodes that should be favored or ignored.

Note

Though it's recommended edges and weights are kept in the same table, it's not required.

Once the components are setup, the graph can be created. Requirements for creating a graph include:

  • name for the graph
  • if the graph is directed or not
  • edges
  • weights (for most solver types)

Important

Nodes and restrictions are not required to create a graph. If nodes are included, however, they should be kept in a separate table from edges and weights. If restrictions are included, they can exist in either the nodes table and/or edges/weights table(s) or in an entirely separate table.

Directed Graphs

Whether a graph is directed or not can determine how a graph is solved or queried. Using Components and Identifiers and Identifier Combinations for context, edges connect two nodes together ("node 1" and "node 2"). When a graph is directed, these nodes that comprose a given edge have an implicit direction: "node 1" to "node 2". Regardless if these nodes have a spatial context, Kinetica will treat "node 2" as if it must be travelled to directly from "node 1" (whatever their underlying values may be: Jim and George , POINT(0 0) and POINT(9 10), 32 and 45, etc.).

For example, given the below non-directed graph:

../_images/non_directed_example.png

Attempting to solve for the shortest path from node F to node A would result in a SOLVERS_EDGE_PATH of F, E, D, C, B, A. Querying for other nodes attached to node E would result in two adjacencies: F and D.

On the other hand, given the below directed graph:

../_images/directed_example.png

Attempting to solve for the shortest path from node F to node A would be unsuccessful because there are only edges going toward F (and none leading away from F and subsequently toward A). Querying for other nodes attached to node E would result in a single adjacency: F.

Solving a Graph

Solving a graph is serviced by the /solve/graph endpoint; this involves using given source node(s), destination node(s), and any weights or restrictions from an existing graph to calculate a given solution type. Off-graph spatial locations are accepted in all solvers, with the results being corrected to snap locations. The calculated solution is then placed in a table in Kinetica; note that many concurrent solves over the same graph are possible. The source node determines from which node the graph solution routing is started; the destination node(s) determines at which node the graph solution will complete routes. Source/destination node(s) can be an ID, name, or WKT point.

Requirements for solving a graph include:

  • name of the graph to solve
  • solution type

Important

Additional weights and restrictions beyond those defined in the graph creation stage can also be provided. Any provided weights will be added (in the case of WEIGHTS_VALUESPECIFIED) to or multiplied with (in the case of WEIGHTS_FACTORSPECIFIED) the existing weight(s). Consult Components and Identifiers for formatting and specifications.

There are several solution types to choose from:

Solver Description CPU Parallel GPU Parallel
SHORTEST_PATH Determines the shortest path upstream using single-source routing. X X
PAGE_RANK Calculates how connected the nodes are and determines which nodes are the most important. Weights are not required. X  
PROBABILITY_RANK Calculates the probability of a node being connected to another node using hidden Markov chains. X  
CENTRALITY Calculates the betweenness of the nodes   X
MULTIPLE_ROUTING Calculates the shortest possible route between the nodes and returns to the origin node -- also known as the travelling salesman X X
INVERSE_SHORTEST_PATH Determines the shortest path downstream using multiple technician routing X X
BACKHAUL_ROUTING Determines the optimal routes between remote asset nodes and fixed asset nodes X X
ALLPATHS Determines all reasonable paths between a source and destination pair. X X

Querying a Graph

Querying a graph is serviced by the /query/graph endpoint; this involves querying a graph for adjacent nodes (if provided edges) or adjacent edges (if provided nodes) using integer IDs, names, or WKT information. Additional adjacent rings around the specified nodes can also be queried. Results can be exported to a table in Kinetica.

Requirements for querying a graph include:

  • name of the graph to query
  • a list of edge or node IDs, names, or WKTs to query

Important

Additional restrictions beyond those defined in the graph creation stage can also be provided. Query restrictions can utilize any of the restriction identifiers, including the unique LABEL identifiers.

Query Identifiers

Nodes or edges to be queried can be identified using any of the query-specific identifiers below.

Important

Consult Components and Identifiers and Identifier Combinations for general information on identifiers and combinations. Note that the same limitations that apply to /create/graph and /solve/graph identifiers also apply to /query/graph identifiers

Identifier Supported Types Description
QUERY_NODE_ID int, long Defines the NODE_ID value that will be matched against in the source graph to determine the source for the query.
QUERY_NODE_LABEL string Defines the NODE_LABEL value that will be matched against in the source graph to determine the source(s) for the query. See Using Labels for more information.
QUERY_NODE_WKTPOINT wkt Defines the NODE_WKTPOINT value that will be matched against in the source graph to determine the source for the query.
QUERY_NODE_NAME string Defines the NODE_NAME value that will be matched against in the the source graph to determine the source for the query.
QUERY_EDGE_ID int, long Defines the EDGE_ID value that will be matched against in the source graph to determine the source for the query.
QUERY_EDGE_LABEL string Defines the EDGE_LABEL value that will be matched against in the source graph to determine the source(s) for the query. See Using Labels for more information.
QUERY_EDGE_WKTLINE wkt Defines the EDGE_WKTLINE value that will be matched against in the source graph to determine the source for the query.
QUERY_NODE1_ID int, long Defines the NODE_ID value that will be matched against in the source graph (in conjunction with QUERY_NODE2_ID) to determine the source for the query.
QUERY_NODE2_ID int, long Defines the NODE_ID value that will be matched against in the source graph (in conjunction with QUERY_NODE1_ID) to determine the source for the query.
QUERY_NODE1_WKTPOINT wkt Defines the NODE_WKTPOINT value that will be matched against in the source graph (in conjunction with QUERY_NODE2_WKTPOINT) to determine the source for the query.
QUERY_NODE2_WKTPOINT wkt Defines the NODE_WKTPOINT value that will be matched against in the source graph (in conjunction with QUERY_NODE1_WKTPOINT) to determine the source for the query.
QUERY_NODE1_NAME string Defines the NODE_NAME value that will be matched against in the source graph (in conjunction with QUERY_NODE2_NAME) to determine the source for the query.
QUERY_NODE2_NAME string Defines the NODE_NAME value that will be matched against in the source graph (in conjunction with QUERY_NODE1_NAME) to determine the source for the query.
QUERY_TARGET_NODE_LABEL string Defines the NODE_LABEL value that will be matched against in the source graph to determine the destination for the query as well as the path taken through the graph to arrive at the destination.

Using Query Identifiers

Pair one value with an appropriate identifier to query for features connected to the given value within the provided number of rings:

queries=[
  "{'male'} AS QUERY_NODE_LABEL"
],

Pair multiple values with a single appropriate identifier to query for features connected to each value independently (effectively an OR) within the provided number of rings:

queries=[
  "{'female', 'chess'} AS QUERY_NODE_LABEL",
],

Important

A rings value of 0 will return features that match the provided query exactly; in this case, the query would return nodes that have a label of female or chess

Providing any combination with the QUERY_TARGET_NODE_LABEL combination (separated by an empty string) will effectively produce a source-destination query where the first combination is the source and the QUERY_TARGET_NODE_LABEL is the destination (assuming the destination is within the provided number of rings):

queries = [
  "{'Jane'} AS QUERY_NODE_NAME",
  "",
  "{'chess'} AS QUERY_TARGET_NODE_LABEL"
],

Query Identifier Combinations

To properly query a graph using identifiers, there's a minimum set of identifiers that must be used. Each identifier combination must reference columns from the same table, e.g., the combination QUERY_NODE1_ID and QUERY_NODE2_ID must both use columns from the same table. The columns must also not be nullable.

Nodes

  • QUERY_NODE_ID
  • QUERY_NODE_LABEL
  • QUERY_NODE_WKTPOINT
  • QUERY_NODE_NAME
  • QUERY_TARGET_NODE_LABEL

Edges

  • QUERY_EDGE_ID
  • QUERY_EDGE_LABEL
  • QUERY_EDGE_WKTLINE
  • QUERY_NODE1_ID, QUERY_NODE2_ID
  • QUERY_NODE1_WKTPOINT, QUERY_NODE2_WKTPOINT
  • QUERY_NODE1_NAME, QUERY_NODE2_NAME

Using Labels

Labels are a type of identifier that provide additional context to a node or edge and can act as a target for a query. Labels are paired with another identifier in several valid combinations listed above, but each unique label must be part of its own combination; i.e., there cannot be two labels in the same identifier combination. For example, this is a valid multi-label configuration:

nodes = [
  TABLE_P + ".name AS NODE_NAME",
  TABLE_P + ".interest AS NODE_LABEL",
  "",
  TABLE_P + ".name AS NODE_NAME",
  TABLE_P + ".gender AS NODE_LABEL"
],

But this example is invalid:

nodes = [
  TABLE_P + ".name AS NODE_NAME",
  TABLE_P + ".interest AS NODE_LABEL",
  TABLE_P + ".gender AS NODE_LABEL",
],

Labels can only be referenced in the context of queries using the /query/graph endpoint. There are three types of labels: node / edge labels, restriction labels, and target labels.

Node / Edge Labels

The QUERY_NODE_LABEL and QUERY_EDGE_LABEL query identifiers are unique to /query/graph. The QUERY_NODE_LABEL / QUERY_EDGE_LABEL identifiers are not paired with other query combinations; they are used to retrieve only the nodes and/or edges associated with a given label. For example:

queries=[
  "{'male'} AS QUERY_NODE_LABEL"
],

Restriction Labels

The RESTRICTIONS_NODE_LABEL and RESTRICTIONS_EDGE_LABEL identifiers are used to restrict the value set defined as NODE_LABEL and EDGE_LABEL respectively. These restrictive identifiers must follow an already-defined NODE_LABEL or EDGE_LABEL, i.e. they cannot be used on their own at graph creation time. They are used in restriction combinations when querying a graph just like other restriction identifiers.

For example, if the relation between two people is used as the EDGE_LABEL when creating the graph:

edges = [
  TABLE_K + ".name1 AS EDGE_NODE1_NAME",
  TABLE_K + ".name2 AS EDGE_NODE2_NAME",
  TABLE_K + ".relation AS EDGE_LABEL"
],

Then the family relation label can be restricted during the subsequent query using the RESTRICTIONS_EDGE_LABEL identifier:

restrictions = [
  "{'family'} AS RESTRICTIONS_EDGE_LABEL",
  "{0} AS RESTRICTIONS_ONOFFCOMPARED"
],

Target Labels

The QUERY_TARGET_NODE_LABEL query identifier is unique to /query/graph. It's always paired with another query combination to define a source-destination relationship and pathing for a query. This identifier also must follow an already-defined NODE_LABEL. Note that an adjacency output table will have two additional columns if the QUERY_TARGET_NODE_LABEL identifier is used:

  • PATH_ID - an ID that helps identify the different paths taken to arrive at the query target if there is more than one adjacency for a given query.
  • RING_ID - an ID that helps identify the steps taken to arrive at a query target. The RING_ID can also be referred to as the hop ID or number of hops it took to arrive at the query target

For example, a social graph can be created, representing a set of people, their primary interests, and their relationship to each other. Each node represents a person with their first name as the NODE_NAME and the primary interest as the NODE_LABEL, while each edge represents the relationship between each pair of people. The nodes would be defined as follows, using the name & interest columns from a table containing a given set of people as the NODE_NAME & NODE_LABEL of each node, respectively:

nodes = [
  TABLE_P + ".name AS NODE_NAME",
  TABLE_P + ".interest AS NODE_LABEL",
],

This graph can be used to query for people with an interest in chess who are related either directly, or indirectly via one or more other people, to the person named Jane. This can be accomplished by querying the graph, using Jane as the QUERY_NODE_NAME (which instructs the graph engine to begin the search at the node with a NODE_NAME of Jane) and using chess as the QUERY_TARGET_NODE_LABEL (which directs the graph engine to only return directly or indirectly connected nodes with a NODE_LABEL of chess):

queries = [
  "{'Jane'} AS QUERY_NODE_NAME",
  "",
  "{'chess'} AS QUERY_TARGET_NODE_LABEL"
],

The query results in two targets for the output adjacency table (Alex and Tom):

Edge ID Name 1   Name 2    Path ID Ring ID
======= ======== ========= ======= =======
1       Jane     Bill      0       1
2       Bill     Susan     0       2
5       Susan    Alex      0       3
1       Jane     Bill      1       1
2       Bill     Susan     1       2
5       Susan    Alex      1       3
4       Alex     Tom       1       4

Matching a Graph

Matching a graph is serviced by the /match/graph endpoint; this involves matching a directed route implied by a given set of latitude/longitude points to an existing underlying road network graph using a given solution type. The solution is then calculated and output to two tables (consult /match/graph for more information):

  1. A table containing track information and the mean square error score -- the lower the number, the more accurate a match.
  2. A table containing the coordinate information and how it relates to the track information and segment from the underlying graph.

Requirements for matching a graph include:

  • name of the graph to match
  • sample points
  • solution type

Solution types available:

Solver Description
INCREMENTAL_WEIGHTED Matches sample points to the graph using time and/or distance between points to influence one or more shortest paths across the sample points.
MARKOV_CHAIN Matches sample points to the graph using the Hidden Markov Model (HMM)-based method, which conducts a range-tree closest-edge search to find the best combinations of possible road segments for each sample point to create the best route. The route is secured one point at a time, so the prediction is corrected after each point. This solution type is the most accurate but also the most computationally intensive.
MATCH_OD_PAIRS Matches sample points to find the most probable path between origin and destination (OD) pairs with given cost constraints.
MATCH_SUPPLY_DEMAND Matches sample generic supply depots to generic demand points using abstract transportation (referred to as trucks). Each route is determined by a truck's ability (size) to service demand at each demand point.

Match Identifiers

Mapping a graph to a table requires a set of sample points. Sample points must be provided to the /match/graph endpoint using the unique identifiers below.

Important

Consult Components and Identifiers and Identifier Combinations for general information on identifiers and combinations. Note that the same limitations that apply to /create/graph and /solve/graph identifiers also apply to /match/graph identifiers

Identifier Supported Types Description
SAMPLE_X float, double A number representing a sample's X or longitude value.
SAMPLE_Y float, double A number representing a sample's Y or latitude value.
SAMPLE_WKTPOINT float, double A WKT string representing a sample's geospatial point, e.g., "POINT(X Y)". See Geospatial Objects for more information on WKTs.
SAMPLE_TIME long, double

A number representing a sample's time value.

Note

SAMPLE_TIME could theoretically represent any time unit (seconds, minutes, epoch, etc.) as long as the representations are consistent for the column in the source table.

SAMPLE_TRIPID int, long A number representing a unique trip identifier.
SAMPLE_ORIGIN_WKTPOINT wkt A WKT string representing a sample's geospatial origin point, e.g., "POINT(X Y)". See Geospatial Objects for more information on WKTs.
SAMPLE_DESTINATION_WKTPOINT wkt A WKT string representing a sample's geospatial destination point, e.g., "POINT(X Y)". See Geospatial Objects for more information on WKTs.
SAMPLE_OD_TIME float, double

A number representing an origin-destination (OD) pair-related sample's cost.

Note

SAMPLE_OD_TIME may not necessarily depict time, e.g., if the graph's weights were distance-based, SAMPLE_OD_TIME could theoretically reflect distance as well, assuming the values are consistent with the values used to create the original weights.

SAMPLE_OD_ID int, long A number representing an OD-pair-related sample's unique identifier.
SAMPLE_DEMAND_ID int, long A number representing a demand's unique identifier.
SAMPLE_DEMAND_WKTPOINT wkt A WKT string representing a demand's geospatial source point.
SAMPLE_DEMAND_SIZE int, long A number representing the size of the demand.
SAMPLE_DEMAND_DEPOT_ID int, long A number representing a demand source's unique identifier.
SAMPLE_SUPPLY_DEPOT_ID int, long A number representing a supplier source's unique identifier.
SAMPLE_SUPPLY_WKTPOINT wkt A WKT string representing a supplier's geospatial source point.
SAMPLE_SUPPLY_TRUCK_ID int, long A number representing a transport's unique identifier.
SAMPLE_SUPPLY_TRUCK_SIZE int, long A number representing a transport's capacity.

Note

Records with a timestamp of 0 for the SAMPLE_TIME column will be ignored when calculating the solution.

Match Identifier Combinations

To properly match a graph using identifiers, there's a minimum set of identifiers that must be used. Each identifier combination must reference columns from the same table, e.g., the combination SAMPLE_WKTPOINT and SAMPLE_TIME must both use columns from the same table. The columns must also not be nullable. The valid match identifier combinations are as follows:

Note

If using the SAMPLE_TRIPID identifier to match the graph, all trip IDs will be used in the solution.

  • SAMPLE_X, SAMPLE_Y, SAMPLE_TIME

  • SAMPLE_WKTPOINT, SAMPLE_TIME

  • SAMPLE_X, SAMPLE_Y, SAMPLE_TIME, SAMPLE_TRIPID

  • SAMPLE_WKTPOINT, SAMPLE_TIME, SAMPLE_TRIPID

  • SAMPLE_ORIGIN_WKTPOINT, SAMPLE_DESTINATION_WKTPOINT, SAMPLE_OD_TIME

    Important

    This combination can only be used with the MATCH_OD_PAIRS solver.

  • SAMPLE_ORIGIN_WKTPOINT, SAMPLE_DESTINATION_WKTPOINT, SAMPLE_OD_TIME, SAMPLE_OD_ID

    Important

    This combination can only be used with the MATCH_OD_PAIRS solver.

  • SAMPLE_DEMAND_ID, SAMPLE_DEMAND_WKTPOINT, SAMPLE_DEMAND_SIZE, SAMPLE_DEMAND_DEPOT_ID

    Important

    This combination can only be used with the MATCH_SUPPLY_DEMAND solver. It is used in conjunction with the below combination to match suppliers to demand. Reference Identifier Combinations for using multiple combinations syntax.

  • SAMPLE_SUPPLY_DEPOT_ID, SUPPLY_WKTPOINT, SAMPLE_SUPPLY_TRUCK_ID, SAMPLE_SUPPLY_TRUCK_SIZE

    Important

    This combination can only be used with the MATCH_SUPPLY_DEMAND solver. It is used in conjunction with the above combination to match suppliers to demand. Reference Identifier Combinations for using multiple combinations syntax.

Showing a Graph

Using /show/graph will provide detailed information about a graph, including number of nodes and edges in the graph, whether the graph is directed or persisted, and more. This endpoint is used by GAdmin to populate the Graphs page.

Listing a Graph

Listing a graph is serviced by the /list/graph endpoint; this involves providing a graph name to view information about the provided graph on the graph server. Providing no graph name will return information on all graphs.

Deleting a Graph

Deleting a graph is serviced by the /delete/graph endpoint; this involves providing a graph name to delete the graph from the graph server (memory) and persist (if applicable).

Tip

If a graph was saved to persist upon creation and then was deleted from the server (but NOT persist), it can be reloaded from persist by recreating the graph using the same graph_name.

Examples

Consult Network Graphs & Solvers Examples for detailed usage of the graph server, including creating, querying, and solving graphs.

Limitations and Cautions

  • Groups of valid identifier combinations must be from the same table, e.g., NODE_ID and NODE_NAME must reference columns from the same table
  • Node, edge, weight, and optional restriction identifiers should be matched to yield a useful graph (NODE_ID --> EDGE_NODE1_ID and EDGE_NODE2_ID / EDGE_ID --> WEIGHTS_EDGE_ID, etc.)
  • Groups of valid numerical identifier combinations must be the same type, e.g., if EDGE_ID is identified from an int column, both EDGE_NODE1_ID and EDGE_NODE2_ID must also be int
  • Graphs cannot be created using columns with the nullable property
  • If no ID identifier is provided, weights will be matched with edges by table row, e.g., the first record in the weight table will be used for the first record in the edge table (should the weights and edges be separate). If two weights are specified for the same edge, the weights are added (if both are using the WEIGHTS_VALUESPECIFIED identifier) or multiplied (if one or both are using the WEIGHTS_FACTORSPECIFIED identifier) together.
  • A node or edge can have up to 64 unique labels.