A collection is a container for tables and/or views, somewhat akin to a schema in other databases. Other types of views, like joins, can also be created within collections. The contained tables and/or views can be of uniform or type schemas.

A view collection is what results from performing a filter operation against another collection.

Besides providing a means for logical grouping of tables, a collection provides the ability to query the contained tables, regardless of the mixture of type schemas, for columns shared between all tables. Tables within the collection that lack the queried-for columns will simply not contribute to the resulting data set.

For example, a transportation department may have two teams gathering ongoing spatial information in their respective jurisdictions. Team A ingests geo-referenced objects from Twitter into a table named TWITTER inside of the collection MASTER. Their objects have X and Y columns corresponding to longitude and latitude, as well as TIMESTAMP and other columns. Meanwhile, Team B collects vehicle movement information, adding their objects with X, Y, TIMESTAMP, TRACK_ID and other columns to a table named VEHICLE_TRACKS also inside the MASTER collection. Now, queries for X, Y, & TIMESTAMP on the collection MASTER will be applied to both tables.

Collections do not themselves have a time-to-live in the way that views do. However, an empty collection will be removed from memory automatically when the database on which it resides is restarted. In that case, it will need to be recreated for applications that depend on its existence, like ODBC or Reveal.

Setting a time-to-live on a collection will set the time-to-live of every table & view contained within it. This creates an effective time-to-live for the collection, as each access of a member of the collection will extend its life.

Collections have the same naming criteria as tables.

A collection can be created using the /create/table endpoint with the is_collection option set to true.


Since collections can contain tables of different types, you don't need to specify a type ID.

In Python, the GPUdbTable class serves as a convenient wrapper for many table-related endpoints and can be used to create a collection implicitly, if it doesn't already exist; for example:

    name = "my_collection",
    db = h_db,
    options = gpudb.GPUdbTableOptions.default().is_collection(True)

After a collection is created, tables and views can be added to the collection at creation time using the compatible endpoints with the collection_name option set to the collection's name.

For example, in Python,:

# Create a column list
columns = [
    [ "id", gpudb.GPUdbRecordColumn._ColumnType.INT ],
    [ "name", gpudb.GPUdbRecordColumn._ColumnType.STRING, gpudb.GPUdbColumnProperty.CHAR64 ]

# Create a simple table using the column list
    name = "table_in_collection",
    db = h_db,
    options = gpudb.GPUdbTableOptions.default().collection_name("my_collection")

Once records/tables are available in the collection, you can use the /get/records/fromcollection endpoint to retrieve data from the collection.

For example, in Python,:

result = h_db.get_records_from_collection_and_decode("my_collection", 0, -9999)

for collection_record in result["records"]:
    print collection_record.values()