Tables Example

The tables client exposes a highly scalable backend for ingesting data and performing various types of queries, such as geospatial and temporal, over that data. This example creates a new table, ingests some rows, performs some basic queries, deletes some rows, shares the table with others, and then deletes the table.

The API reference for these methods and classes is at descarteslabs.tables.

from descarteslabs.tables import Tables

t = Tables()

First let’s download a sample dataset in the form of a shape file so that we don’t have to create our own. We will use active fire data from NASA FIRMS (https://firms.modaps.eosdis.nasa.gov/usfs/active_fire) and assume that it was saved to a file called /tmp/modis_fire.zip. Example command to download: wget -O /tmp/modis_fire.zip https://firms.modaps.eosdis.nasa.gov/data/active_fire/modis-c6.1/shapes/zips/MODIS_C6_1_USA_contiguous_and_Hawaii_7d.zip

import requests

url = "https://firms.modaps.eosdis.nasa.gov/data/active_fire/modis-c6.1/shapes/zips/MODIS_C6_1_USA_contiguous_and_Hawaii_7d.zip"
r = requests.get(url)

with open("/tmp/modis_fire.zip", "wb") as f:
    f.write(r.content)

Once you have data, the next step is to create a table that supports the features in this shape file. We need a schema and an SRID, and a quick way to get that is by using Tables.inspect_dataset Note that this method of deriving the schema and srid should not be relied upon without manually inspecting the output due to the sheer amount of conversions possible. For example, converting a date type may not translate properly so it is encumbent on the user to ensure everything looks correct before creating a table.

schema, srid = t.inspect_dataset("/tmp/modis_fire.zip")
print(schema, srid)

Out:

{'properties': {'LATITUDE': 'float', 'LONGITUDE': 'float', 'BRIGHTNESS': 'float', 'SCAN': 'float', 'TRACK': 'float', 'ACQ_DATE': 'date', 'ACQ_TIME': 'str', 'SATELLITE': 'str', 'CONFIDENCE': 'int', 'VERSION': 'str', 'BRIGHT_T31': 'float', 'FRP': 'float', 'DAYNIGHT': 'str'}, 'geometry': 'Point'} 4326

Now let’s create our table with Tables.create_table. The table must have a primary key, which is either provided on creation or created by the service. We’ll create one on our own and make it auto-increment.

from uuid import uuid4

table_name = f"fires_{uuid4().hex}"
schema["properties"]["ID"] = "auto"
t.create_table(table_name, srid=srid, schema=schema, primary_key="ID")

Now that we’ve created the table, we can start inserting rows into it. We have the dataset from earlier, so let’s just insert that using Tables.insert_rows. Like many other asynchronous client calls in the platform, a job ID is returned that can be used to poll the status of the job. Tables.wait_until_completion can be used to halt execution of the program until the ingest job succeeds.

jobid = t.upload_file("/tmp/modis_fire.zip", table_name)
t.wait_until_completion(jobid)

Once the data has been ingested, the ibis library provides an interface for us to lazily evalute queries and retrieve them as Geo/DataFrames.

fires = t.table(table_name)
print(fires.execute())

Out:

       ACQ_DATE VERSION  ...  LATITUDE                         geom
0    2022-01-13  6.1NRT  ...  21.25716   POINT (-77.88353 21.25716)
1    2022-01-13  6.1NRT  ...  22.59575   POINT (-79.84987 22.59575)
2    2022-01-13  6.1NRT  ...  22.61468   POINT (-83.06532 22.61468)
3    2022-01-13  6.1NRT  ...  33.01699   POINT (-79.88882 33.01699)
4    2022-01-13  6.1NRT  ...  44.68003  POINT (-109.64005 44.68003)
...         ...     ...  ...       ...                          ...
2728 2022-01-20  6.1NRT  ...  39.48658  POINT (-122.02285 39.48658)
2729 2022-01-20  6.1NRT  ...  39.48912  POINT (-122.02832 39.48912)
2730 2022-01-20  6.1NRT  ...  39.69320  POINT (-122.15646 39.69320)
2731 2022-01-20  6.1NRT  ...  39.74248  POINT (-122.07507 39.74248)
2732 2022-01-20  6.1NRT  ...  39.74580  POINT (-122.08365 39.74580)

[2733 rows x 15 columns]

There are many different types of queries that can be performed. For example, to select specific columns:

print(fires["ID", "LONGITUDE", "LATITUDE", "ACQ_TIME"].execute())

Out:

        ID  LONGITUDE  LATITUDE ACQ_TIME
0        1  -77.88353  21.25716     0321
1        2  -79.84987  22.59575     0321
2        3  -83.06532  22.61468     0321
3        4  -79.88882  33.01699     0323
4        5 -109.64005  44.68003     0508
...    ...        ...       ...      ...
2728  2729 -122.02285  39.48658     2038
2729  2730 -122.02832  39.48912     2038
2730  2731 -122.15646  39.69320     2038
2731  2732 -122.07507  39.74248     2038
2732  2733 -122.08365  39.74580     2038

[2733 rows x 4 columns]

A more in depth example: let’s find the first row with SATELLITE == ‘T’ whose geometry intersects with the highest confidence field. Note that you will need the ibis library to do this intersection, which is installed as a dependency of the tables client

import ibis

confidence = fires["CONFIDENCE", "geom"].execute()
confidence = confidence[confidence["CONFIDENCE"] > 90]
df_geom = fires[
    (fires.SATELLITE == "T")
    & fires.geom.intersects(ibis.literal(confidence.iloc[0].geom, type="point;4326"))
]
print(df_geom.execute())

Out:

    ACQ_DATE VERSION  ...  LATITUDE                         geom
0 2022-01-13  6.1NRT  ...  46.51318  POINT (-112.31757 46.51318)

[1 rows x 15 columns]

If we want to remove some rows from the table, we need to get a list of the primary keys associated with those rows. We can use the Tables.delete_rows method to achieve this. Let’s delete every row with < 20 confidence.

to_delete = fires[fires.CONFIDENCE < 20].ID.execute()
jobid = t.delete_rows(to_delete.to_list(), table_name)
t.wait_until_completion(jobid)

Sharing is handled by the Discover service. Therefore, we will need the Discover client to do this.

from descarteslabs.discover import Discover, Organization

d = Discover()

To share a table you own with someone as a “viewer” (i.e. read-only), use Discover.discover.table.share

d.table(table_name).share(with_=Organization("descarteslabs"), as_="viewer")

To change access to a table from viewer to editor (i.e. read/write), use Discover.discover.table.replace_shares

d.table(table_name).replace_shares(
    user=Organization("descarteslabs"), from_role="viewer", to_role="editor"
)

Finally, to revoke access to a table use Discover.discover.table.revoke

d.table(table_name).revoke(from_=Organization("descarteslabs"), as_="editor")

Once you are finished with a table, you can delete the entire thing with Tables.delete_table

t.delete_table(table_name)

Total running time of the script: ( 0 minutes 12.596 seconds)

Gallery generated by Sphinx-Gallery