Note
Click here to download the full example code
Tables Example¶
The tables client exposes a highly scalable backend for ingesting data and performing various types of queries, such as geospatial and temporal, over that data. This example creates a new table, ingests some rows, performs some basic queries, deletes some rows, shares the table with others, and then deletes the table.
The API reference for these methods and classes is at descarteslabs.tables
.
import os
from tempfile import mkstemp
from descarteslabs.tables import Tables
t = Tables()
First let’s download a sample dataset in the form of a shape file so that we don’t have to create our own. We will use active fire data from NASA FIRMS (https://firms.modaps.eosdis.nasa.gov/usfs/active_fire) and assume that it was saved to a file called /tmp/modis_fire.zip. Example command to download: wget -O /tmp/modis_fire.zip https://firms.modaps.eosdis.nasa.gov/data/active_fire/modis-c6.1/shapes/zips/MODIS_C6_1_USA_contiguous_and_Hawaii_7d.zip
import requests
url = "https://cdn.descarteslabs.com/docs/MODIS_C6_1_USA_contiguous_and_Hawaii_7d.zip"
r = requests.get(url)
fd, tmpf = mkstemp(suffix=".zip")
try:
with os.fdopen(fd, mode="w+b") as f:
f.write(r.content)
################################################
# Once you have data, the next step is to create a table that supports the features in
# this shape file. We need a schema and an SRID, and a quick way to get that is by
# using :meth:`Tables.inspect_dataset <descarteslabs.tables.inspect_dataset>` Note
# that this method of deriving the schema and srid should not be relied upon without
# manually inspecting the output due to the sheer amount of conversions possible. For
# example, converting a ``date`` type may not translate properly so it is encumbent on
# the user to ensure everything looks correct before creating a table.
schema, srid = t.inspect_dataset(tmpf)
print(schema, srid)
################################################
# Now let's create our table with :meth:`Tables.create_table
# <descarteslabs.tables.create_table>`. The table must have a primary key, which is
# either provided on creation or created by the service. We'll create one on our own
# and make it auto-increment.
from uuid import uuid4
table_name = f"fires_{uuid4().hex}"
schema["properties"]["ID"] = "auto"
t.create_table(table_name, srid=srid, schema=schema, primary_key="ID")
################################################
# Now that we've created the table, we can start inserting rows into it. We have the
# dataset from earlier, so let's just insert that using :meth:`Tables.insert_rows
# <descarteslabs.tables.insert_rows>`. Like many other asynchronous client calls in
# the platform, a job ID is returned that can be used to poll the status of the job.
# :meth:`Tables.wait_until_completion <descarteslabs.tables.wait_until_completion>`
# can be used to halt execution of the program until the ingest job succeeds.
jobid = t.upload_file(tmpf, table_name)
t.wait_until_completion(jobid)
finally:
try:
os.remove(tmpf)
except Exception:
pass
Out:
{'properties': {'LATITUDE': 'float', 'LONGITUDE': 'float', 'BRIGHTNESS': 'float', 'SCAN': 'float', 'TRACK': 'float', 'ACQ_DATE': 'date', 'ACQ_TIME': 'str', 'SATELLITE': 'str', 'CONFIDENCE': 'int', 'VERSION': 'str', 'BRIGHT_T31': 'float', 'FRP': 'float', 'DAYNIGHT': 'str'}, 'geometry': 'Point'} 4326
Once the data has been ingested, the ibis library provides an interface for us to lazily evalute queries and retrieve them as Geo/DataFrames.
fires = t.table(table_name)
print(fires.execute())
Out:
SATELLITE FRP ... LATITUDE geom
0 T 12.95 ... 18.14990 POINT (-71.42392 18.14990)
1 T 20.32 ... 18.14881 POINT (-71.41471 18.14881)
2 T 20.44 ... 18.92936 POINT (-70.34927 18.92936)
3 T 13.43 ... 18.92716 POINT (-70.36263 18.92716)
4 T 11.44 ... 34.66922 POINT (-98.32987 34.66922)
... ... ... ... ... ...
2526 A 102.23 ... 17.84056 POINT (-93.11199 17.84056)
2527 A 65.58 ... 17.85654 POINT (-93.11552 17.85654)
2528 A 11.37 ... 26.88698 POINT (-101.44954 26.88698)
2529 A 16.13 ... 38.13739 POINT (-121.65136 38.13739)
2530 A 16.53 ... 38.13445 POINT (-121.63574 38.13445)
[2531 rows x 15 columns]
There are many different types of queries that can be performed. For example, to select specific columns:
print(fires["ID", "LONGITUDE", "LATITUDE", "ACQ_TIME"].execute())
Out:
ID LONGITUDE LATITUDE ACQ_TIME
0 1 -71.42392 18.14990 0236
1 2 -71.41471 18.14881 0236
2 3 -70.34927 18.92936 0236
3 4 -70.36263 18.92716 0236
4 5 -98.32987 34.66922 0418
... ... ... ... ...
2526 2527 -93.11199 17.84056 0718
2527 2528 -93.11552 17.85654 0718
2528 2529 -101.44954 26.88698 0852
2529 2530 -121.65136 38.13739 1028
2530 2531 -121.63574 38.13445 1028
[2531 rows x 4 columns]
A more in depth example: let’s find the first row with SATELLITE == ‘T’ whose geometry intersects with the highest confidence field. Note that you will need the ibis library to do this intersection, which is installed as a dependency of the tables client
import ibis
confidence = fires["CONFIDENCE", "geom"].execute()
confidence = confidence[confidence["CONFIDENCE"] > 90]
df_geom = fires[
(fires.SATELLITE == "T")
& fires.geom.intersects(ibis.literal(confidence.iloc[0].geom, type="point;4326"))
]
print(df_geom.execute())
Out:
SATELLITE FRP BRIGHTNESS ... ACQ_TIME LATITUDE geom
0 T 35.3 332.13 ... 0414 17.89659 POINT (-92.49234 17.89659)
[1 rows x 15 columns]
If we want to remove some rows from the table, we need to get a list of the primary
keys associated with those rows. We can use the Tables.delete_rows
method to achieve this. Let’s delete every row
with < 20 confidence.
to_delete = fires[fires.CONFIDENCE < 20].ID.execute()
jobid = t.delete_rows(to_delete.to_list(), table_name)
t.wait_until_completion(jobid)
Sharing is handled by the Discover service. Therefore, we will need the
Discover
client to do this.
from descarteslabs.discover import Discover, Organization
d = Discover()
To share a table you own with someone as a “viewer” (i.e. read-only), use
Discover.discover.table.share
d.table(table_name).share(with_=Organization("descarteslabs"), as_="viewer")
To change access to a table from viewer to editor (i.e. read/write), use
Discover.discover.table.replace_shares
d.table(table_name).replace_shares(
user=Organization("descarteslabs"), from_role="viewer", to_role="editor"
)
Finally, to revoke access to a table use Discover.discover.table.revoke
d.table(table_name).revoke(from_=Organization("descarteslabs"), as_="editor")
Once you are finished with a table, you can delete the entire thing with
Tables.delete_table
t.delete_table(table_name)
Total running time of the script: ( 2 minutes 4.565 seconds)