Vectors

The Vector service lets you store vector geometries (points, polygons, etc.) along with key-value properties, and query that data spatially and/or by properties.

It’s meant for data at the scale of millions to billions of features. If your data can fit in memory, work with it there—the Vector service is not meant for small datasets, and will be far less performant than working locally.

A typical use for the Vector service that might produce such millions of features is storing the output from Tasks. For example, a computer vision detector might be run in thousands of tasks over many years of data across a continent; the objects it detects could be saved as features for later querying and analysis.

Data Types

The Vector service mirrors GeoJSON by offering two types: Feature and FeatureCollection.

A Feature is a single geometry and key-value properties; a FeatureCollection holds many Features <Feature>, with a name and access controls.

The key-value properties of Features <Feature> are schemaless: Features <Feature> in the same FeatureCollection do not all have to have the same keys present, or the same types for their values. Also, FeatureCollections <FeatureCollection> are append-only: once a Feature is added, it can’t be removed or modified.

Feature

A Feature is a single GeoJSON Geometry, a dict of properties, and a unique id:

In [1]: import descarteslabs as dl

In [2]: import shapely.geometry

In [3]: feature = dl.vectors.Feature(
   ...:   geometry={
   ...:     'type': 'Polygon',
   ...:     'coordinates': [[[-95, 42], [-93, 42], [-93, 40], [-95, 41], [-95, 42]]]
   ...:   }, properties={"temperature": 70.13, "size": "large", "tags": None}
   ...: )
   ...: 

In [4]: feature
Out[4]: 
Feature({
  'geometry': {
    'coordinates': (((-95.0, 42.0), (-93.0, 42.0), (-93.0, 40.0), (-95.0, 41.0), (-95.0, 42.0)),),
    'type': 'Polygon'
  },
  'id': None,
  'properties': {
    'size': 'large',
    'tags': None,
    'temperature': 70.13
  },
  'type': 'Feature'
})

Unlike GeoJSON, the values in properties can only be strings (up to 256 characters), integers, floats, or the value None. Therefore, properties doesn’t support nesting (containing more dictionaries or lists).

geometry must be a primitive GeoJSON geometry (Point, MultiPoint, Polygon, MultiPolygon, LineString, MultiLineString, GeometryCollection). Using a Feature or FeatureCollection will raise an error. As a GeoJSON geometry, the coordinates are assumed to be (longitude, latitude) in WGS84 decimal degrees (EPSG:4326), with planar edges.

You don’t need to—and shouldn’t—set id yourself.

The geometry you pass in is converted to a Shapely shape:

In [5]: feature.geometry
Out[5]: <shapely.geometry.polygon.Polygon at 0x7f7db9d2cad0>

The properties are stored in a DotDict, making syntax for getting and setting properties more convenient:

In [6]: feature.properties['temperature']
Out[6]: 70.13

In [7]: feature.properties.temperature
Out[7]: 70.13

(The DotDict class allows you refer to values by key or as a property.)

FeatureCollection

In the Vector service, you create products with a name, description, and access controls to hold a collection of Features.

A FeatureCollection represents one of those vector products with some filters applied to it.

A FeatureCollection doesn’t actually contain data. Instead, FeatureCollection.filter sets up the filters to be used, then FeatureCollection.features returns an iterator over the matching Features <Feature> collection, and retrieving the first value from the iterator will perform the query.

Each of those methods return a new FeatureCollection instance, allowing you to partially apply and chain filters.

Creating FeatureCollections

To see existing FeatureCollections <FeatureCollection> that you have access to, use FeatureCollection.list:

In [8]: fcs = dl.vectors.FeatureCollection.list()

In [9]: fcs[:3]
Out[9]: 
[FeatureCollection({
   u'description': u'',
   'id': u'noaa_tornado_reports',
   u'name': u'noaa_tornado_reports',
   u'title': u'NOAA Tornado Reports'
 }), FeatureCollection({
   u'description': u'',
   'id': u'06d1f4694ead46a49f6b32194dfadac',
   u'name': u'us_congressional_districts_area',
   u'title': u'Congressional Districts of the USA'
 }), FeatureCollection({
   u'description': u"Data from 1980 NOAA 'Thermal Springs List for the United States'",
   'id': u'05420e67c6bc4bc7980cfe795eed361',
   u'name': u'nm_hotsprings',
   u'owners': [u'user:d4ef22d5a6969cb61147ec8ea3e060cdf33e1a49', u'org:descarteslabs'],
   u'readers': [],
   u'title': u'Geothermal Springs in New Mexico',
   u'writers': []
 })]

You can instantiate a FeatureCollection object from an existing Vector product using its ID:

In [10]: us_cities_fc = dl.vectors.FeatureCollection("d1349cc2d8854d998aa6da92dc2bd24")

In [11]: us_cities_fc
Out[11]: 
FeatureCollection({
  u'description': u'',
  'id': 'd1349cc2d8854d998aa6da92dc2bd24',
  u'name': u'us_cities_area',
  u'title': u'Cities of the USA'
})

To create a new Vector product, use FeatureCollection.create. You must supply a name, a human-readable title, and a description. You can also supply optional owners, readers, and writers lists.

Names must be less than 256 characters and may only contain alphanumeric characters, dashes (-), and underscores (_).

In [12]: mountains_of_middle_earth = dl.vectors.FeatureCollection.create(
   ....:   name="mountains_of_middle_earth",
   ....:   title="Mountains of Middle Earth",
   ....:   description="Nice spots to climb around the Shire"
   ....: )
   ....: 

The created FeatureCollection has a unique ID:

In [13]: mountains_of_middle_earth.id
Out[13]: u'990dbf494c644ea195c19fadaa42d02'

Modifying FeatureCollections

To modify the metadata of a FeatureCollection (its title, description, access control lists, etc.), use FeatureCollection.update:

In [14]: mountains_of_middle_earth.update(
   ....:   description="Mt. Doom is on private land; landowner not climber-friendly",
   ....:   writers=mountains_of_middle_earth.writers + ['org:descarteslabs']
   ....: )
   ....: 

In [15]: mountains_of_middle_earth
Out[15]: 
FeatureCollection({
  u'description': u'Mt. Doom is on private land; landowner not climber-friendly',
  'id': u'990dbf494c644ea195c19fadaa42d02',
  u'name': u'mountains_of_middle_earth',
  u'owners': [u'user:d4ef22d5a6969cb61147ec8ea3e060cdf33e1a49', u'org:descarteslabs'],
  u'readers': [],
  u'title': u'Mountains of Middle Earth',
  u'writers': [u'org:descarteslabs']
})

Delete a Vector product that you own using FeatureCollection.delete:

In [16]: mountains_of_middle_earth.delete()

In [17]: "mountains_of_middle_earth" in [fc.name for fc in dl.vectors.FeatureCollection.list()]
Out[17]: True

Adding Features

To add Features to a FeatureCollection, use FeatureCollection.add and pass in a Feature instance, or a list of them. The method returns a copy of the Features, with the id parameter now set.

# first, we need a FeatureCollection to add to
In [18]: nm_hotsprings = dl.vectors.FeatureCollection.create(
   ....:   name="nm_hotsprings",
   ....:   title="Geothermal Springs in New Mexico",
   ....:   description="Data from 1980 NOAA 'Thermal Springs List for the United States'"
   ....: )
   ....: 

Make some Features:

In [19]: hotsprings_features = [
   ....:   dl.vectors.Feature(
   ....:     shapely.geometry.Point(-106.646, 35.938),
   ....:     {'name': 'San Antonio', 'temp_c': 54, 'category': 'hot'}
   ....:   ),
   ....:   dl.vectors.Feature(
   ....:     shapely.geometry.Point(-106.827, 35.548),
   ....:     {'name': 'San Ysidro', 'temp_c': 20, 'category': 'warm', 'fun': 'no'}
   ....:   ),
   ....:   dl.vectors.Feature(
   ....:     shapely.geometry.Point(-108.209, 33.199),
   ....:     {'name': 'Gila', 'temp_c': 66, 'category': 'hot'}
   ....:   ),
   ....: ]
   ....: 
In [20]: nm_hotsprings.add(hotsprings_features)
Out[20]: 
[Feature({
   'geometry': {
     'coordinates': (-106.646, 35.938),
     'type': 'Point'
   },
   'id': u'1e9ee63ed19e49e498acf28fbb472d8_82089b46cf35407d',
   'properties': {
     'category': 'hot',
     'name': 'San Antonio',
     'temp_c': 54
   },
   'type': 'Feature'
 }), Feature({
   'geometry': {
     'coordinates': (-106.827, 35.548),
     'type': 'Point'
   },
   'id': u'1e9ee63ed19e49e498acf28fbb472d8_30086dbfe8924392',
   'properties': {
     'category': 'warm',
     'fun': 'no',
     'name': 'San Ysidro',
     'temp_c': 20
   },
   'type': 'Feature'
 }), Feature({
   'geometry': {
     'coordinates': (-108.209, 33.199),
     'type': 'Point'
   },
   'id': u'1e9ee63ed19e49e498acf28fbb472d8_1c215051502c4173',
   'properties': {
     'category': 'hot',
     'name': 'Gila',
     'temp_c': 66
   },
   'type': 'Feature'
 })]

Notice that the returned Features have an ID set.

Features do not need to follow a fixed schema: notice how the Feature for San Ysidro hot springs has {'fun': 'no'} set, whereas the other hot springs do not have a fun property (since obviously they both are). Features <Feature> can have different properties, or different types of values for properties of the same name, so be prepared for this in your code.

Querying FeatureCollections

FeatureCollections <FeatureCollection> can be queried spatially, as well as by their key-value properties, with the FeatureCollection.filter method.

Remember that a FeatureCollection represents a Vector product with filters applied to it. That means that each call to FeatureCollection.filter returns a new FeatureCollection instance, still referring to the same underlying product, but with more filters applied. This lets you start with one query, and chain more onto it.

Spatial Filtering

To add a spatial query to a FeatureCollection, pass a GeoJSON geometry dict, or object with __geo_interface__, to the geometry keyword argument of FeatureCollection.filter.

Only Features <Feature> that intersect that geometry will be selected. Any geometry type can be used (though Point doesn’t make a whole lot of sense).

In [21]: northern_nm_polygon = {
   ....:   "type": "Polygon",
   ....:   "coordinates": [[[-107, 35], [-105, 35], [-105, 37], [-107, 37], [-107, 35]]]
   ....: }
   ....: 

In [22]: northern_nm_springs = nm_hotsprings.filter(geometry=northern_nm_polygon)

In [23]: southern_nm_polygon = {
   ....:   "type": "Polygon",
   ....:   "coordinates": [[[-109, 32], [-106, 32], [-106, 35], [-109, 35], [-109, 32]]]
   ....: }
   ....: 

In [24]: southern_nm_springs = nm_hotsprings.filter(geometry=southern_nm_polygon)

Notice that calling filter returns a copy of the FeatureCollection, not the Features <Feature> themselves:

In [25]: northern_nm_springs
Out[25]: 
FeatureCollection({
  'description': u"Data from 1980 NOAA 'Thermal Springs List for the United States'",
  'id': u'1e9ee63ed19e49e498acf28fbb472d8',
  'name': u'nm_hotsprings',
  'owners': [u'user:d4ef22d5a6969cb61147ec8ea3e060cdf33e1a49', u'org:descarteslabs'],
  'readers': [],
  'title': u'Geothermal Springs in New Mexico',
  'writers': []
})

The two FeatureCollections <FeatureCollection> (northern_nm_springs and southern_nm_springs) refer to the same Vector product, but will return different data when iterating through .features():

In [26]: list(northern_nm_springs.features())
Out[26]: 
[Feature({
   'geometry': {
     'coordinates': (-106.827, 35.548),
     'type': 'Point'
   },
   'id': u'1e9ee63ed19e49e498acf28fbb472d8_30086dbfe8924392',
   'properties': {
     u'category': u'warm',
     u'fun': u'no',
     u'name': u'San Ysidro',
     u'temp_c': 20
   },
   'type': 'Feature'
 }), Feature({
   'geometry': {
     'coordinates': (-106.646, 35.938),
     'type': 'Point'
   },
   'id': u'1e9ee63ed19e49e498acf28fbb472d8_82089b46cf35407d',
   'properties': {
     u'category': u'hot',
     u'name': u'San Antonio',
     u'temp_c': 54
   },
   'type': 'Feature'
 })]

In [27]: list(southern_nm_springs.features())
Out[27]: 
[Feature({
   'geometry': {
     'coordinates': (-108.209, 33.199),
     'type': 'Point'
   },
   'id': u'1e9ee63ed19e49e498acf28fbb472d8_1c215051502c4173',
   'properties': {
     u'category': u'hot',
     u'name': u'Gila',
     u'temp_c': 66
   },
   'type': 'Feature'
 })]

Property Filtering

To add a properties filter to a FeatureCollection, the descarteslabs.vectors.properties helper lets you use normal Python operators to specify comparisons that you can pass to the properties keyword argument of FeatureCollection.filter. For example:

In [28]: from descarteslabs.vectors import properties as p

In [29]: very_hot_hotsprings = nm_hotsprings.filter(
   ....:   properties=(p.category == "hot") & (p.temp_c > 60)
   ....: )
   ....: 

In [30]: list(very_hot_hotsprings.features())
Out[30]: 
[Feature({
   'geometry': {
     'coordinates': (-108.209, 33.199),
     'type': 'Point'
   },
   'id': u'1e9ee63ed19e49e498acf28fbb472d8_1c215051502c4173',
   'properties': {
     u'category': u'hot',
     u'name': u'Gila',
     u'temp_c': 66
   },
   'type': 'Feature'
 })]

To refer to a field in your data, just access that attribute by name from descarteslabs.vectors.properties. Then you can use Python binary comparison operators (>, <, >=, <=, ==, !=). like is also supported for pattern-matching in strings; see the like example for more.

To combine these expressions, use & (logical AND), | (logical OR), and parenthesis. Using Python and and or will not work as expected:

In [31]: type(p.a > 1 and p.b == 1)  # just returns the `p.b == 1` part
Out[31]: descarteslabs.common.property_filtering.filtering.EqExpression

In [32]: type((p.a > 1) & (p.b == 1)) # AndExpression as intended
Out[32]: descarteslabs.common.property_filtering.filtering.AndExpression

When filtering, the value of a field that doesn’t exist is considered None. Additionally, if the types of a field’s value and the value it’s compared to are incompatible, the comparison evaluates to False.

Retrieving Data

The filters you set aren’t actually applied until you iterate through FeatureCollection.features.

This means you can start with one filtered FeatureCollection and chain other filters onto it:

In [33]: hot_hotsprings = nm_hotsprings.filter(properties=(p.category == "hot"))

In [34]: northern_hot_hotsprings = hot_hotsprings.filter(geometry=northern_nm_polygon)

In [35]: list(northern_hot_hotsprings.features())
Out[35]: 
[Feature({
   'geometry': {
     'coordinates': (-106.646, 35.938),
     'type': 'Point'
   },
   'id': u'1e9ee63ed19e49e498acf28fbb472d8_82089b46cf35407d',
   'properties': {
     u'category': u'hot',
     u'name': u'San Antonio',
     u'temp_c': 54
   },
   'type': 'Feature'
 })]

In [36]: southern_hot_hotsprings = hot_hotsprings.filter(geometry=southern_nm_polygon)

In [37]: list(southern_hot_hotsprings.features())
Out[37]: 
[Feature({
   'geometry': {
     'coordinates': (-108.209, 33.199),
     'type': 'Point'
   },
   'id': u'1e9ee63ed19e49e498acf28fbb472d8_1c215051502c4173',
   'properties': {
     u'category': u'hot',
     u'name': u'Gila',
     u'temp_c': 66
   },
   'type': 'Feature'
 })]

The chained filters are logically ANDed together:

In [38]: southern_notfun_hotsprings = southern_hot_hotsprings.filter(
   ....:   properties=(p.fun == 'no')
   ....: )
   ....: 

In [39]: list(southern_notfun_hotsprings.features())
Out[39]: []

You can’t chain multiple geometries, however—filtering with a new geometry simply replaces the old one.

Note

Because Vector products can potentially contain millions or billions of Features, you must specify some filter in order to iterate through FeatureCollection.features. Not doing so will raise an error:
In [40]: list(nm_hotsprings.features())

BadRequestErrorTraceback (most recent call last)
<ipython-input-40-e1551fc8af13> in <module>()
----> 1 list(nm_hotsprings.features())

/root/.cache/bazel/_bazel_drone-agent-43mw/3517bd091dde6188868082e15543f179/sandbox/processwrapper-sandbox/50/execroot/__main__/bazel-out/k8-fastbuild/genfiles/docs/build_tools/public/descarteslabs/vectors/featurecollection.py in features(self)
    268         )
    269 
--> 270         for response in self.vector_client.search_features(**params):
    271             yield Feature._create_from_jsonapi(response)
    272 

/root/.cache/bazel/_bazel_drone-agent-43mw/3517bd091dde6188868082e15543f179/sandbox/processwrapper-sandbox/50/execroot/__main__/bazel-out/k8-fastbuild/genfiles/docs/build_tools/public/descarteslabs/client/services/vector/vector.py in search_features(self, product_id, geometry, query_expr, query_limit, **kwargs)
    439                 query_limit=query_limit,
    440                 continuation_token=continuation_token,
--> 441                 **kwargs
    442             )
    443 

/root/.cache/bazel/_bazel_drone-agent-43mw/3517bd091dde6188868082e15543f179/sandbox/processwrapper-sandbox/50/execroot/__main__/bazel-out/k8-fastbuild/genfiles/docs/build_tools/public/descarteslabs/client/services/vector/vector.py in _fetch_feature_page(self, product_id, geometry, query_expr, query_limit, continuation_token, **kwargs)
    381         ).items() if v is not None}
    382 
--> 383         r = self.session.post('/products/{}/search'.format(product_id), json=params)
    384         return DotDict(r.json())
    385 

/root/.cache/bazel/_bazel_drone-agent-43mw/3517bd091dde6188868082e15543f179/sandbox/processwrapper-sandbox/50/execroot/__main__/bazel-out/host/bin/docs/build_tools/public_sphinx-build.runfiles/pypi__requests_2_20_1/requests/sessions.py in post(self, url, data, json, **kwargs)
    579         """
    580 
--> 581         return self.request('POST', url, data=data, json=json, **kwargs)
    582 
    583     def put(self, url, data=None, **kwargs):

/root/.cache/bazel/_bazel_drone-agent-43mw/3517bd091dde6188868082e15543f179/sandbox/processwrapper-sandbox/50/execroot/__main__/bazel-out/k8-fastbuild/genfiles/docs/build_tools/public/descarteslabs/client/services/service/service.py in request(self, method, url, **kwargs)
     49             return resp
     50         elif resp.status_code == 400:
---> 51             raise BadRequestError(resp.text)
     52         elif resp.status_code == 404:
     53             raise NotFoundError(resp.text if 'text' in resp else '404 {} {}'.format(method, url))

BadRequestError: {"error":400,"message":"No query given and no limit set: one of geometry, query_expr must be set or a limit given"}

Modifying and deleting Features

Actually, you can’t modify or delete individual Features <Feature> in a FeatureCollectionVector products are append-only. Keep this in mind when adding to FeatureCollections <FeatureCollection>; you could use a "version" property or something similar if you expect to create multiple versions of the same Feature.

Currently, the only option for modifying or deleting Features <Feature> is to create a new Vector product and copy over all the Features <Feature> from the old one, making changes in the process.