Vectors

The Vector service lets you store vector geometries (points, polygons, etc.) along with key-value properties, and query that data spatially and/or by properties.

It’s meant for data at the scale of millions to billions of features. If your data can fit in memory, work with it there — the Vector service is not meant for small datasets, and will be far less performant than working locally.

A typical use for the Vector service that might produce such millions of features is storing the output from Tasks. For example, a computer vision detector might be run in thousands of tasks over many years of data across a continent; the objects it detects could be saved as features for later querying and analysis.

Note

For information about API Quotas and limits see our Quotas & Limits page.

Data Types

The Vector service mirrors GeoJSON by offering two types: Feature and FeatureCollection.

A Feature is a single geometry and key-value properties; a FeatureCollection holds many Features, with an id and access controls.

The key-value properties of Features are schemaless: Features in the same FeatureCollection do not all have to have the same keys present, or the same types for their values. Features cannot be modified after they have been added, but they can be removed.

Feature

A Feature is a single GeoJSON Geometry, a dict of properties, and a unique id:

import descarteslabs as dl
>>> import shapely.geometry
>>>
>>> feature = dl.vectors.Feature(
...     geometry={
...         'type': 'Polygon',
...         'coordinates': [[[-95, 42], [-93, 42], [-93, 40], [-95, 41], [-95, 42]]]
...     },
...     properties={
...         "temperature": 70.13,
...         "size": "large",
...         "tags": None
...     }
... )
...
>>> feature
Feature({
  'geometry': {
    'coordinates': (((-95.0, 42.0), (-93.0, 42.0), (-93.0, 40.0), (-95.0, 41.0), (-95.0, 42.0)),),
    'type': 'Polygon'
  },
  'id': None,
  'properties': {
    'size': 'large',
    'tags': None,
    'temperature': 70.13
  },
  'type': 'Feature'
})

Unlike GeoJSON, the values in properties can only be strings (up to 256 characters), integers, floats, or the value None. Therefore, properties doesn’t support nesting (containing more dictionaries or lists).

geometry must be a primitive GeoJSON geometry (Point, MultiPoint, Polygon, MultiPolygon, LineString, MultiLineString, GeometryCollection). Using a Feature or FeatureCollection will raise an error. As a GeoJSON geometry, the coordinates are assumed to be (longitude, latitude) in WGS84 decimal degrees (EPSG:4326), with planar edges.

You don’t need to—and shouldn’t—set id yourself.

The geometry you pass in is converted to a Shapely shape:

>>> feature.geometry
<shapely.geometry.polygon.Polygon object at 0x7fe97eae7cd0>

The properties are stored in a DotDict, which allows you refer to values by key or property access, making syntax for getting and setting properties more convenient:

>>> feature.properties['temperature']
70.13
>>>
>>> feature.properties.temperature
70.13

FeatureCollection

In the Vector service, you create products with a id, description, and access controls to hold a collection of Features.

A FeatureCollection represents one of those vector products with some filters applied to it.

A FeatureCollection doesn’t actually contain data. Instead, filter() sets up the filters to be used, then features() returns an iterator over the matching Features collection, and retrieving the first value from the iterator will perform the query.

Each of those methods return a new FeatureCollection instance, allowing you to partially apply and chain filters.

Creating FeatureCollections

To see existing FeatureCollections that you have access to, use list():

>>> fcs = dl.vectors.FeatureCollection.list()
>>> fcs[:3]
[FeatureCollection({
  u'description': u'',
  'id': u'noaa_tornado_reports',
  u'name': u'noaa_tornado_reports',
  u'title': u'NOAA Tornado Reports'
}), FeatureCollection({
  u'description': u'',
  'id': u'06d1f4694ead46a49f6b32194dfadac',
  u'name': u'us_congressional_districts_area',
  u'title': u'Congressional Districts of the USA'
}), FeatureCollection({
  u'description': u'This product was created using an example file.',
  'id': u'6c7945a01f1842d983417223b226673',
  u'name': u'my_test_product',
  u'owners': [u'user:d4ef22d5a6969cb61147ec8ea3e060cdf33e1a49', u'org:descarteslabs'],
  u'readers': [],
  u'title': u'My Test Product',
  u'writers': []
})]

You can instantiate a FeatureCollection object from an existing Vector product using its ID:

>>> us_cities_fc = dl.vectors.FeatureCollection("d1349cc2d8854d998aa6da92dc2bd24")
>>> us_cities_fc
FeatureCollection({
  u'description': u'',
  'id': 'd1349cc2d8854d998aa6da92dc2bd24',
  u'name': u'us_cities_area',
  u'title': u'Cities of the USA'
})

To create a new Vector product, use create(). You must supply an id , a human-readable title, and a description. You can also supply optional owners, readers, and writers lists.

Ids must be less than 204 characters and may only contain alphanumeric characters, dashes (-), and underscores (_).

>>> import uuid
>>>
>>> new_id = str(uuid.uuid4())
>>> mountains_of_middle_earth = dl.vectors.FeatureCollection.create(
...     product_id="mountains_of_middle_earth" + new_id,
...     title="Mountains of Middle Earth",
...     description="Nice spots to climb around the Shire"
... )
...

Your user id is prepended to the id in the created FeatureCollection:

>>> mountains_of_middle_earth.id
u'd4ef22d5a6969cb61147ec8ea3e060cdf33e1a49:mountains_of_middle_earth1d37c3f5-4502-4b31-beb4-6a876badad88'

Modifying FeatureCollections

To modify the metadata of a FeatureCollection (its title, description, access control lists, etc.), use update():

>>> mountains_of_middle_earth.update(
...     description="Mt. Doom is on private land; landowner not climber-friendly",
...     writers=mountains_of_middle_earth.writers + ['org:descarteslabs']
... )
...
>>> mountains_of_middle_earth
FeatureCollection({
  u'description': u'Mt. Doom is on private land; landowner not climber-friendly',
  'id': u'd4ef22d5a6969cb61147ec8ea3e060cdf33e1a49:...e_earth1d37c3f5-4502-4b31-beb4-6a876badad88',
  u'name': None,
  u'owners': [u'org:descarteslabs', u'user:d4ef22d5a6969cb61147ec8ea3e060cdf33e1a49'],
  u'readers': [],
  u'title': u'Mountains of Middle Earth',
  u'writers': [u'org:descarteslabs']
})

Delete a Vector product that you own using delete():

>>> id = mountains_of_middle_earth.id
>>>
>>> before_delete = id in [fc.id for fc in dl.vectors.FeatureCollection.list()]
>>> before_delete
True
>>>
>>> mountains_of_middle_earth.delete()
>>>
>>> after_delete = id in [fc.id for fc in dl.vectors.FeatureCollection.list()]
>>> after_delete
False

Adding Features

To add Features to a FeatureCollection, use add() and pass in a Feature instance, or a list of them. The method returns a copy of the Features, with the id property now set.

>>> # first, we need a FeatureCollection to add to
>>> nm_hotsprings = dl.vectors.FeatureCollection.create(
...     product_id="nm_hotsprings" + new_id,
...     title="Geothermal Springs in New Mexico",
...     description="Data from 1980 NOAA 'Thermal Springs List for the United States'"
... )
...

Make some Features:

>>> hotsprings_features = [
...     dl.vectors.Feature(
...         shapely.geometry.Point(-106.646, 35.938),
...         {'name': 'San Antonio', 'temp_c': 54, 'category': 'hot'}
...     ),
...     dl.vectors.Feature(
...         shapely.geometry.Point(-106.827, 35.548),
...         {'name': 'San Ysidro', 'temp_c': 20, 'category': 'warm', 'fun': 'no'}
...     ),
...     dl.vectors.Feature(
...         shapely.geometry.Point(-108.209, 33.199),
...         {'name': 'Gila', 'temp_c': 66, 'category': 'hot'}
...     ),
... ]
...
>>> added_features = nm_hotsprings.add(hotsprings_features)
>>> added_features
[Feature({
  'geometry': {
    'coordinates': (-106.646, 35.938),
    'type': 'Point'
  },
  'id': u'ace82e07219ee25a398d2ca351a1346c0e6a9b83d...90b0e342b6960919bbb9d20ec9_2b49ef73fd6548a1',
  'properties': {
    'category': 'hot',
    'name': 'San Antonio',
    'temp_c': 54
  },
  'type': 'Feature'
}), Feature({
  'geometry': {
    'coordinates': (-106.827, 35.548),
    'type': 'Point'
  },
  'id': u'ace82e07219ee25a398d2ca351a1346c0e6a9b83d...90b0e342b6960919bbb9d20ec9_d9a699fbe6704ac3',
  'properties': {
    'category': 'warm',
    'fun': 'no',
    'name': 'San Ysidro',
    'temp_c': 20
  },
  'type': 'Feature'
}), Feature({
  'geometry': {
    'coordinates': (-108.209, 33.199),
    'type': 'Point'
  },
  'id': u'ace82e07219ee25a398d2ca351a1346c0e6a9b83d...90b0e342b6960919bbb9d20ec9_db16cf76c7864862',
  'properties': {
    'category': 'hot',
    'name': 'Gila',
    'temp_c': 66
  },
  'type': 'Feature'
})]

Notice that the returned Features have an ID set.

Features do not need to follow a fixed schema: notice how the Feature for San Ysidro hot springs has {'fun': 'no'} set, whereas the other hot springs do not have a fun property (since obviously they both are). Features can have different properties, or different types of values for properties of the same name, so be prepared for this in your code.

Querying FeatureCollections

FeatureCollections can be queried spatially, as well as by their key-value properties, with the filter() method.

Remember that a FeatureCollection represents a Vector product with filters applied to it. That means that each call to filter() returns a new FeatureCollection instance, still referring to the same underlying product, but with more filters applied. This lets you start with one query, and chain more onto it.

Spatial Filtering

To add a spatial query to a FeatureCollection, pass a GeoJSON geometry dict, or object with __geo_interface__, to the geometry keyword argument of filter().

Only Features that intersect that geometry will be selected. Any geometry type can be used (though Point doesn’t make a whole lot of sense).

>>> northern_nm_polygon = {
...     "type": "Polygon",
...     "coordinates": [[[-107, 35], [-105, 35], [-105, 37], [-107, 37], [-107, 35]]]
... }
...
>>> northern_nm_springs = nm_hotsprings.filter(geometry=northern_nm_polygon)
>>>
>>> southern_nm_polygon = {
...     "type": "Polygon",
...     "coordinates": [[[-109, 32], [-106, 32], [-106, 35], [-109, 35], [-109, 32]]]
... }
...
>>> southern_nm_springs = nm_hotsprings.filter(geometry=southern_nm_polygon)

Notice that calling filter() returns a copy of the FeatureCollection, not the Features themselves:

>>> northern_nm_springs
FeatureCollection({
  'description': u"Data from 1980 NOAA 'Thermal Springs List for the United States'",
  'id': u'd4ef22d5a6969cb61147ec8ea3e060cdf33e1a49:...springs1d37c3f5-4502-4b31-beb4-6a876badad88',
  'name': None,
  'owners': [u'user:d4ef22d5a6969cb61147ec8ea3e060cdf33e1a49', u'org:descarteslabs'],
  'readers': [],
  'title': u'Geothermal Springs in New Mexico',
  'writers': []
})

The two FeatureCollections (northern_nm_springs and southern_nm_springs) refer to the same Vector product, but will return different data when iterating through features():

>>> nnm_features = list(northern_nm_springs.features())
>>> nnm_features
[Feature({
  'geometry': {
    'coordinates': (-106.827, 35.548),
    'type': 'Point'
  },
  'id': u'ace82e07219ee25a398d2ca351a1346c0e6a9b83d...90b0e342b6960919bbb9d20ec9_d9a699fbe6704ac3',
  'properties': {
    u'category': u'warm',
    u'fun': u'no',
    u'name': u'San Ysidro',
    u'temp_c': 20
  },
  'type': 'Feature'
}), Feature({
  'geometry': {
    'coordinates': (-106.646, 35.938),
    'type': 'Point'
  },
  'id': u'ace82e07219ee25a398d2ca351a1346c0e6a9b83d...90b0e342b6960919bbb9d20ec9_2b49ef73fd6548a1',
  'properties': {
    u'category': u'hot',
    u'name': u'San Antonio',
    u'temp_c': 54
  },
  'type': 'Feature'
})]
>>>
>>> snm_features = list(southern_nm_springs.features())
>>> snm_features
[Feature({
  'geometry': {
    'coordinates': (-108.209, 33.199),
    'type': 'Point'
  },
  'id': u'ace82e07219ee25a398d2ca351a1346c0e6a9b83d...90b0e342b6960919bbb9d20ec9_db16cf76c7864862',
  'properties': {
    u'category': u'hot',
    u'name': u'Gila',
    u'temp_c': 66
  },
  'type': 'Feature'
})]

Property Filtering

To add a properties filter to a FeatureCollection, the descarteslabs.vectors.properties helper lets you use normal Python operators to specify comparisons that you can pass to the properties keyword argument of filter(). For example:

>>> from descarteslabs.vectors import properties as p
>>>
>>> very_hot_hotsprings = nm_hotsprings.filter(
...     properties=(p.category == "hot") & (p.temp_c > 60)
... )
...
>>> vh_features = list(very_hot_hotsprings.features())
>>> vh_features
[Feature({
  'geometry': {
    'coordinates': (-108.209, 33.199),
    'type': 'Point'
  },
  'id': u'ace82e07219ee25a398d2ca351a1346c0e6a9b83d...90b0e342b6960919bbb9d20ec9_db16cf76c7864862',
  'properties': {
    u'category': u'hot',
    u'name': u'Gila',
    u'temp_c': 66
  },
  'type': 'Feature'
})]

To refer to a field in your data, just access that attribute by name from descarteslabs.vectors.properties. Then you can use Python binary comparison operators (>, <, >=, <=, ==, !=). like is also supported for pattern-matching in strings; see the like example for more.

To combine these expressions, use & (logical AND), | (logical OR), and parenthesis. Using Python and and or will not work as expected:

>>> type(p.a > 1 and p.b == 1)  # just returns the `p.b == 1` part
<class 'descarteslabs.common.property_filtering.filtering.EqExpression'>
>>> type((p.a > 1) & (p.b == 1)) # AndExpression as intended
<class 'descarteslabs.common.property_filtering.filtering.AndExpression'>

When filtering, the value of a field that doesn’t exist is considered None. Additionally, if the types of a field’s value and the value it’s compared to are incompatible, the comparison evaluates to False.

Retrieving Data

The filters you set aren’t actually applied until you iterate through features().

This means you can start with one filtered FeatureCollection and chain other filters onto it:

>>> hot_hotsprings = nm_hotsprings.filter(properties=(p.category == "hot"))
>>> northern_hot_hotsprings = hot_hotsprings.filter(geometry=northern_nm_polygon)
>>> southern_hot_hotsprings = hot_hotsprings.filter(geometry=southern_nm_polygon)
>>>
>>> nnm_features = list(northern_hot_hotsprings.features())
>>> nnm_features
[Feature({
  'geometry': {
    'coordinates': (-106.646, 35.938),
    'type': 'Point'
  },
  'id': u'ace82e07219ee25a398d2ca351a1346c0e6a9b83d...90b0e342b6960919bbb9d20ec9_2b49ef73fd6548a1',
  'properties': {
    u'category': u'hot',
    u'name': u'San Antonio',
    u'temp_c': 54
  },
  'type': 'Feature'
})]
>>>
>>> snm_features = list(southern_hot_hotsprings.features())
>>> snm_features
[Feature({
  'geometry': {
    'coordinates': (-108.209, 33.199),
    'type': 'Point'
  },
  'id': u'ace82e07219ee25a398d2ca351a1346c0e6a9b83d...90b0e342b6960919bbb9d20ec9_db16cf76c7864862',
  'properties': {
    u'category': u'hot',
    u'name': u'Gila',
    u'temp_c': 66
  },
  'type': 'Feature'
})]

The chained filters are logically ANDed together:

>>> southern_notfun_hotsprings = southern_hot_hotsprings.filter(
...     properties=(p.fun == 'no')
... )
...
>>> notfun_features = list(southern_notfun_hotsprings.features())
>>> notfun_features
[]

You can’t chain multiple geometries, however—filtering with a new geometry simply replaces the old one.

Note

Because Vector products can potentially contain millions or billions of Features, you must specify some filter in order to iterate through features(). Not doing so will raise an error:

>>> list(nm_hotsprings.features())
Traceback (most recent call last):
  File "< chunk 29 named None >", line 1, in <module>
  File "/root/.cache/bazel/_bazel_drone-agent-j2dr/3517bd091dde6188868082e15543f179/sandbox/processwrapper-sandbox/4/execroot/__main__/bazel-out/host/bin/docs/guides/pweave-vectors.runfiles/__main__/descarteslabs/vectors/featurecollection.py", line 414, in features
    return _FeaturesIterator(self.vector_client.search_features(**params))
  File "/root/.cache/bazel/_bazel_drone-agent-j2dr/3517bd091dde6188868082e15543f179/sandbox/processwrapper-sandbox/4/execroot/__main__/bazel-out/host/bin/docs/guides/pweave-vectors.runfiles/__main__/descarteslabs/client/services/vector/vector.py", line 953, in search_features
    self, product_id, geometry, query_expr, query_limit
  File "/root/.cache/bazel/_bazel_drone-agent-j2dr/3517bd091dde6188868082e15543f179/sandbox/processwrapper-sandbox/4/execroot/__main__/bazel-out/host/bin/docs/guides/pweave-vectors.runfiles/__main__/descarteslabs/client/services/vector/vector.py", line 29, in __init__
    self._next_page()
  File "/root/.cache/bazel/_bazel_drone-agent-j2dr/3517bd091dde6188868082e15543f179/sandbox/processwrapper-sandbox/4/execroot/__main__/bazel-out/host/bin/docs/guides/pweave-vectors.runfiles/__main__/descarteslabs/client/services/vector/vector.py", line 57, in _next_page
    **self._kwargs
  File "/root/.cache/bazel/_bazel_drone-agent-j2dr/3517bd091dde6188868082e15543f179/sandbox/processwrapper-sandbox/4/execroot/__main__/bazel-out/host/bin/docs/guides/pweave-vectors.runfiles/__main__/descarteslabs/client/services/vector/vector.py", line 887, in _fetch_feature_page
    r = self.session.post("/products/{}/search".format(product_id), json=params)
  File "/root/.cache/bazel/_bazel_drone-agent-j2dr/3517bd091dde6188868082e15543f179/sandbox/processwrapper-sandbox/4/execroot/__main__/bazel-out/host/bin/docs/guides/pweave-vectors.runfiles/requirements_py2_pypi__requests_2_22_0/requests/sessions.py", line 581, in post
    return self.request('POST', url, data=data, json=json, **kwargs)
  File "/root/.cache/bazel/_bazel_drone-agent-j2dr/3517bd091dde6188868082e15543f179/sandbox/processwrapper-sandbox/4/execroot/__main__/bazel-out/host/bin/docs/guides/pweave-vectors.runfiles/__main__/descarteslabs/client/services/service/service.py", line 66, in request
    raise BadRequestError(resp.text)
BadRequestError: {"errors": [{"status": "400", "detail": "No query given and no limit set: one of geometry, query_expr must be set or a limit given"}]}

Modifying and deleting Features

It isn’t possible to modify individual Features in a FeatureCollection. Keep this in mind when adding data that might change to a FeatureCollection. If you want to retain multiple versions of Features, you could use a "version" property on your Features to make it possible to query for different versions of your data.

If you need to make changes to features from a Vector product and you don’t want multiple versions of Features, you can create a new Vector product, copy over all the Features from the old one that you want to keep, and make any changes to them in the process of copying.

It is possible to delete Features using a filter. You apply a filter much the same way you do to retrieve Features from a FeatureCollection; however you cannot use limit() when deleting Features. Once you create your filters, call delete_features() to start removing features from the collection.

>>> from descarteslabs.vectors import FailedJobError
>>>
>>> hot_hotsprings = nm_hotsprings.filter(properties=(p.category == "hot"))
>>>
>>> try:
...     delete_job = hot_hotsprings.delete_features()
... except FailedJobError:
...     print(delete_job.state, delete_job.errors)
...

Because deleting features can take a long time, delete_features() returns a DeleteJob. If you need to ensure the DeleteJob ran successfully, you can use the wait_for_completion() to block until the job reports that is is done. You can then check the state of the job, how long it took to complete, and what errors, if any, occurred.

It’s only possible to run a single DeleteJob at a time per Vector product. If you have multiple filters you need to apply, you’ll need to wait for each job to complete before running the next delete filter.