SceneCollection

Back to Scenes

Create a SceneCollection by searching:

In [1]: import descarteslabs as dl

In [2]: import numpy as np

In [3]: aoi_geometry = {'type': 'Polygon',
   ...:  'coordinates': (((-95.27841503861751, 42.76556057019057),
   ...:    (-93.15675252485482, 42.36289849433184),
   ...:    (-93.73350276458868, 40.73810018004927),
   ...:    (-95.79766011799035, 41.13809376845988),
   ...:    (-95.27841503861751, 42.76556057019057)),)}
   ...: 

In [4]: scenes, ctx = dl.scenes.search(aoi_geometry, products=["landsat:LC08:PRE:TOAR"], limit=10)

In [5]: scenes
Out[5]: 
SceneCollection of 10 scenes
  * Dates: Apr 18, 2013 to Sep 09, 2013
  * Products: landsat:LC08:PRE:TOAR: 10

Use SceneCollection.each and SceneCollection.filter to subselect Scenes you want:

In [6]: # which month is each scene from?

In [7]: scenes.each.properties.date.month.combine()
Out[7]: [4, 5, 5, 6, 6, 7, 7, 8, 8, 9]

In [8]: spring_scenes = scenes.filter(lambda s: s.properties.date.month <= 6)

In [9]: spring_scenes
Out[9]: 
SceneCollection of 5 scenes
  * Dates: Apr 18, 2013 to Jun 21, 2013
  * Products: landsat:LC08:PRE:TOAR: 5

Operate on related Scenes with SceneCollection.groupby:

In [10]: for month, month_scenes in spring_scenes.groupby("properties.date.month"):
   ....:    print("Month {}: {} scenes".format(month, len(month_scenes)))
   ....: 
Month 4: 1 scenes
Month 5: 2 scenes
Month 6: 2 scenes

Load data with SceneCollection.stack or SceneCollection.mosaic:

In [1]: ctx_lowres = ctx.assign(resolution=120)

In [2]: stack = spring_scenes.stack("red green blue", ctx_lowres)

In [3]: stack.shape
Out[3]: (5, 3, 1845, 1862)
class SceneCollection(iterable=None, raster_client=None)[source]

Holds Scenes, with methods for loading their data.

As a subclass of Collection, the filter, map, and groupby methods and each property simplify inspection and subselection of contianed Scenes.

stack and mosaic rasterize all contained Scenes into an ndarray using the a GeoContext.

append(x)

Append x to end of self

extend(x)

Extend self by appending elements from the iterable

filter(predicate)

Returns Collection of items in self for which predicate(item) is True

groupby(*predicates)

Groups items by predicates and yields tuple of (group, items) for each group, where items is a Collection.

Each predicate can be a key function, or a string of dot-chained attributes to use as sort keys.

Examples

>>> import collections
>>> FooBar = collections.namedtuple("FooBar", ["foo", "bar"])
>>> c = Collection([FooBar("a", True), FooBar("b", False), FooBar("a", False)])
>>> for group, items in c.groupby("foo"):
...     print(group, items)
a Collection([FooBar(foo='a', bar=True), FooBar(foo='a', bar=False)])
b Collection([FooBar(foo='b', bar=False)])
>>> for group, items in c.groupby("bar"):
...     print(group, items)
False Collection([FooBar(foo='b', bar=False), FooBar(foo='a', bar=False)])
True Collection([FooBar(foo='a', bar=True)])
map(f)[source]

Returns list of f applied to each item in self, or SceneCollection if f returns Scenes

mosaic(bands, ctx, mask_nodata=True, mask_alpha=True, bands_axis=0, raster_info=False)[source]

Load bands from all scenes, combining them into a single 3D ndarray and optionally masking invalid data.

Where multiple scenes overlap, only data from the scene that comes last in the SceneCollection is used.

Parameters:
  • bands (str or Sequence[str]) – Band names to load. Can be a single string of band names separated by spaces ("red green blue"), or a sequence of band names (["red", "green", "blue"]). If the alpha band is requested, it must be last in the list to reduce rasterization errors.
  • ctx (GeoContext) – A GeoContext to use when loading each Scene
  • mask_nodata (bool, default True) – Whether to mask out values in each band that equal that band’s nodata sentinel value.
  • mask_alpha (bool, default True) – Whether to mask pixels in all bands where the alpha band of all scenes is 0.
  • bands_axis (int, default 0) –

    Axis along which bands should be located in the returned array. If 0, the array will have shape (band, y, x), if -1, it will have shape (y, x, band).

    It’s usually easier to work with bands as the outermost axis, but when working with large arrays, or with many arrays concatenated together, NumPy operations aggregating each xy point across bands can be slightly faster with bands as the innermost axis.

  • raster_info (bool, default False) – Whether to also return a dict of information about the rasterization of the scenes, including the coordinate system WKT and geotransform matrix. Generally only useful if you plan to upload data derived from this scene back to the Descartes catalog, or use it with GDAL.
Returns:

  • arr (ndarray) – Returned array’s shape will be (band, y, x) if bands_axis is 0, and (y, x, band) if bands_axis is -1. If mask_nodata or mask_alpha is True, arr will be a masked array.
  • raster_info (dict) – If raster_info=True, a raster information dict is also returned.

Raises:
  • ValueError – If requested bands are unavailable, or band names are not given or are invalid. If not all required parameters are specified in the GeoContext. If the SceneCollection is empty.
  • NotFoundError – If a Scene’s ID cannot be found in the Descartes Labs catalog
  • BadRequestError – If the Descartes Labs platform is given unrecognized parameters
sorted(*predicates, **reverse)

Returns a copy of self, sorted by predicates in ascending order.

Each predicate can be a key function, or a string of dot-chained attributes to use as sort keys. The reverse flag returns results in descending order.

Examples

>>> import collections
>>> FooBar = collections.namedtuple("FooBar", ["foo", "bar"])
>>> X = collections.namedtuple("X", "x")
>>> c = Collection([FooBar(1, X("one")), FooBar(2, X("two")), FooBar(3, X("three"))])
>>> c.sorted("foo")
Collection([FooBar(foo=1, bar=X(x='one')), FooBar(foo=2, bar=X(x='two')), FooBar(foo=3, bar=X(x='three'))])
>>> c.sorted("bar.x")
Collection([FooBar(foo=1, bar=X(x='one')), FooBar(foo=3, bar=X(x='three')), FooBar(foo=2, bar=X(x='two'))])
stack(bands, ctx, mask_nodata=True, mask_alpha=True, bands_axis=1, raster_info=False, max_workers=None)[source]

Load bands from all scenes and stack them into a 4D ndarray, optionally masking invalid data.

Parameters:
  • bands (str or Sequence[str]) – Band names to load. Can be a single string of band names separated by spaces ("red green blue"), or a sequence of band names (["red", "green", "blue"]). If the alpha band is requested, it must be last in the list to reduce rasterization errors.
  • ctx (GeoContext) – A GeoContext to use when loading each Scene
  • mask_nodata (bool, default True) – Whether to mask out values in each band of each scene that equal that band’s nodata sentinel value.
  • mask_alpha (bool, default True) – Whether to mask pixels in all bands of each scene where the alpha band is 0.
  • bands_axis (int, default 1) – Axis along which bands should be located. If 1, the array will have shape (scene, band, y, x), if -1, it will have shape (scene, y, x, band), etc. A bands_axis of 0 is currently unsupported.
  • raster_info (bool, default False) – Whether to also return a list of dicts about the rasterization of each scene, including the coordinate system WKT and geotransform matrix. Generally only useful if you plan to upload data derived from this scene back to the Descartes catalog, or use it with GDAL.
  • max_workers (int, default None) – Maximum number of threads to use to parallelize individual ndarray calls to each Scene. If None, it defaults to the number of processors on the machine, multiplied by 5. Note that unnecessary threads won’t be created if max_workers is greater than the number of Scenes in the SceneCollection.
Returns:

  • arr (ndarray) – Returned array’s shape is (scene, band, y, x) if bands_axis is 1, or (scene, y, x, band) if bands_axis is -1. If mask_nodata or mask_alpha is True, arr will be a masked array.
  • raster_info (List[dict]) – If raster_info=True, a list of raster information dicts for each scene is also returned

Raises:
  • ValueError – If requested bands are unavailable, or band names are not given or are invalid. If not all required parameters are specified in the GeoContext. If the SceneCollection is empty.
  • NotFoundError – If a Scene’s id cannot be found in the Descartes Labs catalog
  • BadRequestError – If the Descartes Labs platform is given unrecognized parameters
each

Any operations chained onto .each (attribute access, item access, and calls) are applied to each item in the Collection.

Notes

  • Add .combine() at the end of the operations chain to combine the results into a list by default, or any container type passed into .combine()
  • Use .pipe(f, *args, **kwargs) to yield f(x, *args, **kwargs) for each item x yielded by the preceeding operations chain

Examples

>>> c = Collection(["one", "two", "three", "four"])
>>> for x in c.each.capitalize():
...     print(x)
One
Two
Three
Four
>>> c.each.capitalize()[:2]
'On'
'Tw'
'Th'
'Fo'
>>> c.each.capitalize().pipe(len)
3
3
5
4
>>> c.each.capitalize().pipe(len).combine(set)
{3, 4, 5}
Yields:item with all operations following .each applied to it