SceneCollection

Back to Scenes

class SceneCollection(iterable=None, raster_client=None)[source]

Holds Scenes, with methods for loading their data.

As a subclass of Collection, the filter, map, and groupby methods and each property simplify inspection and subselection of contianed Scenes.

stack and mosaic rasterize all contained Scenes into an ndarray using the a GeoContext.

append(x)

Append x to the end of this Collection

download(bands, ctx, dest, format='tif', resampler='near', processing_level=None, scaling=None, data_type=None, max_workers=None)[source]

Download scenes as image files in parallel.

Parameters
  • bands (str or Sequence[str]) – Band names to load. Can be a single string of band names separated by spaces ("red green blue"), or a sequence of band names (["red", "green", "blue"]).

  • ctx (GeoContext) – A GeoContext to use when loading each Scene

  • dest (str, path-like, or sequence of str or path-like) –

    Directory or sequence of paths to which to write the image files.

    If a directory, files within it will be named by their scene IDs and the bands requested, like "sentinel-2:L1C:2018-08-10_10TGK_68_S2A_v1-red-green-blue.tif".

    If a sequence of paths of the same length as the SceneCollection is given, each Scene will be written to the corresponding path. This lets you use your own naming scheme, or even write images to multiple directories.

    Any intermediate paths are created if they do not exist, for both a single directory and a sequence of paths.

  • format (str, default "tif") –

    Only if a single directory is given as dest: what image format to use. One of “tif”, “png”, or “jpg”.

    If dest is a sequence of paths, format is ignored and determined by the extension on each path.

  • resampler (str, default "near") – Algorithm used to interpolate pixel values when scaling and transforming the image to its new resolution or SRS. Possible values are near (nearest-neighbor), bilinear, cubic, cubicsplice, lanczos, average, mode, max, min, med, q1, q3.

  • processing_level (str, optional) – How the processing level of the underlying data should be adjusted. Possible values are toa (top of atmosphere) and surface. For products that support it, surface applies Descartes Labs’ general surface reflectance algorithm to the output.

  • scaling (None, str, list, dict) – Band scaling specification. Please see scaling_parameters() for a full description of this parameter.

  • data_type (None, str) – Output data type. Please see scaling_parameters() for a full description of this parameter.

  • max_workers (int, default None) – Maximum number of threads to use to parallelize individual download calls to each Scene. If None, it defaults to the number of processors on the machine, multiplied by 5. Note that unnecessary threads won’t be created if max_workers is greater than the number of Scenes in the SceneCollection.

Returns

paths – A list of all the paths where files were written.

Return type

Sequence[str]

Example

>>> import descarteslabs as dl
>>> tile = dl.scenes.DLTile.from_key("256:0:75.0:15:-5:230")  
>>> scenes, _ = dl.scenes.search(tile, products=["landsat:LC08:PRE:TOAR"], limit=5)  
>>> scenes.download("red green blue", tile, "rasters")  
["rasters/landsat:LC08:PRE:TOAR:meta_LC80260322013108_v1-red-green-blue.tif",
 "rasters/landsat:LC08:PRE:TOAR:meta_LC80260322013124_v1-red-green-blue.tif",
 "rasters/landsat:LC08:PRE:TOAR:meta_LC80260322013140_v1-red-green-blue.tif",
 "rasters/landsat:LC08:PRE:TOAR:meta_LC80260322013156_v1-red-green-blue.tif",
 "rasters/landsat:LC08:PRE:TOAR:meta_LC80260322013172_v1-red-green-blue.tif"]
>>> # use explicit paths for a custom naming scheme:
>>> paths = [
...     "{tile.key}/l8-{scene.properties.date:%Y-%m-%d-%H:%m}.jpg".format(tile=tile, scene=scene)
...     for scene in scenes
... ]  
>>> scenes.download("nir red", tile, paths)  
["256:0:75.0:15:-5:230/l8-2013-04-18-16:04.jpg",
 "256:0:75.0:15:-5:230/l8-2013-05-04-16:05.jpg",
 "256:0:75.0:15:-5:230/l8-2013-05-20-16:05.jpg",
 "256:0:75.0:15:-5:230/l8-2013-06-05-16:06.jpg",
 "256:0:75.0:15:-5:230/l8-2013-06-21-16:06.jpg"]
Raises
  • RuntimeError – If the paths given are not all unique. If there is an error generating default filenames.

  • ValueError – If requested bands are unavailable, or band names are not given or are invalid. If not all required parameters are specified in the GeoContext. If the SceneCollection is empty. If dest is a sequence not equal in length to the SceneCollection. If format is invalid, or a path has an invalid extension.

  • TypeError – If dest is not a string or a sequence type.

  • NotFoundError – If a Scene’s ID cannot be found in the Descartes Labs catalog

  • BadRequestError – If the Descartes Labs platform is given unrecognized parameters

download_mosaic(bands, ctx, dest=None, format='tif', resampler='near', processing_level=None, scaling=None, data_type=None)[source]

Download all scenes as a single image file. Where multiple scenes overlap, only data from the scene that comes last in the SceneCollection is used.

Parameters
  • bands (str or Sequence[str]) – Band names to load. Can be a single string of band names separated by spaces ("red green blue"), or a sequence of band names (["red", "green", "blue"]).

  • ctx (GeoContext) – A GeoContext to use when loading the Scenes

  • dest (str, path-like object, or file-like object, default None) –

    Where to write the image file.

    • If None (default), it’s written to an image file of the given format in the current directory, named by the requested bands, like "mosaic-red-green-blue.tif"

    • If a string or path-like object, it’s written to that path.

      Any file already existing at that path will be overwritten.

      Any intermediate directories will be created if they don’t exist.

      Note that path-like objects (such as pathlib.Path) are only supported in Python 3.6 or later.

    • If a file-like object, it’s written into that file.

  • format (str, default "tif") –

    If a file-like object or None is given as dest: one of “tif”, “png”, or “jpg”.

    If a str or path-like object is given as dest, format is ignored and determined from the extension on the path (one of “.tif”, “.png”, or “.jpg”).

  • resampler (str, default "near") – Algorithm used to interpolate pixel values when scaling and transforming the image to its new resolution or SRS. Possible values are near (nearest-neighbor), bilinear, cubic, cubicsplice, lanczos, average, mode, max, min, med, q1, q3.

  • processing_level (str, optional) – How the processing level of the underlying data should be adjusted. Possible values are toa (top of atmosphere) and surface. For products that support it, surface applies Descartes Labs’ general surface reflectance algorithm to the output.

  • scaling (None, str, list, dict) – Band scaling specification. Please see scaling_parameters() for a full description of this parameter.

  • data_type (None, str) – Output data type. Please see scaling_parameters() for a full description of this parameter.

Returns

path – If dest is a path or None, the path where the image file was written is returned. If dest is file-like, nothing is returned.

Return type

str or None

Example

>>> import descarteslabs as dl
>>> tile = dl.scenes.DLTile.from_key("256:0:75.0:15:-5:230")  
>>> scenes, _ = dl.scenes.search(tile, products=["landsat:LC08:PRE:TOAR"], limit=5)  
>>> scenes.download_mosaic("nir red", tile)  
'mosaic-nir-red.jpg'
>>> scenes.download_mosaic("nir red", tile, dest="mosaics/{}.png".format(tile.key))  
'mosaics/256:0:75.0:15:-5:230.png'
>>> with open("another_mosaic.jpg", "wb") as f:
...     scenes.download_mosaic("swir2", tile, dest=f, format="jpg")  
Raises
  • ValueError – If requested bands are unavailable, or band names are not given or are invalid. If not all required parameters are specified in the GeoContext. If the SceneCollection is empty. If format is invalid, or the path has an invalid extension.

  • NotFoundError – If a Scene’s ID cannot be found in the Descartes Labs catalog

  • BadRequestError – If the Descartes Labs platform is given unrecognized parameters

extend(x)

Extend this Collection by appending elements from the iterable

filter(predicate)

Returns a Collection of items for which predicate(item) is True

filter_coverage(geom, minimum_coverage=1)[source]

Include only Scenes overlapping with geom by some fraction.

See Scene.coverage for getting coverage information for a scene.

Parameters
  • geom (GeoJSON-like dict, GeoContext, or object with __geo_interface__) – Geometry to which to compare each Scene’s geometry.

  • minimum_coverage (float) – Only include Scenes that cover geom by at least this fraction.

Returns

scenes

Return type

SceneCollection

Example

>>> import descarteslabs as dl
>>> aoi_geometry = {
...    'type': 'Polygon',
...    'coordinates': [[[-95, 42],[-93, 42],[-93, 40],[-95, 41],[-95, 42]]]}
>>> scenes, ctx = dl.scenes.search(aoi_geometry, products=["landsat:LC08:PRE:TOAR"], limit=20,
...    sort_field='processed')  
>>> filtered_scenes = scenes.filter_coverage(ctx, 0.50)  
>>> assert len(filtered_scenes) < len(scenes)  
groupby(*predicates)

Groups items by predicates and yields tuple of (group, items) for each group, where items is a Collection.

Each predicate can be a key function, or a string of dot-chained attributes to use as sort keys.

Examples

>>> import collections
>>> FooBar = collections.namedtuple("FooBar", ["foo", "bar"])
>>> c = Collection([FooBar("a", True), FooBar("b", False), FooBar("a", False)])
>>> for group, items in c.groupby("foo"):
...     print(group)
...     print(items)
a
Collection([FooBar(foo='a', bar=True), FooBar(foo='a', bar=False)])
b
Collection([FooBar(foo='b', bar=False)])
>>> for group, items in c.groupby("bar"):
...     print(group)
...     print(items)
False
Collection([FooBar(foo='b', bar=False), FooBar(foo='a', bar=False)])
True
Collection([FooBar(foo='a', bar=True)])
map(f)[source]

Returns list of f applied to each item in self, or SceneCollection if f returns Scenes

mosaic(bands, ctx, mask_nodata=True, mask_alpha=None, bands_axis=0, resampler='near', processing_level=None, scaling=None, data_type=None, raster_info=False)[source]

Load bands from all scenes, combining them into a single 3D ndarray and optionally masking invalid data.

Where multiple scenes overlap, only data from the scene that comes last in the SceneCollection is used.

If the selected bands and scenes have different data types the resulting ndarray has the most general of those data types. See Scene.ndarray() for details on data type conversions.

Parameters
  • bands (str or Sequence[str]) – Band names to load. Can be a single string of band names separated by spaces ("red green blue"), or a sequence of band names (["red", "green", "blue"]). If the alpha band is requested, it must be last in the list to reduce rasterization errors.

  • ctx (GeoContext) – A GeoContext to use when loading each Scene

  • mask_nodata (bool, default True) – Whether to mask out values in each band that equal that band’s nodata sentinel value.

  • mask_alpha (bool or str or None, default None) – Whether to mask pixels in all bands where the alpha band of all scenes is 0. Provide a string to use an alternate band name for masking. If the alpha band is available for all scenes in the collection and mask_alpha is None, mask_alpha is set to True. If not, mask_alpha is set to False.

  • bands_axis (int, default 0) –

    Axis along which bands should be located in the returned array. If 0, the array will have shape (band, y, x), if -1, it will have shape (y, x, band).

    It’s usually easier to work with bands as the outermost axis, but when working with large arrays, or with many arrays concatenated together, NumPy operations aggregating each xy point across bands can be slightly faster with bands as the innermost axis.

  • raster_info (bool, default False) – Whether to also return a dict of information about the rasterization of the scenes, including the coordinate system WKT and geotransform matrix. Generally only useful if you plan to upload data derived from this scene back to the Descartes catalog, or use it with GDAL.

  • resampler (str, default "near") – Algorithm used to interpolate pixel values when scaling and transforming the image to its new resolution or SRS. Possible values are near (nearest-neighbor), bilinear, cubic, cubicsplice, lanczos, average, mode, max, min, med, q1, q3.

  • processing_level (str, optional) – How the processing level of the underlying data should be adjusted. Possible values are toa (top of atmosphere) and surface. For products that support it, surface applies Descartes Labs’ general surface reflectance algorithm to the output.

  • scaling (None, str, list, dict) – Band scaling specification. Please see scaling_parameters() for a full description of this parameter.

  • data_type (None, str) – Output data type. Please see scaling_parameters() for a full description of this parameter.

Returns

  • arr (ndarray) – Returned array’s shape will be (band, y, x) if bands_axis is 0, and (y, x, band) if bands_axis is -1. If mask_nodata or mask_alpha is True, arr will be a masked array. The data type (“dtype”) of the array is the most general of the data types among the scenes being rastered.

  • raster_info (dict) – If raster_info=True, a raster information dict is also returned.

Raises
  • ValueError – If requested bands are unavailable, or band names are not given or are invalid. If not all required parameters are specified in the GeoContext. If the SceneCollection is empty.

  • NotFoundError – If a Scene’s ID cannot be found in the Descartes Labs catalog

  • BadRequestError – If the Descartes Labs platform is given unrecognized parameters

scaling_parameters(bands, scaling=None, data_type=None)[source]

Computes fully defaulted scaling parameters and output data_type from provided specifications.

This method is provided as a convenience to the user to help with understanding how scaling and data_type parameters passed to other methods on this class (e.g. stack() or mosaic()) will be interpreted. It would not usually be used in a normal workflow.

A scene collection may contain scenes from more than one product, introducing the possibility that the band properties for a band of a given name may differ from product to product. This method works in a similar fashion to the Scene.scaling_parameters method, but it additionally ensures that the resulting scale elements are compatible across the multiple products. If there is an incompatibility, an appropriate ValueError will be raised.

Parameters
  • bands (list(str)) – List of bands to be scaled.

  • scaling (None or str or list or dict) – Band scaling specification. See Scene.scaling_parameters for a full description of this parameter.

  • data_type (None or str) – Result data type desired, as a standard data type string (e.g. "Byte", "Uint16", or "Float64"). If not specified, will be deduced from the scaling specification. See Scene.scaling_parameters for a full description of this parameter.

Returns

  • scales (list(tuple)) – The fully specified scaling parameter, compatible with the Raster API and the output data type.

  • data_type (str) – The result data type as a standard GDAL type string.

Raises

ValueError – If any invalid or incompatible value is passed to any of the three parameters.

See also

Scenes Guide : This contains many examples of the use of the scaling and data_type parameters.

sorted(*predicates, **reverse)

Returns a Collection, sorted by predicates in ascending order.

Each predicate can be a key function, or a string of dot-chained attributes to use as sort keys. The reverse flag returns results in descending order.

Examples

>>> import collections
>>> FooBar = collections.namedtuple("FooBar", ["foo", "bar"])
>>> X = collections.namedtuple("X", "x")
>>> c = Collection([FooBar(1, X("one")), FooBar(2, X("two")), FooBar(3, X("three"))])
>>> c.sorted("foo")
Collection([FooBar(foo=1, bar=X(x='one')), FooBar(foo=2, bar=X(x='two')), FooBar(foo=3, bar=X(x='three'))])
>>> c.sorted("bar.x")
Collection([FooBar(foo=1, bar=X(x='one')), FooBar(foo=3, bar=X(x='three')), FooBar(foo=2, bar=X(x='two'))])
stack(bands, ctx, flatten=None, mask_nodata=True, mask_alpha=None, bands_axis=1, raster_info=False, resampler='near', processing_level=None, scaling=None, data_type=None, max_workers=None)[source]

Load bands from all scenes and stack them into a 4D ndarray, optionally masking invalid data.

If the selected bands and scenes have different data types the resulting ndarray has the most general of those data types. See Scene.ndarray() for details on data type conversions.

Parameters
  • bands (str or Sequence[str]) – Band names to load. Can be a single string of band names separated by spaces ("red green blue"), or a sequence of band names (["red", "green", "blue"]). If the alpha band is requested, it must be last in the list to reduce rasterization errors.

  • ctx (GeoContext) – A GeoContext to use when loading each Scene

  • flatten (str, Sequence[str], callable, or Sequence[callable], default None) –

    “Flatten” groups of Scenes in the stack into a single layer by mosaicking each group (such as Scenes from the same day), then stacking the mosaics.

    flatten takes the same predicates as Collection.groupby, such as "properties.date" to mosaic Scenes acquired at the exact same timestamp, or ["properties.date.year", "properties.date.month", "properties.date.day"] to combine Scenes captured on the same day (but not necessarily the same time).

    This is especially useful when ctx straddles a scene boundary and contains one image captured right after another. Instead of having each as a separate layer in the stack, you might want them combined.

    Note that indicies in the returned ndarray will no longer correspond to indicies in this SceneCollection, since multiple Scenes may be combined into one layer in the stack. You can call groupby on this SceneCollection with the same parameters to iterate through groups of Scenes in equivalent order to the returned ndarray.

    Additionally, the order of scenes in the ndarray will change: they’ll be sorted by the parameters to flatten.

  • mask_nodata (bool, default True) – Whether to mask out values in each band of each scene that equal that band’s nodata sentinel value.

  • mask_alpha (bool or str or None, default None) – Whether to mask pixels in all bands where the alpha band of all scenes is 0. Provide a string to use an alternate band name for masking. If the alpha band is available for all scenes in the collection and mask_alpha is None, mask_alpha is set to True. If not, mask_alpha is set to False.

  • bands_axis (int, default 1) – Axis along which bands should be located. If 1, the array will have shape (scene, band, y, x), if -1, it will have shape (scene, y, x, band), etc. A bands_axis of 0 is currently unsupported.

  • raster_info (bool, default False) – Whether to also return a list of dicts about the rasterization of each scene, including the coordinate system WKT and geotransform matrix. Generally only useful if you plan to upload data derived from this scene back to the Descartes catalog, or use it with GDAL.

  • resampler (str, default "near") – Algorithm used to interpolate pixel values when scaling and transforming each image to its new resolution or SRS. Possible values are near (nearest-neighbor), bilinear, cubic, cubicsplice, lanczos, average, mode, max, min, med, q1, q3.

  • processing_level (str, optional) – How the processing level of the underlying data should be adjusted. Possible values are toa (top of atmosphere) and surface. For products that support it, surface applies Descartes Labs’ general surface reflectance algorithm to the output.

  • scaling (None, str, list, dict) – Band scaling specification. Please see scaling_parameters() for a full description of this parameter.

  • data_type (None, str) – Output data type. Please see scaling_parameters() for a full description of this parameter.

  • max_workers (int, default None) – Maximum number of threads to use to parallelize individual ndarray calls to each Scene. If None, it defaults to the number of processors on the machine, multiplied by 5. Note that unnecessary threads won’t be created if max_workers is greater than the number of Scenes in the SceneCollection.

Returns

  • arr (ndarray) – Returned array’s shape is (scene, band, y, x) if bands_axis is 1, or (scene, y, x, band) if bands_axis is -1. If mask_nodata or mask_alpha is True, arr will be a masked array. The data type (“dtype”) of the array is the most general of the data types among the scenes being rastered.

  • raster_info (List[dict]) – If raster_info=True, a list of raster information dicts for each scene is also returned

Raises
  • ValueError – If requested bands are unavailable, or band names are not given or are invalid. If not all required parameters are specified in the GeoContext. If the SceneCollection is empty.

  • NotFoundError – If a Scene’s ID cannot be found in the Descartes Labs catalog

  • BadRequestError – If the Descartes Labs platform is given unrecognized parameters

property each

Any operations chained onto each (attribute access, item access, and calls) are applied to each item in the Collection.

Notes

  • Add combine() at the end of the operations chain to combine the results into a list by default, or any container type passed into combine()

  • Use pipe(f, *args, **kwargs) to yield f(x, *args, **kwargs) for each item x yielded by the preceeding operations chain

Examples

>>> c = Collection(["one", "two", "three", "four"])
>>> for x in c.each.capitalize():
...     print(x)
One
Two
Three
Four
>>> c.each.capitalize()[:2]
'On'
'Tw'
'Th'
'Fo'
>>> c.each.capitalize().pipe(len)
3
3
5
4
>>> list(c.each.capitalize().pipe(len).combine(set))
[3, 4, 5]
Yields

item with all operations following each applied to it