# Catalog¶

Use the Descartes Labs Catalog to discover existing raster products, search the images contained in them and manage your own products and images.

Note

The Catalog Python object-oriented client provides the functionality previously covered by the more low-level, now deprecated Metadata and Catalog Python clients. There are a few compatibility warning you can find here.

Note

The Catalog Python client is mainly for discovering data and for managing data. For data analysis and rastering use Scenes.

## Concepts¶

The Descartes Labs Catalog is a repository for georeferenced images. Commonly these images are either acquired by Earth observation platforms like a satellite or they are derived from other georeferenced images. The catalog is modeled on the following core concepts, each of which is represented by its own class in the API.

### Images¶

An image (represented by class Image in the API) contains data for a shape on earth, as specified by its georeferencing. An image references one or more files (commonly TIFF or JPEG files) that contain the binary data conforming to the band declaration of its product.

### Bands¶

A band (represented by class Band) is a 2-dimensional slice of raster data in an image. A product must have at least one band and all images in the product must conform to the declared band structure. For example, an optical sensor will commonly have bands that correspond to the red, blue and green visible light spectrum, which you could raster together to create an RGB image.

### Products¶

A product (represented by class Product) is a collection of images that share the same band structure. Images in a product can generally be used jointly in a data analysis, as they are expected to have been uniformly processed with respect to data correction, georegistration and so on. For example, you can composite multiple images from a product to run an algorithm over a large geographic region.

Some products correspond directly to image datasets provided by a platform. See for example the Landsat 8 Collection 1 product. This product contains all images taken by the Landsat 8 satellite, is updated continuously as it takes more images, and is processed to NASA’s Collection 1 specification.

A product may also represent data derived from multiple other products or data sources - some may not even derive from Earth observation data. A raster product can contain any sort of image data as long as it’s georeferenced.

## Searching the catalog¶

All objects support the same search interface. Let’s look at two of the most commonly searched for types of objects: products and images.

### Finding products¶

#### Filtering and sorting¶

Product.search() is the entry point for searching products. It returns a query builder that you can use to refine your search and can iterate over to retrieve search results.

Count all products with some data before 2016 using filter():

>>> from descarteslabs.catalog import Product, properties as p
>>> search = Product.search().filter(p.start_datetime < "2016-01-01")
>>> search.count()
72


You can apply multiple filters. To restrict this search to products with data after 2000:

>>> search = search.filter(p.end_datetime > "2000-01-01")
>>> search.count()
40


Of these, get the 3 products with the oldest data, using sort() and limit(). The search is not executed until you start retrieving results by iterating over it:

>>> oldest_search = search.sort("start_datetime").limit(3)
>>> for result in oldest_search:
...     print(result.id)
...
landsat:LT05:PRE:TOAR
dmsp:nightlights
daily-weather:gsod-interpolated:v0


All attributes are documented in the Product API reference, which also spells out which ones can be used to filter or sort.

#### Lookup by id and object relationships¶

If you know a product’s id, look it up directly with Product.get():

>>> landsat8_collection1 = Product.get("landsat:LC08:01:RT:TOAR")
>>> landsat8_collection1
Product: Landsat 8 Collection 1 Real-Time
id: landsat:LC08:01:RT:TOAR


Wherever there are relationships between objects expect methods such as Product.bands() to find related objects. This shows the first four bands of the Landsat 8 product we looked up:

>>> for band in landsat8_collection1.bands().limit(4):
...     print(band)
...
SpectralBand: coastal-aerosol
id: landsat:LC08:01:RT:TOAR:coastal-aerosol
product: landsat:LC08:01:RT:TOAR
SpectralBand: blue
id: landsat:LC08:01:RT:TOAR:blue
product: landsat:LC08:01:RT:TOAR
SpectralBand: green
id: landsat:LC08:01:RT:TOAR:green
product: landsat:LC08:01:RT:TOAR
SpectralBand: red
id: landsat:LC08:01:RT:TOAR:red
product: landsat:LC08:01:RT:TOAR


Product.bands() returns a search object that can be further refined. This shows all class bands of this Landsat 8 product, sorted by name:

>>> from descarteslabs.catalog import BandType
>>> for band in landsat8_collection1.bands().filter(p.type == BandType.CLASS).sort("name"):
...    print(band)
...
ClassBand: qa_cirrus
id: landsat:LC08:01:RT:TOAR:qa_cirrus
product: landsat:LC08:01:RT:TOAR
ClassBand: qa_cloud
id: landsat:LC08:01:RT:TOAR:qa_cloud
product: landsat:LC08:01:RT:TOAR
product: landsat:LC08:01:RT:TOAR
ClassBand: qa_saturated
id: landsat:LC08:01:RT:TOAR:qa_saturated
product: landsat:LC08:01:RT:TOAR
ClassBand: qa_snow
id: landsat:LC08:01:RT:TOAR:qa_snow
product: landsat:LC08:01:RT:TOAR
ClassBand: valid-cloudfree
id: landsat:LC08:01:RT:TOAR:valid-cloudfree
product: landsat:LC08:01:RT:TOAR


### Finding images¶

#### Image filters¶

Search images by the most common attributes - by product, intersecting with a geometry and by a date range:

>>> from descarteslabs.catalog import Image, properties as p
>>> geometry = {
...     "type": "Polygon",
...     "coordinates": [[
...         [2.915496826171875, 42.044193618165224],
...         [2.838592529296875, 41.92475971933975],
...         [3.043212890625, 41.929868314485795],
...         [2.915496826171875, 42.044193618165224]
...     ]]
... }
...
>>> search = Product.get("landsat:LC08:01:RT:TOAR").images()
>>> search = search.intersects(geometry)
>>> search = search.filter((p.acquired > "2017-01-01") & (p.acquired < "2018-01-01"))
>>> search.count()
14


There are other attributes useful to filter by, documented in the API reference for Image. For example exclude images with too much cloud cover:

>>> search = search.filter(p.cloud_fraction < 0.2)
>>> search.count()
7


Filtering by cloud_fraction is only reasonable when the product sets this attribute on images. Images that don’t set the attribute are excluded from the filter.

The created timestamp is added to all objects in the catalog when they are created and is immutable. Restrict the search to results created before some time in the past, to make sure that the image results are stable:

>>> from datetime import datetime
>>> search = search.filter(p.created < datetime(2019, 1, 1))
>>> search.count()
7


Note that for all timestamps we can use datetime instances or strings that can reasonably be parsed as a timestamp. If a timestamp has no explicit timezone, it’s assumed to be in UTC.

#### Image summaries¶

Any queries for images support a summary via the summary() method, returning a SummaryResult with aggregate statistics beyond just the number of results:

>>> from descarteslabs.catalog import Image, properties as p
>>> search = Image.search().filter(p.product_id == "landsat:LC08:01:T1:TOAR")
>>> search.summary()

Summary for 633708 images:
- Total bytes: 73,899,661,048,665
- Products: landsat:LC08:01:T1:TOAR


These summaries can also be bucketed by time intervals with summary_interval() to create a time series:

>>> search.summary_interval(interval="month", start_datetime="2017-01-01", end_datetime="2017-06-01")
[
Summary for 9872 images:
- Total bytes: 1,230,379,744,242
- Interval start: 2017-01-01 00:00:00+00:00,
Summary for 10185 images:
- Total bytes: 1,288,400,404,886
- Interval start: 2017-02-01 00:00:00+00:00,
Summary for 12426 images:
- Total bytes: 1,556,107,514,684
- Interval start: 2017-03-01 00:00:00+00:00,
Summary for 12492 images:
- Total bytes: 1,476,030,969,986
- Interval start: 2017-04-01 00:00:00+00:00,
Summary for 13768 images:
- Total bytes: 1,571,780,442,608
- Interval start: 2017-05-01 00:00:00+00:00]


## Managing products¶

### Creating and updating a product¶

Before uploading images to the catalog, you need to create a product and declare its bands. The only required attributes are a unique id, passed in the constructor, and a name:

>>> from descarteslabs.catalog import Product
>>> product = Product(id="guide-example-product")
>>> product.name = "Example product"
>>> product.save()
>>> product.id
u'descarteslabs:guide-example-product'
>>> product.created
datetime.datetime(2019, 8, 19, 18, 53, 26, 250005, tzinfo=<UTC>)


save() saves the product to the catalog in the cloud. Note that you get to choose an id for your product but it must be unique within your organization (you get an exception if it’s not). This code example is assuming the user is in the “descarteslabs” organization. The id is prefixed with the organization id on save to enforce global uniqueness and uniqueness within an organization. If you are not part of an organization the prefix will be your unique user id.

Every object has a read-only created attribute with the timestamp from when it was first saved.

There are a few more attributes that you can set (see the :class~descarteslabs.catalog.Product API reference). You can update the product to define the timespan that it covers. This is as simple as assigning attributes and then saving again:

>>> product.start_datetime = "2012-01-01"
>>> product.end_datetime = "2015-01-01"
>>> product.save()
>>> product.start_datetime
datetime.datetime(2012, 1, 1, 0, 0, tzinfo=<UTC>)
>>> product.modified
datetime.datetime(2019, 8, 19, 18, 53, 27, 114274, tzinfo=<UTC>)


A read-only modified attribute exists on all objects and is updated on every save.

Note that all timestamp attributes are represented as datetime instances in UTC. You may assign strings to timestamp attributes if they can be reasonably parsed as timestamps. Once the object is saved the attributes will appear as parsed datetime instances. If a timestamp has no explicit timezone, it’s assumed to be in UTC.

#### Get existing product or create new one¶

If you rerun the same code many times and you only want to create the product once, you can use the Product.get_or_create() method. This method will do a lookup, and if not found, will create a new product instance (you can do the same for bands or images):

>>> product = Product.get_or_create("guide-example-product")
>>> product.name = "Example product"
>>> product.save()


This is the equivalent to:

>>> product = Product.get("guide-example-product")
>>> if product is None:
...     product = Product(id="guide-example-product")
>>>  product.name = "Example product"
>>>  product.save()


If the product doesn’t exist yet, it will be created, the name will be assigned, and it will be created by the save. If the product already exists, it will be retrieved. If the assigned name differs, the product will be updated by the save. If everything is identical, the save becomes a noop.

>>> product = Product.get_or_create("guide-example-product", name="Example product")
>>> product.save()


### Creating bands¶

Before adding any images to a product you should create bands that declare the structure of the data shared among all images in a product.

>>> from descarteslabs.catalog import SpectralBand, DataType, Resolution, ResolutionUnit
>>> band = SpectralBand(name="blue", product=product)
>>> band.data_type = DataType.UINT16
>>> band.data_range = (0, 10000)
>>> band.display_range = (0, 4000)
>>> band.resolution = Resolution(unit=ResolutionUnit.METERS, value=60)
>>> band.band_index = 0
>>> band.save()
>>> band.id
u'descarteslabs:guide-example-product:blue'


A band is uniquely identified by its name and product. The full id of the band is composed of the product id and the name.

The band defines where its data is found in the files attached to images in the product: In this example, band_index = 0 indicates that blue is the first band in the image file, and that first band is expected to be represented by unsigned 16-bit integers (DataType.UINT16).

This band is specifically a SpectralBand, with pixel values representing measurements somewhere in the visible/NIR/SWIR electro-optical wavelength spectrum, so you can also set additional attributes to locate it on the spectrum:

>>> # These values are in nanometers (nm)
>>> band.wavelength_nm_min = 452
>>> band.wavelength_nm_max = 512
>>> band.save()


Bands are created and updated in the same way was as products and all other Catalog objects.

#### Band types¶

It’s common for many products to have an alpha band, which masks pixels in the image that don’t have valid data:

>>> from descarteslabs.catalog import MaskBand
>>> alpha.is_alpha = True
>>> alpha.data_type = DataType.UINT16
>>> alpha.resolution = band.resolution
>>> alpha.band_index = 1
>>> alpha.save()


Here the “alpha” band is created as a MaskBand which is by definition a binary band with a data range from 0 to 1, so there is no need to set the data_range and display_range attribute.

Setting is_alpha to True enables special behavior for this band during rastering. If this band appears as the last band in a raster operation (such as SceneCollection.mosaic() or SceneCollection.stack() in the scenes client) pixels with a value of 0 in this band will be treated as transparent.

There are five band types which may have some attributes specific to them. The type of a band does not necessarily affect how it is rastered, it mainly conveys useful information about the data it contains.

All bands have the following attributes in common: id, name, product_id, description, type, sort_order, data_type, no_data, data_range, display_range, resolution, band_index, file_index, jpx_layer_index.

Note that when retrieving bands using a band-specific class, for example SpectralBand.get(), SpectralBand.get_many() or SpectralBand.search(), you will only retrieve that type of band; any other types will be silently dropped. Using Band.get(), Band.get_many() or Band.search() will return all of the types.

### Access control¶

By default only the creator of a product can read and modify it as well as read and modify the images in it. To share access to a product with others you can modify its access control lists (ACLs):

>>> product.readers = ["org:descarteslabs"]
>>> product.writers = ["email:jane.doe@descarteslabs.com", "email:john.daly@gmail.com"]
>>> product.save()


For some more details on access control lists see the Sharing Resources guide

This gives read access to the whole “descarteslabs” organization. All users in that organization can now find the product. This also gives write access to two specific users identified by email. These two users can now update the product and add new images to it.

New bands and images created in a product inherit the product’s ACLs by default, but the ACLs for existing images are not automatically updated when they change on the product.

You can change the ACLs for all bands and images associated with a given product using update_related_objects_permissions(). This method kicks off an asynchronous task that performs the updates. If the product has more than 10,000 associated images, this might take several minutes to finish running. You get the current status of the job using get_update_permissions_status() or wait for the task to complete using wait_for_completion().

This sets the ACLs for all bands and images in product to those of the product and waits for the update to complete:

>>> status = product.update_related_objects_permissions(readers=product.readers, writers=product.writers)
>>> if status:
...     status.wait_for_completion()


### Derived bands¶

A derived band is the result of a pixel function applied to one or more existing bands of a product. Derived bands become available on a product automatically when canonically named bands it relies on are present in the product. For example, the derived:ndvi band provides the normalized difference vegetation index (NDVI) if a product has bands named red and nir:

>>> from descarteslabs.catalog import DerivedBand
>>>
>>> ndvi = DerivedBand.get("derived:ndvi")
>>> ndvi.description
'Normalized Difference Vegetation Index'
>>> ndvi.bands
['nir', 'red']


The id and name of a derived band always has a derived: prefix to distinguish them clearly from bands declared in a product. The catalog provides a standard set of derived bands - you can’t create your own.

The bands attribute defines the band names that must be present in a product for this derived band. Find all derived bands available for a product with Product.derived_bands():

>>> landsat8_collection1 = Product.get("landsat:LC08:01:RT:TOAR")
>>> for band in landsat8_collection1.derived_bands():
...     print(band)
...
DerivedBand: derived:bai
id: derived:bai
DerivedBand: derived:evi
id: derived:evi
DerivedBand: derived:ndvi
id: derived:ndvi
DerivedBand: derived:ndwi
id: derived:ndwi
DerivedBand: derived:ndwi1
id: derived:ndwi1
DerivedBand: derived:ndwi2
id: derived:ndwi2
DerivedBand: derived:rsqrt
id: derived:rsqrt


### Deleting bands and products¶

All objects can be deleted using delete(). For example, delete the previously created alpha band:

>>> alpha.delete()
True


A product can only be deleted if it doesn’t have any bands or images. Because the product we created still has one band this fails:

>>> product.delete()
Traceback (most recent call last):
File "< chunk 24 named None >", line 1, in <module>
File "descarteslabs/catalog/catalog_base.py", line 450, in delete
r = self._client.session.delete(self._url + "/" + self.id)
File "requests/sessions.py", line 615, in delete
return self.request('DELETE', url, **kwargs)
File "descarteslabs/client/services/service/service.py", line 74, in request
raise ConflictError(resp.text)
ConflictError: {"errors":[{"detail":"One or more related objects exist","status":"409","title":"Related objects exist"}],"jsonapi":{"version":"1.0"}}


There is a convenience method to delete all bands and images in a product. Be careful as this may delete a lot of data and can’t be undone!

>>> status = product.delete_related_objects()


This kicks off a job that deletes bands and images in the background. You can wait for this to complete and then delete the product:

>>> if status:
>>>    status.wait_for_completion()
>>> product.delete()


### Finding Products by id¶

You may have noticed that when creating products, the id you provide isn’t the id that is assigned to the object.

>>> product = Product(id="guide-example-product")
>>> product.name = "Example product"
>>> product.save()
>>> product.id
"descarteslabs:guide-example-product"


The id has a prefix added to ensure uniqueness without requiring you to come up with a globally unique name. The downside of this is you need to remember that prefix when looking up your products later:

# this will return False because the id has a prefix!
>>> Product.exists("guide-example-product")
False


You can use namespace_id() to generate a fully-namespaced product if you know the unprefixed part.

# this will return False because the id has a prefix!
>>> product_id = Product.namespace_id("guide-example-product")
>>> product_id
"descarteslabs:guide-example_product"


## Managing images¶

Apart from searching and discovering data available to you, the main use case of the catalog is to let you upload new images.

If your data already exists on disk as an image file, usually a GeoTIFF or JPEG file, you can upload it directly.

In the following examples we will upload data with a single band representing the blue light spectrum. First let’s create a product and band corresponding to that:

>>> # Create a product
>>> from descarteslabs.catalog import Band, DataType, Product, Resolution, ResolutionUnit, SpectralBand
>>> product = Product(id="guide-example-product", name="Example product")
>>> product.save()
>>>
>>> # Create a band
>>> band = SpectralBand(name="blue", product=product)
>>> band.data_type = DataType.UINT16
>>> band.data_range = (0, 10000)
>>> band.display_range = (0, 4000)
>>> band.resolution = Resolution(unit=ResolutionUnit.METERS, value=60)
>>> band.band_index = 0
>>> band.save()


Now image.upload() uploads images to the new product and returns a ImageUpload. Images are uploaded and processed asynchronously, so they are not available in the catalog immediately. With upload.wait_for_completion() we wait until the upload is completely finished.

>>> # Set any attributes that should be set on the uploaded images
>>> image = Image(product=product, name="scene1")
>>> image.acquired = "2012-01-02"
>>> image.cloud_fraction = 0.1
>>>
>>> image_path = "docs/guides/blue.tif"
u'success'


Attributes that can be derived from the image file, such as the georeferencing, will be assigned to the image during the upload process. But you can set any additional Image attributes such as acquired and cloud_fraction here.

Note that this code makes a number of assumptions:

• A GeoTIFF exists locally on disk at the path docs/guides/blue.tiff from the current directory.

• The GeoTIFF’s only band matches the blue band we created (for example, it has an unsigned 16-bit integer data type).

• The GeoTIFF is correctly georeferenced.

Image uploads use Descartes Labs Storage behind the scenes. You can find the uploaded file using the product id as a prefix in the products storage type:

>>> import descarteslabs as dl
>>> storage_client = dl.Storage()
>>> storage_client.list(prefix=product.id, storage_type="products")
['guide-example-product/ebe3cdeb709ac362b3d908e3802f8e0f']


Note that the actual name of the file will depend on several specifics including the file contents and hence will not necessarily be equal to that in the example.

Often, when creating derived product - for example, running a classification model on existing data - you’ll have a NumPy array (often referred to as “ndarrays”) in memory instead of a file written to disk. In that case, you can use upload_ndarray(). This method behaves like upload(), with one key difference: you must provide georeferencing attributes for the ndarray.

Georeferencing attributes are used to map between geospatial coordinates (such as latitude and longitude) and their corresponding pixel coordinates in the array. The required attributes are:

If the ndarray you’re uploading was rastered through the the platform, this information is easy to get. When rastering you also receive a dictionary of metadata that includes both of these parameters. Using the Scene.ndarray(), you have to set raster_info=True; with Raster.ndarray(), it’s always returned.

The following example puts these pieces together. This extracts the blue band from a Landsat 8 scene at a lower resolution and uploads it to our product:

>>> from descarteslabs.catalog import OverviewResampler
>>>
>>> scene, geoctx = dl.scenes.Scene.from_id("landsat:LC08:01:T1:TOAR:meta_LC08_L1TP_163068_20181025_20181025_01_T1_v1")
>>> ndarray, raster_meta = scene.ndarray(
...     "blue",
...     geoctx.assign(resolution=60),
...     # return georeferencing info we need to re-upload
...     raster_info=True
... )
...
>>> image2 = Image(product=product, name="scene2")
>>> image2.acquired = "2012-01-02"
...     ndarray,
...     raster_meta=raster_meta,
...     # create overviews for 120m and 240m resolution
...     overviews=[2, 4],
...     overview_resampler=OverviewResampler.AVERAGE,
... )
...
u'success'


The rastered ndarray here is a three-dimensional array in the shape (band, x, y) - the first axis corresponds to the band number. upload_ndarray() expects an array in that shape and will raise a warning if thinks the shape of the array is wrong. If the given array is two-dimensional it will assume you’re uploading a single band image.

This also specifies typically useful values for overviews and overview_resampler. Overviews allow the platform to raster your image faster at non-native resolutions, at the cost of more storage and a longer initial upload processing time to calculate the overviews.

The overviews argument specifies a list of up to 16 different resolution magnification factors to calulate overviews for. E.g. overviews=[2,4] calculates two overviews at 2x and 4x the native resolution. The overview_resampler argument specifies the algorithm to use when calculating overviews, see upload_ndarray() for which algorithms can be used.

### Updating images¶

The image created in the previous example is now available in the Catalog. We can look it up and update any of its attributes like any other catalog object:

>>> image2 = Image.get(image2.id)
>>> image2.cloud_fraction = 0.2
>>> image2.save()


To update the underlying file data, you will need to upload a new file or ndarray. However you must utilize a new unsaved Image instance (using the original product id and image name) along with the overwrite=True parameter. The reason for this is the original image which is now saved in the catalog contains many computed values, which may be different from those which would be computed from the new upload. There is no way for the catalog to know if you intend to reuse the original values or compute new values for these properties.

If you are going to be uploading a large number of images - especially if you are doing so from inside a set of tasks running in parallel, it is better to avoid calling the wait_for_completion() method immediately after initiating each upload. You can instead use the ability to query uploads to determine later on what has succeeded, failed, or is still running at a later time. This has advantages both in within a loop, where you don’t have to waste time waiting for each one, and in the tasks framework, where waiting inside of many tasks wastes resources and slows down the entire job.

As an example, if you have used either a loop or a task group to upload a bunch of images to a single product, you can use a pattern like the following to gather up the results.

>>> for upload in product.image_uploads().filter():
...     if upload.status not in (
...     ):
...     # do whatever you want here ...


Note that the above will return all uploads that you initiated on the product that are still being tracked; you may wish to do additional filtering on the created timestamp or other property to narrow the search.

The ImageUpload returned from upload() and upload_ndarray() provides status information on the image upload.

In the following example we upload an invalid file (it’s empty), so we expect the upload to fail. Additional information about the failure should be available in the errors attribute, which will contain a list of error records:

>>> import tempfile
>>> invalid_image_path = tempfile.mkstemp()[1]
>>> with open(invalid_image_path, "w"): pass
>>>
>>> image3 = Image(product=product, name="scene3", acquired="2012-03-01")
u'pending'
>>>
u'failure'
component: yaas
component_id: yaas-release-cc95fb75-gwxvr
event_datetime: 2020-01-09 14:12:35.2387465+00:00
event_type: queue
id: 13
message: message-id=XXXXXXX
severity: INFO
component: yaas_worker
event_datetime: 2020-01-09 14:12:35.756811+00:00
event_type: run
id: 14
message: Running
severity: INFO
component: IngestV2Worker
event_datetime: 2020-01-09 14:12:35.756811+00:00
event_type: complete
id: 15
message: InvalidFileError: Cannot determine file information, missing the following properties for storage-XXXX-products/guide-example-product/uploads/5d6f4154-7e9e-43a9-aed3-7f19f66cebe1/1578579154865887: ['size']
severity: ERROR
]


Uploads also contain a list of events pertaining to the upload. These can be useful for understanding or diagnosing problems.

You can also list any past upload results with Product.image_uploads() and Image.image_uploads(). Note that upload results are currently not stored indefinitely, so you may not have access to the full history of uploads for a product or image.

>>> for upload in product.image_uploads():
...
10635 descarteslabs:guide-example-product:scene1 success
10702 descarteslabs:guide-example-product:scene2 success
10767 descarteslabs:guide-example-product:scene3 failure


Alternatively you can filter the list by properties such as the status.

>>> for upload in product.image_uploads().filter(properties.status == ImageUploadStatus.FAILURE):
...
10767 descarteslabs:guide-example-product:scene3 failure


In the event that you experience an upload failure, and the error(s) don’t make it clear what you need to do to fix it, you should include the upload object id and any events and errors associated with it when you communicate with the Descartes Labs support team.

#### Tags & extra properties¶

The image attributes you can set, filter by and sort on are documented on the Image class. If you have other structured metadata to attach with your images you can use extra_properties:

>>> image2.extra_properties = {
...     "processing_time": 120,
...     "quality": 0.5,
...     "reviewer": "joe@acme.com",
... }
...
>>> image2.save()


extra_properties is a dictionary with string keys and values of any type that can be JSON-serialized (booleans, numbers, strings, lists, dictionaries).

Note that you cannot filter or sort images by extra_properties. Use tags if you have a finite discrete number of custom values you’d like to filter by:

>>> image2.tags = ["temporary", "guide"]
>>> image2.save()
>>>
>>> # Find all images in the product tagged "temporary"
>>> search = product.images().filter(p.tags == "temporary")
>>> for image in search:
...     print(image)
Image:
id: descarteslabs:guide-example-product:scene2
product: descarteslabs:guide-example-product
created: Mon Aug 19 18:53:43 2019


### Remote images¶

In addition to hosting rasterable images with file data attached, the catalog also supports images where the underlying raster data is not directly available. These remote images cannot be rastered but can be searched for using the catalog. This is useful for a couple of scenarios:

• A product of images that have not been consistently processed, optimized or georegistered in a way that prevents them from being rastered by the platform, for example raw imagery taken in unprocessed form from a sensor. Such a product can serve as the basis for higher-level products that have been processed consistently from the raw imagery.

• A product of images for which file data exist somewhere outside the platform but has not been uploaded or only partly uploaded into the platform. This gives users the chance to browse the full metadata of images and then make decisions about what file data should be uploaded on demand.

To create a remote image set storage_state to "remote". The only required attributes for remote images are acquired and geometry to anchor them in time and space. No bands are required for a product holding only remote images.

>>> from descarteslabs.catalog import Product, Image, StorageState
>>> product = Product(id="guide-example-raw", name="Raw product")
>>> product.save()
>>>
>>> geometry = {
...     "type": "Polygon",
...     "coordinates": [[
...         [7.488099932670593, 46.95386728954941],
...         [7.488352060317992, 46.953656742419255],
...         [7.488429844379425, 46.953916722233814],
...         [7.488099932670593, 46.95386728954941]
...     ]]
... }
...
>>> image = Image(product=product, name="raw-image")
>>> image.storage_state = StorageState.REMOTE
>>> image.acquired = "2018-04-12"
>>> image.geometry = geometry
>>> image.save()


If some form of URL referencing the remote image is available, attach it through the files attribute using a File:

>>> from descarteslabs.catalog import File
>>> image.files = [File(href="http://remote.server.com/path/image.tiff")]
>>> image.save()