Workflows

Workflows is a computation engine for inventing and deploying geospatial analyses, fast.

With Workflows, you can develop your algorithm interactively and see changes recomputed on the fly, then run it at scale without changing your code.

Workflows helps you express the “what” of your model, rather than the “how”, letting you think more about the science, and less about the software engineering work of implementing it.

Vessel traffic through the Strait of Gibraltar, using Workflows

Four years of vessel traffic through the Strait of Gibraltar from Sentinel-1 radar, rendered on the fly with Workflows.

In Workflows, you write code with high-level objects, like Image and ImageCollection, rather than thinking about array indices. Instead of sending data back and forth, Workflows sends the sequence of operations you define to the backend, which pulls all the data, processes it, and sends you back just the final result—whether that’s a composited image, a timeseries of statistics derived from raster and vector data, or a single number.

With this design, you get:

  • Live visualization on an interactive map

  • Easy exploration using parameters and widgets

  • Caching, so only parts of your code that change are recomputed

  • Automatic parallelization

  • Batch jobs without the overhead of setting up Tasks

Workflows integrates with the Python data science ecosystem. You can quickly build custom interactive tools on top of Workflows using ipywidgets and ipyleaflet in Jupyter notebooks. And you can retrieve any data as a NumPy array or native Python type to use with other libraries. It also has an Array type that’s intercompatible with NumPy, so you can express complex numerical operations without learning new syntax.

Workflows is meant to do the heavy lifting of filtering, merging, and transforming raw data down into a refined result. But don’t expect it to do everything. Instead, use Workflows to get your data into the right format, then call compute on any object to retrieve the data and continue on with your analysis locally.

This guide describes some high-level Workflows concepts and implementation specifics (in no particular order), but is not exhaustive. For a complete reference, see the Workflows API Docs.

Request Access

Workflows is currently in limited release. To request access please contact support@descarteslabs.com.

Note

Old versions of the Workflows client will eventually stop working as new versions are released. Workflows doesn’t currently guarantee backwards API compatibility between versions.

The client library depends on a backend that it knows how to communicate with. As the API changes and improves, old clients will no longer be compatible with the new backend. To ensure consistency, each version gets its own backend, or channel, which doesn’t change.

Even when new versions are released, old versions will still work as before. This gives you time to upgrade at your own pace, lets you use standard Python package-management tools like pip or poetry to control the version, and ensures APIs don’t mysteriously change in the middle of your work.

However, when a version gets old enough, we will deactivate its backend. You should only expect the 2-3 most recent versions to have active backends.

If you see an error like NotFound: 404 Channel "v0-1" does not exist. If it used to, please upgrade your client., it’s time to upgrade your descarteslabs client version and get access to new features!

Upgrade early, upgrade often.

Example

The following example loads a single Image with red, green, and blue bands and computes with a GeoContext argument.

>>> import descarteslabs.workflows as wf
>>> img = wf.Image.from_id("landsat:LC08:PRE:TOAR:meta_LC80270312016188_v1")
>>> rgb = img.pick_bands(["red", "green", "blue"])
>>> geocontext = wf.GeoContext(
...     resolution=200.0,
...     crs='EPSG:32615',
...     align_pixels=False,
...     bounds=(258292.5, 4503907.5, 493732.5, 4743307.5),
...     bounds_crs='EPSG:32615'
... )
>>> result = rgb.compute(geocontext)
>>> type(result.ndarray)
<class 'numpy.ma.core.MaskedArray'>
>>> result.ndarray.shape
(3, 1197, 1177)

For more examples, check out the example_notebooks folder on Workbench (read more here).

Proxy and Result Objects

In the Workflows client, there are two types of objects that you will interact with: lazy proxy objects and result objects. Every time you call a function or access an attribute on a Workflows object, it returns a proxy object representing what the result would be, and keeps track of that operation for later. When you call compute() on a proxy object, a graph of all those operations is sent to the backend, which executes it and sends the result back to you. That result is returned as a result object which is just a simple container for holding the result data.

For example, in normal Python, adding two numbers happens right away:

>>> 1 + 1
2

But if we use a Workflows Int, we just get a proxy object:

>>> from descarteslabs import workflows as wf
>>> wf.Int(1) + 1
<descarteslabs.workflows.types.primitives.number.Int at 0x...>

This proxy object is actually just storing a dependency graph representing the operation 1 + 1, in a syntax called “graft”:

>>> result = wf.Int(1) + 1
>>> result.graft
{'1': 1, '2': 1, '3': ['add', '1', '2'], 'returns': '3'}

You don’t ever need to worry about the graft or understand the syntax, but knowing that it’s happening might make the system a little less mysterious.

So when you call compute() on a proxy object, that dependency graph gets sent to the backend and executed there, and the result object is sent back to you. For Workflows types with Python equivalents like Int, Dict, or List, as is the case in this example, there is no special result object—we just use the native Python type.

>>> result.compute()
Job ID: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
[######] | Steps: 0/0 | Stage: SAVING | Status: SUCCESS
2

To further illustrate the difference between proxy and result objects we’ll access the properties attribute of an Image proxy object:

>>> img = wf.Image.from_id("landsat:LC08:PRE:TOAR:meta_LC80330352016022_v1")
>>> type(img)
<class 'descarteslabs.workflows.types.geospatial.image.Image'>
>>> img.properties
<descarteslabs.workflows.types.containers.known_dict.KnownDict[{'crs': Str, 'date': Datetime, 'geotrans': Tuple[Float, Float, Float, Float, Float, Float], 'id': Str, 'product': Str}, Str, Any] at 0x...>

We haven’t yet called compute() on img, so we only get the proxy object representing img.properties. Next we’ll call img.compute to get the actual data in an ImageResult, then look at the properties field on that:

>>> img_result = img.compute(ctx) # ctx is the geocontext used to compute img_result
Job ID: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
[######] | Steps: 17/17 | Stage: SAVING | Status: SUCCESS
>>> img_result
ImageResult:
  * ndarray: MaskedArray<shape=(3, 137, 398), dtype=float64>
  * properties: 'acquired', 'bucket', 'cloud_fraction_0', 'confidence_dlsr', ...
  * bandinfo: 'red', 'green', 'blue'
  * geocontext: 'geometry', 'resolution', 'crs', 'bounds', ...
>>> type(img_result)
<class 'descarteslabs.workflows.results.results.ImageResult'>
>>> img.result.properties
{'acquired': '2016-01-22T17:38:34.696493+00:00',
...
}

We now have our result object img_result (whose type is ImageResult) and can now access all of its attributes (not just properties). In the event that you only need a certain attribute of a result object, it is more efficient to call compute() directly on that attribute, instead of computing every attribute as we did in the example above:

>>> img_bandinfo_result = img.bandinfo.compute(ctx)
>>> img_bandinfo_result
{'red': {'color': 'Red',
...
}

The following diagram is a simplified view of the architecture of Workflows.

workflows architecture overview

Workflows Architecture Overview

Empty Rasters

Sometimes, no data exist for the dates or places on Earth where you’re looking. Rather than raising an error, this missing data is represented by an empty Image or ImageCollection, which has no data or metadata. All operations on these “empties” succeed, but propagate the “emptiness” (except for some operations like concatenation, which drop empties). In general, the goal is that missing data, and anything it touches, will simply be ignored in your computations. If you need it to be handled differently, you can replace it with default values.

For example, say we want to make a composite of the ascending and descending orbital passes of Sentinel-1. In some places, Sentinel-1 captures data in either an ascending and descending orbit; in other places, both. So if we filter by both both pass directions, we’ll sometimes get missing data.:

>>> col = wf.ImageCollection.from_id("sentinel-1:GRD", start_datetime="2019-04-01", end_datetime="2019-07-01")
>>> a = col.filter(lambda img: img.properties["pass"]=="ASCENDING") # empty ImageCollection (in eastern North America)
>>> d = col.filter(lambda img: img.properties["pass"]=="DESCENDING") # non-empty ImageCollection (in eastern North America)
>>>
>>> # Pick bands and min will succeed in both the empty and non-empty case
>>> a_min = a.pick_bands("vv").min(axis="images") # empty Image
>>> d_min = d.pick_bands("vv").min(axis="images") # non-empty Image
>>>
>>> mins_ic = wf.ImageCollection([a_min, d_min])
>>> # creating an ImageCollection drops any empty inputs,
>>> # so this will only contain `d_min` (`a_min` is an empty Image)
>>>
>>> max_composite = mins_ic.max(axis="images")
>>> # a maximum composite of the non-empty ascending/descending min composites.
>>> # since `a_min` was empty and `mins_ic` only contains `d_min`,
>>> # this is equivalent to `d_min`.
>>>
>>> max_composite.compute(ctx)
ImageResult:
  * ndarray: MaskedArray<shape=(1, 512, 512), dtype=float64>
...

What if we did not want to ignore empty values, but actually fill them in with something else? Let’s treat empty data explicitly as 0.:

>>> a_min_filled = a_min.replace_empty_with(0, mask=False, bandinfo={"vv": {}})
>>> # when `a_min` is empty, replaces it with a new Image with one "vv" band of all 0s.
>>> # if `a_min` wasn't empty, it would return it unchanged.
>>>
>>> mins_ic = wf.ImageCollection([a_min_filled, d_min])
>>> # now `mins_ic` contains both `a_min_filled` (all 0s) and `d_min` (actual data)
>>>
>>> max_composite = mins_ic.max(axis="images")
>>> max_composite.compute(ctx)
ImageResult:
  * ndarray: MaskedArray<shape=(1, 512, 512), dtype=float64>
...

Bear in mind that this is a contrived example, and is simply meant to demonstrate how empty rasters can be handled. There are many other ways to do something like this (ImageCollection.groupby() would be the simplest). More specific details about how empties are represented, and how they are handled in specific functions can be found below.

Both Image and ImageCollection objects can be empty, meaning they lack an ndarray, properites, and bandinfo. The .ndarray will be None, .properties will be an empty dictionary (Image) or empty list (ImageCollection), and .bandinfo will be an empty dictionary. Empty imagery objects have a .geocontext determined by the GeoContext passed to .compute(). An ImageCollection cannot have some empty and some non-empty images. Every image in a non-empty ImageCollection is non-empty. This is why concat drops empties. In a similar way, you cannot have some empty and some non-empty bands. Every band in a non-empty Image or ImageCollection is non-empty. This is why concat_bands and map_bands return an empty if any of the input bands or mapped bands are empty.

An empty Image cannot be explicitly constructed, and will only result from other operations. An empty ImageCollection can be explicitly constructed by calling the constructor with an empty list/list of empty images, as well as calling .from_id where no imagery matches the given constraints:

# Construct from empty list
>>> empty_col = wf.ImageCollection([])
>>> empty_col.length().compute(ctx) # ctx is the GeoContext the computation will be performed over
[###############] | Steps: 1/1 | Stage: STAGE_DONE | Status: STATUS_SUCCESS
0

# Construct from list of empty images
>>> empty_col = wf.ImageCollection([empty_img1, empty_img2, empty_img3])
>>> empty_col.length().compute(ctx)
[###############] | Steps: 1/1 | Stage: STAGE_DONE | Status: STATUS_SUCCESS
0

# No imagery matching given constraints
>>> empty_col = wf.ImageCollection.from_id("landsat:LC08:01:RT:TOAR", start_datetime="2017-01-01", end_datetime="2017-02-01") # no imagery exists between these start and end datetimes
>>> empty_col.length().compute(ctx)
0

Empties can be created without explicitly constructing them as shown above. Some operations can result in an empty, when being passed a non-empty.

Image Operations:

  • map_bands: If the mapper function returns an empty Image, map_bands will also return an empty Image.

ImageCollection Operations:

  • filter: If the filter function is not true for any images in the collection, filter will return an empty ImageCollection

    >>> non_empty_col = wf.ImageCollection.from_id("landsat:LC08:01:RT:TOAR", start_datetime="2017-01-01", end_datetime="2017-12-01")
    >>> filtd = non_empty_col.filter(lambda img: img.properties["date"].year == 2018) # none of the images are from 2018
    >>> filtd.length().compute(ctx)
    [###############] | Steps: 1/1 | Stage: STAGE_DONE | Status: STATUS_SUCCESS
    0
    
  • map: If the mapper function returns an empty Image, map will return an empty ImageCollection.

  • map_bands: If the mapper function returns an empty ImageCollection, map_bands will return an empty ImageCollection. If the mapper function returns an empty Image, map_bands will return an empty Image.

  • map_window: If the mapper function returns empty imagery, map_window will return an empty ImageCollection.

Empties are aggressively propagated. This means that when there is empty involved in an operation, it will result in a empty, regardless of the emptiness of other arguments:

>>> empty_col = wf.ImageCollection([])
>>> non_empty_col = wf.ImageCollection.from_id("landsat:LC08:01:RT:TOAR", start_datetime="2017-01-01", end_datetime="2017-07-01")
>>> added = non_empty_col + empty_col
>>> added.length().compute(ctx)
[###############] | Steps: 1/1 | Stage: STAGE_DONE | Status: STATUS_SUCCESS
0

Some operations are an exception to this:

  • concat: All non-empty imagery is preserved

    >>> empty_col = wf.ImageCollection([])
    >>> non_empty_col = wf.ImageCollection.from_id("landsat:LC08:01:RT:TOAR", start_datetime="2017-01-01", end_datetime="2017-07-01")
    >>> concat = empty_col.concat(non_empty_col) # non_empty_col.concat(empty_col) will yield the same result
    >>> concat.length().compute(ctx)
    [###############] | Steps: 1/1 | Stage: STAGE_DONE | Status: STATUS_SUCCESS
    4
    >>> concat = non_empty_col.concat(empty_col, non_empty_col)
    >>> concat.length().compute(ctx)
    [###############] | Steps: 1/1 | Stage: STAGE_DONE | Status: STATUS_SUCCESS
    8
    
  • ImageCollection constructor: All non-empty imagery is preserved

    >>> empty_col = wf.ImageCollection([])
    >>> empty_img = empty_col[0] # slicing into an empty ImageCollection results in an empty Image (all indices are valid)
    >>> non_empty_img = wf.Image.from_id("landsat:LC08:PRE:TOAR:meta_LC80270312016188_v1")
    >>> col = wf.ImageCollection([non_empty_img, empty_img, empty_img])
    >>> col.length().compute(ctx)
    [###############] | Steps: 1/1 | Stage: STAGE_DONE | Status: STATUS_SUCCESS
    1
    
  • mask: If the mask is empty, mask is a no-op. If the object being masked is empty, mask is an empty (type determined by broadcasting rules)

    >>> empty_col = wf.ImageCollection([])
    >>> non_empty_col = wf.ImageCollection.from_id("landsat:LC08:01:RT:TOAR", start_datetime="2017-01-01", end_datetime="2017-07-01")
    >>> no_op = non_empty_col.mask(empty_col)
    >>> no_op.length().compute(ctx)
    [###############] | Steps: 1/1 | Stage: STAGE_DONE | Status: STATUS_SUCCESS
    4
    >>> op = empty_col.mask(non_empty_col)
    >>> op.length().compute(ctx)
    [###############] | Steps: 1/1 | Stage: STAGE_DONE | Status: STATUS_SUCCESS
    0
    

Aggresive propagation of empties may not always be desirable. If you have just done an operation that may result in an empty, but do not want that emptiness to propagate down through the rest of the operations, you can replace it with a non-empty:

>>> empty_col = wf.ImageCollection([])
>>> non_empty_col = wf.ImageCollection.from_id("landsat:LC08:01:RT:TOAR", start_datetime="2017-01-01", end_datetime="2017-07-01")
>>> new_col = empty_col.replace_empty_with(non_empty_col) # replacing with an ImageCollection
>>> new_col.length().compute(ctx)
4
>>> new_col = empty_col.replace_empty_with(1.1, bandinfo={"red": {}, "green": {}, "blue": {}}) # replacing with a scalar (3 bands per Image)
>>> # new_col will be fully masked as the default behavior of replace_empty_with (specifying mask=False would make it unmasked)
>>> new_col.length().compute(ctx)
1

See Image.replace_empty_with() and ImageCollection.replace_empty_with() for details on how to replace empty Image and ImageCollection objects. More information about empty handling for specific functions can be found in their documentation.

Broadcasting Imagery

In Workflows, numbers, Images, and ImageCollections are all interoperable, even though they contain different amounts of data. For example, adding a number to an Image adds it to every pixel in the Image. Like NumPy and other array computing libraries, Workflows imagery objects support vectorized operations and broadcasting: when using an operator like addition, lower-dimensional data is first “stretched” (broadcast) to match up with high-dimensional data, then the operator is applied element-wise (vectorized) between all pixels in the images.

Broadly, the following statements apply when combining imagery:

  • Scalars broadcast to everything

  • Images broadcast to ImageCollections

  • 1-band imagery broadcasts to N-band imagery

  • N-band imagery must have the same band names to interoperate

  • ImageCollections must be the same length to interoperate

  • Imagery operations are aligned by band name

Some common examples of combining imagery:

>>> imagecollection = wf.ImageCollection.from_id("sentinel-2:L1C")
>>> img = imagecollection[0]
>>>
>>> img + 1  # adds 1 to every pixel
>>> img + img  # adds corresponding pixels (equivalent to `img * 2`)
>>> img + img.pick_bands("red")  # adds corresponding pixel from the red band to every band
>>> img.pick_bands("red green") + img.pick_bands("red green")  # adds red band to red band, and green band to green band
>>> img.pick_bands("red green") + img.pick_bands("green red")  # same as above---bands are matched up by name
>>>
>>> imagecollection + 1  # adds 1 to every pixel in every Image
>>> imagecollection + imagecollection  # adds corresp. pixels in corresp. Images (equivalent to `imagecollection * 2`)
>>> imagecollection + imagecollection.pick_bands("red")  # adds corresp. pixel from the red band to every band, Image by Image
>>> imagecollection + img  # adds `img` to every Image
>>> imagecollection + img.pick_bands("red")  # adds `img`'s red band to every band in every Image

Image and ImageCollection objects have three attributes that may be broadcast: .ndarray, .properties and .bandinfo (for more information about these attributes and how to access them, see the Proxy and Result Objects section). The following sections cover in detail how Workflows handles combining and broadcasting these attributes.

Broadcasting ndarrays

For an Image the .ndarray property is a 3-dimensional NumPy array (with axes corresponding to: bands, y pixels, and x pixels). For an ImageCollection it is 4-dimensional (with axes corresponding to: images, bands, y pixels, and x pixels). When combining ndarrays, if broadcasting is necessary, it is done according to the NumPy broadcasting guide. (Note that the .ndarray property is not directly accessible on Image or ImageCollection proxy objects. Conceptually it does exist on the backend but only appears on result objects after calling compute(). For more information, see the Proxy and Result Objects section.)

Incorrect ndarray broadcasting

Combining imagery objects with different shapes is an invalid operation. For example, you cannot combine two ImageCollections with different numbers of bands (eg. the .ndarray shapes (3, 4, 512, 512) and (3, 2, 512, 512) cannot be broadcasted together).

You do not need to worry about having different shapes in the x pixels and y pixels dimensions. Since every raster is loaded with the same geocontext, those dimensions will always be the same.

Broadcasting properties

The .properties attribute of an Image is a dictionary; for an ImageCollection it is a list of dictionaries. Combining properties is done by taking the intersection of dictionaries. In the following example, the .properties dictionary of my_image is intersected with every property dictionary of my_imagecollection (ie. my_image.properties is broadcasted to the length of my_imagecollection.properties). The resulting list of properties contains only keys where every dictionary combination had the same value.

>>> my_image.properties.compute(ctx)
{"color": "red", "size": "large"}
>>> my_imagecollection.properties.compute(ctx)
[{"color": "red", "size":"small"}, {"color":"blue", "size": "large"}]
>>> result = my_image + my_imagecollection
>>> result.properties.compute(ctx)
[{"color": "red"}, {"size": "large"}]

Broadcasting bandinfo

For both Image and ImageCollection, the .bandinfo attribute is a dictionary. Bandinfos can only be combined if the imagery objects have the same number of bands with the same names (with one exception, see below). If two imagery objects have the same bands but in different orders, when they are combined, the order of the first operand is perserved. The actual combining of bandinfo is performed in the same way as when combining properties: the intersection is taken of the two band’s dictionaries. In the following example the .bandinfo of my_image is broadcasted to the .bandinfo of my_imagecollection and combined accordingly.

>>> my_image.bandinfo.compute(ctx)
{'red': {'color':'Red', 'product': 'sentinel-2:L1C'}}
>>> my_imagecollection.bandinfo.compute(ctx)
{'red': {'color':'Red', 'product': 'sentinel-2:L1C'}, 'blue': {'color':'Blue', 'product': 'sentinel-2:L1C'}}
>>> result = my_image + my_imagecollection
>>> result.bandinfo.compute(ctx)
{'red': {'product': 'sentinel-2:L1C'}, 'blue': {'product': 'sentinel-2:L1C'}}

The one exception to the above combination rules happens when two objects each have a single band. In this case, the objects can be combined regardless of band name. The resulting imagery object’s band name will be of the format <first_bandname>_<op_name>_<second_bandname> where op_name is the name of the operation performed to combine the two bands.

>>> first = img.pick_bands("red")
>>> second = img.pick_bands("blue")
>>> result = first + second
>>> result.bandinfo.compute(ctx)
{'red_add_blue': {...}}

Interactive Maps in Jupyter

Some Workflows objects can be viewed on an interactive map in a Jupyter Notebook. Rather than calling .compute() with an explicit GeoContext, the workflow is computed on-the-fly for the area you’re viewing in the map.

Workflows Interactive Map

Using Workbench

The fastest and easiset way to get started with interactive maps is to use the Descartes Labs Workbench. There is no installation required, simply log in and create a new Jupyter Notebook. See Usage for instructions on how to use wf.map.

Local Installation

To use the interactive map locally, you will need an installation of either JupyterLab (recommended) or Jupyter Notebook. Installation instructions for both can be found here. Additionally, the map requires ipyleaflet, which is included when running pip install --upgrade "descarteslabs[complete]", or by manually running pip install ipyleaflet.

Currently, ipyleaflet requires some additional installation steps to make widgets show up in Jupyter. For JupyterLab:

jupyter labextension install jupyter-leaflet @jupyter-widgets/jupyterlab-manager

If you’re using plain Jupyter Notebook and maps don’t show up, try:

jupyter nbextension enable --py --sys-prefix ipyleaflet

See Usage for instructions on how to use wf.map.

Usage

wf.map is a single MapApp instance that all layers are added to by default. To use the map, one of your first cells should be:

import descarteslabs.workflows as wf
wf.map

This will display the map below that cell. Right-clicking on the map and selecting ‘New View for Output’, will allow you to rearrange the map as its own tab.

Then, calling visualize will add a new layer to wf.map. (Note that visualize just adds the layer; nothing will show up directly underneath that cell.) Using the provided layer controls, you can adjust scaling, set colormaps for single-band images, perform autoscaling to the current viewport with the “Autoscale” button, and rearrange layers.

Currently, only Image objects can be displayed. To visualize an ImageCollection, first composite it (using mean, min, etc.) into an Image. To visualize vector data, you can rasterize it into an Image.

Troubleshooting

  • Running wf.map shows “A Jupyter widget could not be displayed because the widget state could not be found.”:

    The ipyleaflet Jupyter plugins aren’t installed correctly. Make sure the ipyleaflet Python package is installed in your environment, and follow the installation steps above.

  • You call Image.visualize(), but nothing shows up on the map:

    Sometimes it just takes a while. However, you can try the following steps to verify your environment is correct.

    1. Make sure you’re logged in to your Descartes Labs account. Visit iam.descarteslabs.com, log in, then refresh the Jupyter page.

    2. Verify that you have the latest version of the descarteslabs client installed. You can print descarteslabs.__version__, or open a terminal and run pip freeze | grep descarteslabs. To upgrade your client to the latest version run pip install --upgrade "descarteslabs[complete]". The most recent version of the client is listed on PyPI.

    3. Make sure that the layer visibility is on. This is indicated with a small checkbox at the far left of the layer row under the map. Additionally, the slider to the right of the checkbox controls opacity, with the right-most position indicating 100% opacity.

    4. If the layer visibility and opacity are turned on and you still see nothing, check that the checkerboard button is turned on for the layer, or pass checkerboard=True to visualize. If you see a checkerboard pattern on the map, this means that your job returned empty imagery for that area, which may mean you are viewing the wrong area or could indicate an issue with your code.

    Workflows Interactive Map
    1. Check status.descarteslabs.com for updates about currently ongoing outages.

If you’re still having trouble, please contact support@descarteslabs.com.

Workflows

A Workflow is a persisted proxy object, plus metadata like name, description, and eventually access controls.

When a workflow is saved on the backend, you or others can link to it in other workflows—much like importing a package in other programming languages:

>>> # continuing from previous example, where `result` is an `Int` proxy object
>>> workflow = result.publish(name="one-plus-one", description="The result of 1 plus 1")
>>> workflow.id
'f8be90ba80990f081cc8460d984ffcbcb1709222c99db052'
>>> workflow.type
descarteslabs.workflows.types.primitives.number.Int

wf.retrieve loads a saved Workflow by ID:

>>> same_workflow = wf.retrieve('f8be90ba80990f081cc8460d984ffcbcb1709222c99db052')
>>> same_workflow.name
"one-plus-one"
>>> same_workflow.description
"The result of 1 plus 1"
>>> same_workflow.type
descarteslabs.workflows.types.primitives.number.Int

Workflow.object contains the actual proxy object, which you can use in your code:

>>> same_workflow.object
<descarteslabs.workflows.types.primitives.number.Int at 0x1152ed110>
>>> (same_workflow.object + 2).compute()
[###############] | Steps: 0/0 | Stage: STAGE_DONE | Status: STATUS_SUCCESS
4

wf.use is a shorthand for wf.retrieve(...).object, and you can use it like an import statement:

>>> two = wf.use('f8be90ba80990f081cc8460d984ffcbcb1709222c99db052')
# `two` is equivalent to `same_workflow.object` from above

Jobs

All computations are executed asynchronously; they will complete in the background without blocking. This execution is represented by the Job object which can stream updates from the running computation including its current status, stage, and progress.

>>> job = result.compute(block=False)
>>> job.id
626e3036857d492fbc11e7fa09b25f16

The Job also allows blocking until the result is available.

>>> Job.get("626e3036857d492fbc11e7fa09b25f16").result()
[###############] | Steps: 1/1 | Stage: STAGE_DONE | Status: STATUS_SUCCESS
2

Jobs execute in a queue and may be subject to a delay as the queue size grows. If you desire greater resources or a prioritized queue, please contact support@descarteslabs.com.

NumPy API (Experimental)

The Workflows Array type mimics a NumPy ndarray, supporting vectorized operations, broadcasting, and advanced indexing with the same syntax as NumPy. Workflows also contains a workflows.numpy submodule with equivalent versions of most NumPy ufuncs, and some other routines.

Note

Array is an experimental API. It may be changed in the future, will not necessarily be backwards compatible, and may have unexpected bugs. Please contact us with any feedback!

You can access a Workflows Array from Image.ndarray and ImageCollection.ndarray. You can also construct one from a local NumPy array or list, as long as it’s relatively small (< 10MiB when JSON-serialized as a list):

>>> import descarteslabs.workflows as wf
>>> import numpy as np
>>> np_arr = np.array([[1.1, 2.2, 4.4], [8.8, 9.9, 10.0]])
>>> wf.Array.from_numpy(np_arr)
<descarteslabs.workflows.types.array.array_.Array at 0x...>

>>> imgs = wf.ImageCollection.from_id("sentinel-2:L1C", start_datetime="2018-01-01", end_datetime="2018-03-01")
>>> imgs.ndarray
<descarteslabs.workflows.types.array.masked_array.MaskedArray at 0x...>

>>> arr = wf.Array[wf.Int, 2]([[1, 2, 4], [8, 9, 10]])
>>> arr
<descarteslabs.workflows.types.array.masked_array.MaskedArray at 0x...>

Slicing

Arrays support the most of the indexing syntax from NumPy:

>>> arr = wf.Array([[1, 2, 4], [8, 9, 10]])

>>> arr[0]
<descarteslabs.workflows.types.array.array_.Array at 0x...>
>>> arr[0].compute()
array([1, 2, 4])

>>> arr[0, 0]
<descarteslabs.workflows.types.primitives.number.Int at 0x...>
>>> arr[0, 0].compute()
1

>>> arr[:, [0, 2]]
<descarteslabs.workflows.types.array.array_.Array at 0x...>
>>> arr[:, [0, 2]].compute()
array([[ 1,  4],
      [ 8, 10]])

>>> arr[np.newaxis, 0, 1].compute()
<descarteslabs.workflows.types.array.array_.Array at 0x...>
>>> arr[np.newaxis, 0, 1].compute()
array([2])

>>> arr[..., 0]
<descarteslabs.workflows.types.array.array_.Array at 0x...>
>>> arr[..., 0].compute()
array([1, 8])


>>> (arr > 3)
<descarteslabs.workflows.types.array.array_.Array at 0x...>
>>> (arr > 3).compute()
array([[False, False,  True],
      [ True,  True,  True]])

>>> arr[arr > 3]
<descarteslabs.workflows.types.array.array_.Array at 0x...>
>>> arr[arr > 3].compute()
array([ 4,  8,  9, 10])

Array supports:

  • Slicing by integers and slices: x[0, :5]

  • Slicing by lists/arrays of integers: x[[1, 2, 4]]

  • Slicing by lists/arrays of booleans: x[[False, True, True, False, True]]

  • Slicing one Array with an Array of bools: x[x > 0]

  • Slicing one Array with a zero or one-dimensional Array of ints: a[b]

The only unsupported indexing operations are:

  • Lists or arrays in multiple axes (x[[1, 2, 3], [3, 2, 1]])

  • Slicing with a multi-dimensional Array of ints

Operations and ufuncs

The workflows.numpy submodule contains equivalents of most of the NumPy ufuncs (elementwise numerical operators like np.add, np.sqrt, etc.), and some other routines. You can use them on proxy types (Array, Int, Float, Bool) as well as NumPy arrays and Python scalars (int, float). They always return proxy types.

Additionally, you can use the actual NumPy version of any of these on a Workflows Array; internally, NumPy will just dispatch to the Workflows version:

>>> import numpy as np
>>> arr = wf.Array.from_numpy(np.arange(4))

>>> wf.numpy.square(arr)
<descarteslabs.workflows.types.array.array_.Array at 0x7ff153cfead0>
>>> wf.numpy.square(arr).compute()
array([0, 1, 4, 9])

>>> np.square(arr)  # still returns Workflows array, even though using the NumPy function
<descarteslabs.workflows.types.array.array_.Array at 0x7ff153cfead0>
>>> np.square(arr).compute()
array([0, 1, 4, 9])

Interacting with Imagery

Arrays and raster objects (Image and ImageCollection) are not directly interoperable (arr + img won’t work, for example).

However, you can access the array from a raster object with the ndarray field. And you can turn an Array back into an Image or ImageCollection with Array.to_imagery. Note that Array.to_imagery always returns an ImageCollection even if the Array is only 3D. If you are expecting an Image, you can index into the result like my_col[0]:

>>> imgs = wf.ImageCollection.from_id("sentinel-2:L1C", start_datetime="2018-01-01", end_datetime="2018-03-01")
>>> rgb = imgs.pick_bands("red green blue")

>>> spectral_target = [0.2, 0.3, 0.4]  # per-band spectral targets
>>> spectral_target_arr = wf.Array(spectral_target)

>>> delta = rgb.ndarray - spectral_target[None, :, None, None]
>>> # ^ must expand to 4 dimensions to align with the ImageCollection's 4D Array

>>> delta_std = delta.std(axis=0)  # std deviation over axis 0, aka `axis="images"`

>>> delta_img = delta_std.to_imagery()[0]  # no properties/bandinfo given
>>> delta_img.compute(wf.map.geocontext())
ImageResult:
* ndarray: MaskedArray<shape=(3, 135, 398), dtype=float64>
* properties:
* bandinfo: 'band_1', 'band_2', 'band_3'
* geocontext: 'geometry', 'resolution', 'crs', 'bounds', ...

Array.to_imagery will create empty metadata for the raster object if you don’t pass any in, defaulting to empty properties and bands named band_1 through band_N. But when appropriate, you can pass in specific metadata:

>>> delta_img_with_metadata = delta_std.to_imagery({"foo": "bar"}, {"red_d": {}, "green_d": {}, "blue_d": {}})[0]
>>> delta_img_with_metadata.compute(wf.map.geocontext())
ImageResult:
  * ndarray: MaskedArray<shape=(3, 135, 398), dtype=float64>
  * properties: 'foo'
  * bandinfo: 'red_d', 'green_d', 'blue_d'
  * geocontext: 'geometry', 'resolution', 'crs', 'bounds', ...

Continue to Workflows API Reference.