Sharing¶
Just as most programming languages support importing packages created by others, or creating packages yourself, Workflows has a built-in mechanism to share your analyses so that others can reuse and build on top of them.
With publish
, you can share any object you create in the Workflows client as a Workflow
with a specific ID (indeed, that’s where the name “Workflows” comes from). Then, other users (or you) can call use
with that ID and get back the same object, on any computer, without actually running or even seeing your Python code.
>>> import descarteslabs.workflows as wf
>>> number = wf.Int(42)
>>> number.publish("you@email.com:magic_number", "1.0.0")
>>> same_number = wf.use("you@email.com:magic_number", "1.0.0")
>>> same_number.inspect()
42
Since objects in Workflows are really just sets of instructions, this lets you create “virtual products”:
>>> cropland_data_layer = wf.ImageCollection.from_id("usda:cdl:v1")
>>> is_corn = (cropland_data_layer == 1) | (cropland_data_layer == 12) | (cropland_data_layer == 13)
>>> is_corn.publish(
... id="you@email.com:corn_mask",
... version="1.0.0",
... title="Annual Corn Mask",
... description="Binary mask of corn in the U.S.",
... labels={"source_product": "usda:cdl:v1"},
... tags=["cdl", "corn", "agriculture"],
... docstring="1-band ImageCollection of corn presence, derived from USDA's CDL. True means corn."
... )
Or share specialized algorithms as Functions
so others don’t have to re-implement them:
>>> @wf.publish(
... id="you@email.com:mask_clouds",
... version="1.0.0",
... title="Apply cloud mask bands",
... tags=["cloud-mask"]
... )
... def mask_by_cloud_bands(ic: wf.ImageCollection):
... "Mask an ImageCollection by all of the commonly-named cloud mask bands"
... cloud_bands = ic.pick_bands("cloud-mask cloud-mask-0 cirrus-cloud-mask", allow_missing=True)
... cloud_mask = cloud_bands.sum(axis="bands") > 0
... return ic.mask(cloud_mask)
You can combine Workflows together, building more complex analyses out of components others have developed:
>>> modis = wf.ImageCollection.from_id("modis:09:v2", start_datetime="2020-01-01", end_datetime="2021-01-01")
>>> mask_clouds = wf.use("you@email.com:mask_clouds", "1.0.0")
>>> cloud_masked_modis = mask_clounds(modis)
>>> corn_mask = wf.use("you@email.com:corn_mask", "1.0.0")
>>> corn_mask_2020 = corn_mask.filter(lambda img: img.properites["date"].year == 2020)
>>> corny_modis = cloud_masked_modis.mask(~corn_mask_2020)
>>> corny_ndvi_values = corny_modis.pick_bands("ndvi").mean(axis=("pixels", "bands"))
>>> dates = corny_modis.properites.map(lambda p: p["date"])
>>> corny_ndvi_timeseries = wf.zip(dates, corny_ndvi_values)
And then publish those as new Workflows, built out of existing Workflows:
>>> corny_ndvi_timeseries.publish(
... id="you@email.com:corny_ndvi_timeseries",
... version="0.0.1",
... title="2020 corn mean NDVI timeseries",
... tags=["cdl", "agriculture", "bad jokes"],
... docstring="List of tuples of (Datetime, mean NDVI value)",
... )
Workflows and VersionedGrafts¶
A Workflow
is a container, which contains multiple, immutable versions. Think of a Workflow
like a library or package: it does one distinct thing, but may change over time with bug fixes or improvements. (Though unlike a typical software library, a Workflow
should only hold one thing, not a collection of related things.)
Workflows have globally-unique IDs, which follow the format you@email.com:workflow_name
where you@email.com
is your email address, and workflow_name
is any string starting with a letter, and containing only letters, numbers, and underscores. Once set, the ID cannot be changed (though you could use Workflow.duplicate
to copy it to a new ID). Workflows also have a few metadata fields, which can change over time, such as title
, description
, labels
, and tags
.
>>> corn_mask_workflow = wf.Workflow.get("you@email.com:corn_mask")
>>> corn_mask_workflow
Workflow: "Annual Corn Mask"
- id: you@email.com:corn_mask
- labels: {'source_product': 'usda:cdl:v1'}
- tags: ['cdl', 'corn', 'agriculture']
- versions: '1.0.0'
Binary mask of corn in the U.S.
A Workflow
contains multiple versions, called VersionedGrafts
. Once a version is created, it’s immutable and cannot be edited in any way. If you have a reason to change it, you should create a new version instead. This immutability ensures that if your code is referencing another Workflow
, it can’t change out from under you: if you don’t change your code, it will always behave the same way. (This is also why there’s no way to use the “latest” version of a Workflow
: since “latest” could change at any point, your code could change or break unexpectedly.)
Versions are named following semantic versioning, like 1.0.0
. Versions also have some metadata, like a docstring
and labels
. This metadata is immutable as well.
>>> vg = corn_mask_workflow["1.0.0"]
>>> vg
VersionedGraft: 1.0.0
- type: ImageCollection
- channel: master
- labels: {}
- viz_options: []
1-band ImageCollection of corn presence, derived from USDA's CDL. True means corn.
When you access the object
stored for that version, you can use it just like any other object in Workflows:
>>> vg.object
<descarteslabs.workflows.types.geospatial.imagecollection.ImageCollection at 0x7f7b067de510>
>>> corn_count = vg.object.sum()
Publishing Workflows¶
Using wf.publish
, or calling .publish
on any object, is the easiest way create a Workflow
, or add a new version to it.
When you call publish
, if the Workflow ID doesn’t exist yet, it’s created and your new version is added.
>>> food = wf.Str("grapes")
>>> food.publish("you@email.com:favorite_food", "1.0.0")
Workflow: ""
- id: you@email.com:favorite_food
- labels: {}
- tags: []
- versions: '1.0.0'
If the Workflow ID already exists, then your new version is added. All the mutable fields on the Workflow
(like title
or description
) are also updated to whatever you set (or cleared, if they’re not set).
>>> better_food = wf.Str("pizza")
>>> better_food.publish(
... id="you@email.com:favorite_food",
... version="2.0.0",
... title="My favorite food",
... description="I like this food the most"
... )
Workflow: "My favorite food"
- id: you@email.com:favorite_food
- labels: {}
- tags: []
- versions: '1.0.0', '2.0.0'
I like this food the most
Note that once a version is created, you can’t overwrite it:
>>> best_food = wf.Str("falafel")
>>> best_food.publish("you@email.com:favorite_food", "2.0.0")
---------------------------------------------------------------------------
AlreadyExists Traceback (most recent call last)
...
AlreadyExists: 409 Version '2.0.0' already exists with a different graft
Sharing Workflows¶
By default, workflows are private and are only accessible to the user (a.k.a. the owner) that created the workflow. However, the owner can choose to share the workflow with other users so that they are also able to access and execute the workflow:
>>> corn_mask_workflow = wf.Workflow.get("you@email.com:corn_mask")
>>> corn_mask_workflow.add_reader("jack@email.com")
>>> corn_mask_workflow.add_reader("jill@email.com")
>>> for reader in corn_mask_workflow.list_readers():
... print(reader)
jack@email.com
jill@email.com
>>> corn_mask_workflow.remove_reader("jack@email.com")
>>> for reader in corn_mask_workflow.list_readers():
... print(reader)
jill@email.com
It is also possible to share a workflow with all users of the Descartes Labs Platform, but additional permissions are required to do so:
>>> corn_mask_workflow = wf.Workflow.get("you@email.com:corn_mask")
>>> corn_mask_workflow.add_public_reader() # You need additional permissions
>>> corn_mask_workflow.has_public_reader()
True
>>> corn_mask_workflow.remove_public_reader()
>>> corn_mask_workflow.has_public_reader()
False
Please contact Descartes Labs if you have a need to share workflows publicly.
When you share a workflow (either with a specific reader or with the general public) you are still the only one who can edit it, add new versions, or delete it. When you grant access to another user by sharing it, that user can retrieve, search for, use, or execute the workflow. However, the user must also have access rights for any data or imagery accessed by the workflow.
Importing Workflows¶
To import a Workflow, use wf.use
, specifying the Workflow ID and version. This returns the object at that version, ready for you to use in your code. Think of it like an import
statement. (In fact, we would have called it import
, but that’s a reserved word in Python.)
>>> favorite_food = wf.use("you@email.com:favorite_food", "2.0.0")
>>> favorite_food.inspect()
'pizza'
>>> old_favorite_food = wf.use("you@email.com:favorite_food", "1.0.0")
>>> old_favorite_food.inspect()
"grapes'
If you want to look at the Workflow
or VersionedGraft
object itself, you can use Workflow.get
or VersionedGraft.get
.
Publishing Functions¶
In many real-world cases, you don’t want to publish a static object, but rather something users can run on their own data, or adjust by passing in their own parameters.
When wf.publish
is used as a function decorator, it converts your Python function into a Workflows Function
and publishes it:
>>> @wf.publish("you@email.com:say_hello", "0.0.1")
... def say_hello(name: wf.Str):
... return "Hello, " + name
>>> greeter = wf.use("you@email.com:say_hello", "0.0.1")
>>> greeter("friend").inspect()
'Hello, friend'
>>> greeter(greeter("again")).inspect()
'Hello, Hello, again'
Note
You must use type annotations on all of your function arguments
In def say_hello(name: wf.Str)
, the : wf.Str
part is a type annotation. Since Workflows is a strongly-typed system, you have to use type annotations in functions you publish
. Additionally, you must only give Workflows types as type annotations: use wf.Str
, not str
, for example.
Publishing Functions, more easily¶
When working in a Jupyter notebook, it can be inconvenient to convert your exploratory code, spread across many cells, into one function.
Instead, if you use wf.widgets or wf.parameter
for things that you’d want to be function arguments, when you call publish
on an object, it’ll be automatically converted to a Function, where the function arguments are whatever widgets/parameters that object depended on.
>>> word = wf.parameter("word", wf.Str)
>>> repeats = wf.widgets.slider("repeats", min=0, max=5, step=1)
>>> repeated = (word + " ") * repeats
>>> repeated.publish("you@email.com:repeater", "1.0.0")
>>> repeater = wf.use("gabe@descarteslabs.com:repeater", "1.0.0")
>>> # our object was converted into a wf.Function because it depended on parameters
>>> repeater
<descarteslabs.workflows.types.function.function.Function[{'word': Str, 'repeats': Int}, Str] at 0x7f7b040abd90>
>>> repeater("hello", 3).inspect()
'hello hello hello '
This is equivalent to:
>>> @wf.publish("you@email.com:say_hello", "1.0.0")
... def say_hello(word: wf.Str, repeats: wf.Int):
... return (word + " ") * repeats
Using widgets to create published Functions supports an iterative, interactive process in Jupyter notebooks. While prototyping an analysis, you might add wf.widgets to your code to explore how tweaking a threshold changes the output, or how your model performs on different date ranges or products, by changing the widgets and seeing the results update live on wf.map. As you refine your code and want to share it for others to reuse, many of those widgets may in fact be the parameters you’d want users to be able to control as well. Then you don’t have to restructure your code in order to publish it—in fact, you can even leave your visualization and debugging code in the notebook, so if you need to revise the algorithm and publish a new version in the future, you can pick up where you left off.