Tasks

Classes:

Tasks([url, auth, retries])

The Tasks API allows you to easily execute parallel computations on cloud infrastructure with high-throughput access to imagery.

FutureTask(guid, tuid[, client, args, kwargs])

A submitted task which may or may not have completed yet.

CloudFunction(group_id[, name, client, …])

Represents the asynchronous function of a task group.

Exceptions:

TransientResultException

alias of descarteslabs.common.tasks.futuretask.TransientResultError

GroupTerminalException

Raised when waiting on a task group that stopped accepting tasks.

BoundGlobalError

Raised when a global is referenced in a function where it won’t be available when executed remotely.

Functions:

as_completed(tasks[, show_progress])

Yields completed tasks from the list of given tasks as they become available, finishing when all given tasks have been completed.

TransientResultException

alias of descarteslabs.common.tasks.futuretask.TransientResultError

exception GroupTerminalException[source]

Raised when waiting on a task group that stopped accepting tasks.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
exception BoundGlobalError[source]

Raised when a global is referenced in a function where it won’t be available when executed remotely.

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

args
class Tasks(url=None, auth=None, retries=None)[source]

The Tasks API allows you to easily execute parallel computations on cloud infrastructure with high-throughput access to imagery.

Parameters
  • url (str) – URL for the tasks service. Only change this if you are being asked to use a non-default Descartes Labs catalog. If not set, then descarteslabs.config.get_settings().TASKS_URL will be used.

  • auth (Auth) – A custom user authentication (defaults to the user authenticated locally by token information on disk or by environment variables)

  • retries (urllib3.util.retry.Retry) – A custom retry configuration used for all API requests (defaults to a reasonable amount of retries)

Attributes:

ADAPTER

COMPLETION_POLL_INTERVAL_SECONDS

CONNECT_TIMEOUT

READ_TIMEOUT

RERUN_BATCH_SIZE

RETRY_CONFIG(*args, **kwargs)

Retry configuration.

TASK_RESULT_BATCH_SIZE

TIMEOUT

session

The session instance used by this service.

token

The bearer token used in the requests.

Methods:

create_function(f, image[, name, cpus, …])

Creates a new task group from a function and returns an asynchronous function that can be called to submit tasks to the group.

create_namespace()

This method has been deprecated. Manually creating a namespace is no longer required.

create_or_get_function(f, image[, name, …])

This method has been deprecated. Please use create_function() or get_function_by_id() instead.

create_webhook(group_id[, name, label_path, …])

Create a new webhook for submitting tasks to task group.

delete_group_by_id(group_id)

Terminates a task group by id.

delete_webhook(group_id, webhook_id)

Delete an existing webhook.

get_default_session_class()

Get the default session class for Service.

get_function(name)

This method has been deprecated. Please use get_function_by_id() instead.

get_function_by_id(group_id)

Get an asynchronous function by group id.

get_group(group_id[, include])

Retrieves a single task group by id.

get_group_by_id(group_id[, include])

Retrieves a single task group by id.

get_group_by_name(name[, status])

Retrieves a single task group by name.

get_task_result(group_id, task_id[, include])

Retrieves a single task result.

get_task_result_batch(group_id, task_ids[, …])

Retrieves a multiple task results by id.

get_task_results(group_id[, limit, offset, …])

Retrieves a portion of task results matching the given criteria.

get_webhook(webhook_id)

Returns a webhook’s configuration.

get_webhooks(group_id)

List all webhooks for a task group.

iter_groups([status, created, updated, …])

Iterates over all task groups matching the given criteria.

iter_task_results(group_id[, status, …])

Iterates over all task results matching the given criteria.

list_groups([status, created, updated, …])

Retrieves a limited list of task groups matching the given criteria.

list_task_results(group_id[, limit, offset, …])

Retrieves a portion of task results matching the given criteria.

list_webhooks(group_id)

List all webhooks for a task group.

new_group(function, container_image[, name, …])

Creates a new task group.

new_task(group_id[, arguments, parameters, …])

Submits a new task to a group.

new_tasks(group_id[, list_of_arguments, …])

Submits multiple tasks to a group.

rerun_failed_tasks(group_id[, retry_count])

Submits all failed tasks for a rerun, except for out-of-memory or version mismatch failures.

rerun_matching_tasks(group_id[, status, …])

Submits all completed tasks matching the given search arguments for a rerun.

rerun_tasks(group_id, task_id_iterable[, …])

Submits a list of completed tasks specified by ids for a rerun.

set_default_session_class(session_class)

Set the default session class for Service.

terminate_group(group_id)

Terminates a task group by id.

update_credentials()

Updates the credentials for the tasks run by this user.

update_group(group_id[, container_image, …])

Update attributes of a group.

wait_for_completion(group_id[, show_progress])

Waits until all submitted tasks for a given group are completed.

create_function(f, image, name=None, cpus=1, gpus=0, memory='2Gi', maximum_concurrency=None, minimum_concurrency=None, minimum_seconds=None, task_timeout=1800, retry_count=0, include_modules=None, include_data=None, requirements=None, **kwargs)[source]

Creates a new task group from a function and returns an asynchronous function that can be called to submit tasks to the group.

Parameters
  • f (function) – The function to be called in a task.

  • image (str) – The location of a docker image to be used for the environment in which the function is executed.

  • name (str) – An optional name used to later help identify the function.

  • cpus (int) – The number of CPUs requested for a single task. A task might be throttled if it uses more CPU. Default: 1. Maximum: 32.

  • gpus (int) – The number of GPUs requested for a single task. As of right now, a maximum of 1 GPU is supported. Default: 0. Maximum: 1.

  • memory (str) – The maximum memory requirement for a single task. If a task uses substantially more memory it will be killed. The value should be a string and can use postfixes such as Mi, Gi, MB, GB, etc (e.g. “4Gi”, “500MB”). If no unit is specified it is assumed to be in bytes. Default: 2Gi. Maximum: 96Gi.

  • maximum_concurrency (int) – The maximum number of tasks to run in parallel. Default: 5. Maximum: 500. If you need higher concurrency contact your Descartes Labs customer success representative.

  • minimum_concurrency (int) – The minimum number of tasks to run right away in parallel. Concurrency is usually scaled up slowly when submitting new tasks. Setting this can mean more immediate processing of this many newly submitted tasks. Note that setting this means the equivalent resources of this many permanently running tasks will be charged to your account while this group is active. Default: 0. Maximum: 4.

  • minimum_seconds (int) – The number of seconds to wait for new tasks before scaling down concurrency, after a task is finished. Default: 0. Maximum: 600.

  • task_timeout (int) – Maximum runtime for a single task in seconds. A task will be killed if it exceeds this limit. Default: 30 minutes. Minimum: 10 seconds. Maximum: 24 hours.

  • retry_count (int) – Number of times to retry a task if it fails Default: 0. Maximum: 5.

  • include_modules (list(str)) – Locally importable python (or cython) names to include as modules in the task group, which can be imported by the entrypoint function, function.

  • include_data (list(str)) – Non python data files to include in the task group. Data path must be descendant of system path or python path directories.

  • requirements (list(str)) – A list of Python dependencies required by this function or a path to a file listing those dependencies, in standard setuptools notation (see PEP 508 https://www.python.org/dev/peps/pep-0508/). For example, if the packages foo and bar are required, then [‘foo’, ‘bar’] or [‘foo>2.0’, ‘bar>=1.0’] might be possible values.

Returns

A CloudFunction.

Return type

CloudFunction

Raises

BadRequest – Raised if any of the supplied parameters are invalid.

create_namespace()[source]

This method has been deprecated. Manually creating a namespace is no longer required.

Creates a namespace for the user and sets up authentication within it from the current client id and secret. Must be called once per user before creating any tasks.

Returns

True if successful, False otherwise.

create_or_get_function(f, image, name=None, cpus=1, gpus=0, memory='2Gi', maximum_concurrency=None, minimum_concurrency=None, minimum_seconds=None, task_timeout=1800, retry_count=0, **kwargs)[source]

This method has been deprecated. Please use create_function() or get_function_by_id() instead.

Creates or gets an asynchronous function. If a task group with the given name exists, returns an asynchronous function for the newest existing group with that. Otherwise creates a new task group.

Parameters
  • f (function) – The function to be called in a task.

  • image (str) – The location of a docker image to be used for the environment in which the function is executed.

  • name (str) – An optional name used to later help identify the function.

  • cpus (int) – The number of CPUs requested for a single task. A task might be throttled if it uses more CPU. Default: 1. Maximum: 32.

  • gpus (int) – The number of GPUs requested for a single task. As of right now, a maximum of 1 GPU is supported. Default: 0. Maximum: 1.

  • memory (str) – The maximum memory requirement for a single task. If a task uses substantially more memory it will be killed. The value should be a string and can use postfixes such as Mi, Gi, MB, GB, etc (e.g. “4Gi”, “500MB”). If no unit is specified it is assumed to be in bytes. Default: 2Gi. Maximum: 96Gi.

  • maximum_concurrency (int) – The maximum number of tasks to run in parallel. Default: 5. Maximum: 500. If you need higher concurrency contact your Descartes Labs customer success representative.

  • minimum_concurrency (int) – The minimum number of tasks to run right away in parallel. Concurrency is usually scaled up slowly when submitting new tasks. Setting this can mean more immediate processing of this many newly submitted tasks. Note that setting this means the equivalent resources of this many permanently running tasks will be charged to your account while this group is active. Default: 0. Maximum: 4.

  • minimum_seconds (int) – The number of seconds to wait for new tasks before scaling down concurrency, after a task is finished. Default: 0. Maximum: 600.

  • task_timeout (int) – Maximum runtime for a single task in seconds. A task will be killed if it exceeds this limit. Default: 30 minutes. Minimum: 10 seconds. Maximum: 24 hours.

  • retry_count (int) – Number of times to retry a task if it fails Default: 0. Maximum: 5.

Returns

A CloudFunction.

Return type

CloudFunction

Raises

BadRequest – Raised if any of the supplied parameters are invalid.

create_webhook(group_id, name=None, label_path=None, label_separator=None)[source]

Create a new webhook for submitting tasks to task group.

Once a POST request is made to the webhook’s URL, a new task will be submitted. If the request contains a valid JSON payload, that payload will be used as the function’s parameters (i.e, f(**payload)).

Optionally, label_path and label_separator provide a way to attach labels to the submitted task for future filtering. The labels will be extracted correspondingly from the request payload.

For example, given an invocation {“a”: {“b”: “foo, bar”}} with label_path set to a.b and label_separator as ,, the labels foo and bar will be attached to the task. Note that the field used for labels will not be removed from the invocation of the function.

Parameters
  • group_id (str) – The task group id.

  • name (str) – Desired name for the webhook.

  • label_path (str) – An optional path to the field to be used as task’s labels. Note that JSONPath is not supported–a JSONPath expression such as $.foo.bar must look like foo.bar instead.

  • label_separator (str) – An optional separator to be used if label_path refers to a string. If not provided, the whole field will be used as label(s).

Returns

A dictionary with properties of the newly created webhook.

Return type

DotDict

Raises

NotFoundError – Raised if the task group cannot be found.

delete_group_by_id(group_id)

Terminates a task group by id. Once a group is terminated, no more tasks can be submitted to it and it stops using any resources. If the group with the given id is already terminated, nothing happens.

Parameters

group_id (str) – The group id.

Returns

A dictionary representing the terminated task group.

Return type

DotDict

Raises

NotFoundError – Raised if the task group cannot be found.

delete_webhook(group_id, webhook_id)[source]

Delete an existing webhook.

Parameters

webhook_id (str) – The webhook id.

Returns

A boolean indicating if the deletion was successful.

Return type

bool

Raises

NotFoundError – Raised if the webhook cannot be found.

classmethod get_default_session_class()

Get the default session class for Service.

Returns

The default session class, which is Session itself or a derived class from Session.

Return type

Session

get_function(name)[source]

This method has been deprecated. Please use get_function_by_id() instead.

Gets an asynchronous function by name (the last function created with that name).

Parameters

name (str) – The name of the function to lookup.

Returns

A CloudFunction, or None if no function with the given name exists.

Return type

CloudFunction

get_function_by_id(group_id)[source]

Get an asynchronous function by group id.

Parameters

group_id (str) – The group id.

Returns

A CloudFunction.

Return type

CloudFunction

Raises
  • NotFoundError – Raised if the task group cannot be found.

  • RateLimitError – Raised when too many requests have been made within a given time period.

  • ServerError – Raised when a unknown error occurred on the server.

get_group(group_id, include=None)[source]

Retrieves a single task group by id.

Parameters
  • group_id (str) – The group id.

  • include (list(str)) – extra fields to include in groups in the response. allowed are: [‘build_log_url, ‘build_log’]. Note that build logs over 10 Mi will not be returned, request the build log url instead.

Returns

A dictionary representing the task group.

Return type

DotDict

Raises

NotFoundError – Raised if the task group cannot be found.

get_group_by_id(group_id, include=None)

Retrieves a single task group by id.

Parameters
  • group_id (str) – The group id.

  • include (list(str)) – extra fields to include in groups in the response. allowed are: [‘build_log_url, ‘build_log’]. Note that build logs over 10 Mi will not be returned, request the build log url instead.

Returns

A dictionary representing the task group.

Return type

DotDict

Raises

NotFoundError – Raised if the task group cannot be found.

get_group_by_name(name, status='running')[source]

Retrieves a single task group by name. Names are not unique; if there are multiple matches, returns the newest group.

Parameters
  • group_id (str) – The group name.

  • status (str) –

    Only consider groups with this status. The default is ‘running’. Allowed are:

    • ’awaiting_bundle’ – The request was received but is waiting for the corresponding code

    • ’building’ – The code was received and a task group image is being created

    • ’build_failed’ – The task group image could not be created

    • ’pending’ – The task group image has been built and is waiting for resources

    • ’running’ – The task group is ready to receive requests for tasks

    • ’terminated’ – The task group has been shut down

Returns

A dictionary representing the task group, or None if no group with the given name exists.

Return type

DotDict

get_task_result(group_id, task_id, include=None)[source]

Retrieves a single task result.

Parameters
  • group_id (str) – The group to get task results from.

  • task_id (str) – Specific ID of task to retrieve.

  • include (list(str)) – Extra fields to include in the task results. Allowed values are [‘arguments’, ‘stacktrace’, ‘result’, ‘logs’, ‘result_url’, ‘logs_url’].

Returns

A dictionary representing the task result.

Return type

DotDict

Raises

NotFoundError – Raised if the task group or task itself cannot be found.

get_task_result_batch(group_id, task_ids, include=None)[source]

Retrieves a multiple task results by id.

Parameters
  • group_id (str) – The group to get task results from.

  • task_ids (list(str)) – A list of task ids to retrieve, maximum 500.

  • include (list(str)) – Extra fields to include in the task results. Allowed values are [‘arguments’, ‘stacktrace’, ‘result_url’, ‘logs_url’].

Returns

A dictionary with a key results containing the list of matching results. Results are in the order of the ids provided. Unknown ids are ignored.

Return type

DotDict

get_task_results(group_id, limit=100, offset=None, status=None, failure_type=None, updated=None, created=None, webhook=None, labels=None, include=None, sort_field='created', sort_order='asc', continuation_token=None)

Retrieves a portion of task results matching the given criteria.

Parameters
  • group_id (str) – The group to get task results from.

  • limit (int) – The number of results to get (max 1000 per page).

  • offset (int) – Where to start when getting task results (deprecated; use continuation_token).

  • status (str) – Filter tasks to this status. Allowed are [‘FAILURE’, ‘SUCCESS’].

  • failure_type (str) – Filter tasks to this type of failure. Allowed are [‘exception’, ‘oom’, ‘timeout’, ‘internal’, ‘unknown’, ‘py_version_mismatch’].

  • updated (str) – Filter tasks by updated date after this timestamp.

  • created (str) – Filter tasks by creation date after this timestamp.

  • webhook (str) – Filter by the webhook uid which spawned the task.

  • labels (list(str)) – Labels that must be present in tasks labels list.

  • include (list(str)) – Extra fields to include in the task results. Allowed values are [‘arguments’, ‘stacktrace’, ‘result_url’, ‘logs_url’].

  • sort_field (str) – The field to sort results on. Allowed are [‘created’, ‘runtime’, ‘peak_memory_usage’]. Default: ‘created’.

  • sort_order (str) – Allowed are [‘asc’, ‘desc’]. Default: ‘asc’.

  • continuation_token (str) – A string returned from a previous call to list_task_results(), which you can use to get the next page of results.

Returns

A dictionary with two keys; results containing the list of matching results, continuation_token containting a string if there are further matching results.

Return type

DotDict

get_webhook(webhook_id)[source]

Returns a webhook’s configuration.

Parameters

webhook_id (str) – The webhook id.

Returns

A dictionary of the webhook’s properties.

Return type

DotDict

Raises

NotFoundError – Raised if the webhook cannot be found.

get_webhooks(group_id)

List all webhooks for a task group.

Parameters

group_id (str) – The task group id.

Returns

A dictionary with one key webhooks containing a list of dictionaries representing the webhooks.

Return type

DotDict

iter_groups(status=None, created=None, updated=None, sort_field=None, sort_order='asc')[source]

Iterates over all task groups matching the given criteria.

Parameters
  • status (str) –

    Filter groups to this status. Allowed are:

    • ’awaiting_bundle’ – The request was received but is waiting for the corresponding code

    • ’building’ – The code was received and a task group image is being created

    • ’build_failed’ – The task group image could not be created

    • ’pending’ – The task group image has been built and is waiting for resources

    • ’running’ – The task group is ready to receive requests for tasks

    • ’terminated’ – The task group has been shut down

  • created (str) – Filter groups by creation date after this timestamp.

  • updated (str) – Filter groups by updated date after this timestamp.

  • sort_field (str) – The field to sort groups on. Allowed are [‘created’, ‘updated’].

  • sort_order (str) – Allowed are [‘asc’, ‘desc’]. Default: ‘asc’.

  • limit (int) – The number of results to get (max 1000 per page).

  • continuation_token (str) – A string returned from a previous call to list_groups(), which you can use to get the next page of results.

Returns

An iterator over matching task groups.

Return type

generator(DotDict)

iter_task_results(group_id, status=None, failure_type=None, updated=None, created=None, webhook=None, labels=None, include=None, sort_field='created', sort_order='asc')[source]

Iterates over all task results matching the given criteria.

Parameters
  • group_id (str) – The group to get task results from.

  • status (str) – Filter tasks to this status. Allowed are [‘FAILURE’, ‘SUCCESS’].

  • failure_type (str) – Filter tasks to this type of failure. Allowed are [‘exception’, ‘oom’, ‘timeout’, ‘internal’, ‘unknown’, ‘py_version_mismatch’].

  • updated (str) – Filter tasks by updated date after this timestamp.

  • created (str) – Filter tasks by creation date after this timestamp.

  • webhook (str) – Filter by the webhook uid which spawned the task.

  • include (list(str)) – Extra fields to include in the task results. Allowed values are [‘arguments’, ‘stacktrace’, ‘result_url’, ‘logs_url’].

  • labels (list(str)) – Labels that must be present in tasks labels list.

  • sort_field (str) – The field to sort results on. Allowed are [‘created’, ‘runtime’, ‘peak_memory_usage’]. Default: ‘created’.

  • sort_order (str) – Allowed are [‘asc’, ‘desc’]. Default: ‘asc’.

Returns

An iterator over matching task results.

Return type

generator(DotDict)

list_groups(status=None, created=None, updated=None, sort_field=None, include=None, sort_order='asc', limit=100, continuation_token=None)[source]

Retrieves a limited list of task groups matching the given criteria.

Parameters
  • status (str) –

    Filter groups to this status. Allowed are:

    • ’awaiting_bundle’ – The request was received but is waiting for the corresponding code

    • ’building’ – The code was received and a task group image is being created

    • ’build_failed’ – The task group image could not be created

    • ’pending’ – The task group image has been built and is waiting for resources

    • ’running’ – The task group is ready to receive requests for tasks

    • ’terminated’ – The task group has been shut down

  • created (str) – Filter groups by creation date after this timestamp.

  • updated (str) – Filter groups by updated date after this timestamp.

  • sort_field (str) – The field to sort groups on. Allowed are [‘created’, ‘updated’].

  • sort_order (str) – Allowed are [‘asc’, ‘desc’]. Default: ‘asc’.

  • include (list[str]) – extra fields to include in groups in the response. allowed are: [‘build_log_url’]

  • limit (int) – The number of results to get (max 1000 per page).

  • continuation_token (str) – A string returned from a previous call to list_groups(), which you can use to get the next page of results.

Returns

A dictionary with two keys; groups containing the list of matching groups, continuation_token containting a string if there are further matching groups.

Return type

DotDict

list_task_results(group_id, limit=100, offset=None, status=None, failure_type=None, updated=None, created=None, webhook=None, labels=None, include=None, sort_field='created', sort_order='asc', continuation_token=None)[source]

Retrieves a portion of task results matching the given criteria.

Parameters
  • group_id (str) – The group to get task results from.

  • limit (int) – The number of results to get (max 1000 per page).

  • offset (int) – Where to start when getting task results (deprecated; use continuation_token).

  • status (str) – Filter tasks to this status. Allowed are [‘FAILURE’, ‘SUCCESS’].

  • failure_type (str) – Filter tasks to this type of failure. Allowed are [‘exception’, ‘oom’, ‘timeout’, ‘internal’, ‘unknown’, ‘py_version_mismatch’].

  • updated (str) – Filter tasks by updated date after this timestamp.

  • created (str) – Filter tasks by creation date after this timestamp.

  • webhook (str) – Filter by the webhook uid which spawned the task.

  • labels (list(str)) – Labels that must be present in tasks labels list.

  • include (list(str)) – Extra fields to include in the task results. Allowed values are [‘arguments’, ‘stacktrace’, ‘result_url’, ‘logs_url’].

  • sort_field (str) – The field to sort results on. Allowed are [‘created’, ‘runtime’, ‘peak_memory_usage’]. Default: ‘created’.

  • sort_order (str) – Allowed are [‘asc’, ‘desc’]. Default: ‘asc’.

  • continuation_token (str) – A string returned from a previous call to list_task_results(), which you can use to get the next page of results.

Returns

A dictionary with two keys; results containing the list of matching results, continuation_token containting a string if there are further matching results.

Return type

DotDict

list_webhooks(group_id)[source]

List all webhooks for a task group.

Parameters

group_id (str) – The task group id.

Returns

A dictionary with one key webhooks containing a list of dictionaries representing the webhooks.

Return type

DotDict

new_group(function, container_image, name=None, cpus=1, gpus=0, memory='2Gi', maximum_concurrency=None, minimum_concurrency=None, minimum_seconds=None, task_timeout=1800, include_modules=None, include_data=None, requirements=None, **kwargs)[source]

Creates a new task group.

Parameters
  • function (function) – The function to be called in a task. The function cannot contain any globals or BoundGlobalError will be raised

  • container_image (str) – The location of a docker image to be used for the environment in which the function is executed.

  • name (str) – An optional name used to later help identify the function.

  • cpus (int) – The number of CPUs requested for a single task. A task might be throttled if it uses more CPU. Default: 1. Maximum: 16.

  • gpus (int) – The number of GPUs requested for a single task. As of right now, a maximum of 1 GPU is supported. Default: 0. Maximum: 1.

  • memory (str) – The maximum memory requirement for a single task. If a task uses substantially more memory it will be killed. The value should be a string and can use postfixes such as Mi, Gi, MB, GB, etc (e.g. “4Gi”, “500MB”). If no unit is specified it is assumed to be in bytes. Default: 2Gi. Maximum: 96Gi.

  • maximum_concurrency (int) – The maximum number of tasks to run in parallel. Default: 5. Maximum: 500. If you need higher concurrency contact your Descartes Labs customer success representative.

  • minimum_concurrency (int) – The minimum number of tasks to run right away in parallel. Concurrency is usually scaled up slowly when submitting new tasks. Setting this can mean more immediate processing of this many newly submitted tasks. Note that setting this means the equivalent resources of this many permanently running tasks will be charged to your account while this group is active. Default: 0. Maximum: 4.

  • minimum_seconds (int) – The number of seconds to wait for new tasks before scaling down concurrency, after a task is finished. Default: 0. Maximum: 600.

  • task_timeout (int) – Maximum runtime for a single task in seconds. A task will be killed if it exceeds this limit. Default: 30 minutes. Minimum: 10 seconds. Maximum: 24 hours.

  • include_modules (list(str)) – Locally importable python names to include as modules in the task group, which can be imported by the entrypoint function, function.

  • include_data (list(str)) – Non python data files to include in the task group. Data path must be descendant of system path or python path directories.

  • requirements (list(str)) – A list of Python dependencies required by this function or a path to a file listing those dependencies, in standard setuptools notation (see PEP 508 https://www.python.org/dev/peps/pep-0508/). For example, if the packages foo and bar are required, then [‘foo’, ‘bar’] or [‘foo>2.0’, ‘bar>=1.0’] are possible values.

Returns

A dictionary representing the group created.

Return type

DotDict

Raises
  • BoundGlobalError – Raised if the given function refers to global variables.

  • BadRequest – Raised if any of the supplied parameters are invalid.

new_task(group_id, arguments=None, parameters=None, labels=None, retry_count=0)[source]

Submits a new task to a group. All positional and keyword arguments to the group’s function must be JSON-serializable (i.e., booleans, numbers, strings, lists, dictionaries).

Parameters
  • group_id (str) – The group id to submit to.

  • arguments (list) – The positional arguments to call the group’s function with.

  • parameters (dict) – The keyword arguments to call the group’s function with.

  • labels (list) – An optional list of labels to attach to the task. Task results can later be filtered by these labels.

  • retry_count (int) – Number of times to retry the task if it fails (maximum 5).

Returns

A dictionary with one key tasks containing a list with one element representing the submitted task.

Return type

DotDict

Raises
  • NotFoundError – Raised if the task group cannot be found.

  • BadRequest – Raised if any of the supplied parameters are invalid.

new_tasks(group_id, list_of_arguments=None, list_of_parameters=None, list_of_labels=None, retry_count=0)[source]

Submits multiple tasks to a group. All positional and keyword arguments to the group’s function must be JSON-serializable (i.e., booleans, numbers, strings, lists, dictionaries).

Parameters
  • group_id (str) – The group id to submit to.

  • arguments (list(list)) – The positional arguments to call the group’s function with, for each task.

  • parameters (list(dict)) – The keyword arguments to call the group’s function with, for each task.

  • labels (list(list)) – An optional list of labels to attach, for each task. Task results can later be filtered by these labels.

  • retry_count (int) – Number of times to retry the tasks if they fails (maximum 5).

Returns

A dictionary with one key tasks containing a list of dictionaries representing the submitted tasks.

Return type

DotDict

Raises
  • NotFoundError – Raised if the task group cannot be found.

  • BadRequest – Raised if any of the supplied parameters are invalid.

rerun_failed_tasks(group_id, retry_count=0)[source]

Submits all failed tasks for a rerun, except for out-of-memory or version mismatch failures. These tasks will be run again with the same arguments as before.

Tasks that are currently already being rerun will be ignored.

Parameters
  • group_id (str) – The group in which to rerun tasks.

  • retry_count (int) – Number of times to retry a task if it fails (maximum 5)

Returns

A list of dictionaries representing the tasks that have been submitted.

Return type

DotList

rerun_matching_tasks(group_id, status=None, failure_type=None, updated=None, created=None, webhook=None, labels=None, retry_count=0)[source]

Submits all completed tasks matching the given search arguments for a rerun. These tasks will be run again with the same arguments as before.

Tasks that are currently already being rerun will be ignored.

Parameters
  • group_id (str) – The group in which to rerun tasks.

  • status (str) – Filter tasks to this status. Allowed are [‘FAILURE’, ‘SUCCESS’].

  • failure_type (str) – Filter tasks to this type of failure. Allowed are [‘exception’, ‘oom’, ‘timeout’, ‘internal’, ‘unknown’].

  • updated (str) – Filter tasks by updated date after this timestamp.

  • created (str) – Filter tasks by creation date after this timestamp.

  • webhook (str) – Filter by the webhook uid which spawned the task.

  • labels (list(str)) – Labels that must be present in tasks labels list.

  • retry_count (int) – Number of times to retry a task if it fails (maximum 5)

Returns

A list of dictionaries representing the tasks that have been submitted.

Return type

DotList

rerun_tasks(group_id, task_id_iterable, retry_count=0)[source]

Submits a list of completed tasks specified by ids for a rerun. The completed tasks with the given ids will be run again with the same arguments as before.

Tasks that are currently already being rerun will be ignored. Unknown or invalid task ids will be ignored.

Parameters
  • group_id (str) – The group in which to rerun tasks.

  • task_id_iterable (iterable(str)) – An iterable of the task ids to be rerun.

  • retry_count (int) – Number of times to retry a task if it fails (maximum 5)

Returns

A list of dictionaries representing the tasks that have been submitted.

Return type

DotList

Raises

NotFoundError – Raised if the task group cannot be found.

classmethod set_default_session_class(session_class)

Set the default session class for Service.

The default session is used for any Service that is instantiated without specifying the session class.

Parameters

session_class (class) – The session class to use when instantiating the session. This must be the class Session itself or a derived class from Session.

terminate_group(group_id)[source]

Terminates a task group by id. Once a group is terminated, no more tasks can be submitted to it and it stops using any resources. If the group with the given id is already terminated, nothing happens.

Parameters

group_id (str) – The group id.

Returns

A dictionary representing the terminated task group.

Return type

DotDict

Raises

NotFoundError – Raised if the task group cannot be found.

update_credentials()[source]

Updates the credentials for the tasks run by this user. It updates the authentication within the user’s tasks namespace. If the user invalidates existing credentials and needs to update them with new credentials, you should call this method.

Note that when you create a new task group, your credentials are automatically updated.

Returns

True if successful, False otherwise.

update_group(group_id, container_image=None, name=None, cpus=None, gpus=None, memory=None, maximum_concurrency=None, minimum_concurrency=None, minimum_seconds=None, task_timeout=None, **kwargs)[source]

Update attributes of a group.

Parameters
  • group_id (str) – The group id.

  • container_image (str) – The optional new location of a docker image to be used for the environment in which the function is executed.

  • name (str) – An optional new name used to later help identify the function.

  • cpus (int) – The optional new number of CPUs requested for a single task. A task might be throttled if it uses more CPU. Maximum: 16.

  • gpus (int) – The optional new number of GPUs requested for a single task. As of right now, a maximum of 1 GPU is supported. Maximum: 1. May not be changed from zero to non-zero or non-zero to zero.

  • memory (str) – The optional new maximum memory requirement for a single task. If a task uses substantially more memory it will be killed. The value should be a string and can use postfixes such as Mi, Gi, MB, GB, etc (e.g. “4Gi”, “500MB”). If no unit is specified it is assumed to be in bytes.Maximum: 96Gi.

  • maximum_concurrency (int) – The optional new maximum number of tasks to run in parallel. Maximum: 500. If you need higher concurrency contact your Descartes Labs customer success representative.

  • minimum_concurrency (int) – The optional new minimum number of tasks to run right away in parallel. Concurrency is usually scaled up slowly when submitting new tasks. Setting this can mean more immediate processing of this many newly submitted tasks. Note that setting this means the equivalent resources of this many permanently running tasks will be charged to your account while this group is active. Maximum: 4.

  • minimum_seconds (int) – The optional new number of seconds to wait for new tasks before scaling down concurrency, after a task is finished. Maximum: 600.

  • task_timeout (int) – The optional new maximum runtime for a single task in seconds. A task will be killed if it exceeds this limit. Minimum: 10 seconds. Maximum: 24 hours.

Returns

A dictionary representing the updated task group.

Return type

DotDict

Raises

NotFoundError – Raised if the task group cannot be found.

wait_for_completion(group_id, show_progress=False)[source]

Waits until all submitted tasks for a given group are completed.

If a task group stops accepting tasks, will raise GroupTerminalException and stop waiting.

Parameters
  • group_id (str) – The group id.

  • show_progress (bool) – Whether to log progress information.

Raises

GroupTerminalException

ADAPTER = <descarteslabs.common.threading.local.ThreadLocalWrapper object>
COMPLETION_POLL_INTERVAL_SECONDS = 5
CONNECT_TIMEOUT = 9.5
READ_TIMEOUT = 30
RERUN_BATCH_SIZE = 200
RETRY_CONFIG(*args, **kwargs)

Retry configuration.

Each retry attempt will create a new Retry object with updated values, so they can be safely reused.

Retries can be defined as a default for a pool:

retries = Retry(connect=5, read=2, redirect=5)
http = PoolManager(retries=retries)
response = http.request('GET', 'http://example.com/')

Or per-request (which overrides the default for the pool):

response = http.request('GET', 'http://example.com/', retries=Retry(10))

Retries can be disabled by passing False:

response = http.request('GET', 'http://example.com/', retries=False)

Errors will be wrapped in MaxRetryError unless retries are disabled, in which case the causing exception will be raised.

Parameters
  • total (int) –

    Total number of retries to allow. Takes precedence over other counts.

    Set to None to remove this constraint and fall back on other counts.

    Set to 0 to fail on the first retry.

    Set to False to disable and imply raise_on_redirect=False.

  • connect (int) –

    How many connection-related errors to retry on.

    These are errors raised before the request is sent to the remote server, which we assume has not triggered the server to process the request.

    Set to 0 to fail on the first retry of this type.

  • read (int) –

    How many times to retry on read errors.

    These errors are raised after the request was sent to the server, so the request may have side-effects.

    Set to 0 to fail on the first retry of this type.

  • redirect (int) –

    How many redirects to perform. Limit this to avoid infinite redirect loops.

    A redirect is a HTTP response with a status code 301, 302, 303, 307 or 308.

    Set to 0 to fail on the first retry of this type.

    Set to False to disable and imply raise_on_redirect=False.

  • status (int) –

    How many times to retry on bad status codes.

    These are retries made on responses, where status code matches status_forcelist.

    Set to 0 to fail on the first retry of this type.

  • other (int) –

    How many times to retry on other errors.

    Other errors are errors that are not connect, read, redirect or status errors. These errors might be raised after the request was sent to the server, so the request might have side-effects.

    Set to 0 to fail on the first retry of this type.

    If total is not set, it’s a good idea to set this to 0 to account for unexpected edge cases and avoid infinite retry loops.

  • allowed_methods (iterable) –

    Set of uppercased HTTP method verbs that we should retry on.

    By default, we only retry on methods which are considered to be idempotent (multiple requests with the same parameters end with the same state). See Retry.DEFAULT_ALLOWED_METHODS.

    Set to a False value to retry on any verb.

    Warning

    Previously this parameter was named method_whitelist, that usage is deprecated in v1.26.0 and will be removed in v2.0.

  • status_forcelist (iterable) –

    A set of integer HTTP status codes that we should force a retry on. A retry is initiated if the request method is in allowed_methods and the response status code is in status_forcelist.

    By default, this is disabled with None.

  • backoff_factor (float) –

    A backoff factor to apply between attempts after the second try (most errors are resolved immediately by a second try without a delay). urllib3 will sleep for:

    {backoff factor} * (2 ** ({number of total retries} - 1))
    

    seconds. If the backoff_factor is 0.1, then sleep() will sleep for [0.0s, 0.2s, 0.4s, …] between retries. It will never be longer than Retry.DEFAULT_BACKOFF_MAX.

    By default, backoff is disabled (set to 0).

  • raise_on_redirect (bool) – Whether, if the number of redirects is exhausted, to raise a MaxRetryError, or to return a response with a response code in the 3xx range.

  • raise_on_status (bool) – Similar meaning to raise_on_redirect: whether we should raise an exception, or return a response, if status falls in status_forcelist range and retries have been exhausted.

  • history (tuple) – The history of the request encountered during each call to increment(). The list is in the order the requests occurred. Each list item is of class RequestHistory.

  • respect_retry_after_header (bool) – Whether to respect Retry-After header on status codes defined as Retry.RETRY_AFTER_STATUS_CODES or not.

  • remove_headers_on_redirect (iterable) – Sequence of headers to remove from the request when a response indicating a redirect is returned before firing off the redirected request.

TASK_RESULT_BATCH_SIZE = 100
TIMEOUT = (9.5, 30)
property session

The session instance used by this service.

Type

Session

property token

The bearer token used in the requests.

Type

str

class FutureTask(guid, tuid, client=None, args=None, kwargs=None)[source]

A submitted task which may or may not have completed yet. Accessing any attributes only available on a completed task (for example result) blocks until the task completes.

Attributes:

COMPLETION_POLL_INTERVAL_SECONDS

FAILURE

SUCCESS

exception

Property indicating the name of the exception raised during the function execution, if any

exception_name

Property indicating the name of the exception raised during the function execution, if any

failure_type

The type of failure if this task did not succeed.

is_success

Did this task succeeed?

log

Property indicating the log output for this completed task.

peak_memory_usage

Property indicating the peak memory usage for this completed task, in bytes.

ready

Property indicating whether the task has completed

result

Property indicating the return value of the function for this completed task.

runtime

Property indicating the time spent executing the function for this task, in seconds.

stacktrace

Property indicating the stacktrace of the exception raised during the function execution, if any.

status

const`FAILURE`) for this completed task.

traceback

Property indicating the stacktrace of the exception raised during the function execution, if any.

Methods:

get_result([wait, timeout])

Attempt to load the result for this task.

get_result(wait=False, timeout=None)[source]

Attempt to load the result for this task. After returning from this method without an exception raised, the return value for the task is available through the result property.

Parameters
  • wait (bool) – Whether to wait for the task to complete or raise a TransientResultError if the task hasnt completed yet.

  • timeout (int) – How long to wait for the task to complete, or None to wait indefinitely.

COMPLETION_POLL_INTERVAL_SECONDS = 3
FAILURE = 'FAILURE'
SUCCESS = 'SUCCESS'
property exception

Property indicating the name of the exception raised during the function execution, if any

Return type

str

Returns

The name of the exception or None

property exception_name

Property indicating the name of the exception raised during the function execution, if any

Return type

str

Returns

The name of the exception or None

property failure_type

The type of failure if this task did not succeed.

Return type

str

Returns

The failure type

property is_success

Did this task succeeed?

Return type

bool

Returns

Whether this task succeeded.

property log

Property indicating the log output for this completed task.

Return type

str

Returns

The log output

property peak_memory_usage

Property indicating the peak memory usage for this completed task, in bytes.

Return type

int

Returns

The peak memory usage

property ready

Property indicating whether the task has completed

Return type

bool

Returns

True if the upload task has completed and status is available, otherwise False.

property result

Property indicating the return value of the function for this completed task.

Return type

json or pickled type

Returns

The return value of the function for this completed task.

property runtime

Property indicating the time spent executing the function for this task, in seconds.

Return type

int

Returns

The time spent executing the function

property stacktrace

Property indicating the stacktrace of the exception raised during the function execution, if any.

Return type

str

Returns

The stacktrace of the exception or None

property status

const`FAILURE`) for this completed task.

Return type

str

Returns

The status for this completed task.

Type

Property indicating the status (SUCCESS or

property traceback

Property indicating the stacktrace of the exception raised during the function execution, if any.

Return type

str

Returns

The stacktrace of the exception or None

class CloudFunction(group_id, name=None, client=None, retry_count=0)[source]

Represents the asynchronous function of a task group. When called, new tasks are submitted to the group with the positional and keyword arguments given. A map() method allows submitting multiple tasks more efficiently than making individual function calls.

Attributes:

TASK_SUBMIT_SIZE

Methods:

map(args, *iterargs)

Submits multiple tasks efficiently with positional argument to each function call, mimicking the behaviour of the builtin map() function.

wait_for_completion([show_progress])

Waits until all tasks submitted through this function are completed.

map(args, *iterargs)[source]

Submits multiple tasks efficiently with positional argument to each function call, mimicking the behaviour of the builtin map() function. When submitting multiple tasks this is preferred over calling the function repeatedly.

All positional arguments must be JSON-serializable (i.e., booleans, numbers, strings, lists, dictionaries).

Parameters
  • args (iterable) – An iterable of arguments. A task will be submitted with each element of the iterable as the first positional argument to the function.

  • iterargs (list(iterable)) – If additional iterable arguments are passed, the function must take that many arguments and is applied to the items from all iterables in parallel (mimicking builtin map() behaviour).

Returns

A list of all submitted tasks.

Return type

list(descarteslabs.client.services.tasks.FutureTask)

wait_for_completion(show_progress=False)[source]

Waits until all tasks submitted through this function are completed.

If a task group stops accepting tasks, will raise GroupTerminalException and stop waiting.

Parameters

show_progress (bool) – Whether to log progress information.

TASK_SUBMIT_SIZE = 100
as_completed(tasks, show_progress=True)[source]

Yields completed tasks from the list of given tasks as they become available, finishing when all given tasks have been completed.

If you don’t care about the particular results of the tasks and only want to wait for all tasks to complete, use wait_for_completion.

If a task group stops accepting tasks, will raise GroupTerminalException and stop waiting.

Parameters

Classes:

FutureTask(guid, tuid[, client, args, kwargs])

A submitted task which may or may not have completed yet.

ResultType

Possible types of return values for a function.

Exceptions:

TimeoutError([message])

Raised when attempting to access results for a task that hasn’t completed.

TransientResultError([message])

Raised when attempting to access results for a task that hasn’t completed.

exception TimeoutError(message='Timeout exceeded')[source]

Raised when attempting to access results for a task that hasn’t completed.

exception TransientResultError(message='Result not yet ready')[source]

Raised when attempting to access results for a task that hasn’t completed.

class FutureTask(guid, tuid, client=None, args=None, kwargs=None)[source]

A submitted task which may or may not have completed yet. Accessing any attributes only available on a completed task (for example result) blocks until the task completes.

Attributes:

COMPLETION_POLL_INTERVAL_SECONDS

FAILURE

SUCCESS

exception

Property indicating the name of the exception raised during the function execution, if any

exception_name

Property indicating the name of the exception raised during the function execution, if any

failure_type

The type of failure if this task did not succeed.

is_success

Did this task succeeed?

log

Property indicating the log output for this completed task.

peak_memory_usage

Property indicating the peak memory usage for this completed task, in bytes.

ready

Property indicating whether the task has completed

result

Property indicating the return value of the function for this completed task.

runtime

Property indicating the time spent executing the function for this task, in seconds.

stacktrace

Property indicating the stacktrace of the exception raised during the function execution, if any.

status

const`FAILURE`) for this completed task.

traceback

Property indicating the stacktrace of the exception raised during the function execution, if any.

Methods:

get_result([wait, timeout])

Attempt to load the result for this task.

get_result(wait=False, timeout=None)[source]

Attempt to load the result for this task. After returning from this method without an exception raised, the return value for the task is available through the result property.

Parameters
  • wait (bool) – Whether to wait for the task to complete or raise a TransientResultError if the task hasnt completed yet.

  • timeout (int) – How long to wait for the task to complete, or None to wait indefinitely.

COMPLETION_POLL_INTERVAL_SECONDS = 3
FAILURE = 'FAILURE'
SUCCESS = 'SUCCESS'
property exception

Property indicating the name of the exception raised during the function execution, if any

Return type

str

Returns

The name of the exception or None

property exception_name

Property indicating the name of the exception raised during the function execution, if any

Return type

str

Returns

The name of the exception or None

property failure_type

The type of failure if this task did not succeed.

Return type

str

Returns

The failure type

property is_success

Did this task succeeed?

Return type

bool

Returns

Whether this task succeeded.

property log

Property indicating the log output for this completed task.

Return type

str

Returns

The log output

property peak_memory_usage

Property indicating the peak memory usage for this completed task, in bytes.

Return type

int

Returns

The peak memory usage

property ready

Property indicating whether the task has completed

Return type

bool

Returns

True if the upload task has completed and status is available, otherwise False.

property result

Property indicating the return value of the function for this completed task.

Return type

json or pickled type

Returns

The return value of the function for this completed task.

property runtime

Property indicating the time spent executing the function for this task, in seconds.

Return type

int

Returns

The time spent executing the function

property stacktrace

Property indicating the stacktrace of the exception raised during the function execution, if any.

Return type

str

Returns

The stacktrace of the exception or None

property status

const`FAILURE`) for this completed task.

Return type

str

Returns

The status for this completed task.

Type

Property indicating the status (SUCCESS or

property traceback

Property indicating the stacktrace of the exception raised during the function execution, if any.

Return type

str

Returns

The stacktrace of the exception or None

class ResultType[source]

Possible types of return values for a function.

Attributes:

JSON

LEGACY_PICKLE

JSON = 'json'
LEGACY_PICKLE = 'pickle'