Tasks

TransientResultException

alias of descarteslabs.common.tasks.futuretask.TransientResultError

AsyncTasks

alias of descarteslabs.client.services.tasks.tasks.Tasks

class Tasks(url=None, auth=None)[source]
create_function(f, image=None, name=None, cpus=1, gpus=0, memory='2Gi', maximum_concurrency=None, minimum_concurrency=None, minimum_seconds=None, task_timeout=1800, retry_count=0, **kwargs)[source]

Creates a new task group from a function and returns an asynchronous function that can be called to submit tasks to the group.

Parameters:
  • f (function) – The function to be called in a task.
  • image (str) – The location of a docker image to be used for the environment in which the function is executed.
  • name (str) – An optional name used to later help identify the function.
  • cpus (int) – The number of CPUs requested for a single task. A task might be throttled if it uses more CPU. Default: 1. Maximum: 32.
  • gpus (int) – The number of GPUs requested for a single task. As of right now, a maximum of 1 GPU is supported. Default: 0. Maximum: 1.
  • memory (str) – The maximum memory requirement for a single task. If a task uses substantially more memory it will be killed. The value should be a string and can use postfixes such as Mi, Gi, MB, GB, etc (e.g. “4Gi”, “500MB”). If no unit is specified it is assumed to be in bytes. Default: 2Gi. Maximum: 96Gi.
  • maximum_concurrency (int) – The maximum number of tasks to run in parallel. Default: 500. Maximum: 500. If you need higher concurrency contact your Descartes Labs customer success representative.
  • minimum_concurrency (int) – The minimum number of tasks to run right away in parallel. Concurrency is usually scaled up slowly when submitting new tasks. Setting this can mean more immediate processing of this many newly submitted tasks. Note that setting this means the equivalent resources of this many permanently running tasks will be charged to your account while this group is active. Default: 0. Maximum: 4.
  • minimum_seconds (int) – The number of seconds to wait for new tasks before scaling down concurrency, after a task is finished. Default: 0. Maximum: 600.
  • task_timeout (int) – Maximum runtime for a single task in seconds. A task will be killed if it exceeds this limit. Default: 30 minutes. Minimum: 10 seconds. Maximum: 24 hours.
  • retry_count (int) – Number of times to retry a task if it fails Default: 0. Maximum: 5.
Returns:

A CloudFunction.

create_namespace()[source]

Creates a namespace for the user and sets up authentication within it from the current client id and secret. Must be called once per user before creating any tasks.

Returns:True if successful, False otherwise.
create_or_get_function(f, image=None, name=None, cpus=1, gpus=0, memory='2Gi', maximum_concurrency=None, minimum_concurrency=None, minimum_seconds=None, task_timeout=1800, retry_count=0, **kwargs)[source]

Creates or gets an asynchronous function. If a task group with the given name exists, returns an asynchronous function for the newest existing group with that. Otherwise creates a new task group.

Parameters:
  • f (function) – The function to be called in a task.
  • image (str) – The location of a docker image to be used for the environment in which the function is executed.
  • name (str) – An optional name used to later help identify the function.
  • cpus (int) – The number of CPUs requested for a single task. A task might be throttled if it uses more CPU. Default: 1. Maximum: 32.
  • gpus (int) – The number of GPUs requested for a single task. As of right now, a maximum of 1 GPU is supported. Default: 0. Maximum: 1.
  • memory (str) – The maximum memory requirement for a single task. If a task uses substantially more memory it will be killed. The value should be a string and can use postfixes such as Mi, Gi, MB, GB, etc (e.g. “4Gi”, “500MB”). If no unit is specified it is assumed to be in bytes. Default: 2Gi. Maximum: 96Gi.
  • maximum_concurrency (int) – The maximum number of tasks to run in parallel. Default: 500. Maximum: 500. If you need higher concurrency contact your Descartes Labs customer success representative.
  • minimum_concurrency (int) – The minimum number of tasks to run right away in parallel. Concurrency is usually scaled up slowly when submitting new tasks. Setting this can mean more immediate processing of this many newly submitted tasks. Note that setting this means the equivalent resources of this many permanently running tasks will be charged to your account while this group is active. Default: 0. Maximum: 4.
  • minimum_seconds (int) – The number of seconds to wait for new tasks before scaling down concurrency, after a task is finished. Default: 0. Maximum: 600.
  • task_timeout (int) – Maximum runtime for a single task in seconds. A task will be killed if it exceeds this limit. Default: 30 minutes. Minimum: 10 seconds. Maximum: 24 hours.
  • retry_count (int) – Number of times to retry a task if it fails Default: 0. Maximum: 5.
Returns:

A CloudFunction.

create_webhook(group_id, name=None, label_path=None, label_separator=None)[source]
delete_group_by_id(group_id)

Terminates a task group by id. Once a group is terminated, no more tasks can be submitted to it and it stops using any resources. If the group with the given id is already terminated, nothing happens.

Parameters:group_id (str) – The group id.
Returns:A dictionary representing the terminated task group.
delete_webhook(group_id, webhook_id)[source]
get_function(name)[source]

Gets an asynchronous function by name (the last function created with that name).

Parameters:name (str) – The name of the function to lookup.
Returns:A CloudFunction, or None if no function with the given name exists.
get_group(group_id)[source]

Retrieves a single task group by id.

Parameters:group_id (str) – The group id.
Returns:A dictionary representing the task group.
get_group_by_id(group_id)

Retrieves a single task group by id.

Parameters:group_id (str) – The group id.
Returns:A dictionary representing the task group.
get_group_by_name(name, status='running')[source]

Retrieves a single task group by name. Names are not unique; if there are multiple matches, returns the newest group.

Parameters:
  • group_id (str) – The group name.
  • status (str) – Only consider groups with this status. Allowed are [‘running’, ‘terminated’]. Default: ‘running’.
Returns:

A dictionary representing the task group, or None if no group with the given name exists.

get_task_result(group_id, task_id, include=None)[source]

Retrieves a single task result.

Parameters:
  • group_id (str) – The group to get task results from.
  • task_id (str) – Specific ID of task to retrieve.
  • include (list(str)) – Extra fields to include in the task results. Allowed values are [‘arguments’, ‘stacktrace’, ‘result’, ‘logs’, ‘result_url’, ‘logs_url’].
Returns:

A dictionary representing the task result.

get_task_result_batch(group_id, task_ids, include=None)[source]

Retrieves a multiple task results by id.

Parameters:
  • group_id (str) – The group to get task results from.
  • task_ids (list(str)) – A list of task ids to retrieve, maximum 500.
  • include (list(str)) – Extra fields to include in the task results. Allowed values are [‘arguments’, ‘stacktrace’, ‘result_url’, ‘logs_url’].
Returns:

A dictionary with a key results containing the list of matching results. Results are in the order of the ids provided. Unknown ids are ignored.

get_task_results(group_id, limit=100, offset=None, status=None, failure_type=None, updated=None, created=None, webhook=None, labels=None, include=None, sort_field='created', sort_order='asc', continuation_token=None)

Retrieves a limited list of task results matching the given criteria.

Parameters:
  • group_id (str) – The group to get task results from.
  • limit (int) – The number of results to get (max 1000 per page).
  • offset (int) – Where to start when getting task results (deprecated; use continuation_token).
  • status (str) – Filter tasks to this status. Allowed are [‘FAILURE’, ‘SUCCESS’].
  • failure_type (str) – Filter tasks to this type of failure. Allowed are [‘exception’, ‘oom’, ‘timeout’, ‘internal’, ‘unknown’, ‘py_version_mismatch’].
  • updated (str) – Filter tasks by updated date after this timestamp.
  • created (str) – Filter tasks by creation date after this timestamp.
  • webhook (str) – Filter by the webhook uid which spawned the task.
  • labels (list(str)) – Labels that must be present in tasks labels list.
  • include (list(str)) – Extra fields to include in the task results. Allowed values are [‘arguments’, ‘stacktrace’, ‘result_url’, ‘logs_url’].
  • sort_field (str) – The field to sort results on. Allowed are [‘created’, ‘runtime’, ‘peak_memory_usage’]. Default: ‘created’.
  • sort_order (str) – Allowed are [‘asc’, ‘desc’]. Default: ‘asc’.
  • continuation_token (str) – A string returned from a previous call to list_task_results(), which you can use to get the next page of results.
Returns:

A dictionary with two keys; results containing the list of matching results, continuation_token containting a string if there are further matching results.

get_webhook(group_id, webhook_id)[source]
get_webhooks(group_id)
iter_groups(status=None, created=None, updated=None, sort_field=None, sort_order='asc')[source]

Iterates over all task groups matching the given criteria.

Parameters:
  • status (str) – Filter groups to this status. Allowed are [‘running’, ‘terminated’].
  • created (str) – Filter groups by creation date after this timestamp.
  • updated (str) – Filter groups by updated date after this timestamp.
  • sort_field (str) – The field to sort groups on. Allowed are [‘created’, ‘updated’].
  • sort_order (str) – Allowed are [‘asc’, ‘desc’]. Default: ‘asc’.
  • limit (int) – The number of results to get (max 1000 per page).
  • continuation_token (str) – A string returned from a previous call to list_groups(), which you can use to get the next page of results.
Returns:

An iterator over matching task groups.

iter_task_results(group_id, status=None, failure_type=None, updated=None, created=None, webhook=None, labels=None, include=None, sort_field='created', sort_order='asc')[source]

Iterates over all task results matching the given criteria.

Parameters:
  • group_id (str) – The group to get task results from.
  • status (str) – Filter tasks to this status. Allowed are [‘FAILURE’, ‘SUCCESS’].
  • failure_type (str) – Filter tasks to this type of failure. Allowed are [‘exception’, ‘oom’, ‘timeout’, ‘internal’, ‘unknown’, ‘py_version_mismatch’].
  • updated (str) – Filter tasks by updated date after this timestamp.
  • created (str) – Filter tasks by creation date after this timestamp.
  • webhook (str) – Filter by the webhook uid which spawned the task.
  • include (list(str)) – Extra fields to include in the task results. Allowed values are [‘arguments’, ‘stacktrace’, ‘result_url’, ‘logs_url’].
  • labels (list(str)) – Labels that must be present in tasks labels list.
  • sort_field (str) – The field to sort results on. Allowed are [‘created’, ‘runtime’, ‘peak_memory_usage’]. Default: ‘created’.
  • sort_order (str) – Allowed are [‘asc’, ‘desc’]. Default: ‘asc’.
Returns:

An iterator over matching task results.

list_groups(status=None, created=None, updated=None, sort_field=None, sort_order='asc', limit=100, continuation_token=None)[source]

Retrieves a limited list of task groups matching the given criteria.

Parameters:
  • status (str) – Filter groups to this status. Allowed are [‘running’, ‘terminated’].
  • created (str) – Filter groups by creation date after this timestamp.
  • updated (str) – Filter groups by updated date after this timestamp.
  • sort_field (str) – The field to sort groups on. Allowed are [‘created’, ‘updated’].
  • sort_order (str) – Allowed are [‘asc’, ‘desc’]. Default: ‘asc’.
  • limit (int) – The number of results to get (max 1000 per page).
  • continuation_token (str) – A string returned from a previous call to list_groups(), which you can use to get the next page of results.
Returns:

A dictionary with two keys; groups containing the list of matching groups, continuation_token containting a string if there are further matching groups.

list_task_results(group_id, limit=100, offset=None, status=None, failure_type=None, updated=None, created=None, webhook=None, labels=None, include=None, sort_field='created', sort_order='asc', continuation_token=None)[source]

Retrieves a limited list of task results matching the given criteria.

Parameters:
  • group_id (str) – The group to get task results from.
  • limit (int) – The number of results to get (max 1000 per page).
  • offset (int) – Where to start when getting task results (deprecated; use continuation_token).
  • status (str) – Filter tasks to this status. Allowed are [‘FAILURE’, ‘SUCCESS’].
  • failure_type (str) – Filter tasks to this type of failure. Allowed are [‘exception’, ‘oom’, ‘timeout’, ‘internal’, ‘unknown’, ‘py_version_mismatch’].
  • updated (str) – Filter tasks by updated date after this timestamp.
  • created (str) – Filter tasks by creation date after this timestamp.
  • webhook (str) – Filter by the webhook uid which spawned the task.
  • labels (list(str)) – Labels that must be present in tasks labels list.
  • include (list(str)) – Extra fields to include in the task results. Allowed values are [‘arguments’, ‘stacktrace’, ‘result_url’, ‘logs_url’].
  • sort_field (str) – The field to sort results on. Allowed are [‘created’, ‘runtime’, ‘peak_memory_usage’]. Default: ‘created’.
  • sort_order (str) – Allowed are [‘asc’, ‘desc’]. Default: ‘asc’.
  • continuation_token (str) – A string returned from a previous call to list_task_results(), which you can use to get the next page of results.
Returns:

A dictionary with two keys; results containing the list of matching results, continuation_token containting a string if there are further matching results.

list_webhooks(group_id)[source]
new_group(function, container_image=None, name=None, cpus=1, gpus=0, memory='2Gi', maximum_concurrency=None, minimum_concurrency=None, minimum_seconds=None, task_timeout=1800, **kwargs)[source]

Creates a new task group.

Parameters:
  • function (function) – The function to be called in a task.
  • container_image (str) – The location of a docker image to be used for the environment in which the function is executed.
  • name (str) – An optional name used to later help identify the function.
  • cpus (int) – The number of CPUs requested for a single task. A task might be throttled if it uses more CPU. Default: 1. Maximum: 32.
  • gpus (int) – The number of GPUs requested for a single task. As of right now, a maximum of 1 GPU is supported. Default: 0. Maximum: 1.
  • memory (str) – The maximum memory requirement for a single task. If a task uses substantially more memory it will be killed. The value should be a string and can use postfixes such as Mi, Gi, MB, GB, etc (e.g. “4Gi”, “500MB”). If no unit is specified it is assumed to be in bytes. Default: 2Gi. Maximum: 96Gi.
  • maximum_concurrency (int) – The maximum number of tasks to run in parallel. Default: 500. Maximum: 500. If you need higher concurrency contact your Descartes Labs customer success representative.
  • minimum_concurrency (int) – The minimum number of tasks to run right away in parallel. Concurrency is usually scaled up slowly when submitting new tasks. Setting this can mean more immediate processing of this many newly submitted tasks. Note that setting this means the equivalent resources of this many permanently running tasks will be charged to your account while this group is active. Default: 0. Maximum: 4.
  • minimum_seconds (int) – The number of seconds to wait for new tasks before scaling down concurrency, after a task is finished. Default: 0. Maximum: 600.
  • task_timeout (int) – Maximum runtime for a single task in seconds. A task will be killed if it exceeds this limit. Default: 30 minutes. Minimum: 10 seconds. Maximum: 24 hours.
Returns:

A dictionary representing the group created.

new_task(group_id, arguments=None, parameters=None, labels=None, retry_count=0)[source]

Submits a new task to a group. All positional and keyword arguments to the group’s function must be JSON-serializable (i.e., booleans, numbers, strings, lists, dictionaries).

Parameters:
  • group_id (str) – The group id to submit to.
  • arguments (list) – The positional arguments to call the group’s function with.
  • parameters (dict) – The keyword arguments to call the group’s function with.
  • labels (list) – An optional list of labels to attach to the task. Task results can later be filtered by these labels.
  • retry_count (int) – Number of times to retry the task if it fails (maximum 5).
Returns:

A dictionary with one key tasks containing a list with one element representing the submitted task.

new_tasks(group_id, list_of_arguments=None, list_of_parameters=None, list_of_labels=None, retry_count=0)[source]

Submits multiple tasks to a group. All positional and keyword arguments to the group’s function must be JSON-serializable (i.e., booleans, numbers, strings, lists, dictionaries).

Parameters:
  • group_id (str) – The group id to submit to.
  • arguments (list(list)) – The positional arguments to call the group’s function with, for each task.
  • parameters (list(dict)) – The keyword arguments to call the group’s function with, for each task.
  • labels (list(list)) – An optional list of labels to attach, for each task. Task results can later be filtered by these labels.
  • retry_count (int) – Number of times to retry the tasks if they fails (maximum 5).
Returns:

A dictionary with one key tasks containing a list of dictionaries representing the submitted tasks.

rerun_failed_tasks(group_id, retry_count=0)[source]

Submits all failed tasks for a rerun, except for tasks that had an out-of-memory or version mismatch failure. These tasks will be run again with the same arguments as before.

Tasks that are currently already being rerun will be ignored.

Parameters:
  • group_id (str) – The group in which to rerun tasks.
  • retry_count (int) – Number of times to retry a task if it fails (maximum 5)
Returns:

A list of dictionaries representing the tasks that have been submitted.

rerun_matching_tasks(group_id, status=None, failure_type=None, updated=None, created=None, webhook=None, labels=None, retry_count=0)[source]

Submits all completed tasks matching the given search arguments for a rerun. These tasks will be run again with the same arguments as before.

Tasks that are currently already being rerun will be ignored.

Parameters:
  • group_id (str) – The group in which to rerun tasks.
  • status (str) – Filter tasks to this status. Allowed are [‘FAILURE’, ‘SUCCESS’].
  • failure_type (str) – Filter tasks to this type of failure. Allowed are [‘exception’, ‘oom’, ‘timeout’, ‘internal’, ‘unknown’].
  • updated (str) – Filter tasks by updated date after this timestamp.
  • created (str) – Filter tasks by creation date after this timestamp.
  • webhook (str) – Filter by the webhook uid which spawned the task.
  • labels (list(str)) – Labels that must be present in tasks labels list.
  • retry_count (int) – Number of times to retry a task if it fails (maximum 5)
Returns:

A list of dictionaries representing the tasks that have been submitted.

rerun_tasks(group_id, task_id_iterable, retry_count=0)[source]

Submits a list of completed tasks specified by ids for a rerun. The completed tasks with the given ids will be run again with the same arguments as before.

Tasks that are currently already being rerun will be ignored. Unknown or invalid task ids will be ignored.

Parameters:
  • group_id (str) – The group in which to rerun tasks.
  • task_id_iterable (iterable(str)) – An iterable of the task ids to be rerun.
  • retry_count (int) – Number of times to retry a task if it fails (maximum 5)
Returns:

A list of dictionaries representing the tasks that have been submitted.

terminate_group(group_id)[source]

Terminates a task group by id. Once a group is terminated, no more tasks can be submitted to it and it stops using any resources. If the group with the given id is already terminated, nothing happens.

Parameters:group_id (str) – The group id.
Returns:A dictionary representing the terminated task group.
wait_for_completion(group_id, show_progress=False)[source]

Waits until all submitted tasks for a given group are completed.

Parameters:
  • group_id (str) – The group id.
  • show_progress (bool) – Whether to log progress information.
COMPLETION_POLL_INTERVAL_SECONDS = 5
RERUN_BATCH_SIZE = 200
TASK_RESULT_BATCH_SIZE = 100
class FutureTask(guid, tuid, client=None, args=None, kwargs=None)[source]

A submitted task which may or may not have completed yet. Accessing any attributes only available on a completed task (for example result) blocks until the task completes.

get_result(wait=False, timeout=None)[source]

Attempt to load the result for this task. After returning from this method without an exception raised, the return value for the task is available through the result property.

Parameters:
  • wait (bool) – Whether to wait for the task to complete or raise a TransientResultError if the task hasnt completed yet.
  • timeout (int) – How long to wait for the task to complete, or None to wait indefinitely.
COMPLETION_POLL_INTERVAL_SECONDS = 3
FAILURE = 'FAILURE'
SUCCESS = 'SUCCESS'
exception

The name of the exception raised during the function execution, if any.

Type:return
exception_name

The name of the exception raised during the function execution, if any.

Type:return
failure_type

The type of failure if this task did not succeed.

Type:return
is_success

Whether this task succeeded.

Type:return
log

The log output for this completed task.

Type:return
peak_memory_usage

The peak memory usage for this completed task, in bytes.

Type:return
result

The return value of the function for this completed task.

Type:return
runtime

The time spent executing the function for this task, in seconds.

Type:return
stacktrace

The stacktrace of the exception raised during the function execution, if any.

Type:return
status

The status (SUCCESS or FAILURE) for this completed task.

Type:return
traceback

The stacktrace of the exception raised during the function execution, if any.

Type:return
class CloudFunction(group_id, name=None, client=None, retry_count=0)[source]

Represents the asynchronous function of a task group. When called, new tasks are submitted to the group with the positional and keyword arguments given. A map() method allows submitting multiple tasks more efficiently than making individual function calls.

map(args, *iterargs)[source]

Submits multiple tasks efficiently with positional argument to each function call, mimicking the behaviour of the builtin map() function. When submitting multiple tasks this is preferred over calling the function repeatedly.

All positional arguments must be JSON-serializable (i.e., booleans, numbers, strings, lists, dictionaries).

Parameters:
  • args (iterable) – An iterable of arguments. A task will be submitted with each element of the iterable as the first positional argument to the function.
  • iterargs (list(iterable)) – If additional iterable arguments are passed, the function must take that many arguments and is applied to the items from all iterables in parallel (mimicking builtin map() behaviour).
Returns:

A list of FutureTask for all submitted tasks.

wait_for_completion(show_progress=False)[source]

Waits until all tasks submitted through this function are completed.

Parameters:show_progress (bool) – Whether to log progress information.
TASK_SUBMIT_SIZE = 100
as_completed(tasks, show_progress=True)[source]

Yields completed tasks from the list of given tasks as they become available, finishing when all given tasks have been completed.

If you don’t care about the particular results of the tasks and only want to wait for all tasks to complete, use wait_for_completion.

Parameters:
  • tasks (list) – List of FutureTask objects.
  • show_progress (bool) – Whether to log progress information.
exception TimeoutError(message='Timeout exceeded')[source]

Raised when attempting to access results for a task that hasn’t completed.

exception TransientResultError(message='Result not yet ready')[source]

Raised when attempting to access results for a task that hasn’t completed.

class FutureTask(guid, tuid, client=None, args=None, kwargs=None)[source]

A submitted task which may or may not have completed yet. Accessing any attributes only available on a completed task (for example result) blocks until the task completes.

get_result(wait=False, timeout=None)[source]

Attempt to load the result for this task. After returning from this method without an exception raised, the return value for the task is available through the result property.

Parameters:
  • wait (bool) – Whether to wait for the task to complete or raise a TransientResultError if the task hasnt completed yet.
  • timeout (int) – How long to wait for the task to complete, or None to wait indefinitely.
COMPLETION_POLL_INTERVAL_SECONDS = 3
FAILURE = 'FAILURE'
SUCCESS = 'SUCCESS'
exception

The name of the exception raised during the function execution, if any.

Type:return
exception_name

The name of the exception raised during the function execution, if any.

Type:return
failure_type

The type of failure if this task did not succeed.

Type:return
is_success

Whether this task succeeded.

Type:return
log

The log output for this completed task.

Type:return
peak_memory_usage

The peak memory usage for this completed task, in bytes.

Type:return
result

The return value of the function for this completed task.

Type:return
runtime

The time spent executing the function for this task, in seconds.

Type:return
stacktrace

The stacktrace of the exception raised during the function execution, if any.

Type:return
status

The status (SUCCESS or FAILURE) for this completed task.

Type:return
traceback

The stacktrace of the exception raised during the function execution, if any.

Type:return
class ResultType[source]

Possible types of return values for a function.

JSON = 'json'
LEGACY_PICKLE = 'pickle'