Storage

The Storage API provides a mechanism to store arbitrary data and later retrieve it using simple key-value pair semantics. A few examples of how the storage api could be used include:

  • Store an auxiliary dataset useful for your particular analysis.
  • Store custom shapes for regions of interest for your application
  • Upload raster data that can later be registered via the catalog client.

A basic interaction would be composed of a PUT API call to set a blob (chunk of bytes) to some key, then retrieve that data using a GET API call referencing the same key. There is no limit in place on the total size of a blob, though in general it is best to store fewer large files than many small files.

Note

For information about API Quotas and limits see our Quotas & Limits page.

Storage Types

There are several storage types that can be chosen while interacting with the storage api: data, tmp, products, result, and logs. Each of these have different use cases, but all behave similarly.

The data storage type should be considered the default storage type. If you don’t know where data should go, it should probably go there.

The tmp storage type is meant to be used for temporary assets, and may be deleted after seven days.

The products storage type is for storing raster data, which has several restrictions. You may only upload data to the products storage type. Deleting data must be done through the Image.delete() interface.

The result and log storage types are where the output and logs are stored from task procssing.

Set / Get

The most basic interactions exposed by the storage client are the set() and get() operations. The set() operation supports uploading strings and file-like objects.

For example, to store the string “Hello Storage”, you would write storage_client.set(“test”, “Hello Storage”). To later retrieve that value, you run print storage_client.get(“test”).

Uploading and Downloading Large Files

Sometimes you may want to upload a large file. If you were to use the set() operation, the entire file would need to exist in-memory, which may be limiting. To circumvent this limit, you can use the set_file() method. This method supports both filenames and file-like objects.

>>> # using the file object
... with open("filename", "r") as fobj:
...     storage_client.set_file("testkey", fobj)
...
>>> # using the filename
... storage_client.set_file("testkey", "filename")
...

To download a large file, you can use the corresponding get_file() method which, like set_file(), also prevents reading all response content into memory at once and accepts file-like objects as well as filenames.