Raster¶
The Raster
API has two purposes: retrieving data from the Descartes Labs Catalog, and providing a consistent way of splitting that data into smaller tiles.
Note
The Raster API is currently deprecated and should not be used directly. The functionality is now available as part of the Catalog V2 API and the latter should be used for all new code.
Note
For information about API Quotas and limits see our Quotas & Limits page.
Retrieving data¶
The ndarray
, stack
, and raster
methods retrieve raster data, returning them either in an NumPy array or, for raster
, an image file. These methods accept a list of Image
id`s, which can be found by using the :mod:`Catalog ~descarteslabs.catalog client search available imagery. These methods enable you to rescale, resample, reproject, clip, resize, select bands of interest, and convert data type of the imagery in a single call.
These three methods differ in how they return raster data, but all follow the same pattern. Here’s an example of using ndarray
to retrieve a visible-light image from the Sentinel-2 constellation as a NumPy array, taken over the booming town of McCarthy, Alaska in the lower-right (population 28).
>>> from descarteslabs.client.services.raster import Raster
>>> from descarteslabs.utils import display
>>> import matplotlib.pyplot as plt
>>>
>>> raster = Raster()
>>> s2_scene_id = "sentinel-2:L1C:2017-08-07_07VCJ_99_S2A_v1"
>>>
>>> s2_arr, s2_meta = raster.ndarray(
... s2_scene_id,
... resolution=120,
... bands=["red", "green", "blue", "alpha"],
... scales=[[0,8000], [0,8000], [0,8000], [0,1]],
... data_type="Byte",
... drop_alpha=True,
... )
>>> display(s2_arr, bands_axis=-1, title=s2_scene_id)

Notice that we’ve already chosen a specific scene to retrieve: the ID "sentinel-2:L1C:2017-08-07_07VCJ_99_S2A_v1"
. Typically, you’d find the scenes you want by searching with the Catalog
client, which we’ve just done in advance here. This illustrates the difference between Catalog
and Raster
: Catalog
is for searching for imagery, Raster
is for retrieving it.
The three methods generally take the same parameters, but differ in how they return data:
ndarray¶
ndarray
returns images as a NumPy array with 3 dimensions (height, width, and bands), for analysis and manipulation with Python scientific computing tools. These are masked arrays by default which return the data along with a mask that represents pixels with no data.
You can see an example of using ndarray
above and below.
>>> # percent of pixels valid
>>> import numpy as np
>>> np.sum(s2_arr.mask == 0) / s2_arr.data.size
1.0
By default masked=True
but setting masked=False
will return an unmasked NumPy array. Note that in the above call, the drop_alpha
parameter was used to remove the alpha band from the result (after using it to mask the result). This is because masking and the alpha band are redundant, and the display
function accepts either a four-band image (with the alpha channel) or masked data but not both.
raster¶
raster
will save an image file
to disk.
>>> file, meta = raster.raster(
... "landsat:LC08:PRE:TOAR:meta_LC80330352017056_v1",
... resolution=150,
... bands=["red", "green", "blue", "alpha"],
... )
>>> file
landsat:LC08:PRE:TOAR:meta_LC80330352017056_v1.tif
stack¶
stack
returns multiple images as a 4-D NumPy array. This is particularly useful for creating time-stacks.
>>> # Two Landsat-8 scenes over Santa Fe, NM in Feb and Apr 2017
>>> scene_ids = [
... "landsat:LC08:PRE:TOAR:meta_LC80330352017056_v1",
... "landsat:LC08:PRE:TOAR:meta_LC80330352017104_v1"
... ]
>>>
>>> # use a DLTile (see bottom of this guide) to specify explicit
>>> # resolution, bounds, & SRS. required by stack so that each
>>> # ndarray "layer" has the same shape.
>>> dltile_key = "1024:0:60.0:13:-2:64"
>>> stack, raster_info = raster.stack(
... scene_ids,
... dltile=dltile_key,
... bands=["red", "green", "blue", "alpha"],
... scales=[[0,4000], [0,4000], [0,4000], [0,1]],
... data_type="Byte",
... drop_alpha=True
... )
>>>
>>> stack.shape
(2, 1024, 1024, 3)
>>> display(*stack, title=scene_ids, bands_axis=-1) # easily show multiple images

Mosaicking¶
You can also pass in multiple IDs to these methods, which will mosaic those images together. This is useful when your area of interest spans multiple images.
The mosaic ordering follows the ordering of the list of inputs, bottom to top. Thus, if there is any area of overlap between images, the mosaic image will contain the data from the last image.
As an example, here are two Landsat 8 scenes taken near Rome in early 2018:

You can mosaic the two into a single image. Note that where the images overlap, the pixels from the last image in the list simply “cover up” the others. There is no compositing (as you might do with a median, mean, etc.) of the two images.
>>> scene_ids = [
... "landsat:LC08:01:RT:TOAR:meta_LC08_L1TP_191031_20180130_20180130_01_RT_v1",
... "landsat:LC08:01:RT:TOAR:meta_LC08_L1TP_190031_20180123_20180123_01_RT_v1"
... ]
>>>
>>> mosaic1, meta1 = raster.ndarray(
... scene_ids,
... resolution=150,
... bands=["red", "green", "blue", "alpha"],
... drop_alpha=True
... )
>>>
>>> display(mosaic1, bands_axis=-1, size=5) # use to easily show uint16 images

Notice how the order changes which image is “on top”:
>>> mosaic2, meta2 = raster.ndarray(
... list(reversed(scene_ids)),
... resolution=150,
... bands=["red", "green", "blue", "alpha"],
... drop_alpha=True
... )
>>>
>>> display(mosaic2, bands_axis=-1, size=5)

Note
In most cases, but always when mosiacing, you should include the alpha band as the last band. The alpha band indicates which pixels are valid (1) or invalid (0).
Pixels ouside the valid data area (where alpha would be 0) may not be exactly 0, due to compression artifacts. Without including the alpha band, invalid areas of one scene may mask out valid areas of another.

Raster Parameters¶
Raster methods can take many parameters to handle many different use-cases, but it’s not necessary to use (or know) all of them at first.
Bands¶
Use the bands
parameter to choose which bands of the images you want, and in which order. Though photographs typically have exactly three bands—red, green, and blue—remotely-sensed data often has many more bands than that outside the visible spectrum. You can select as many or few bands as you want, in any order. Alpha (mask) bands are handled specially only when they are the last band in the list.
Note
matplotlib
can only display images with one or three bands (plus an optional alpha band).
Here are two rasters of the same scene of a wildfire in Oklahoma, one in the visible-light spectrum, one falsely colored with the short-wave infrared, near-infrared 2, and near-infrared 1 bands:
>>> bands_scene_id = "sentinel-2:L1C:2018-04-13_14SME_99_S2B_v1"
>>>
>>> bands_visible, _ = raster.ndarray(
... bands_scene_id,
... bands=["red", "green", "blue", "alpha"], # visible-light bands
... resolution=150,
... scales=[[0,4000], [0,4000], [0,4000], [0,1]],
... data_type="Byte",
... drop_alpha=True
... )
>>>
>>> bands_ir, _ = raster.ndarray(
... bands_scene_id,
... bands=["swir2", "swir1", "nir", "alpha"], # infrared bands
... resolution=150,
... scales=[[0,4000], [0,4000], [0,4000], [0,1]],
... data_type="Byte",
... drop_alpha=True
... )

Scales¶
You only should specify scales if you specify a data_type smaller than the product’s native data type.
Scales allow you to map pixel values from their native domain to a range of your choosing. For example, a scale of [0, 4000, 0, 255]
would map 4000 to 255, 0 to 0, and 2000 to 127.5. If the range is left out (the second two values), it defaults to [0, 255]
.
>>> raster_params = {
... "data_type": "Byte",
... "bands": ["red", "green", "blue", "alpha"],
... "resolution": 150,
... "drop_alpha": True
... }
>>>
>>> full_TOAR_arr, full_TOAR_meta = raster.ndarray(
... s2_scene_id,
... scales=[[0,10000], [0,10000], [0,10000], [0,1]],
... **raster_params
... )
>>>
>>> partial_TOAR_arr, partial_TOAR_meta = raster.ndarray(
... s2_scene_id,
... scales=[[0,4000], [0,4000], [0,4000], [0,1]],
... **raster_params
... )

Notice how in the second image, the snowy peaks look blown out, but you can see more detail in the darker valleys. Every pixel above 4000 is rescaled to 255, so detail is lost in areas with high reflectance, and more values between 0-255 are used to express dark areas, giving them more dynamic range.
Scaling is only necessary when requesting data in a smaller data type than its native format—like above, where we requested UInt16
data as Byte
. The 65,535 possible values in a UInt16
must be represented with just 255 values in a Byte
. Though you could simply map [0..65,535] to [0..255], when you know something about the data, you can be more clever to reduce loss of precision.
In typical images, most pixels will be below around 40% reflectance. So rather than using only 0-101 to represent 0-40%, and 102-255 for 40-100%—values which rarely occur—it’s common to use a scaling like [0, 4000, 0, 255]
for each band (as seen in the image above). This allows the values containing most of the useful information to utilize the full output range.
Note
Data range vs physical range
The values in raster data have physical meaning—but for efficiency, are often stored on-disk as different numbers.
Most satellite imagery, for example, measures reflectance: the fraction of the sun’s light reflected by the Earth. (Technically, it measures radiance, which is converted to reflectance, then also possibly corrected for atmospheric absorption.)
It would seem logical to represent these fractions as floating-point numbers between 0 and 1. However, they can also be represented as integers between 0 and 10,000 without loss of precision (due to the number of significant figures to which the instrument is actually accurate).
So storing the data as 16-bit unsigned integers is as precise as 64-bit floats, but uses four times less space (and is faster to operate on for the CPU).
We call the data range the range that values can have in their on-disk format (for example, 0-10000), and the physical range the numbers with physical meaning that the data range represents (in this case, reflectance from 0 to 1).
Again: scales are unnecessary if you just request the data in its native data type. The parameter exists only to allow for the optimization of downsampling to a smaller data type (from UInt16
to Byte
), while retaining control over how that downsampling occurs. Unless you need to optimize your memory usage in that way, it’s easier to specify neither dtype
nor scales
and work with data in its native format. If necessary, you can use display
to visualize it (since matplotlib’s imshow
can only handle uint8
arrays).
Warning
You should always specify a scaling when requesting data in a different data type than its native format.
Otherwise, you’ll get unexpected results, as the original binary values are simply cast to the new data type.
>>> do_this, _ = raster.ndarray(
... s2_scene_id,
... scales=[[0, 10000], [0, 10000], [0, 10000], [0, 1]],
... data_type="Byte", # original data is uint16
... bands=["red", "green", "blue", "alpha"],
... resolution=200,
... masked=False
... )
>>>
>>> dont_do_this, _ = raster.ndarray(
... s2_scene_id,
... scales=None, # no! specify a scaling!
... data_type="Byte", # original data is uint16
... bands=["red", "green", "blue", "alpha"],
... resolution=200,
... masked=False
... )


Cutline¶
A cutline
clips raster data to geometry of your choice. Cutlines can be any GeoJSON object, or a WKT string. As a GeoJSON object, coordinates are expected to be in WGS84 lat-lon.
>>> geometry = {
... "type": "Polygon",
... "coordinates": [[
... [-105.718463, 36.995841],
... [-105.200117, 36.915628],
... [-105.53038, 36.013014],
... [-106.058364, 36.296978],
... [-105.938469, 36.465516],
... [-106.006634, 36.995343],
... [-105.718463, 36.995841]
... ]]
... }
>>>
>>> taos_scene_id = "landsat:LC08:PRE:TOAR:meta_LC80330352017056_v1"
>>>
>>> clipped_arr, c_meta = raster.ndarray(
... taos_scene_id,
... cutline=geometry, # clip to shape
... bands=["red", "green", "blue", "alpha"],
... resolution=150,
... scales=[[0,4000], [0,4000], [0,4000], [0,1]],
... data_type="Byte",
... drop_alpha=True
... )
>>>
>>> unclipped_arr, u_meta = raster.ndarray(
... taos_scene_id,
... cutline=None, # no cutline
... bands=["red", "green", "blue", "alpha"],
... resolution=150,
... scales=[[0,4000], [0,4000], [0,4000], [0,1]],
... data_type="Byte",
... drop_alpha=True
... )

Bounds¶
Bounds clips rasters to a rectangular spatial region defined by [min_x, min_y, max_x, max_y]
. If bounds_srs
is not set, those coordinates should be in the same spatial reference system as the output.
Bounds define a spatial region more explicitly than a cutline; if you find that different images with the same cutline have slightly different numbers of pixels, try setting the bounds.
SRS¶
The SRS (Spatial Reference System, also known as Coordinate Reference System) sets the map projection of the output raster. It can be specified as an EPSG code (like “EPSG:4326”), a PROJ.4 definition, or OGC CRS Well-Known Text.
>>> s2_arr_wgs84, m_wgs84 = raster.ndarray(
... s2_scene_id,
... srs="EPSG:4326", # WGS84 (lat-lon)
... resolution=0.001, # in decimal degrees
... bands=["red", "green", "blue", "alpha"],
... scales=[[0,8000], [0,8000], [0,8000], [0,1]],
... data_type="Byte",
... drop_alpha=True
... )
>>>
>>> s2_arr_polar, m_polar = raster.ndarray(
... s2_scene_id,
... srs="EPSG:3995", # WGS84 Arctic Polar Stereographic
... resolution=150,
... bands=["red", "green", "blue", "alpha"],
... scales=[[0,8000], [0,8000], [0,8000], [0,1]],
... data_type="Byte",
... drop_alpha=True
... )

Here’s the first image we saw over Alaska, in its (native) UTM projection, versus a WGS84 geographic coordinate system (Plate Carrée, where latitude and longitude are mapped directly to x and y), versus an Arctic Plolar Stereographic projection, if you’re into that.
The affine geotransform matrix for the raster is available in the rasterization metadata dictionary that’s also returned from raster calls if you need it for interacting with other geospatial software, or uploading data back to the Descartes Labs Catalog.
>>> m_wgs84["geoTransform"]
[-144.84256570972653, 0.001, 0.0, 62.22092380677686, 0.0, -0.001]
Resolution¶
Resolution specifies the distance that the edge of each pixel represents on the ground, in units of the SRS. So in UTM coordinates, for example, it should be given in meters; in lat-lon coordinates, it should be in decimal degrees. Therefore, the smaller the number, the finer resolution.
Using as coarse of a resolution as possible for your analysis will be both faster and cheaper (less data to transfer over the network, and to operate on). In particular, loading an entire scene at full resolution is rarely necessary, and will take a while.
Dimensions¶
Dimensions specify a width and height, in pixels, within which the output raster should fit.
You can either specify the resolution or the dimensions, but not both. If you use resolution, the output image is resampled to the specified pixel size, which determines its width and height. If you use dimensions, a resolution is chosen that fits the output image into the specified width and height. However, to maintain the aspect ratio, the output dimensions may end up being smaller than you specify.
>>> singapore_s2 = "sentinel-2:L1C:2016-04-16_48NUG_99_S2A_v1"
>>>
>>> arr_smalldims, _ = raster.ndarray(
... singapore_s2,
... dimensions=(100, 100), # width x height
... bands=["red", "green", "blue", "alpha"],
... scales=[[0,4000], [0,4000], [0,4000], [0,1]],
... data_type="Byte",
... drop_alpha=True
... )
>>>
>>> arr_smalldims.shape
(100, 100, 3)
>>> arr_biggerdims, _ = raster.ndarray(
... singapore_s2,
... dimensions=(600, 400), # width x height, unequal
... bands=["red", "green", "blue", "alpha"],
... scales=[[0,4000], [0,4000], [0,4000], [0,1]],
... data_type="Byte",
... drop_alpha=True
... )
>>>
>>> # notice the array is still 400 x 400; it just fits *within* 600 x 400
>>> arr_biggerdims.shape
(400, 600, 3)

Align Pixels¶
The align_pixels
parameter ensures that pixels’ top-left corners in the output raster correspond to non-fractional intervals of resolution
in the output SRS.
What does that mean? Consider requesting two rasters in some imaginary coordinate system with units in meters, one requested with the bounds [0, 0, 10, 10], and another with the bounds [0.5, 0.8, 10.5, 10.8]. Since both areas are 10x10 meters, both rasters would be 10x10 pixels at 1-meter resolution. However, pixel (0, 0) in the first raster would refer to the coordinates (0, 0) in our imaginary SRS, whereas pixel (0, 0) in the second would refer to (0.5, 0.8). Therefore, though the pixels would have the same array indicies, they wouldn’t be quite aligned to the same place on Earth, and comparisons between them would be invalid.
Specifying align_pixels=True
in this case would internally shift bounds
for you, so that the coordinates fell on whole-number intervals of 1 meter. In that case, the bounds of the output would end up being [0, 0, 10, 10].
Therefore, when you’re loading different images covering the same area, and you want to compare pixel-to-pixel between them, set align_pixels=True
in most cases.
>>> s1_id = "sentinel-1:GRD:meta_2018-06-16_085D5608_S1A"
>>>
>>> utm_bounds = (
... 418080.5,
... 3379200.25,
... 518080.5,
... 3579200.25
... )
>>>
>>> s1_arr, s1_meta = raster.ndarray(
... s1_id,
... align_pixels=False,
... bounds=utm_bounds,
... resolution=200,
... bands=["vv", "vh"]
... )
>>>
>>> s1_arr_tap, s1_meta_tap = raster.ndarray(
... s1_id,
... align_pixels=True,
... bounds=utm_bounds,
... resolution=200,
... bands=["vv", "vh"]
... )
>>>
>>> # without align_pixels, the pixels start on a fraction of a meter
>>> s1_meta['cornerCoordinates']['upperLeft'], s1_meta['cornerCoordinates']['lowerRight']
([418080.5, 3579200.25], [518080.5, 3379200.25])
>>> # with it, they are "snapped" to whole meters
>>> s1_meta_tap['cornerCoordinates']['upperLeft'], s1_meta_tap['cornerCoordinates']['lowerRight']
([418000.0, 3579400.0], [518200.0, 3379200.0])
>>> s1_arr.shape
(1000, 500, 2)
>>> # notice that this changes the output dimensions
>>> # to cover the extra fraction of a pixel
>>> s1_arr_tap.shape
(1001, 501, 2)
All Touched¶
The cutline_all_touched
controls how source image pixels are mapped into the output pixels. Specifically it controls the GDAL CUTLINE_ALL_TOUCHED
warp option. By default this parameter is False
, and a source pixel will only contribute to the destination pixel if the bounds of the destination pixel contain the centroid of the source pixel (the so-called “pixel-as-point” approach). If set to True
then a source pixel will contribute to the destination pixel if the bounds of the destination pixel intersect with the bounds of the source pixel.
It is recommended to leave cutline_all_touched
at its default value. In general it guards against systematic bias in the results. However, there are circumstances where all_touched=True
is warranted. This is the case when the destination resolution is smaller than the source resolution and the bounds of a destination pixel may not contain the centroid of any source pixel, resulting in a masked pixel.
Here’s an example:
>>> scene_id = "ecmwf:era5:v0:ERA5_20210401T20_tile4"
>>>
>>> geometry = {
... "type": "Polygon",
... "coordinates": ((
... (-50.21819862632718, -6.033494420295789),
... (-50.21973462385856, -6.021475529166609),
... (-50.09615761215656, -6.021324708843302),
... (-50.09368786752497, -6.121996683190829),
... (-50.21768560099267, -6.122384050341935),
... (-50.21819862632718, -6.033494420295789)
... ),)
... }
>>>
>>> bounds = (
... -50.21973462385856,
... -6.122384050341935,
... -50.09368786752497,
... -6.021324708843302
... )
>>>
>>>
>>> arr, meta = raster.ndarray(
... scene_id,
... align_pixels=False,
... outputBounds=bounds,
... outputBoundsSRS="EPSG:4326",
... srs="EPSG:3857",
... resolution=100.,
... bands=["tp"],
... cutline=geometry
... )
>>>
>>> arr_at, meta_at = raster.ndarray(
... scene_id,
... align_pixels=False,
... outputBounds=bounds,
... outputBoundsSRS="EPSG:4326",
... srs="EPSG:3857",
... resolution=100.,
... bands=["tp"],
... cutline=geometry,
... cutline_all_touched=True
... )
>>>
>>> # without ``cutline_all_touched`` the result contains no data (all masked)
>>> import numpy as np
>>> arr.mask.sum() == np.prod(arr.shape)
True
>>> # with the ``cutline_all_touched`` the result contains data which comes from outside the cutline
>>> arr_at.mask.sum() == 0
True
Resampler¶
This determines the algorithm used to interpolate between pixels when scaling up, scaling down, or otherwise warping a raster to a new resolution or SRS.
Nearest-neighbor is the default (fastest but lowest-quality). Bilinear produces visually smoother images when downsampling. Be aware, though, that if using discretely-valued data (such as a raster classification layer where 1 means brick, 3 means wood, etc.), use a resampler like nearest-neighbor or median—an interpolating resampler could produce values that don’t exist in your input data (like 2).
>>> resamplers = [
... "bilinear",
... "near",
... "cubic"
... ]
>>>
>>> arrs = [
... raster.ndarray(
... s1_id,
... resampler=resampler,
... resolution=1000,
... bands=["vv", "vh", "vv"]
... )[0]
... for resampler in resamplers
... ]
>>>
>>> display(arrs[0], title=resamplers[0], bands_axis=-1)

>>> display(arrs[1], title=resamplers[1], bands_axis=-1)

>>> display(arrs[2], title=resamplers[2], bands_axis=-1)
