About

Radiant Earth's registry contains information about geospatial training datasets that are available via Radiant MLHub. For instructions to access these open datasets, please refer to the Radiant MLHub API documentation. Each training dataset has a specific license and citation that should be followed when using the data.

See all usage examples for datasets listed in this registry.


Search datasets (currently 13 matching datasets)


Update registry

If you want to update the entry for a geospatial training dataset (e.g. adding new tutorials, publications, etc), please follow the instructions on the GitHub repository for Radiant MLHub geospatial training data registry.

If you want Radiant Earth to share/host your training dataset on Radiant MLHub, please contact us by filling out this form.

BigEarthNet

image classificationland coversentinel-2

BigEarthNet is a new large-scale Sentinel-2 benchmark archive, consisting of 590,326 Sentinel-2 image patches. To construct BigEarthNet, 125 Sentinel-2 tiles acquired between June 2017 and May 2018 over the 10 countries (Austria, Belgium, Finland, Ireland, Kosovo, Lithuania, Luxembourg, Portugal, Serbia, Switzerland) of Europe were initially selected. All the tiles were atmospherically corrected by the Sentinel-2 Level 2A product generation and formatting tool (sen2cor). Then, they were divided into 590,326 non-overlapping image patches. Each image patch was annotated by the multiple land-cove...

Details →

Usage examples

See 2 usage examples →

CV4A Kenya Crop Type Competition

crop typesegmentationsentinel-2

This dataset was produced as part of the Crop Type Detection competition at the Computer Vision for Agriculture (CV4A) Workshop at the ICLR 2020 conference. The objective of the competition was to create a machine learning model to classify fields by crop type from images collected during the growing season by the Sentinel-2 satellites.

The ground reference data were collected by the PlantVillage team, and Radiant Earth Foundation curated the training dataset after inspecting and selecting more than 4,000 fields from the original ground reference data. The dataset has been split into training and test sets (3,286 in the train and 1,402 in the test).

The dataset is cataloged in four tiles. These tiles are smaller than the original Sentinel-2 tile that has been clipped and chipped to the geographical area that labels have been collected.

Each tile has a) 13 multi-band observations throughout the growing s...

Details →

Usage examples

See 5 usage examples →

Chesapeake Land Cover

building footprintsland coverlandsat 8naipnlcdsegmentation

This dataset contains high-resolution aerial imagery from the USDA NAIP program, high-resolution land cover labels from the Chesapeake Conservancy, low-resolution land cover labels from the USGS NLCD 2011 dataset, low-resolution multi-spectral imagery from Landsat 8, and high-resolution building footprint masks from Microsoft Bing, formatted to accelerate machine learning research into land cover mapping. The Chesapeake Conservancy spent over 10 months and $1.3 million creating a consistent six-class land cover dataset covering the Chesapeake Bay watershed. While the purpose of the mapping eff...

Details →

Dalberg Data Insights Crop Type Uganda

crop typesegmentationsentinel-2

This dataset contains crop types and field boundaries along with other metadata collected in a campaign run by Dalberg Data Insights in the end of September 2017, as close as possible to the harvest period of 2017. GeoODKapps were used to collect approximately four points per field to get widest coverage during two field campaigns.

Post ground data collection, Radiant Earth Foundation conducted a quality control of the polygons using Sentinel-2 imagery of the growing season as well as Google basemap imagery, and removed several polygons that overlapped with infrastructure or built-up areas. F
...

Details →

Usage examples

See 1 usage example →

Great African Food Company Crop Type Tanzania

crop typesegmentationsentinel-2

This dataset contains field boundaries and crop types from farms in Tanzania. Great African Food Company used Farmforce app to collect a point within each field, and recorded other properties including area of the field.

Radiant Earth Foundation team used the point measurements from the ground data collection and the area of each field overlaid on satellite imagery (multiple Sentinel-2 scenes during the growing season, and Google basemap) to draw the polygons for each field. These polygons do not cover the entirety of the field, and are always enclosed within the field. Therefore, they should
...

Details →

Usage examples

See 1 usage example →

LandCoverNet

land coversegmentationsentinel-2

LandCoverNet is a global annual land cover classification training dataset with labels for the multi-spectral satellite imagery from Sentinel-2 mission in 2018. Version 1.0 of the dataset contains data across Africa, which accounts for ~1/5 of the global dataset. Each pixel is identified as one of the seven land cover classes based on its annual time series. These classes are water, natural bare ground, artificial bare ground, woody vegetation, cultivated vegetation, (semi) natural vegetation, and permanent snow/ice.

There are a total of 1980 image chips of 256 x 256 pixels in V1.0 spanning 66 tiles of Sentinel-2. Each image chip contains temporal observations from Sentinel-2 surface reflectance product (L2A) at 10m spatial resolution and an annual class label, all stored in a raster format (GeoTIFF files).

Radiant Earth Foundation designed and generated this dataset with a grant from
...

Details →

Usage examples

See 1 usage example →

PlantVillage Crop Type Kenya

crop typesegmentationsentinel-2

This dataset contains field boundaries and crop type information for fields in Kenya. PlantVillage app is used to collect multiple points around each field and collectors have access to basemap imagery in the app during data collection. They use the basemap as a guide in collecting and verifying the points.

Post ground data collection, Radiant Earth Foundation conducted a quality control of the polygons using Sentinel-2 imagery of the growing season as well as Google basemap imagery. Two actions were taken on the data 1)several polygons that had overlapping areas with different crop labels we
...

Details →

Usage examples

See 1 usage example →

SpaceNet 1

building footprintssegmentationworldview-3

The commercialization of the geospatial industry has led to an explosive amount of data being collected to characterize our changing planet. One area for innovation is the application of computer vision and deep learning to extract information from satellite imagery at scale. CosmiQ Works, Radiant Solutions and NVIDIA have partnered to release the SpaceNet data set to the public to enable developers and data scientists to work with this data.

Today, map features such as roads, building footprints, and points of interest are primarily created through manual techniques. We believe that advanc
...

Details →

Usage examples

See 2 usage examples →

SpaceNet 2

building footprintssegmentationworldview-3

The commercialization of the geospatial industry has led to an explosive amount of data being collected to characterize our changing planet. One area for innovation is the application of computer vision and deep learning to extract information from satellite imagery at scale. CosmiQ Works, Radiant Solutions and NVIDIA have partnered to release the SpaceNet data set to the public to enable developers and data scientists to work with this data.

Today, map features such as roads, building footprints, and points of interest are primarily created through manual techniques. We believe that advanc
...

Details →

Usage examples

See 2 usage examples →

SpaceNet 3

road networksegmentationworldview-3

The commercialization of the geospatial industry has led to an explosive amount of data being collected to characterize our changing planet. One area for innovation is the application of computer vision and deep learning to extract information from satellite imagery at scale. CosmiQ Works, Radiant Solutions and NVIDIA have partnered to release the SpaceNet data set to the public to enable developers and data scientists to work with this data.

Today, map features such as roads, building footprints, and points of interest are primarily created through manual techniques. We believe that advanc
...

Details →

Usage examples

See 2 usage examples →

SpaceNet 4

building footprintsoff-nadirsegmentationworldview-3

The commercialization of the geospatial industry has led to an explosive amount of data being collected to characterize our changing planet. One area for innovation is the application of computer vision and deep learning to extract information from satellite imagery at scale. CosmiQ Works, Radiant Solutions and NVIDIA have partnered to release the SpaceNet data set to the public to enable developers and data scientists to work with this data.

Today, map features such as roads, building footprints, and points of interest are primarily created through manual techniques. We believe that advanc
...

Details →

Usage examples

See 2 usage examples →

SpaceNet 5

road networksegmentationworldview-3

Determining optimal routing paths in near real-time is at the heart of many humanitarian, civil, military, and commercial challenges. This statement is as true today as it was two years ago when the SpaceNet Partners announced the SpaceNet Challenge 3 focused on road network detection and routing. In a disaster response scenario, for example, pre-existing foundational maps are often rendered useless due to debris, flooding, or other obstructions. Satellite or aerial imagery often provides the first large-scale data in such scenarios, rendering such imagery attractive.

The SpaceNet 5 challenge sought to build upon the advances from SpaceNet 3 and test challenge participants to automatically extract road networks and routing information from satellite imagery, along with travel time estimates along all roadways, thereby permitting true optimal routing.

The task of this challenge was to output a detailed graph structure with edges corresponding to roadways and nodes corresponding to intersections and end points, with estimates for route travel times on all detected edges. You can find a detailed description of CosmiQ Works’ algorithmic baseline on their blog at The DownLinQ.

SpaceNet o...

Details →

Usage examples

See 3 usage examples →

SpaceNet 6

building footprintsoff-nadirsarsegmentationworldview-2

Synthetic Aperture Radar (SAR) is a unique form of radar that can penetrate clouds, collect during all- weather conditions, and capture data day and night. Overhead collects from SAR satellites could be particularly valuable in the quest to aid disaster response in instances where weather and cloud cover can obstruct traditional electro-optical sensors. However, despite these advantages, there is limited open data available to researchers to explore the effectiveness of SAR for such applications, particularly at ultra-high resolutions.

The task of SpaceNet 6 was to automatically extract building footprints with computer vision and artificial intelligence (AI) algorithms using a combination of SAR and electro-optical imagery datasets. This openly-licensed dataset features a unique combination of half-meter Synthetic Aperture Radar (SAR) imagery from Capella Space and half-meter electro-optical (EO) imagery from Maxar’s WorldView 2 satellite. The area of interest for the challenge was centered over the largest port in Europe: Rotterdam, the Netherlands. This area features thousands of buildings, vehicles, and boats of various sizes, to make an effective test bed for SAR and the fusion of these two types of data.

In this challenge, the training dataset contained both SA
...

Details →

Usage examples

See 2 usage examples →

Western USA Live Fuel Moisture

landsat 8live fuel moisturesarsentinel-1

This data contains manually collected live fuel moisture measurements in the western United States and remotely-sensed variables. Live fuel moisture represents the mass of water in live vegetation elements like leaves, needles, and twigs divided by its oven-dried mass. It is represented in percentages. Higher the live fuel moisture, wetter the vegetation elements, and vice versa. Live fuel moisture measurements were collected by the United States Forest Service and are available from the National Fuel Moisture Database. Each row of the data corresponds to one unique ground measurement of live ...

Details →

Usage examples

See 1 usage example →