About

This registry contains information about datasets that are available via Radiant MLHub. For instructions to access these datasets, please refer to our API documentation. Each dataset has specific license and citation that should be followed when using the data.

See all usage examples for datasets listed in this registry.


Search datasets (currently 13 matching datasets)


Update registry

If you want to update the entry for a dataset (e.g. adding new tutorials, publications, etc), please follow the instructions on the GitHub repository for Radiant MLHub data registry.

If you want to share/host your training dataset on Radiant MLHub, please contact us by filling out this form.

CV4A Kenya Crop Type Competition

crop typesegmentationsentinel-2

This dataset was produced as part of the Crop Type Detection competition at the Computer Vision for Agriculture (CV4A) Workshop at the ICLR 2020 conference. The objective of the competition was to create a machine learning model to classify fields by crop type from images collected during the growing season by the Sentinel-2 satellites.

The ground reference data were collected by the PlantVillage team, and Radiant Earth Foundation curated the training dataset after inspecting and selecting more than 4,000 fields from the original ground reference data. The dataset has been split into training and test sets (3,286 in the train and 1,402 in the test).

The dataset is cataloged in four tiles. These tiles are smaller than the original Sentinel-2 tile that has been clipped and chipped to the geographical area that labels have been collected.

Each tile has a) 13 multi-band observations throughout the growing s...

Details →

Usage examples

See 5 usage examples →

BigEarthNet

image classificationland coversentinel-2

BigEarthNet is a new large-scale Sentinel-2 benchmark archive, consisting of 590,326 Sentinel-2 image patches. To construct BigEarthNet, 125 Sentinel-2 tiles acquired between June 2017 and May 2018 over the 10 countries (Austria, Belgium, Finland, Ireland, Kosovo, Lithuania, Luxembourg, Portugal, Serbia, Switzerland) of Europe were initially selected. All the tiles were atmospherically corrected by the Sentinel-2 Level 2A product generation and formatting tool (sen2cor). Then, they were divided into 590,326 non-overlapping image patches. Each image patch was annotated by the multiple land-cove...

Details →

Usage examples

See 2 usage examples →

SpaceNet 2

building footprintssegmentationworldview-3

The commercialization of the geospatial industry has led to an explosive amount of data being collected to characterize our changing planet. One area for innovation is the application of computer vision and deep learning to extract information from satellite imagery at scale. CosmiQ Works, Radiant Solutions and NVIDIA have partnered to release the SpaceNet data set to the public to enable developers and data scientists to work with this data.

Today, map features such as roads, building footprints, and points of interest are primarily created through manual techniques. We believe that advanc
...

Details →

Usage examples

See 2 usage examples →

Dalberg Data Insights Crop Type Uganda

crop typesegmentationsentinel-2

This dataset contains crop types and field boundaries along with other metadata collected in a campaign run by Dalberg Data Insights in the end of September 2017, as close as possible to the harvest period of 2017. GeoODKapps were used to collect approximately four points per field to get widest coverage during two field campaigns.

Post ground data collection, Radiant Earth Foundation conducted a quality control of the polygons using Sentinel-2 imagery of the growing season as well as Google basemap imagery, and removed several polygons that overlapped with infrastructure or built-up areas. F
...

Details →

Usage examples

See 1 usage example →

Great African Food Company Crop Type Tanzania

crop typesegmentationsentinel-2

This dataset contains field boundaries and crop types from farms in Tanzania. Great African Food Company used Farmforce app to collect a point within each field, and recorded other properties including area of the field.

Radiant Earth Foundation team used the point measurements from the ground data collection and the area of each field overlaid on satellite imagery (multiple Sentinel-2 scenes during the growing season, and Google basemap) to draw the polygons for each field. These polygons do not cover the entirety of the field, and are always enclosed within the field. Therefore, they should
...

Details →

Usage examples

See 1 usage example →

PlantVillage Crop Type Kenya

crop typesegmentationsentinel-2

This dataset contains field boundaries and crop type information for fields in Kenya. PlantVillage app is used to collect multiple points around each field and collectors have access to basemap imagery in the app during data collection. They use the basemap as a guide in collecting and verifying the points.

Post ground data collection, Radiant Earth Foundation conducted a quality control of the polygons using Sentinel-2 imagery of the growing season as well as Google basemap imagery. Two actions were taken on the data 1)several polygons that had overlapping areas with different crop labels we
...

Details →

Usage examples

See 1 usage example →

Chesapeake Land Cover

building footprintsland coverlandsat 8naipnlcdsegmentation

This dataset contains high-resolution aerial imagery from the USDA NAIP program, high-resolution land cover labels from the Chesapeake Conservancy, low-resolution land cover labels from the USGS NLCD 2011 dataset, low-resolution multi-spectral imagery from Landsat 8, and high-resolution building footprint masks from Microsoft Bing, formatted to accelerate machine learning research into land cover mapping. The Chesapeake Conservancy spent over 10 months and $1.3 million creating a consistent six-class land cover dataset covering the Chesapeake Bay watershed. While the purpose of the mapping eff...

Details →