SeagrassFinder: An Underwater Eelgrass Image Classification Dataset

  • Jannik Elsäßer (Creator)
  • Laura Weihl (Creator)
  • Lisbeth Tangaa Nielsen (Creator)
  • Veronika Cheplygina (Creator)

Dataset

Description

This dataset is published as part of the publishing of the paper “SeagrassFinder: Deep Learning for Eelgrass Detection and Coverage Estimation in the Wild” in the Journal Ecological Informatics. The dataset is created as a machine learning dataset for training computer vision models to classify the presence of eelgrass. This dataset was created by the main author Jannik Elsäßer as part of his bachelor's thesis. The original video transect data in this dataset comes from DHI A/S work providing By og Havn a “Summer Status” report on the maritime environmental impacts of the Lynetteholm project. More information on the project and the report is available here: https://byoghavn.dk/mediebibliotek/lynetteholm-sommerstatus-2023/ The dataset consists of underwater images taken on a sled, dragged through the water by a survey vessel. The camera used is a Subsea HD-Camera made by LH-Camera. Images were created by taking 5 video frames each second, and then randomly sampling. Each image is labeled True or False for eelgrass presence. In total, the dataset consists of 8500 images from 6 different transects, with 4482 images containing eelgrass, and 4042 images not containing eelgrass. All images have been annotated by a both domain-experts, and non-domain experts. Images were annotated using a uniform sampling process. In the occurrence of any disagreement between annotators, images have been removed from the dataset. For more information on the dataset creation, please refer to the corresponding paper. We recommend using one transect as a test dataset, and not using a random split of all images to create the test dataset. When using a random split of all images, a form of data leakage occurs, since some images can be very similar to other images. An unfortunate limitation, we believe caused by the compression of the videos in the camera system, is some frames contain an echo or form of motion trail. This can lead to ghost like eelgrass features in some frames. This should be taken into consideration when applying the dataset in future locations.
Date made available8 Oct 2024
PublisherZENODO

Cite this