In our latest article, Kaya ter Burg, Researcher on multi-modal underwater computer vision for litter detection at Delft University of Technology, shares insights about her work for SeaClear2.0, and the collection and curation of a dataset for marine litter, flora and fauna.
Read the full article below:
Written by: Kaya ter Burg
One of the key components of the SeaClear2.0 system is the MiniTortuga. The MiniTortuga is a remotely operated vehicle (ROV) whose task is to scan the seabed to discover where the litter is located and what type of litter it is. In order for the MiniTortuga to do this, we need to integrate algorithms that can perform marine litter identification and classification. For this, we use deep-learning-based artificial intelligence (AI) techniques.
However, before we can do any deep-learning, we need to have data, a large amount of it. Most deep-learning techniques work by processing huge amounts of data in order to learn a particular task. The neural networks are then able to grasp the patterns that occur in the data and based on this, they can then perform their task autonomously. For the SeaClear2.0 system, this task would be the localization and classification of marine litter. Localization here is knowing where in an image an object occurs. Classification is knowing what type of object it is.
The first step in getting enough data for feeding our neural networks is the actual collection of it. This involves getting out in the water and making camera and sonar videos of the seabed. We want to collect both camera and sonar video simultaneously, so that we can later on use both to detect marine litter. In the videos below, you can see a snippet of the camera and sonar videos we collect for the dataset.




Data is collected from all pilot and demo sites involved in the SeaClear2.0 project, so that we can curate a diverse dataset for our algorithms. This is very important, since the differences in the underwater environment between locations can be huge! Some example frames all from different locations collected by the SeaClear[1] team are shown below. The turbidity, reflections and colors change drastically. This is one of the main challenges in underwater computer vision. To combat this, we want our algorithms to have seen many different circumstances during training, so that it works well under various conditions, even ones that it might not have seen before. Therefore, we need to collect data from many different locations.
After collection has taken place, we now have camera and sonar videos that contain various marine litter items. However, this “raw” data, as it is called, cannot be used as-is to train our algorithms. We first need to process the videos further. The first processing step is to extract the frames from the video, so that we end up with images.
Then, the most important step is to label the data. This means that we manually annotate all the images with the location and type of the litter objects that occur in each image. The location of the object is denoted by a bounding box, which is the smallest box you can draw around an object without missing any parts of it. Each object also gets a class label, which describes the precise type the object is, such as a plastic bottle. These bounding boxes and class labels are the information we will give to the algorithms in order to teach them how to do this localization and classification autonomously without the need for further human annotations. Labeling the data is a very time-consuming task and it is important to do precisely, since the quality of the annotations is directly tied with the performance of our algorithms after training. For the SeaClear2.0 dataset, we label many different types of litter, animals, plants and ROV parts that might be visible in a frame. That means our algorithms will not only be able to detect marine litter, but also marine flora and fauna. This ensures that the system is able to collect litter without harming the surrounding environment.


The next step is actually using this data for its intended purpose: training neural networks to detect marine litter!
[1] SeaClear2.0 builds on the work implemented within the Horizon 2020-funded project SeaClear, which developed the first autonomous robotic system for seafloor litter collection and ended in December 2023.