Jellyfish blooms pose a significant economic threat to the global aquaculture industry, causing mass mortality events in farmed fish. Automated detection systems are needed for early warning, but existing computer vision models have not been tested in the challenging visual conditions of aquaculture pens. We evaluated domain adaptation approaches and jellyfish object detection models trained specifically for aquaculture environments, comparing eight architectures including convolutional (YOLOv11) and transformer-based (RT-DETR, DINO) models. Our novel dataset comprised 31,875 jellyfish annotations across 2558 images extracted from 118 unique videos recorded in salmon farms in Tasmania, Australia, capturing jellyfish and gelatinous zooplankton under challenging conditions of high turbidity, complex backgrounds, and low visibility. To evaluate domain adaptation strategies, we combined publicly available datasets with two newly collected non-aquaculture datasets, creating a diverse corpus of 17,622 images with 32,025 annotations. Unlike previous studies, we implemented a strict evaluation protocol accounting for spatial–temporal correlations of images extracted from the same video sequence. Detection transformers consistently outperformed fully convolutional architectures. The DINO architecture with a transformer backbone achieved 56.5% mAP50 when pre-trained on the combined non-aquaculture data and fine-tuned on aquaculture data—a 4.6 percentage point improvement over training on aquaculture data alone—with strongest improvements in detecting challenging categories such as ctenophores (AP50 improved from 36.9% to 51.3%). This work provides the first jellyfish detection system trained and evaluated for aquaculture environments and demonstrates that transformer-based architectures with large-scale pre-training on out-of-domain data offer the most effective approach for developing jellyfish early warning systems in fish farms.
扫码关注我们
求助内容:
应助结果提醒方式:
