Real-time object detection and tracking is an active area of aerial remote sensing research that enables many environmental and ecological monitoring and preservation applications. Despite the development of several solutions tailored for these specific applications, trade-offs between cost efficiency and feature richness persist. This paper proposes a lightweight, low-cost, and modular approach to real-time object detection and instance tracking, enabling a wide gamut of use cases. By integrating real-time object detection models with affordable embedded hardware, we present a system that uses image metadata to perform geolocation on detected objects, enabling real-time applications due to minimal computational overhead. This algorithm generates cleaner ’areas of interest’ based on geolocated detections filtered by a clustering algorithm to remove false positives. In our findings, this proved a viable solution with real-time processing speeds and GPS positioning accuracy within a meter. While there is room for improvement, our proposed pipeline represents a significant step forward in lowering the costs involved with applying computer vision to conservation applications.
Forest diebacks pose a major threat to global ecosystems. Identifying and mapping both living and dead trees is crucial for understanding the causes and implementing effective management strategies. This study explores the efficacy of Mask R–CNN for automated forest dieback monitoring. The method detects individual trees, delineates their crowns, and classifies them as alive or dead. We evaluated the approach using aerial imagery and canopy height models in the Harz Mountains, Germany, a region severely affected by forest dieback. To assess the model's ability to track changes over time, we applied it to images from three separate flight campaigns (2009, 2016, and 2022). This evaluation considered variations in acquisition dates, cameras, post-processing techniques, and image tilting. Forest changes were analyzed based on the detected trees' number, spatial distribution, and height. A comprehensive accuracy assessment demonstrated the Mask R–CNN's robust performance, with precision scores ranging from 0.80 to 0.88 and F1-scores from 0.88 to 0.91. These results confirm the model's ability to generalize across diverse image acquisition conditions. While minor changes were observed between 2009 and 2016, the period between 2016 and 2022 witnessed substantial dieback, with a 64.57% loss of living trees. Notably, taller trees appeared to be particularly affected. This study highlights Mask R–CNN's potential as a valuable tool for automated forest dieback monitoring. It enables efficient detection, delineation, and classification of both living and dead trees, providing crucial data for informed forest management practices.
The progressing industrialization of oceans mandates reliable, accurate and automatable subsea survey methods. Close-range photogrammetry is a promising discipline, which is frequently applied by archaeologists, fish-farmers, and the offshore energy industry. This paper presents a robust approach for the reliable detection and identification of photogrammetric markers in subsea images. The proposed method is robust to severe image degradation, which is frequently observed in underwater images due to turbidity, light absorption, and optical aberrations. This is the first step towards a highly automated work-flow for single-camera underwater photogrammetry. The newly developed approach comprises several machine learning models, which are trained by 10,122 real-world subsea images, showing a total of 338,301 photogrammetric markers. The performance is evaluated using an object detection metrics, and through a comparison with the commercially available software Metashape by Agisoft. Metashape delivers satisfactory results when the image quality is good. In images with strong noise, haze or little light, only the novel approach retrieves sufficient information for a high degree of automation of the subsequent bundle adjustment. While the need for offshore personnel and the time-to-results decreases, the robustness of the survey increases.
Several industrial and commercial bulk material management applications rely on accurate, current stockpile volume estimation. Proximal imaging and LiDAR sensing modalities can be used to derive stockpile volume estimates in outdoor and indoor storage facilities. Among available imaging and LiDAR sensing modalities, the latter is more advantageous for indoor storage facilities due to its ability to capture scans under poor lighting conditions. Evaluating volumes from such sensing modalities requires the pose (i.e., position and orientation) parameters of the used sensors relative to a common reference frame. For outdoor facilities, a Global Navigation Satellite System (GNSS) combined with an Inertial Navigation System (INS) can be used to derive the sensors’ pose relative to a global reference frame. For indoor facilities, GNSS signal outages will not allow for such capability. Prior research has developed strategies for establishing the sensor position and orientation for stockpile volume estimation while relying on multi-beam spinning LiDAR units. These approaches are feasible due to the large range and Field of View (FOV) of such systems that can capture the internal surfaces of indoor storage facilities.
The mechanical movement of multi-beam spinning LiDAR units together with the harsh conditions within indoor facilities (e.g., excessive humidity, wide range of temperature variation, dust, and corrosive environment in deicing salt storage facilities) limit the use of such systems. With the increasing availability of solid-state LiDAR units, there is an interest in exploring their potential for stockpile volume estimation. Despite their higher robustness to harsh conditions, solid-state LiDAR units have shorter distance measurement range and limited FOV when compared with multi-beam spinning LiDAR. This research presents a strategy for the extrinsic calibration (i.e., estimating the relative pose parameters) of installed solid-state LiDAR units inside stockpile storage facilities. The extrinsic calibration is made possible using deployed spherical targets and a complete, reference scan of the facility from another LiDAR sensing modality. The proposed research introduces strategies for: 1) automated extraction of the spherical targets; 2) automated matching of these targets in the solid-state LiDAR and reference scans using invariant relationships among them; and 3) coarse-to-fine estimation of the calibration parameters. Experimental results in several facilities have shown the feasibility of using the proposed methodology to conduct the extrinsic calibration and volume evaluation with an error percentage less than 3.5% even with occlusion percentages reaching up to 50%.

