Meng Xiao, Teng Hu, Zhizhong Kang, Haifeng Zhao, Feng Liu
Impact craters are geomorphological features widely distributed on the lunar surface. Their morphological parameters are crucial for studying the reasons for their formation, the thickness of the lunar regolith at the impact site and the age of the impact crater. However, current research on the extraction of multiple morphological parameters from a large number of impact craters within extensive geographical regions faces several challenges, including issues related to coordinate offsets in heterogeneous data, insufficient interpretation of impact crater profile morphology and incomplete extraction of morphological parameters. To address the aforementioned challenges, this paper proposes an automatic extraction method of morphological parameters based on the digital elevation model (DEM) and impact crater database. It involves the correction of heterogeneous data coordinate offset, simulation of impact crater profile morphology and various impact crater morphological parameter automatic extraction. And the method is designed to handle large numbers of impact craters in a wide range of areas. This makes it particularly useful for studies involving regional‐scale impact crater analysis. Experiments were carried out in geological units of different ages and we analysed the accuracy of this method. The analysis results show that: first, the proposed method has a relatively effective impact crater centre position offset correction. Second, the impact crater profile shape fitting result is relatively accurate. The R‐squared value (R2) is distributed from 0.97 to 1, and the mean absolute percentage error (MAPE) is between 0.032% and 0.568%, which reflects high goodness of fit. Finally, the eight morphological parameters automatically extracted using this method, such as depth, depth–diameter ratio, and internal and external slope, are basically consistent with those extracted manually. By comparing the proposed method with a similar approach, the results demonstrate that it is effective and can provide data support for relevant lunar surface research.
{"title":"Automatic extraction of multiple morphological parameters of lunar impact craters","authors":"Meng Xiao, Teng Hu, Zhizhong Kang, Haifeng Zhao, Feng Liu","doi":"10.1111/phor.12483","DOIUrl":"https://doi.org/10.1111/phor.12483","url":null,"abstract":"Impact craters are geomorphological features widely distributed on the lunar surface. Their morphological parameters are crucial for studying the reasons for their formation, the thickness of the lunar regolith at the impact site and the age of the impact crater. However, current research on the extraction of multiple morphological parameters from a large number of impact craters within extensive geographical regions faces several challenges, including issues related to coordinate offsets in heterogeneous data, insufficient interpretation of impact crater profile morphology and incomplete extraction of morphological parameters. To address the aforementioned challenges, this paper proposes an automatic extraction method of morphological parameters based on the digital elevation model (DEM) and impact crater database. It involves the correction of heterogeneous data coordinate offset, simulation of impact crater profile morphology and various impact crater morphological parameter automatic extraction. And the method is designed to handle large numbers of impact craters in a wide range of areas. This makes it particularly useful for studies involving regional‐scale impact crater analysis. Experiments were carried out in geological units of different ages and we analysed the accuracy of this method. The analysis results show that: first, the proposed method has a relatively effective impact crater centre position offset correction. Second, the impact crater profile shape fitting result is relatively accurate. The <jats:italic>R</jats:italic>‐squared value (<jats:italic>R</jats:italic><jats:sup><jats:italic>2</jats:italic></jats:sup>) is distributed from 0.97 to 1, and the mean absolute percentage error (<jats:italic>MAPE</jats:italic>) is between 0.032% and 0.568%, which reflects high goodness of fit. Finally, the eight morphological parameters automatically extracted using this method, such as depth, depth–diameter ratio, and internal and external slope, are basically consistent with those extracted manually. By comparing the proposed method with a similar approach, the results demonstrate that it is effective and can provide data support for relevant lunar surface research.","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":"234 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140316408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Most existing point cloud segmentation methods ignore directional information when extracting neighbourhood features. Those methods are ineffective in extracting point cloud neighbourhood features because the point cloud data is not uniformly distributed and is restricted by the size of the convolution kernel. Therefore, we take into account both multiple directions and hole sampling (MDHS). First, we execute spherically sparse sampling with directional encoding in the surrounding domain for every point inside the data to increase the local perceptual field. The data input is the basic geometric features. We use the graph convolutional neural network to conduct the maximisation of point cloud characteristics in a local neighbourhood. Then the more representative local point features are automatically weighted and fused by an attention pooling layer. Finally, spatial attention is added to increase the connection between remote points, and then the segmentation accuracy is improved. Experimental results show that the OA and mIoU are 1.3% and 4.0% higher than the method PointWeb and 0.6% and 0.7% higher than the baseline method RandLA-Net. For the indoor point cloud semantic segmentation, the segmentation effect of the proposed network is superior to other methods.
{"title":"Indoor point cloud semantic segmentation based on direction perception and hole sampling","authors":"Xijiang Chen, Peng Li, Bufan Zhao, Tieding Lu, Xunqiang Gong, Hui Deng","doi":"10.1111/phor.12482","DOIUrl":"https://doi.org/10.1111/phor.12482","url":null,"abstract":"Most existing point cloud segmentation methods ignore directional information when extracting neighbourhood features. Those methods are ineffective in extracting point cloud neighbourhood features because the point cloud data is not uniformly distributed and is restricted by the size of the convolution kernel. Therefore, we take into account both multiple directions and hole sampling (MDHS). First, we execute spherically sparse sampling with directional encoding in the surrounding domain for every point inside the data to increase the local perceptual field. The data input is the basic geometric features. We use the graph convolutional neural network to conduct the maximisation of point cloud characteristics in a local neighbourhood. Then the more representative local point features are automatically weighted and fused by an attention pooling layer. Finally, spatial attention is added to increase the connection between remote points, and then the segmentation accuracy is improved. Experimental results show that the OA and mIoU are 1.3% and 4.0% higher than the method PointWeb and 0.6% and 0.7% higher than the baseline method RandLA-Net. For the indoor point cloud semantic segmentation, the segmentation effect of the proposed network is superior to other methods.","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":"32 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140045403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The learning-based multi-view stereo (MVS) methods have made remarkable progress in recent years. However, these methods exhibit limited robustness when faced with occlusion, weak or repetitive texture regions in the image. These factors often lead to holes in the final point cloud model due to excessive pixel-matching errors. To address these challenges, we propose a novel MVS network assisted by monocular prediction for 3D reconstruction. Our approach combines the strengths of both monocular and multi-view branches, leveraging the internal semantic information extracted from a single image through monocular prediction, along with the strict geometric relationships between multiple images. Moreover, we adopt a coarse-to-fine strategy to gradually reduce the number of assumed depth planes and minimise the interval between them as the resolution of the input images increases during the network iteration. This strategy can achieve a balance between the computational resource consumption and the effectiveness of the model. Experiments on the DTU, Tanks and Temples, and BlendedMVS datasets demonstrate that our method achieves outstanding results, particularly in textureless regions.
近年来,基于学习的多视角立体(MVS)方法取得了显著进展。然而,这些方法在面对图像中的遮挡、弱纹理或重复纹理区域时表现出有限的鲁棒性。由于像素匹配误差过大,这些因素往往会导致最终的点云模型出现漏洞。为了应对这些挑战,我们提出了一种由单目预测辅助的新型 MVS 网络,用于三维重建。我们的方法结合了单目和多目分支的优势,利用通过单目预测从单幅图像中提取的内部语义信息,以及多幅图像之间严格的几何关系。此外,我们还采用了由粗到细的策略,随着网络迭代过程中输入图像分辨率的提高,逐渐减少假定深度平面的数量,并最小化它们之间的间隔。这种策略可以实现计算资源消耗和模型有效性之间的平衡。在 DTU、Tanks and Temples 和 BlendedMVS 数据集上的实验表明,我们的方法取得了出色的结果,尤其是在无纹理区域。
{"title":"Mono-MVS: textureless-aware multi-view stereo assisted by monocular prediction","authors":"Yuanhao Fu, Maoteng Zheng, Peiyu Chen, Xiuguo Liu","doi":"10.1111/phor.12480","DOIUrl":"https://doi.org/10.1111/phor.12480","url":null,"abstract":"The learning-based multi-view stereo (MVS) methods have made remarkable progress in recent years. However, these methods exhibit limited robustness when faced with occlusion, weak or repetitive texture regions in the image. These factors often lead to holes in the final point cloud model due to excessive pixel-matching errors. To address these challenges, we propose a novel MVS network assisted by monocular prediction for 3D reconstruction. Our approach combines the strengths of both monocular and multi-view branches, leveraging the internal semantic information extracted from a single image through monocular prediction, along with the strict geometric relationships between multiple images. Moreover, we adopt a coarse-to-fine strategy to gradually reduce the number of assumed depth planes and minimise the interval between them as the resolution of the input images increases during the network iteration. This strategy can achieve a balance between the computational resource consumption and the effectiveness of the model. Experiments on the DTU, Tanks and Temples, and BlendedMVS datasets demonstrate that our method achieves outstanding results, particularly in textureless regions.","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":"148 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140008129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhihua Xu, Yiru Niu, Yan Cui, Rongjun Qin, Wenbin Sun
Camera localisation is an essential task in the field of computer vision. The objective is to determine the precise position and orientation of one newly introduced camera station based on a collection of control images that are geographically referenced. Traditional feature‐based approaches have been found to face difficulties when confronted with the task of localising images that exhibit significant disparities in viewpoint. Modern deep learning approaches, on the contrary, aim to directly regress camera poses from input image content, being holistic to remedy the problem of viewpoint disparities. This paper posits that although deep networks possess the ability to learn robust and invariant visual features, the incorporation of geometry models can provide rigorous constraints in the process of pose estimation. Following the classic structure‐from‐motion (SfM) pipeline, we propose a PL‐Pose framework to perform camera localisation. First, to improve feature correlations for images with large viewpoint disparities, we perform the combination of point and line features based on a deep learning framework and geometric relation of wireframes. Then, a cost function is constructed using the combined point and line features in order to impose constraints on the bundle adjustment process. Finally, the camera pose parameters and 3D points are estimated through an iterative optimisation process. We verify the accuracy of the PL‐Pose approach through the utilisation of two datasets, that is, the publicly available S3DIS dataset and the self‐collected dataset CUMTB_Campus. The experimental results demonstrate that in both indoor and outdoor scenes, our PL‐Pose method can achieve localisation errors of less than 1 m for 82% of the test points. In contrast, the other four comparison methods yield a best result of merely 72%. Meanwhile, the PL‐Pose method can successfully obtain the camera pose parameters in all the scenes with small or large viewpoint disparities, indicating its good stability and adaptability.
{"title":"PL‐Pose: robust camera localisation based on combined point and line features using control images","authors":"Zhihua Xu, Yiru Niu, Yan Cui, Rongjun Qin, Wenbin Sun","doi":"10.1111/phor.12481","DOIUrl":"https://doi.org/10.1111/phor.12481","url":null,"abstract":"Camera localisation is an essential task in the field of computer vision. The objective is to determine the precise position and orientation of one newly introduced camera station based on a collection of control images that are geographically referenced. Traditional feature‐based approaches have been found to face difficulties when confronted with the task of localising images that exhibit significant disparities in viewpoint. Modern deep learning approaches, on the contrary, aim to directly regress camera poses from input image content, being holistic to remedy the problem of viewpoint disparities. This paper posits that although deep networks possess the ability to learn robust and invariant visual features, the incorporation of geometry models can provide rigorous constraints in the process of pose estimation. Following the classic structure‐from‐motion (SfM) pipeline, we propose a PL‐Pose framework to perform camera localisation. First, to improve feature correlations for images with large viewpoint disparities, we perform the combination of point and line features based on a deep learning framework and geometric relation of wireframes. Then, a cost function is constructed using the combined point and line features in order to impose constraints on the bundle adjustment process. Finally, the camera pose parameters and 3D points are estimated through an iterative optimisation process. We verify the accuracy of the PL‐Pose approach through the utilisation of two datasets, that is, the publicly available S3DIS dataset and the self‐collected dataset CUMTB_Campus. The experimental results demonstrate that in both indoor and outdoor scenes, our PL‐Pose method can achieve localisation errors of less than 1 m for 82% of the test points. In contrast, the other four comparison methods yield a best result of merely 72%. Meanwhile, the PL‐Pose method can successfully obtain the camera pose parameters in all the scenes with small or large viewpoint disparities, indicating its good stability and adaptability.","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":"60 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140008346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Structure from motion (SfM) using optical images has been an important prerequisite for reconstructing three-dimensional (3D) landforms. Although various algorithms have been developed in the past, they suffer from many image pairs for feature matching and recursive searching for the most suitable image to add to SfM reconstruction. Thus, carrying out SfM is computationally costly. This research proposes a boosting SfM (B-SfM) pipeline containing two phases, indexing graph network (IGN) and graph tracking, to accelerate SfM reconstruction. The IGN intends to form image pairs presenting desirable spatial correlation to reduce the time costs spent for feature matching. Building on the IGN, graph tracking integrates ant colony optimisation and greedy sorting algorithms to encode an optimum image sequence to boost SfM reconstruction. Compared to the results derived from other available means, the experimental results show that the proposed approach can accelerate the two phases, feature matching and 3D reconstruction, by up to 14 times faster. The quality of the camera poses recovered is retained or even slightly improved. As a result, the developed B-SfM can efficiently achieve SfM reconstruction by suppressing the time cost in the fashion of image pair selection for feature matching and image order determination for more efficient SfM reconstruction.
{"title":"Associating UAS images through a graph-based guiding strategy for boosting structure from motion","authors":"Min-Lung Cheng, Yuji Fujita, Yasutaka Kuramoto, Hiroyuki Miura, Masashi Matsuoka","doi":"10.1111/phor.12479","DOIUrl":"https://doi.org/10.1111/phor.12479","url":null,"abstract":"Structure from motion (SfM) using optical images has been an important prerequisite for reconstructing three-dimensional (3D) landforms. Although various algorithms have been developed in the past, they suffer from many image pairs for feature matching and recursive searching for the most suitable image to add to SfM reconstruction. Thus, carrying out SfM is computationally costly. This research proposes a boosting SfM (B-SfM) pipeline containing two phases, indexing graph network (IGN) and graph tracking, to accelerate SfM reconstruction. The IGN intends to form image pairs presenting desirable spatial correlation to reduce the time costs spent for feature matching. Building on the IGN, graph tracking integrates ant colony optimisation and greedy sorting algorithms to encode an optimum image sequence to boost SfM reconstruction. Compared to the results derived from other available means, the experimental results show that the proposed approach can accelerate the two phases, feature matching and 3D reconstruction, by up to 14 times faster. The quality of the camera poses recovered is retained or even slightly improved. As a result, the developed B-SfM can efficiently achieve SfM reconstruction by suppressing the time cost in the fashion of image pair selection for feature matching and image order determination for more efficient SfM reconstruction.","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139669542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhi Zheng, Yi Wan, Yongjun Zhang, Zhonghua Hu, Dong Wei, Yongxiang Yao, Chenming Zhu, Kun Yang, Rang Xiao
Recent studies have demonstrated that deep learning-based stereo matching methods (DLSMs) can far exceed conventional ones on most benchmark datasets by both improving visual performance and decreasing the mismatching rate. However, applying DLSMs on high-resolution satellite stereos with broad image coverage and wide terrain variety is still challenging. First, the broad coverage of satellite stereos brings a wide disparity range, while DLSMs are limited to a narrow disparity range in most cases, resulting in incorrect disparity estimation in areas with contradictory disparity ranges. Second, high-resolution satellite stereos always comprise various terrain types, which is more complicated than carefully prepared datasets. Thus, the performance of DLSMs on satellite stereos is unstable, especially for intractable regions such as texture-less and occluded regions. Third, generating DSMs requires occlusion-aware disparity maps, while traditional occlusion detection methods are not always applicable for DLSMs with continuous disparity. To tackle these problems, this paper proposes a novel DLSM-based DSM generation workflow. The workflow comprises three steps: pre-processing, disparity estimation and post-processing. The pre-processing step introduces low-resolution terrain to shift unmatched disparity ranges into a fixed scope and crops satellite stereos to regular patches. The disparity estimation step proposes a hybrid feature fusion network (HF2Net) to improve the matching performance. In detail, HF2Net designs a cross-scale feature extractor (CSF) and a multi-scale cost filter. The feature extractor differentiates structural-context features in complex scenes and thus enhances HF2Net's robustness to satellite stereos, especially on intractable regions. The cost filter filters out most matching errors to ensure accurate disparity estimation. The post-processing step generates initial DSM patches with estimated disparity maps and then refines them for the final large-scale DSMs. Primary experiments on the public US3D dataset showed better accuracy than state-of-the-art methods, indicating HF2Net's superiority. We then created a self-made Gaofen-7 dataset to train HF2Net and conducted DSM generation experiments on two Gaofen-7 stereos to further demonstrate the effectiveness and practical capability of the proposed workflow.
{"title":"Digital surface model generation from high-resolution satellite stereos based on hybrid feature fusion network","authors":"Zhi Zheng, Yi Wan, Yongjun Zhang, Zhonghua Hu, Dong Wei, Yongxiang Yao, Chenming Zhu, Kun Yang, Rang Xiao","doi":"10.1111/phor.12471","DOIUrl":"https://doi.org/10.1111/phor.12471","url":null,"abstract":"Recent studies have demonstrated that deep learning-based stereo matching methods (DLSMs) can far exceed conventional ones on most benchmark datasets by both improving visual performance and decreasing the mismatching rate. However, applying DLSMs on high-resolution satellite stereos with broad image coverage and wide terrain variety is still challenging. First, the broad coverage of satellite stereos brings a wide disparity range, while DLSMs are limited to a narrow disparity range in most cases, resulting in incorrect disparity estimation in areas with contradictory disparity ranges. Second, high-resolution satellite stereos always comprise various terrain types, which is more complicated than carefully prepared datasets. Thus, the performance of DLSMs on satellite stereos is unstable, especially for intractable regions such as texture-less and occluded regions. Third, generating DSMs requires occlusion-aware disparity maps, while traditional occlusion detection methods are not always applicable for DLSMs with continuous disparity. To tackle these problems, this paper proposes a novel DLSM-based DSM generation workflow. The workflow comprises three steps: pre-processing, disparity estimation and post-processing. The pre-processing step introduces low-resolution terrain to shift unmatched disparity ranges into a fixed scope and crops satellite stereos to regular patches. The disparity estimation step proposes a hybrid feature fusion network (HF<sup>2</sup>Net) to improve the matching performance. In detail, HF<sup>2</sup>Net designs a cross-scale feature extractor (CSF) and a multi-scale cost filter. The feature extractor differentiates structural-context features in complex scenes and thus enhances HF<sup>2</sup>Net's robustness to satellite stereos, especially on intractable regions. The cost filter filters out most matching errors to ensure accurate disparity estimation. The post-processing step generates initial DSM patches with estimated disparity maps and then refines them for the final large-scale DSMs. Primary experiments on the public US3D dataset showed better accuracy than state-of-the-art methods, indicating HF<sup>2</sup>Net's superiority. We then created a self-made Gaofen-7 dataset to train HF<sup>2</sup>Net and conducted DSM generation experiments on two Gaofen-7 stereos to further demonstrate the effectiveness and practical capability of the proposed workflow.","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":"210 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139414818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a novel method for real-time generation of seamless spherical panoramic videos from an omnidirectional multi-camera system (OMS). Firstly, a multi-view video alignment model called spherical projection constrained thin-plate spline (SP-TPS) was established and estimated using an approximately symmetrical seam-line, maintaining the structure inconsistency around the seam-line. Then, a look-up table was designed to support real-time processing of video re-projection, video dodging and seam-line updates. In the table, the overlapping areas in OMS multi-view videos, the seam-lines between spherical panoramas and OMS multi-view videos and the pixel coordinate mapping relationship between spherical panoramas and OMS multi-view videos were pre-stored as a whole. Finally, a spherical panoramic video was outputted in real-time through look-up table computation under an ordinary GPU processor. The experiments were conducted on multi-view video taken by “1 + 4” and “1 + 7” OMS, respectively. Experimental results demonstrate that compared with four state-of-the-art methods reported in the literature and two bits of commercial software for video stitching, the proposed method excels in eliminating visual artefacts and demonstrates superior adaptability to scenes with varying depths of field. Assuming that OMS is not movable in the scene, this method can generate seamless spherical panoramic videos with a resolution of 8 K in real time, which is of great value to the surveillance field.
{"title":"Real-time generation of spherical panoramic video using an omnidirectional multi-camera system","authors":"Jiongli Gao, Jun Wu, Mingyi Huang, Gang Xu","doi":"10.1111/phor.12474","DOIUrl":"https://doi.org/10.1111/phor.12474","url":null,"abstract":"This paper presents a novel method for real-time generation of seamless spherical panoramic videos from an omnidirectional multi-camera system (OMS). Firstly, a multi-view video alignment model called spherical projection constrained thin-plate spline (SP-TPS) was established and estimated using an approximately symmetrical seam-line, maintaining the structure inconsistency around the seam-line. Then, a look-up table was designed to support real-time processing of video re-projection, video dodging and seam-line updates. In the table, the overlapping areas in OMS multi-view videos, the seam-lines between spherical panoramas and OMS multi-view videos and the pixel coordinate mapping relationship between spherical panoramas and OMS multi-view videos were pre-stored as a whole. Finally, a spherical panoramic video was outputted in real-time through look-up table computation under an ordinary GPU processor. The experiments were conducted on multi-view video taken by “1 + 4” and “1 + 7” OMS, respectively. Experimental results demonstrate that compared with four state-of-the-art methods reported in the literature and two bits of commercial software for video stitching, the proposed method excels in eliminating visual artefacts and demonstrates superior adaptability to scenes with varying depths of field. Assuming that OMS is not movable in the scene, this method can generate seamless spherical panoramic videos with a resolution of 8 K in real time, which is of great value to the surveillance field.","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139414805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The widely used unmanned aerial vehicle oblique photogrammetry often suffers from information loss in complex urban environments, leading to geometric and textural defects in the resulting models. In this study, a close-range panoramic optimal viewpoint selection assisted 3D urban scene reconstruction enhancement method is proposed for areas prone to defects. We first introduce the ground panoramic data acquisition equipment and strategy, which are different from those of the single-lens supplementary photography method. Data acquisition is accomplished through a single and continuous surround-style collection approach. The full coverage of the panoramic video in the space–time dimension enables the acquisition of texture details without considering camera station planning. Then, a panoramic multiview image generation approach is proposed. Adaptive viewpoint selection is achieved using unbiased sampling points from the rough scene model, and viewpoint optimisation is adopted to ensure sufficient image overlap and intersection effects, thus improving the scene reconstructability. Finally, the 3D model is generated by photogrammetric processing of the panoramic multiview images, resulting in an enhanced modelling effect. To validate the proposed method, we conducted experiments using real data from Qingdao, China. Both the qualitative and quantitative results demonstrate a significant improvement in the quality of geometric and textural reconstruction. The tie-point reprojection errors are less than 1 pixel, and the registration accuracy with the model from oblique photogrammetry is comparable to that of optimised-view photography. By eliminating the need for on-site camera station planning or manual flight operations and effectively minimising the redundancy of panoramic videos, our approach significantly reduces the photography and computation costs associated with reconstruction enhancement. Thus, it presents a feasible technical solution for the generation of urban 3D fine models.
{"title":"A 3D urban scene reconstruction enhancement approach based on adaptive viewpoint selection of panoramic videos","authors":"Xujie Zhang, Zhenbiao Hu, Qingwu Hu, Jun Zhao, Mingyao Ai, Pengcheng Zhao, Jiayuan Li, Xiaojie Zhou, Zongqiang Chen","doi":"10.1111/phor.12467","DOIUrl":"https://doi.org/10.1111/phor.12467","url":null,"abstract":"The widely used unmanned aerial vehicle oblique photogrammetry often suffers from information loss in complex urban environments, leading to geometric and textural defects in the resulting models. In this study, a close-range panoramic optimal viewpoint selection assisted 3D urban scene reconstruction enhancement method is proposed for areas prone to defects. We first introduce the ground panoramic data acquisition equipment and strategy, which are different from those of the single-lens supplementary photography method. Data acquisition is accomplished through a single and continuous surround-style collection approach. The full coverage of the panoramic video in the space–time dimension enables the acquisition of texture details without considering camera station planning. Then, a panoramic multiview image generation approach is proposed. Adaptive viewpoint selection is achieved using unbiased sampling points from the rough scene model, and viewpoint optimisation is adopted to ensure sufficient image overlap and intersection effects, thus improving the scene reconstructability. Finally, the 3D model is generated by photogrammetric processing of the panoramic multiview images, resulting in an enhanced modelling effect. To validate the proposed method, we conducted experiments using real data from Qingdao, China. Both the qualitative and quantitative results demonstrate a significant improvement in the quality of geometric and textural reconstruction. The tie-point reprojection errors are less than 1 pixel, and the registration accuracy with the model from oblique photogrammetry is comparable to that of optimised-view photography. By eliminating the need for on-site camera station planning or manual flight operations and effectively minimising the redundancy of panoramic videos, our approach significantly reduces the photography and computation costs associated with reconstruction enhancement. Thus, it presents a feasible technical solution for the generation of urban 3D fine models.","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":"38 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139414542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hong Hu, Qing Tan, Ruihong Kang, Yanlan Wu, Hui Liu, Baoguo Wang
Unmanned aircraft vehicles (UAVs) capture oblique point clouds in outdoor scenes that contain considerable building information. Building features extracted from images are affected by the viewing point, illumination, occlusion, noise and image conditions, which make building features difficult to extract. Currently, ground elevation changes can provide powerful aids for the extraction, and point cloud data can precisely reflect this information. Thus, oblique photogrammetry point clouds have significant research implications. Traditional building extraction methods involve the filtering and sorting of raw data to separate buildings, which cause the point clouds to lose spatial information and reduce the building extraction accuracy. Therefore, we develop an intelligent building extraction method based on deep learning that incorporates an attention mechanism module into the Samling and PointNet operations within the set abstraction layer of the PointNet++ network. To assess the efficacy of our approach, we train and extract buildings from a dataset created using UAV oblique point clouds from five regions in the city of Bengbu, China. Impressive performance metrics are achieved, including 95.7% intersection over union, 96.5% accuracy, 96.5% precision, 98.7% recall and 97.8% F1 score. And with the addition of attention mechanism, the overall training accuracy of the model is improved by about 3%. This method showcases potential for advancing the accuracy and efficiency of digital urbanization construction projects.
{"title":"Building extraction from oblique photogrammetry point clouds based on PointNet++ with attention mechanism","authors":"Hong Hu, Qing Tan, Ruihong Kang, Yanlan Wu, Hui Liu, Baoguo Wang","doi":"10.1111/phor.12476","DOIUrl":"https://doi.org/10.1111/phor.12476","url":null,"abstract":"Unmanned aircraft vehicles (UAVs) capture oblique point clouds in outdoor scenes that contain considerable building information. Building features extracted from images are affected by the viewing point, illumination, occlusion, noise and image conditions, which make building features difficult to extract. Currently, ground elevation changes can provide powerful aids for the extraction, and point cloud data can precisely reflect this information. Thus, oblique photogrammetry point clouds have significant research implications. Traditional building extraction methods involve the filtering and sorting of raw data to separate buildings, which cause the point clouds to lose spatial information and reduce the building extraction accuracy. Therefore, we develop an intelligent building extraction method based on deep learning that incorporates an attention mechanism module into the Samling and PointNet operations within the set abstraction layer of the PointNet++ network. To assess the efficacy of our approach, we train and extract buildings from a dataset created using UAV oblique point clouds from five regions in the city of Bengbu, China. Impressive performance metrics are achieved, including 95.7% intersection over union, 96.5% accuracy, 96.5% precision, 98.7% recall and 97.8% F1 score. And with the addition of attention mechanism, the overall training accuracy of the model is improved by about 3%. This method showcases potential for advancing the accuracy and efficiency of digital urbanization construction projects.","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139372840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Compared with optical remote sensing satellites, the geometric positioning accuracy of synthetic aperture radar (SAR) satellite is not affected by satellite attitude or weather conditions. SAR satellites can achieve relatively high positioning accuracy without ground control points, which is particularly important in global surveying and mapping. However, the stereo positioning accuracy of SAR satellites is mainly affected by the SAR systematic delay and the atmospheric propagation delay of radar signals. An iterative compensation method for the SAR systematic time delay is proposed based on digital elevation model to improve the stereo positioning accuracy of SAR satellites without control points. In addition, to address the non-real-time updates of external reference atmospheric param, an iterative compensation method to estimate the atmospheric propagation delay of radar signals is proposed based on standard atmospheric models. In this study, SAR images from the Gaofen-3 (GF-3) satellite with 5 m resolutions were used as experimental data to verify the effectiveness of our proposed method. Simultaneously, the 2D positioning accuracy was better than 3 m and increased by 42.9%, and the elevation positioning accuracy was better than 3 m and increased by 90.2%.
{"title":"Improvement of the spaceborne synthetic aperture radar stereo positioning accuracy without ground control points","authors":"Yu Wei, Ruishan Zhao, Qiang Fan, Jiguang Dai, Bing Zhang","doi":"10.1111/phor.12475","DOIUrl":"https://doi.org/10.1111/phor.12475","url":null,"abstract":"Compared with optical remote sensing satellites, the geometric positioning accuracy of synthetic aperture radar (SAR) satellite is not affected by satellite attitude or weather conditions. SAR satellites can achieve relatively high positioning accuracy without ground control points, which is particularly important in global surveying and mapping. However, the stereo positioning accuracy of SAR satellites is mainly affected by the SAR systematic delay and the atmospheric propagation delay of radar signals. An iterative compensation method for the SAR systematic time delay is proposed based on digital elevation model to improve the stereo positioning accuracy of SAR satellites without control points. In addition, to address the non-real-time updates of external reference atmospheric param, an iterative compensation method to estimate the atmospheric propagation delay of radar signals is proposed based on standard atmospheric models. In this study, SAR images from the Gaofen-3 (GF-3) satellite with 5 m resolutions were used as experimental data to verify the effectiveness of our proposed method. Simultaneously, the 2D positioning accuracy was better than 3 m and increased by 42.9%, and the elevation positioning accuracy was better than 3 m and increased by 90.2%.","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139372872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}