Pub Date : 2020-02-01DOI: 10.1109/MVIP49855.2020.9116895
Pegah Salehi, A. Chalechale
The diagnosis of cancer is mainly performed by visual analysis of the pathologists, through examining the morphology of the tissue slices and the spatial arrangement of the cells. If the microscopic image of a specimen is not stained, it will look colorless and textured. Therefore, chemical staining is required to create contrast and help identify specific tissue components. During tissue preparation due to differences in chemicals, scanners, cutting thicknesses, and laboratory protocols, similar tissues are usually varied significantly in appearance. This diversity in staining, in addition to Interpretive disparity among pathologists more is one of the main challenges in designing robust and flexible systems for automated analysis. To address the staining color variations, several methods for normalizing stain have been proposed. In our proposed method, a Stain-to-Stain Translation (STST) approach is used to stain normalization for Hematoxylin and Eosin (H&E) stained histopathology images, which learns not only the specific color distribution but also the preserves corresponding histopathological pattern. We perform the process of translation based on the "pix2pix" framework, which uses the conditional generator adversarial networks (cGANs). Our approach showed excellent results, both mathematically and experimentally against the state of the art methods. We have made the source code publicly available 1.
{"title":"Pix2Pix-based Stain-to-Stain Translation: A Solution for Robust Stain Normalization in Histopathology Images Analysis","authors":"Pegah Salehi, A. Chalechale","doi":"10.1109/MVIP49855.2020.9116895","DOIUrl":"https://doi.org/10.1109/MVIP49855.2020.9116895","url":null,"abstract":"The diagnosis of cancer is mainly performed by visual analysis of the pathologists, through examining the morphology of the tissue slices and the spatial arrangement of the cells. If the microscopic image of a specimen is not stained, it will look colorless and textured. Therefore, chemical staining is required to create contrast and help identify specific tissue components. During tissue preparation due to differences in chemicals, scanners, cutting thicknesses, and laboratory protocols, similar tissues are usually varied significantly in appearance. This diversity in staining, in addition to Interpretive disparity among pathologists more is one of the main challenges in designing robust and flexible systems for automated analysis. To address the staining color variations, several methods for normalizing stain have been proposed. In our proposed method, a Stain-to-Stain Translation (STST) approach is used to stain normalization for Hematoxylin and Eosin (H&E) stained histopathology images, which learns not only the specific color distribution but also the preserves corresponding histopathological pattern. We perform the process of translation based on the \"pix2pix\" framework, which uses the conditional generator adversarial networks (cGANs). Our approach showed excellent results, both mathematically and experimentally against the state of the art methods. We have made the source code publicly available 1.","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122250505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-02-01DOI: 10.1109/MVIP49855.2020.9116877
Sahar Abdolmaleki, M. S. Abadeh
Attention-deficit/hyperactivity disorder (ADHD) is one of the most prevalent neurodevelopmental disorder in childhood and adolescence. ADHD diagnosis currently includes psychological tests and depends on ratings of behavioral symptoms, which can be unreliable. Thus, an objective diagnostic tool based on non-invasive imaging can improve the understanding and diagnosis of ADHD. The purpose of this study is classifying brain images by using Artificial Intelligence methods such as clinical decision support system for the diagnosis of ADHD. For this purpose and according to a medical imaging classification system, firstly, image pre-processing is done. Then, a deep multi-modal 3D CNN is trained on GM from structural and fALFF from functional MRI using ADHD-200 training dataset. Finally, with the intention of classifying the extracted features, early and late fusion schemes are employed, and the output scores are classified with the SVM, KNN and LDA algorithms. The evaluation of the proposed approach on the ADHD-200 testing dataset revealed that the presence of personal characteristics alone increased the classification accuracy by 3.79%. In addition, using a combination of early, late fusion and personal characteristics together improved the accuracy of the classification by 5.84%. Among the three classifiers LDA showed better results and achieved a classification accuracy of 74.93%. The comparison of results showed that the combination of early and late fusion as well as considering personal characteristics has a significant effect on enhancing classification accuracy. As a result of this, the reliability of this medical decision support system is increased.
{"title":"Brain MR Image Classification for ADHD Diagnosis Using Deep Neural Networks","authors":"Sahar Abdolmaleki, M. S. Abadeh","doi":"10.1109/MVIP49855.2020.9116877","DOIUrl":"https://doi.org/10.1109/MVIP49855.2020.9116877","url":null,"abstract":"Attention-deficit/hyperactivity disorder (ADHD) is one of the most prevalent neurodevelopmental disorder in childhood and adolescence. ADHD diagnosis currently includes psychological tests and depends on ratings of behavioral symptoms, which can be unreliable. Thus, an objective diagnostic tool based on non-invasive imaging can improve the understanding and diagnosis of ADHD. The purpose of this study is classifying brain images by using Artificial Intelligence methods such as clinical decision support system for the diagnosis of ADHD. For this purpose and according to a medical imaging classification system, firstly, image pre-processing is done. Then, a deep multi-modal 3D CNN is trained on GM from structural and fALFF from functional MRI using ADHD-200 training dataset. Finally, with the intention of classifying the extracted features, early and late fusion schemes are employed, and the output scores are classified with the SVM, KNN and LDA algorithms. The evaluation of the proposed approach on the ADHD-200 testing dataset revealed that the presence of personal characteristics alone increased the classification accuracy by 3.79%. In addition, using a combination of early, late fusion and personal characteristics together improved the accuracy of the classification by 5.84%. Among the three classifiers LDA showed better results and achieved a classification accuracy of 74.93%. The comparison of results showed that the combination of early and late fusion as well as considering personal characteristics has a significant effect on enhancing classification accuracy. As a result of this, the reliability of this medical decision support system is increased.","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"150 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116369876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-02-01DOI: 10.1109/MVIP49855.2020.9116907
Hossein Motamednia, Mohammad Minouei, Pooryaa Cheraaqee, M. Soheili
Today, smartphones with high-quality built-in cameras are very common. People prefer to take pictures from documents with smartphones instead of scanning them with a scanner. Due to the limitation of scanners input size, it is difficult to scan everything with them. Resolution and quality of smartphone cameras are not enough to take a picture from large documents like posters. In this paper, we proposed a pipeline to make a high-resolution image of a document from its captured video. We suppose that during the record of the video, the camera was moved slowly all over the surface of the document from a close distance. In the proposed method we find the location of each frame in the document and we use a sharpness criterion to select the highest possible quality for each region of the document among all available frames. We evaluated our method on the SmartDoc Video dataset and reported the promising results.
{"title":"High-Resolution Document Image Reconstruction from Video","authors":"Hossein Motamednia, Mohammad Minouei, Pooryaa Cheraaqee, M. Soheili","doi":"10.1109/MVIP49855.2020.9116907","DOIUrl":"https://doi.org/10.1109/MVIP49855.2020.9116907","url":null,"abstract":"Today, smartphones with high-quality built-in cameras are very common. People prefer to take pictures from documents with smartphones instead of scanning them with a scanner. Due to the limitation of scanners input size, it is difficult to scan everything with them. Resolution and quality of smartphone cameras are not enough to take a picture from large documents like posters. In this paper, we proposed a pipeline to make a high-resolution image of a document from its captured video. We suppose that during the record of the video, the camera was moved slowly all over the surface of the document from a close distance. In the proposed method we find the location of each frame in the document and we use a sharpness criterion to select the highest possible quality for each region of the document among all available frames. We evaluated our method on the SmartDoc Video dataset and reported the promising results.","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117095680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-02-01DOI: 10.1109/MVIP49855.2020.9116914
Behnam Azimi, A. Rashno, S. Fadaei
Retinal diseases can be manifested in optical coherence tomography (OCT) images since many signs of retina abnormalities are visible in OCT. Fluid regions can reveal the signs of age-related macular degeneration (AMD) and diabetic macular edema (DME) diseases and automatic segmentation of these regions can help ophthalmologists for diagnosis and treatment. This work presents a fully-automated method based on graph shortest path layer segmentation and fully convolutional networks (FCNs) for fluid segmentation. The proposed method has been evaluated on a dataset containing 600 OCT scans of 24 subjects. Results showed that the proposed FCN model outperforms 3 existing fluid segmentation methods by the improvement of 4.44% and 6.28% with respect to dice cofficients and sensitivity, respectively.
{"title":"Fully Convolutional Networks for Fluid Segmentation in Retina Images","authors":"Behnam Azimi, A. Rashno, S. Fadaei","doi":"10.1109/MVIP49855.2020.9116914","DOIUrl":"https://doi.org/10.1109/MVIP49855.2020.9116914","url":null,"abstract":"Retinal diseases can be manifested in optical coherence tomography (OCT) images since many signs of retina abnormalities are visible in OCT. Fluid regions can reveal the signs of age-related macular degeneration (AMD) and diabetic macular edema (DME) diseases and automatic segmentation of these regions can help ophthalmologists for diagnosis and treatment. This work presents a fully-automated method based on graph shortest path layer segmentation and fully convolutional networks (FCNs) for fluid segmentation. The proposed method has been evaluated on a dataset containing 600 OCT scans of 24 subjects. Results showed that the proposed FCN model outperforms 3 existing fluid segmentation methods by the improvement of 4.44% and 6.28% with respect to dice cofficients and sensitivity, respectively.","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129426393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-02-01DOI: 10.1109/MVIP49855.2020.9116885
S. Safari, F. Afsari
This paper proposes an ensemble p-spectral semi-supervised clustering algorithm for very high dimensional data sets. Traditional clustering and semi-supervised clustering approaches have several shortcomings; do not use the prior knowledge of experts and researchers; not good for high dimensional data; and use less constraint pairs. To overcome, we first apply the transitive closure operator to the pairwise constraints. Then the whole feature space is divided into several subspaces to find the ensemble semi-supervised p-spectral clustering of the whole data. Also, we search to find the best subspace by using three operators. Experiments show that the proposed ensemble pspectral clustering method outperforms the existing semi-supervised clustering methods on several high dimensional data sets.
{"title":"Ensemble P-spectral Semi-supervised Clustering","authors":"S. Safari, F. Afsari","doi":"10.1109/MVIP49855.2020.9116885","DOIUrl":"https://doi.org/10.1109/MVIP49855.2020.9116885","url":null,"abstract":"This paper proposes an ensemble p-spectral semi-supervised clustering algorithm for very high dimensional data sets. Traditional clustering and semi-supervised clustering approaches have several shortcomings; do not use the prior knowledge of experts and researchers; not good for high dimensional data; and use less constraint pairs. To overcome, we first apply the transitive closure operator to the pairwise constraints. Then the whole feature space is divided into several subspaces to find the ensemble semi-supervised p-spectral clustering of the whole data. Also, we search to find the best subspace by using three operators. Experiments show that the proposed ensemble pspectral clustering method outperforms the existing semi-supervised clustering methods on several high dimensional data sets.","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133164424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-02-01DOI: 10.1109/MVIP49855.2020.9116878
M. Imani
Although hyperspectral images contain rich spectral information due to high number of spectral bands acquired in a wide and continous range of wavelengths, there are also worthful spatial features in adjacent regions, i.e., neighboring pixels. Three spectral-spatial fusion frameworks are introduced in this work. The extended multi-attribute profile (EMAP) are used for spatial feature extraction. The performance of EMAP is assessed when it fed to the random forest classifier. The use of EMAP alone as well as fusion of EMAP with spectral features in both cases of full bands and reduced dimensionality are investigated. The advanced binary ant colony optimization is used for implementation of feature reduction. Three fusion frameworks are introduced for integration of EMAP and the spectral bands; and the classification results are discussed compared to the use of EMAP alone. The experimental results on three popular hyperspectral images show the superior performance of EMAP features fed to the random forest classifier.
{"title":"Random Forest with Attribute Profile for Remote Sensing Image Classification","authors":"M. Imani","doi":"10.1109/MVIP49855.2020.9116878","DOIUrl":"https://doi.org/10.1109/MVIP49855.2020.9116878","url":null,"abstract":"Although hyperspectral images contain rich spectral information due to high number of spectral bands acquired in a wide and continous range of wavelengths, there are also worthful spatial features in adjacent regions, i.e., neighboring pixels. Three spectral-spatial fusion frameworks are introduced in this work. The extended multi-attribute profile (EMAP) are used for spatial feature extraction. The performance of EMAP is assessed when it fed to the random forest classifier. The use of EMAP alone as well as fusion of EMAP with spectral features in both cases of full bands and reduced dimensionality are investigated. The advanced binary ant colony optimization is used for implementation of feature reduction. Three fusion frameworks are introduced for integration of EMAP and the spectral bands; and the classification results are discussed compared to the use of EMAP alone. The experimental results on three popular hyperspectral images show the superior performance of EMAP features fed to the random forest classifier.","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133258375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-02-01DOI: 10.1109/MVIP49855.2020.9116900
Saman Taheri, M. Ezoji
Brain computer interface (BCI) is a system which is able to translate EEG signals into comprehensive commands for the computers. EEG-based motor imagery (MI) signals are one of the most widely used signals in this topic. In this paper, an efficient algorithm to classify 2-class MI signals based on the convolutional neural network (CNN) through the transfer learning is introduced. To this end, different 3D representations of EEG signals are injected into the CNN. These proposed 3D representations are prepared by combination of some frequency and time-frequency algorithms such as Fourier Transform, CSP, DCT and EMD. Then, CNN will be trained to classify MI-EEG signals. The average accuracy of classification for 5 subjects achieved 98.5% on the BCI competition iii database IVa.
{"title":"EEG-based Motor Imagery Classification through Transfer Learning of the CNN","authors":"Saman Taheri, M. Ezoji","doi":"10.1109/MVIP49855.2020.9116900","DOIUrl":"https://doi.org/10.1109/MVIP49855.2020.9116900","url":null,"abstract":"Brain computer interface (BCI) is a system which is able to translate EEG signals into comprehensive commands for the computers. EEG-based motor imagery (MI) signals are one of the most widely used signals in this topic. In this paper, an efficient algorithm to classify 2-class MI signals based on the convolutional neural network (CNN) through the transfer learning is introduced. To this end, different 3D representations of EEG signals are injected into the CNN. These proposed 3D representations are prepared by combination of some frequency and time-frequency algorithms such as Fourier Transform, CSP, DCT and EMD. Then, CNN will be trained to classify MI-EEG signals. The average accuracy of classification for 5 subjects achieved 98.5% on the BCI competition iii database IVa.","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132144468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-02-01DOI: 10.1109/MVIP49855.2020.9116909
A. Foroozandeh, A. A. Hemmat, H. Rabbani
Handwriting signatures are widely used to register ownership in banking systems, administrative and financial applications, all over the world. With the increasing advancement of technology, increasing the volume of financial transactions, and the possibility of signature fraud, it is necessary to develop more accurate, convenient, and cost effective signature based authentication systems. In this paper, a signature verification method based on circlet transform and the statistical properties of the circlet coefficients is presented. Experiments have been conducted using three benchmark datasets: GPDS synthetic and MCYT-75 as two Latin signature datasets, and UTSig as a Persian signature dataset. Obtained experimental results, in comparison with literature, confirm the effectiveness of the presented method.
{"title":"Offline Handwritten Signature Verification Based on Circlet Transform and Statistical Features","authors":"A. Foroozandeh, A. A. Hemmat, H. Rabbani","doi":"10.1109/MVIP49855.2020.9116909","DOIUrl":"https://doi.org/10.1109/MVIP49855.2020.9116909","url":null,"abstract":"Handwriting signatures are widely used to register ownership in banking systems, administrative and financial applications, all over the world. With the increasing advancement of technology, increasing the volume of financial transactions, and the possibility of signature fraud, it is necessary to develop more accurate, convenient, and cost effective signature based authentication systems. In this paper, a signature verification method based on circlet transform and the statistical properties of the circlet coefficients is presented. Experiments have been conducted using three benchmark datasets: GPDS synthetic and MCYT-75 as two Latin signature datasets, and UTSig as a Persian signature dataset. Obtained experimental results, in comparison with literature, confirm the effectiveness of the presented method.","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121058621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-02-01DOI: 10.1109/MVIP49855.2020.9116882
Leila Kiani, Masoudnia Saeed, H. Nezamabadi-pour
Automatic colorizing is one of the most interesting problems in computer graphics. During the colorization process, the gray one-dimensional images are converted to three-dimensional images with colored components. As a typical technique, Convolutional neural networks (CNNs) have been well studied and used for automatic coloring. In these networks, the information that is generalized over in the top layers is available in intermediate layers. Although the output of the last layer of CNNs is usually used in many applications, in this paper, we use a concept called "Hypercolumn" derived from neuroscience to exploit information at all levels to develop a fully automated image colorization system. There are not always millions of data available in the real world to train complex deep learning models. Therefore, the VGG19 model trained with the big data set of ImageNet is used as a pre-trained model in the generator network and the hypercolumn idea is implemented in it with DIV2K datasets. We train our model to predict each pixel’s color texture. The results obtained indicate that the proposed method is superior to competing models.
{"title":"Image Colorization Using Generative Adversarial Networks and Transfer Learning","authors":"Leila Kiani, Masoudnia Saeed, H. Nezamabadi-pour","doi":"10.1109/MVIP49855.2020.9116882","DOIUrl":"https://doi.org/10.1109/MVIP49855.2020.9116882","url":null,"abstract":"Automatic colorizing is one of the most interesting problems in computer graphics. During the colorization process, the gray one-dimensional images are converted to three-dimensional images with colored components. As a typical technique, Convolutional neural networks (CNNs) have been well studied and used for automatic coloring. In these networks, the information that is generalized over in the top layers is available in intermediate layers. Although the output of the last layer of CNNs is usually used in many applications, in this paper, we use a concept called \"Hypercolumn\" derived from neuroscience to exploit information at all levels to develop a fully automated image colorization system. There are not always millions of data available in the real world to train complex deep learning models. Therefore, the VGG19 model trained with the big data set of ImageNet is used as a pre-trained model in the generator network and the hypercolumn idea is implemented in it with DIV2K datasets. We train our model to predict each pixel’s color texture. The results obtained indicate that the proposed method is superior to competing models.","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126644287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-02-01DOI: 10.1109/MVIP49855.2020.9116894
Shima Sahebdivani, H. Arefi, M. Maboudi
In architecture and engineering, the production of 3D models of various objects that are both simple and most closely related to reality is of particular importance. In this article, we are going to model different aspects of the interior of a building, which is performed in three general steps. In the first step, the existing point clouds of a room are semantically segmented using the PointNet Deep Learning Network. Each class of objects is then reconstructed using three methods including: Poisson, ball-pivoting and combined volumetric triangulation method and marching cubes. In the last step, each model is simplified by the methods of vertex clustering and edge collapse with quadratic error. Results are quantitatively and qualitatively evaluated for two types of objects, one with simple geometry and one with complex geometry. After selecting the optimal surface reconstruction method and simplifying it, all the objects are modeled. According to the results, the Poisson surface reconstruction method with a simplified edge collapse method provides better geometric accuracy of 0.1 mm for simpler geometry classes. In addition, for more complex geometry problems, the model produced by combined volumetric triangulation method and marching cubes with simplified edge collapse method was more suitable due to a higher accuracy of 0.022 mm.
{"title":"Deep Learning based Classification of Color Point Cloud for 3D Reconstruction of Interior Elements of Buildings","authors":"Shima Sahebdivani, H. Arefi, M. Maboudi","doi":"10.1109/MVIP49855.2020.9116894","DOIUrl":"https://doi.org/10.1109/MVIP49855.2020.9116894","url":null,"abstract":"In architecture and engineering, the production of 3D models of various objects that are both simple and most closely related to reality is of particular importance. In this article, we are going to model different aspects of the interior of a building, which is performed in three general steps. In the first step, the existing point clouds of a room are semantically segmented using the PointNet Deep Learning Network. Each class of objects is then reconstructed using three methods including: Poisson, ball-pivoting and combined volumetric triangulation method and marching cubes. In the last step, each model is simplified by the methods of vertex clustering and edge collapse with quadratic error. Results are quantitatively and qualitatively evaluated for two types of objects, one with simple geometry and one with complex geometry. After selecting the optimal surface reconstruction method and simplifying it, all the objects are modeled. According to the results, the Poisson surface reconstruction method with a simplified edge collapse method provides better geometric accuracy of 0.1 mm for simpler geometry classes. In addition, for more complex geometry problems, the model produced by combined volumetric triangulation method and marching cubes with simplified edge collapse method was more suitable due to a higher accuracy of 0.022 mm.","PeriodicalId":255375,"journal":{"name":"2020 International Conference on Machine Vision and Image Processing (MVIP)","volume":"148 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128601939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}