Pub Date : 2023-06-01DOI: 10.1109/IS3C57901.2023.00055
Pratamagusta Parawita Muhammad Dharmawan, Chuan-Wang Chang
Generative Adversarial Networks (GAN) is a generative modeling approach with the ability to learn highly complex data. Particularly, they don’t require immediate pairing between the data in input and output domains. This property makes it ideal for image translation tasks. Image translation from photo image into anime style images using GAN is a fast and efficient way to generate art for creative industries. Recently, some algorithms such as U-GAT-IT, CycleGAN, AnimeGAN, and CartoonGAN emerged as few algorithms to accomplish this particular task. The purpose of this paper is to compare the performance of these algorithms in photo-to-anime styled image-to-image translation and discuss the results these algorithms in image-to-image translation task between photo image domain into anime image domain.
{"title":"Exploring Generative Adversarial Networks for Photo-to-Anime Image-to-Image Translation: A Comparative Study","authors":"Pratamagusta Parawita Muhammad Dharmawan, Chuan-Wang Chang","doi":"10.1109/IS3C57901.2023.00055","DOIUrl":"https://doi.org/10.1109/IS3C57901.2023.00055","url":null,"abstract":"Generative Adversarial Networks (GAN) is a generative modeling approach with the ability to learn highly complex data. Particularly, they don’t require immediate pairing between the data in input and output domains. This property makes it ideal for image translation tasks. Image translation from photo image into anime style images using GAN is a fast and efficient way to generate art for creative industries. Recently, some algorithms such as U-GAT-IT, CycleGAN, AnimeGAN, and CartoonGAN emerged as few algorithms to accomplish this particular task. The purpose of this paper is to compare the performance of these algorithms in photo-to-anime styled image-to-image translation and discuss the results these algorithms in image-to-image translation task between photo image domain into anime image domain.","PeriodicalId":142483,"journal":{"name":"2023 Sixth International Symposium on Computer, Consumer and Control (IS3C)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115543654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose the first polarization-independent mode splitter and converter by using subwavelength grating metamaterials, realizing the 3-dB splitting and mode-order conversion, simultaneously. Simulated insertion losses and crosstalks below 0.75 dB and −14.5dB over a 70-nm bandwidth are achieved for both TE and TM modes.
{"title":"On-chip polarization-independent nanophotonics mode splitter and converter enabled by subwavelength metamaterials","authors":"Zhenzhao Guo, Ya-Chun Yu, Shengbao Wu, Jinbiao Xiao","doi":"10.1109/is3c57901.2023.00066","DOIUrl":"https://doi.org/10.1109/is3c57901.2023.00066","url":null,"abstract":"We propose the first polarization-independent mode splitter and converter by using subwavelength grating metamaterials, realizing the 3-dB splitting and mode-order conversion, simultaneously. Simulated insertion losses and crosstalks below 0.75 dB and −14.5dB over a 70-nm bandwidth are achieved for both TE and TM modes.","PeriodicalId":142483,"journal":{"name":"2023 Sixth International Symposium on Computer, Consumer and Control (IS3C)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122921373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.1109/IS3C57901.2023.00098
Jia-Chun Sheng, Yiping Liao, Chun-Rong Huang
Instance segmentation can be applied for the discrimination and diagnosis of cancer cells in pathology images. Accurate segmentation of each pathological cell in the pathology images can improve the efficiency of clinical diagnosis. In this paper, we aim to evaluate the state-of-the-art transformer-based instance segmentation method, masked-attention mask transformer (Mask2Former)[1], on pathology datasets. With the pretrained model of Mask2Former on the natural image instance segmentation dataset, we show that Mask2Former can be adaptive to small pathological datasets and achieve comparable or even better instance segmentation performance compared with the state-of-the-art task-specific pathology image instance segmentation methods.
{"title":"Apply Masked-attention Mask Transformer to Instance Segmentation in Pathology Images","authors":"Jia-Chun Sheng, Yiping Liao, Chun-Rong Huang","doi":"10.1109/IS3C57901.2023.00098","DOIUrl":"https://doi.org/10.1109/IS3C57901.2023.00098","url":null,"abstract":"Instance segmentation can be applied for the discrimination and diagnosis of cancer cells in pathology images. Accurate segmentation of each pathological cell in the pathology images can improve the efficiency of clinical diagnosis. In this paper, we aim to evaluate the state-of-the-art transformer-based instance segmentation method, masked-attention mask transformer (Mask2Former)[1], on pathology datasets. With the pretrained model of Mask2Former on the natural image instance segmentation dataset, we show that Mask2Former can be adaptive to small pathological datasets and achieve comparable or even better instance segmentation performance compared with the state-of-the-art task-specific pathology image instance segmentation methods.","PeriodicalId":142483,"journal":{"name":"2023 Sixth International Symposium on Computer, Consumer and Control (IS3C)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114156542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.1109/IS3C57901.2023.00021
Kei Masaoka, Irawati Nurmala Sari, Weiwei Du
Depth estimation using vanishing points is important in computer vision and has been widely used in various applications such as robotics and autonomous driving. A vanishing point is a point in the image where parallel lines appear to converge to a single point in 3D space. The detection of vanishing points in images plays a crucial role in estimating the depth of a scene. However, the accuracy of vanishing point detection is often affected by noisy or unconverged line segments detected by the line detectors. The problem with using line detectors is that they can produce noisy or unconverged line segments, leading to a decrease in the accuracy of vanishing point detection. Therefore, it is important to develop a method to extract accurate vanishing points from noisy line segments. This paper proposes an algorithm to detect vanishing points by projecting line segments to Gaussian sphere. The proposed method follows the steps: (1) Line segment detection by Mobile LSD [1], (2) Classifying line segments based on their angle, and (3) Projecting the classified line segments onto a Gaussian sphere and converting them into a 2D image. In the context of this transformed 2D image, it is postulated that the vanishing point corresponds to the region in which the line segments exhibit the greatest degree of overlap. In other words, the proposal does not need to compute all intersection points from line segments and recognize the vanishing points from the intersection points. The proposal can not only improves the accuracy but also reduces the time in vanishing points detection by the experiments.
{"title":"Vanishing Points Detection with Line Segments of Gaussian Sphere","authors":"Kei Masaoka, Irawati Nurmala Sari, Weiwei Du","doi":"10.1109/IS3C57901.2023.00021","DOIUrl":"https://doi.org/10.1109/IS3C57901.2023.00021","url":null,"abstract":"Depth estimation using vanishing points is important in computer vision and has been widely used in various applications such as robotics and autonomous driving. A vanishing point is a point in the image where parallel lines appear to converge to a single point in 3D space. The detection of vanishing points in images plays a crucial role in estimating the depth of a scene. However, the accuracy of vanishing point detection is often affected by noisy or unconverged line segments detected by the line detectors. The problem with using line detectors is that they can produce noisy or unconverged line segments, leading to a decrease in the accuracy of vanishing point detection. Therefore, it is important to develop a method to extract accurate vanishing points from noisy line segments. This paper proposes an algorithm to detect vanishing points by projecting line segments to Gaussian sphere. The proposed method follows the steps: (1) Line segment detection by Mobile LSD [1], (2) Classifying line segments based on their angle, and (3) Projecting the classified line segments onto a Gaussian sphere and converting them into a 2D image. In the context of this transformed 2D image, it is postulated that the vanishing point corresponds to the region in which the line segments exhibit the greatest degree of overlap. In other words, the proposal does not need to compute all intersection points from line segments and recognize the vanishing points from the intersection points. The proposal can not only improves the accuracy but also reduces the time in vanishing points detection by the experiments.","PeriodicalId":142483,"journal":{"name":"2023 Sixth International Symposium on Computer, Consumer and Control (IS3C)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129778086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.1109/IS3C57901.2023.00054
Rahmi Liza, Chen-Kun Tsung
Multi-Dimensional Voice Program (MDVP) parameters are very popular among physicians/clinicians to detect vocal pathologies and analyze various diseases of the vocal cords. In this paper, voice pathologies are automatically detected using the parameters of the MDVP. However, MDVP is commercial software, so this work is trying to build MDVP using Python to extract MDVP parameters useful for various experiments, automatic detection of sound pathologies, and automatic classification of voice recognition. This study evaluates MDVP parameters and applies the XGBoost model as a classification method to analyze and classify diseases. This work considers three sample data, polyps, nodules, and Reinke edema, popular in clinical vocal cords diseases, from Saarbruecken Voice Database (SVD) for data testing and training. Test results demonstrate the excellent ability of MDVP’s parameter extraction to identify healthy voices and obtain accurate classification results to discriminate between healthy voices and pathological voices. The best overall accuracy is 98% using the XGBoost classifier.
{"title":"Analyzing Voice Quality with Multi-Dimensional Voice Program for Disease Determination","authors":"Rahmi Liza, Chen-Kun Tsung","doi":"10.1109/IS3C57901.2023.00054","DOIUrl":"https://doi.org/10.1109/IS3C57901.2023.00054","url":null,"abstract":"Multi-Dimensional Voice Program (MDVP) parameters are very popular among physicians/clinicians to detect vocal pathologies and analyze various diseases of the vocal cords. In this paper, voice pathologies are automatically detected using the parameters of the MDVP. However, MDVP is commercial software, so this work is trying to build MDVP using Python to extract MDVP parameters useful for various experiments, automatic detection of sound pathologies, and automatic classification of voice recognition. This study evaluates MDVP parameters and applies the XGBoost model as a classification method to analyze and classify diseases. This work considers three sample data, polyps, nodules, and Reinke edema, popular in clinical vocal cords diseases, from Saarbruecken Voice Database (SVD) for data testing and training. Test results demonstrate the excellent ability of MDVP’s parameter extraction to identify healthy voices and obtain accurate classification results to discriminate between healthy voices and pathological voices. The best overall accuracy is 98% using the XGBoost classifier.","PeriodicalId":142483,"journal":{"name":"2023 Sixth International Symposium on Computer, Consumer and Control (IS3C)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128659589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.1109/is3c57901.2023.00003
{"title":"Copyright Page","authors":"","doi":"10.1109/is3c57901.2023.00003","DOIUrl":"https://doi.org/10.1109/is3c57901.2023.00003","url":null,"abstract":"","PeriodicalId":142483,"journal":{"name":"2023 Sixth International Symposium on Computer, Consumer and Control (IS3C)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129220648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.1109/IS3C57901.2023.00043
Y. Cheng, Yen-Ting Chiu
In the field of mechanical machining, measuring cutting force has become increasingly important. As the tool wears during the machining process, it can cause damage and require timely replacement. However, excessive and frequent tool changes can prolong overall processing time and lead to cost wastage. Moreover, if the timing of tool replacement is too late, it may also affect the quality of machining and reduce cost-effectiveness. Currently, most mechanical machining factories still rely on experienced engineers to judge whether the tool needs to be replaced by observing and listening to the machining process. This not only requires a lot of manpower and time costs but can also result in misjudgments. To address the problem of the inability to measure data in real-time during the machining process, this paper proposes a smart tool holder design that installs a sensing system inside the tool holder. The system uses a three-axis accelerometer to measure and analyze data. Additionally, the design uses a wireless power transmission system to provide power to the entire system and wirelessly transmit data to the host for signal processing and analysis. By displaying data through a window interface, the system can achieve real-time monitoring and is expected to have advantages such as fault diagnosis, improved accuracy, and reduced human costs.
{"title":"Intelligent Tool Holder Design: Effective Management of Tool Wear Through Real-Time Monitoring During Machining Processes","authors":"Y. Cheng, Yen-Ting Chiu","doi":"10.1109/IS3C57901.2023.00043","DOIUrl":"https://doi.org/10.1109/IS3C57901.2023.00043","url":null,"abstract":"In the field of mechanical machining, measuring cutting force has become increasingly important. As the tool wears during the machining process, it can cause damage and require timely replacement. However, excessive and frequent tool changes can prolong overall processing time and lead to cost wastage. Moreover, if the timing of tool replacement is too late, it may also affect the quality of machining and reduce cost-effectiveness. Currently, most mechanical machining factories still rely on experienced engineers to judge whether the tool needs to be replaced by observing and listening to the machining process. This not only requires a lot of manpower and time costs but can also result in misjudgments. To address the problem of the inability to measure data in real-time during the machining process, this paper proposes a smart tool holder design that installs a sensing system inside the tool holder. The system uses a three-axis accelerometer to measure and analyze data. Additionally, the design uses a wireless power transmission system to provide power to the entire system and wirelessly transmit data to the host for signal processing and analysis. By displaying data through a window interface, the system can achieve real-time monitoring and is expected to have advantages such as fault diagnosis, improved accuracy, and reduced human costs.","PeriodicalId":142483,"journal":{"name":"2023 Sixth International Symposium on Computer, Consumer and Control (IS3C)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123845727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.1109/IS3C57901.2023.00031
K. Tan, Meng-Yang Li, Chih-Chan Hu
A droop controlled microgrid with distribution static compensator (DSTATCOM) is developed to improve the power quality in this study. Due to the reactive power/voltage QV droop characteristic and the existence of the unbalanced, linear inductive and nonlinear loads, the power quality problems, including the voltage drop, unbalanced currents, lagging power factor (PF) and current harmonics, are very serious in the islanded microgrid. Moreover, owing to the instantaneous power following into or out of the DC-link capacitor of the DSTATCOM under load variation, the performance of the DSTATCOM for power quality improvement is seriously degenerated. Hence, to effectively improve the power quality of the droop controlled microgrid and the transient response of the DC-link voltage of the DSTATCOM under load variation, an online trained polynomial petri fuzzy neural network (PPFNN) controller is firstly proposed as the DC-link voltage controller to supersede the conventional proportional-integral (PI) controller in the DSTATCOM. The network structure and the online learning strategy of the proposed PPFNN are detailedly derived. Finally, the effectiveness of the DSTATCOM using the proposed PPFNN controller to improve the unbalanced currents, the total harmonic distortion (THD) reduction of the current and to compensate the reactive power for the voltage support and PF correction in the droop controlled microgrid is certified.
{"title":"Microgrid Using Intelligent Controlled DSTATCOM for Power Quality Enhancement","authors":"K. Tan, Meng-Yang Li, Chih-Chan Hu","doi":"10.1109/IS3C57901.2023.00031","DOIUrl":"https://doi.org/10.1109/IS3C57901.2023.00031","url":null,"abstract":"A droop controlled microgrid with distribution static compensator (DSTATCOM) is developed to improve the power quality in this study. Due to the reactive power/voltage QV droop characteristic and the existence of the unbalanced, linear inductive and nonlinear loads, the power quality problems, including the voltage drop, unbalanced currents, lagging power factor (PF) and current harmonics, are very serious in the islanded microgrid. Moreover, owing to the instantaneous power following into or out of the DC-link capacitor of the DSTATCOM under load variation, the performance of the DSTATCOM for power quality improvement is seriously degenerated. Hence, to effectively improve the power quality of the droop controlled microgrid and the transient response of the DC-link voltage of the DSTATCOM under load variation, an online trained polynomial petri fuzzy neural network (PPFNN) controller is firstly proposed as the DC-link voltage controller to supersede the conventional proportional-integral (PI) controller in the DSTATCOM. The network structure and the online learning strategy of the proposed PPFNN are detailedly derived. Finally, the effectiveness of the DSTATCOM using the proposed PPFNN controller to improve the unbalanced currents, the total harmonic distortion (THD) reduction of the current and to compensate the reactive power for the voltage support and PF correction in the droop controlled microgrid is certified.","PeriodicalId":142483,"journal":{"name":"2023 Sixth International Symposium on Computer, Consumer and Control (IS3C)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126519207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.1109/IS3C57901.2023.00069
Yufei Chen, Shengbao Wu, Jinbiao Xiao
On-chip polarization beam splitters (PBSs) are necessary elements in the polarization diversity circuits and coherent optical communication systems to manage the polarization of silicon nanophotonic devices. However, achieving the low insertion loss (IL), compact footprint, broad bandwidth, and high extinction ratio (ER) of PBS at the same time remains challenging. Here we propose and experimentally demonstrate a compact and high-ER PBS with a directional coupler (DC) consisting of two metamaterial waveguides, which constructs the asymmetry of the DC-based PBS in a new way. For TM polarization, the proposed PBS performs as a DC, while for TE polarization, the device acts as two isolated waveguides since one of the metamaterial waveguides establishes a large bandgap of TE polarization. In simulation, an ultra-compact coupling region of $1.3 times 6 mu mathbf{m}^{2}$ is achieved and the proposed PBS has an ER >20 dB and IL <0.5 dB in the wavelength range of 1500 - 1590 nm, which exceeds the entire C-band. The ER values at the central wavelength of 1550 nm are 48 dB and 36.6 dB for TE and TM modes, respectively. The experimental results illustrate that the fabricated PBS has excellent performance of ER >15 dB and IL <1.2 dB within a wide bandwidth of 90 nm.
片上偏振分束器是偏振分集电路和相干光通信系统中管理硅纳米光子器件偏振的必要元件。然而,同时实现PBS的低插入损耗(IL)、紧凑的占地面积、宽带宽和高消光比(ER)仍然是一个挑战。本文提出并实验证明了一种紧凑的高er定向耦合器(DC),该定向耦合器由两个超材料波导组成,以一种新的方式构建了基于DC的非对称PBS。对于TM极化,所提出的PBS作为直流波导,而对于TE极化,该器件作为两个隔离波导,因为其中一个超材料波导建立了较大的TE极化带隙。仿真结果表明,在90 nm的带宽范围内,该滤波器的ER >20 dB, IL < 15 dB, IL <1.2 dB,实现了1.3 × 6 mu mathbf{m}^{2}$的超紧凑耦合区域。
{"title":"An ultra-compact and broadband high extinction ratio polarization splitting directional coupler using nanohole-based metamaterial waveguides","authors":"Yufei Chen, Shengbao Wu, Jinbiao Xiao","doi":"10.1109/IS3C57901.2023.00069","DOIUrl":"https://doi.org/10.1109/IS3C57901.2023.00069","url":null,"abstract":"On-chip polarization beam splitters (PBSs) are necessary elements in the polarization diversity circuits and coherent optical communication systems to manage the polarization of silicon nanophotonic devices. However, achieving the low insertion loss (IL), compact footprint, broad bandwidth, and high extinction ratio (ER) of PBS at the same time remains challenging. Here we propose and experimentally demonstrate a compact and high-ER PBS with a directional coupler (DC) consisting of two metamaterial waveguides, which constructs the asymmetry of the DC-based PBS in a new way. For TM polarization, the proposed PBS performs as a DC, while for TE polarization, the device acts as two isolated waveguides since one of the metamaterial waveguides establishes a large bandgap of TE polarization. In simulation, an ultra-compact coupling region of $1.3 times 6 mu mathbf{m}^{2}$ is achieved and the proposed PBS has an ER >20 dB and IL <0.5 dB in the wavelength range of 1500 - 1590 nm, which exceeds the entire C-band. The ER values at the central wavelength of 1550 nm are 48 dB and 36.6 dB for TE and TM modes, respectively. The experimental results illustrate that the fabricated PBS has excellent performance of ER >15 dB and IL <1.2 dB within a wide bandwidth of 90 nm.","PeriodicalId":142483,"journal":{"name":"2023 Sixth International Symposium on Computer, Consumer and Control (IS3C)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128064763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.1109/IS3C57901.2023.00023
Shang-Xian Lin, Yueh-Shen Tu, Jenhui Chen
We address the problem of basketball jersey number recognition, where the jersey numbers may be partially occluded, deformed, or bent. These challenges make it difficult to accurately recognize the jersey numbers, especially in real-world scenarios with varying player postures, angles, lighting conditions, jersey colors, and patterns. In this paper, we propose a novel method for basketball jersey number recognition, name deformable hourglass network (DHN), by integrating deformable convolutional network v3 (DCNv3) into the hourglass architecture, which is inspired by the convolutional character networks (CharNet) model. We also provide a new dataset, which contains various occluded jersey numbers in real basketball contest scenarios. We show that the DHN can achieve better recognition accuracy and robustness of the deformed basketball jersey number in challenging real-world scenarios.
{"title":"Occluded and Deformed Jersey Numbers Recognition by Hourglass Networks with Deformable Convolutional Networks","authors":"Shang-Xian Lin, Yueh-Shen Tu, Jenhui Chen","doi":"10.1109/IS3C57901.2023.00023","DOIUrl":"https://doi.org/10.1109/IS3C57901.2023.00023","url":null,"abstract":"We address the problem of basketball jersey number recognition, where the jersey numbers may be partially occluded, deformed, or bent. These challenges make it difficult to accurately recognize the jersey numbers, especially in real-world scenarios with varying player postures, angles, lighting conditions, jersey colors, and patterns. In this paper, we propose a novel method for basketball jersey number recognition, name deformable hourglass network (DHN), by integrating deformable convolutional network v3 (DCNv3) into the hourglass architecture, which is inspired by the convolutional character networks (CharNet) model. We also provide a new dataset, which contains various occluded jersey numbers in real basketball contest scenarios. We show that the DHN can achieve better recognition accuracy and robustness of the deformed basketball jersey number in challenging real-world scenarios.","PeriodicalId":142483,"journal":{"name":"2023 Sixth International Symposium on Computer, Consumer and Control (IS3C)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132507929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}