Pub Date : 2023-12-22DOI: 10.5755/j01.itc.52.4.33239
Xuan Zhou, Jixiang Ma, Jianping Yi
Temporal reasoning is crucial for action recognition tasks. The previous works use 3D CNNs to jointly capture spatiotemporal information, but it causes a lot of computational costs as well. To improve the above problems, we propose a general channel split spatiotemporal network (CSST-Net) to achieve effective spatiotemporal feature representation learning. The CSST module consists of the grouped spatiotemporal modeling (GSTM) module and the parameter-free feature fusion (PFFF) module. The GSTM module decomposes features into spatial and temporal parts along the channel dimension in parallel, which focuses on spatial and temporal clues respectively. Meanwhile, we utilize the combination of group-wise convolution and point-wise convolution to reduce the number of parameters in the temporal branch, thus alleviating the overfitting of 3D CNNs. Furthermore, for the problem of spatiotemporal feature fusion, the PFFF module performs the recalibration and fusion of spatial and temporal features by a soft attention mechanism, without introducing extra parameters, thus ensuring the correct network information flow effectively. Finally, extensive experiments on three benchmark databases (Sth-Sth V1, Sth-Sth V2, and Jester) indicate that the proposed CSST-Net can achieve competitive performance compared to existing methods, and significantly reduces the number of parameters and FLOPs of 3D CNNs baseline.
{"title":"CSST-Net: Channel Split Spatiotemporal Network for Human Action Recognition","authors":"Xuan Zhou, Jixiang Ma, Jianping Yi","doi":"10.5755/j01.itc.52.4.33239","DOIUrl":"https://doi.org/10.5755/j01.itc.52.4.33239","url":null,"abstract":"Temporal reasoning is crucial for action recognition tasks. The previous works use 3D CNNs to jointly capture spatiotemporal information, but it causes a lot of computational costs as well. To improve the above problems, we propose a general channel split spatiotemporal network (CSST-Net) to achieve effective spatiotemporal feature representation learning. The CSST module consists of the grouped spatiotemporal modeling (GSTM) module and the parameter-free feature fusion (PFFF) module. The GSTM module decomposes features into spatial and temporal parts along the channel dimension in parallel, which focuses on spatial and temporal clues respectively. Meanwhile, we utilize the combination of group-wise convolution and point-wise convolution to reduce the number of parameters in the temporal branch, thus alleviating the overfitting of 3D CNNs. Furthermore, for the problem of spatiotemporal feature fusion, the PFFF module performs the recalibration and fusion of spatial and temporal features by a soft attention mechanism, without introducing extra parameters, thus ensuring the correct network information flow effectively. Finally, extensive experiments on three benchmark databases (Sth-Sth V1, Sth-Sth V2, and Jester) indicate that the proposed CSST-Net can achieve competitive performance compared to existing methods, and significantly reduces the number of parameters and FLOPs of 3D CNNs baseline.","PeriodicalId":54982,"journal":{"name":"Information Technology and Control","volume":"83 8","pages":""},"PeriodicalIF":1.1,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139164281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-22DOI: 10.5755/j01.itc.52.4.34039
Teng Xu, Ling Ren, Tian Shi, Yuan Gao, Jian-Bang Ding, Rong-Chen Jin
This paper proposes a novel PVF-YOLO model to extract the multi-scale traffic sign features more effectively during car driving. Firstly, the original convolution module is replaced with the Omni-Dimensional convolution (ODconv) and the feature information obtained from the shallow feature layer is incorporated into the network. Secondly, this paper proposes a parallel structure block module for capturing multi-scale features. This module uses the Large Kernel Attention (LKA) and Visual Multilayer Perceptron (Visual MLP) to capture the information generated by the network model. It enhances the representation ability of feature maps. Next, in the process of training, the gradient concentration algorithm is used to optimize the initial Stochastic Gradient Descent (SGD). Under the condition of real-time detection, it improves the detection accuracy. Finally, to improve the robustness of the model, this paper conducts extensive experiments. Tsinghua-Tencent 100K (TT100K), Changsha University of Science and Technology CCTSDB (CSUST Chinese Traffic Sign Detection Benchmark) are used as the training data set. It verifies that the PVF-YOLO method proposed in this paper enhances the detection ability of traffic signs of different scales, and the detection speed and accuracy are better than the original model.
{"title":"Traffic Sign Detection Algorithm Based on Improved Yolox","authors":"Teng Xu, Ling Ren, Tian Shi, Yuan Gao, Jian-Bang Ding, Rong-Chen Jin","doi":"10.5755/j01.itc.52.4.34039","DOIUrl":"https://doi.org/10.5755/j01.itc.52.4.34039","url":null,"abstract":"This paper proposes a novel PVF-YOLO model to extract the multi-scale traffic sign features more effectively during car driving. Firstly, the original convolution module is replaced with the Omni-Dimensional convolution (ODconv) and the feature information obtained from the shallow feature layer is incorporated into the network. Secondly, this paper proposes a parallel structure block module for capturing multi-scale features. This module uses the Large Kernel Attention (LKA) and Visual Multilayer Perceptron (Visual MLP) to capture the information generated by the network model. It enhances the representation ability of feature maps. Next, in the process of training, the gradient concentration algorithm is used to optimize the initial Stochastic Gradient Descent (SGD). Under the condition of real-time detection, it improves the detection accuracy. Finally, to improve the robustness of the model, this paper conducts extensive experiments. Tsinghua-Tencent 100K (TT100K), Changsha University of Science and Technology CCTSDB (CSUST Chinese Traffic Sign Detection Benchmark) are used as the training data set. It verifies that the PVF-YOLO method proposed in this paper enhances the detection ability of traffic signs of different scales, and the detection speed and accuracy are better than the original model.","PeriodicalId":54982,"journal":{"name":"Information Technology and Control","volume":"2 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138944963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-22DOI: 10.5755/j01.itc.52.4.34079
Jinhang Liu, Lin Li
A social-media user portrait is an important means of improving the quality of an Internet information service. Current user profiling methods do not discriminate the emotional differences of users of different genders and ages on social media against a background of multi-modality and a lack of domain sentiment labels. This paper adopts the sentiment analysis of images and text to improve label classification, incorporating gender and age differences in the sentiment analysis of multi-modal social-media user profiles. In the absence of domain sentiment labels, instance transfer learning technology is used to express the learning method with the sentiment of text and images; the semantic association learning of multi-modal data of graphics and text is realized; and a multi-modal attention mechanism is introduced to establish the hidden image and text. Alignment relationships are used to address the semantic and modal gaps between modalities. A multi-modal user portrait label classification model (MPCM) is constructed. In an analysis of the sentiment data of User users on Facebook, Twitter, and News, the MPCM method is compared with the naive Bayes, Latent Dirichlet Allocation, Tweet-LDA and LUBD-CM(3) methods in terms of accuracy, precision, recall and the FL-score. At a 95% confidence, the performance is improved by 1% to 4% by using the MPCM method.
{"title":"MPCM: Multi-modal User Portrait Classification Model Based on Collaborative Learning","authors":"Jinhang Liu, Lin Li","doi":"10.5755/j01.itc.52.4.34079","DOIUrl":"https://doi.org/10.5755/j01.itc.52.4.34079","url":null,"abstract":"A social-media user portrait is an important means of improving the quality of an Internet information service. Current user profiling methods do not discriminate the emotional differences of users of different genders and ages on social media against a background of multi-modality and a lack of domain sentiment labels. This paper adopts the sentiment analysis of images and text to improve label classification, incorporating gender and age differences in the sentiment analysis of multi-modal social-media user profiles. In the absence of domain sentiment labels, instance transfer learning technology is used to express the learning method with the sentiment of text and images; the semantic association learning of multi-modal data of graphics and text is realized; and a multi-modal attention mechanism is introduced to establish the hidden image and text. Alignment relationships are used to address the semantic and modal gaps between modalities. A multi-modal user portrait label classification model (MPCM) is constructed. In an analysis of the sentiment data of User users on Facebook, Twitter, and News, the MPCM method is compared with the naive Bayes, Latent Dirichlet Allocation, Tweet-LDA and LUBD-CM(3) methods in terms of accuracy, precision, recall and the FL-score. At a 95% confidence, the performance is improved by 1% to 4% by using the MPCM method.","PeriodicalId":54982,"journal":{"name":"Information Technology and Control","volume":"41 48","pages":""},"PeriodicalIF":1.1,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138946538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-22DOI: 10.5755/j01.itc.52.4.33765
Tung-Tso Tsai, Han-Yu LIN, Cheng-Ye Wu
To maintain the confidentiality of private data, encryption mechanisms have become prevalent. Researchers always strive to design secure and efficient encryption mechanisms in both symmetric and asymmetric key systems. Certificate-based public key systems (CB-PKS) belong to the family of asymmetric key systems. CB-PKS offers solutions to both the key escrow problem present in identity-based public key systems, and the need to construct a public key infrastructure in traditional public key systems. The past saw a wealth of research into the encryption mechanisms in the CB-PKS, called certificate-based encryption (CBE). Indeed, encrypted data (ciphertext) can be used in other applications such as the comparison of personal medical data as two ciphertexts can be compared to determine if they contain the same data (plaintext). However, the equality test of two ciphertexts in the CB-PKS is an open issue since research which has empirically studied is scant. The purpose of this paper is to propose the first certificate-based encryption with equality test (CBEET), and to prove that it is secure under the bilinear Diffie-Hellman (BDH) assumption.
{"title":"CBEET: Constructing Certificate-based Encryption with Equality Test in the CB-PKS","authors":"Tung-Tso Tsai, Han-Yu LIN, Cheng-Ye Wu","doi":"10.5755/j01.itc.52.4.33765","DOIUrl":"https://doi.org/10.5755/j01.itc.52.4.33765","url":null,"abstract":"To maintain the confidentiality of private data, encryption mechanisms have become prevalent. Researchers always strive to design secure and efficient encryption mechanisms in both symmetric and asymmetric key systems. Certificate-based public key systems (CB-PKS) belong to the family of asymmetric key systems. CB-PKS offers solutions to both the key escrow problem present in identity-based public key systems, and the need to construct a public key infrastructure in traditional public key systems. The past saw a wealth of research into the encryption mechanisms in the CB-PKS, called certificate-based encryption (CBE). Indeed, encrypted data (ciphertext) can be used in other applications such as the comparison of personal medical data as two ciphertexts can be compared to determine if they contain the same data (plaintext). However, the equality test of two ciphertexts in the CB-PKS is an open issue since research which has empirically studied is scant. The purpose of this paper is to propose the first certificate-based encryption with equality test (CBEET), and to prove that it is secure under the bilinear Diffie-Hellman (BDH) assumption.","PeriodicalId":54982,"journal":{"name":"Information Technology and Control","volume":"16 4","pages":""},"PeriodicalIF":1.1,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138947627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-22DOI: 10.5755/j01.itc.52.4.33125
Jianming Zhu, Wei Tang, Jian-Wei Dong
Recently, higher structure complicacy and performances requirements of the aero-engine have brought higher demands on its control system. For the control of aerodynamic thermodynamic system, the intelligent control method with self-learning ability will be a promising choice. In the paper, we propose an aero-engine intelligent controller design method based on twin delayed deep deterministic policy gradient (TD3) algorithm. The method enables the intelligent controller to learn continuously according to the feedback of the environment and control the aero-engine. The paper takes the intelligent controller design of the JT9D turbofan engine as an example. First, the aero-engine control problem is described as a Markov decision process for deep reinforcement learning algorithms. Second, a complete intelligent controller design process is constructed by reasonably designing the network structure and reward function. Finally, the comparison simulations are conducted to verify the effectiveness of the proposed methods. The simulation results show that the TD3 controller outperforms deep deterministic policy gradient (DDPG) and the proportional-integral-derivative (PID) in the aero-engine control task. And the TD3 controller can realize the tracking control of low-pressure turbine speed with quick response and small overshoot.
{"title":"Design of Intelligent Controller for Aero-engine Based on TD3 Algorithm","authors":"Jianming Zhu, Wei Tang, Jian-Wei Dong","doi":"10.5755/j01.itc.52.4.33125","DOIUrl":"https://doi.org/10.5755/j01.itc.52.4.33125","url":null,"abstract":"Recently, higher structure complicacy and performances requirements of the aero-engine have brought higher demands on its control system. For the control of aerodynamic thermodynamic system, the intelligent control method with self-learning ability will be a promising choice. In the paper, we propose an aero-engine intelligent controller design method based on twin delayed deep deterministic policy gradient (TD3) algorithm. The method enables the intelligent controller to learn continuously according to the feedback of the environment and control the aero-engine. The paper takes the intelligent controller design of the JT9D turbofan engine as an example. First, the aero-engine control problem is described as a Markov decision process for deep reinforcement learning algorithms. Second, a complete intelligent controller design process is constructed by reasonably designing the network structure and reward function. Finally, the comparison simulations are conducted to verify the effectiveness of the proposed methods. The simulation results show that the TD3 controller outperforms deep deterministic policy gradient (DDPG) and the proportional-integral-derivative (PID) in the aero-engine control task. And the TD3 controller can realize the tracking control of low-pressure turbine speed with quick response and small overshoot.","PeriodicalId":54982,"journal":{"name":"Information Technology and Control","volume":"42 6","pages":""},"PeriodicalIF":1.1,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139164429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-22DOI: 10.5755/j01.itc.52.4.32042
Mohd. Rehan Ghazi, N. S. Raghava
The availability of automated data collection techniques and the growth in the amount of data collected from cloud network traffic and cloud resource activities has transformed into a big data challenge, compelling the engagement of big data tools to handle, manage, and interpret it. A single classification method may fail to execute successfully for the amount of acquired data. Despite being more complex and consuming more computational resources, the research shows that stacking-based ensemble Machine Learning (ML) methodologies perform better in data classification approaches than single classifiers. This research proposes Intrusion Detection Systems (IDS), both based on the ensemble of ML algorithms built on the Stacked Generalization Approach (SGA) and big data technology. The suggested approaches are tested and assessed on NSL-KDD and UNSW-NB15 datasets, utilizing a Gain Ration (GR) based Feature Selection (FS) approach, J48, OneR, Support Vector Machine (SVM), Random Forest (RF), Multi- layer Perceptron (MLP) and Extreme Gradient Boosting (XGBoost) classifiers and Apache Spark, a prominent big data processing platform. The first technique involves storing data on HDFS, while the second involves selecting the most suitable subset of base classifiers for stacking. A thorough performance investigation reveals that our proposed model outperforms other current IDS models either in terms of accuracy or FPR or other performance metrics, in discovering intrusions for the Cloud.
{"title":"A Scalable and Stacked Ensemble Approach to Improve Intrusion Detection in Clouds","authors":"Mohd. Rehan Ghazi, N. S. Raghava","doi":"10.5755/j01.itc.52.4.32042","DOIUrl":"https://doi.org/10.5755/j01.itc.52.4.32042","url":null,"abstract":"The availability of automated data collection techniques and the growth in the amount of data collected from cloud network traffic and cloud resource activities has transformed into a big data challenge, compelling the engagement of big data tools to handle, manage, and interpret it. A single classification method may fail to execute successfully for the amount of acquired data. Despite being more complex and consuming more computational resources, the research shows that stacking-based ensemble Machine Learning (ML) methodologies perform better in data classification approaches than single classifiers. This research proposes Intrusion Detection Systems (IDS), both based on the ensemble of ML algorithms built on the Stacked Generalization Approach (SGA) and big data technology. The suggested approaches are tested and assessed on NSL-KDD and UNSW-NB15 datasets, utilizing a Gain Ration (GR) based Feature Selection (FS) approach, J48, OneR, Support Vector Machine (SVM), Random Forest (RF), Multi- layer Perceptron (MLP) and Extreme Gradient Boosting (XGBoost) classifiers and Apache Spark, a prominent big data processing platform. The first technique involves storing data on HDFS, while the second involves selecting the most suitable subset of base classifiers for stacking. A thorough performance investigation reveals that our proposed model outperforms other current IDS models either in terms of accuracy or FPR or other performance metrics, in discovering intrusions for the Cloud.","PeriodicalId":54982,"journal":{"name":"Information Technology and Control","volume":"3 4","pages":""},"PeriodicalIF":1.1,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138944673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-22DOI: 10.5755/j01.itc.52.4.34232
B. Krishnakumar, K. Kousalya
Breast cancer is the most widespread cancer among women. Based on the International cancer research center analysis, the highest number of deaths among women is due to breast cancer. Hence, detecting breast cancer at the earliest may help the oncologist to make appropriate decisions. Due to variations in breast tissue density, there is still a challenge in precise diagnosis and classification. To overcome this challenge, a novel OTDEM-based breast cancer segmentation and classification is proposed with the following four stages: they are, preprocessing, segmentation, feature extraction and classification. The input image is passed to the initial stage using the CLAHE filter to enhance the image. Then the preprocessed image is given to the segmentation stage for the image sub-segments by correlation-based deep joint segmentation. Following that, the features such as statistical features, improved LGXP, texton features, and shape-based features are derived from the segmented image. Then the derived features are fed to the ensemble model that includes CNN, DBN, and BI-GRU classifier to finalize the classification outcome. Further, to enhance the performance of the ensemble model, the weight of BI-GRU is optimized via a new algorithm termed SIPOA. This ensures optimal training to make the model more appropriate in its classification process. Finally, the performance of the proposed work is validated over the traditional models concerning different performance measures.
{"title":"Optimal Trained Deep Learning Model for Breast Cancer Segmentation and Classification","authors":"B. Krishnakumar, K. Kousalya","doi":"10.5755/j01.itc.52.4.34232","DOIUrl":"https://doi.org/10.5755/j01.itc.52.4.34232","url":null,"abstract":"Breast cancer is the most widespread cancer among women. Based on the International cancer research center analysis, the highest number of deaths among women is due to breast cancer. Hence, detecting breast cancer at the earliest may help the oncologist to make appropriate decisions. Due to variations in breast tissue density, there is still a challenge in precise diagnosis and classification. To overcome this challenge, a novel OTDEM-based breast cancer segmentation and classification is proposed with the following four stages: they are, preprocessing, segmentation, feature extraction and classification. The input image is passed to the initial stage using the CLAHE filter to enhance the image. Then the preprocessed image is given to the segmentation stage for the image sub-segments by correlation-based deep joint segmentation. Following that, the features such as statistical features, improved LGXP, texton features, and shape-based features are derived from the segmented image. Then the derived features are fed to the ensemble model that includes CNN, DBN, and BI-GRU classifier to finalize the classification outcome. Further, to enhance the performance of the ensemble model, the weight of BI-GRU is optimized via a new algorithm termed SIPOA. This ensures optimal training to make the model more appropriate in its classification process. Finally, the performance of the proposed work is validated over the traditional models concerning different performance measures.","PeriodicalId":54982,"journal":{"name":"Information Technology and Control","volume":"58 12","pages":""},"PeriodicalIF":1.1,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138945796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-22DOI: 10.5755/j01.itc.52.4.33766
Xiaoping Yang, Zhehong Li, Yuan Liu, Ran Huang, Kai Tan, Lin Huang
Aiming at the matter that pedestrian detection in the autonomous driving system is vulnerable to the influence of the external environment and the detector supported single sensor modal detector has poor performance beneath the condition of enormous amendment of unrestricted light-weight, this paper proposes a fusion of light and thermal infrared dual mode pedestrian detection methodology. Firstly, 1 × 1 convolution and expanded convolution square measure are introduced within the residual network, and also the ROI Align methodology is employed to exchange the ROI Pooling method-ology to map the candidate box to the feature layer to optimize the Faster R-CNN. Secondly, the loss performance of the generalized intersection over union (GIoU) is employed because of the loss performance of the prediction box positioning regression; finally, supported by the improved Faster R-CNN, four forms of multimodal neural network structures square measure designed to fuse visible and thermal infrared pictures. According to experimental findings, the proposed technique outperforms current mainstream detection algorithms on the KAIST dataset. As compared to the conventional ACF + T + THOG pedestrian detector, the AP is 8.38 percentage points greater. Compared to the visible light pedestrian detector, the miss rate is 5.34 percentage points lower.
针对自动驾驶系统中行人检测易受外部环境影响,以及在轻量化不受限制的巨大修正条件下,单传感器模态检测器性能不佳的问题,本文提出了一种光热红外双模行人融合检测方法。首先,在残差网络中引入 1 × 1 卷积和扩展卷积平方量,并采用 ROI Align 方法交换 ROI Pooling 方法将候选框映射到特征层,从而优化 Faster R-CNN 。其次,由于预测框定位回归的损失性能,采用了广义交集大于联合(GIoU)的损失性能;最后,在改进的 Faster R-CNN 的支持下,设计了四种形式的多模态神经网络结构来融合可见光和热红外图像。实验结果表明,在 KAIST 数据集上,所提出的技术优于当前的主流检测算法。与传统的 ACF + T + THOG 行人检测器相比,AP 高出 8.38 个百分点。与可见光行人检测器相比,漏检率降低了 5.34 个百分点。
{"title":"Research on Pedestrian Detection Based on Multimodal Infor-mation Fusion","authors":"Xiaoping Yang, Zhehong Li, Yuan Liu, Ran Huang, Kai Tan, Lin Huang","doi":"10.5755/j01.itc.52.4.33766","DOIUrl":"https://doi.org/10.5755/j01.itc.52.4.33766","url":null,"abstract":"Aiming at the matter that pedestrian detection in the autonomous driving system is vulnerable to the influence of the external environment and the detector supported single sensor modal detector has poor performance beneath the condition of enormous amendment of unrestricted light-weight, this paper proposes a fusion of light and thermal infrared dual mode pedestrian detection methodology. Firstly, 1 × 1 convolution and expanded convolution square measure are introduced within the residual network, and also the ROI Align methodology is employed to exchange the ROI Pooling method-ology to map the candidate box to the feature layer to optimize the Faster R-CNN. Secondly, the loss performance of the generalized intersection over union (GIoU) is employed because of the loss performance of the prediction box positioning regression; finally, supported by the improved Faster R-CNN, four forms of multimodal neural network structures square measure designed to fuse visible and thermal infrared pictures. According to experimental findings, the proposed technique outperforms current mainstream detection algorithms on the KAIST dataset. As compared to the conventional ACF + T + THOG pedestrian detector, the AP is 8.38 percentage points greater. Compared to the visible light pedestrian detector, the miss rate is 5.34 percentage points lower.","PeriodicalId":54982,"journal":{"name":"Information Technology and Control","volume":"17 3","pages":""},"PeriodicalIF":1.1,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138947619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-22DOI: 10.5755/j01.itc.52.4.33503
S. Nivedha, S. Shankar
Melanoma, a rapidly spreading and perilous type of skin cancer, is the focus of this study, presenting a reliable technique for its detection. It is one of the most prevalent types of cancer that might be challenging for medical professionals to diagnose. Artificial intelligence can improve diagnostic accuracy when utilized in conjunction with the expertise of medical specialists. An innovative computer-aided method for the diagnosis of skin cancer has been introduced in the current study. The construction of the proposed method uses the African Gorilla Troops Optimizer (AGTO) Algorithm, a recently introduced meta-heuristic optimization algorithm, and deep learning models such as Faster Region Convolutional Neural Networks. To reduce the complexity of the analytic process, valuable features are chosen using the AGTO method, and further classification is implemented using Faster R-CNN. The proposed model is applied to the ISIC-2020 skin cancer dataset. When the final performance results from the proposed model are compared to those from four existing works, the findings show that the proposed system outperforms the existing models with an accuracy of 98.55%.
{"title":"Melanoma Diagnosis Using Enhanced Faster Region Convolutional Neural Networks Optimized by Artificial Gorilla Troops Algorithm","authors":"S. Nivedha, S. Shankar","doi":"10.5755/j01.itc.52.4.33503","DOIUrl":"https://doi.org/10.5755/j01.itc.52.4.33503","url":null,"abstract":"Melanoma, a rapidly spreading and perilous type of skin cancer, is the focus of this study, presenting a reliable technique for its detection. It is one of the most prevalent types of cancer that might be challenging for medical professionals to diagnose. Artificial intelligence can improve diagnostic accuracy when utilized in conjunction with the expertise of medical specialists. An innovative computer-aided method for the diagnosis of skin cancer has been introduced in the current study. The construction of the proposed method uses the African Gorilla Troops Optimizer (AGTO) Algorithm, a recently introduced meta-heuristic optimization algorithm, and deep learning models such as Faster Region Convolutional Neural Networks. To reduce the complexity of the analytic process, valuable features are chosen using the AGTO method, and further classification is implemented using Faster R-CNN. The proposed model is applied to the ISIC-2020 skin cancer dataset. When the final performance results from the proposed model are compared to those from four existing works, the findings show that the proposed system outperforms the existing models with an accuracy of 98.55%.","PeriodicalId":54982,"journal":{"name":"Information Technology and Control","volume":"4 3","pages":""},"PeriodicalIF":1.1,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138944669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-22DOI: 10.5755/j01.itc.52.4.33712
Xiaoping Yang, Zhehong Li, Kai Tan, Xing Zhu, Guanghui Liu, Li Jiang
Landslides significantly impact economic development and public safety. Aiming at the problem of insufficient prediction accuracy of the displacement data series of the traditional grey Verhulst model, this paper proposes a fractional Verhulst model optimized using the beetle tentacle search algorithm. First, based on the grey Verhulst model, a fractional order operator is introduced to accurately adjust the magnitude between cumulative values, constructing a fractional order-based grey Verhulst model. Expanding the accumulative order range improves prediction performance. Second, the fractional operator is optimized. The beetle antennae search algorithm finds the optimal fractional order between 0 and 1 in the grey Verhulst model, minimizing average relative error. Finally, using Heifangtai landslide group displacement data from Gansu Province, simulation experiments verified that the model has higher fitting accuracy and prediction effect than the traditional grey Verhulst model, Huang's improved Verhulst model, GM (1,1) model, cubic exponential smoothing model, and DGM (2,1) model. The average relative error is 2.949 %. Results show that the beetle antennae search algorithm optimized fractional order grey prediction model significantly improves fitting and prediction effect on data. The optimized fractional Verhulst model is more suitable for predicting landslide displacement deformation.
{"title":"Design of Fractional Verhulst Model for Displacement Prediction of Landslide Based on the Optimization of Beetle Antennae Algorithm","authors":"Xiaoping Yang, Zhehong Li, Kai Tan, Xing Zhu, Guanghui Liu, Li Jiang","doi":"10.5755/j01.itc.52.4.33712","DOIUrl":"https://doi.org/10.5755/j01.itc.52.4.33712","url":null,"abstract":"Landslides significantly impact economic development and public safety. Aiming at the problem of insufficient prediction accuracy of the displacement data series of the traditional grey Verhulst model, this paper proposes a fractional Verhulst model optimized using the beetle tentacle search algorithm. First, based on the grey Verhulst model, a fractional order operator is introduced to accurately adjust the magnitude between cumulative values, constructing a fractional order-based grey Verhulst model. Expanding the accumulative order range improves prediction performance. Second, the fractional operator is optimized. The beetle antennae search algorithm finds the optimal fractional order between 0 and 1 in the grey Verhulst model, minimizing average relative error. Finally, using Heifangtai landslide group displacement data from Gansu Province, simulation experiments verified that the model has higher fitting accuracy and prediction effect than the traditional grey Verhulst model, Huang's improved Verhulst model, GM (1,1) model, cubic exponential smoothing model, and DGM (2,1) model. The average relative error is 2.949 %. Results show that the beetle antennae search algorithm optimized fractional order grey prediction model significantly improves fitting and prediction effect on data. The optimized fractional Verhulst model is more suitable for predicting landslide displacement deformation.","PeriodicalId":54982,"journal":{"name":"Information Technology and Control","volume":"71 11","pages":""},"PeriodicalIF":1.1,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139164317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}