Yanlei Li, Chenguang Yang, Amit Bahl, Raj Persad, Chris Melhuish
Prostate brachytherapy is a validated treatment for prostate cancer. During the procedure, the accuracy of needle placement is critical to the treatment’s effectiveness. However, the inserted needle could deflect from the preset trajectory because of the needle deflection, tissue shifting caused by the interaction between the needle and soft tissue, as well as the effects of pre-inserted needles. There are significant challenges in needle placement areas, especially in prostate brachytherapy, because multiple needles are required for the effectiveness of radiation. To overcome these limitations, relevant research is carried out in mechanical, computer science, and material science areas. With the development of surgical robotics, researchers are also exploring the possibilities of raising the accuracy of needle placement with surgical-assisted robotics. This study provides a review over the last 3 decades in each of the component research areas that constitutes a surgical robotics system, including needle steering approaches, needle-tissue deformation models, path planning algorithms and different automatic level surgical robotics systems used for prostate cancer treatment, especially prostate brachytherapy. Further directions for researchers are also suggested.
{"title":"A review on the techniques used in prostate brachytherapy","authors":"Yanlei Li, Chenguang Yang, Amit Bahl, Raj Persad, Chris Melhuish","doi":"10.1049/ccs2.12067","DOIUrl":"10.1049/ccs2.12067","url":null,"abstract":"<p>Prostate brachytherapy is a validated treatment for prostate cancer. During the procedure, the accuracy of needle placement is critical to the treatment’s effectiveness. However, the inserted needle could deflect from the preset trajectory because of the needle deflection, tissue shifting caused by the interaction between the needle and soft tissue, as well as the effects of pre-inserted needles. There are significant challenges in needle placement areas, especially in prostate brachytherapy, because multiple needles are required for the effectiveness of radiation. To overcome these limitations, relevant research is carried out in mechanical, computer science, and material science areas. With the development of surgical robotics, researchers are also exploring the possibilities of raising the accuracy of needle placement with surgical-assisted robotics. This study provides a review over the last 3 decades in each of the component research areas that constitutes a surgical robotics system, including needle steering approaches, needle-tissue deformation models, path planning algorithms and different automatic level surgical robotics systems used for prostate cancer treatment, especially prostate brachytherapy. Further directions for researchers are also suggested.</p>","PeriodicalId":33652,"journal":{"name":"Cognitive Computation and Systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ccs2.12067","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132134833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
There has been a remarkably increasing interest in music technology in the past few years, which is a multi-disciplinary overlapping research area. It involves digital signal processing, acoustics, mechanics, computer science, electronic engineering, artificial intelligence psychophysiology, cognitive neuroscience and music performance, theory and analysis. Among these sub-domains of music technology, Music Perception and Cognition are important parts of Computational Musicology as Musiking is a whole activity from music noumenon to being perceived and cognised by human beings. In addition to the calculation of basic elements of music itself, such as rhythm, pitch, timbre, harmony and structure, the perception of music to the human ear and the creative cognitive process should gain more attention from researchers because it serves as a bridge to join the humanity and technology.
Music perception exists in almost every aspect related to music, such as composing, playing, improvising, performing, teaching and learning. It is so comprehensive that a range of disciplines, including cognitive musicology, musical timbre perception, music emotions, acoustics, audio-based music signal processing, music interactive, cognitive modelling and music information retrieval, can be incorporated.
This special issue aims to bring together humanity and technology scientists in music technology in areas such as music performance art, creativity, computer science, experimental psychology, and cognitive science. It is composed of 10 outstanding contributions covering auditory attention selection behaviours, emotional music generation, instrument and performance skills recognition, perception and musical elements, music educational robots, affective computing, music-related social behaviour, and cross-cultural music dataset.
Li et al. studied the automatic recognition of traditional Chinese musical instrument audio. Specifically in the instrument type identification experiment, Mel-spectrum is used as input, and an 8-layer convolutional neural network is trained. This configuration achieves 99.3% accuracy; in the performance skills recognition experiments respectively conducted on single-instrument level and same-kind instruments level where the regularity of the same playing technique of different instruments can be utilised. The recognition accuracy of the four kinds of instruments is as follows: 95.7% for blowing instruments, 82.2% for plucked string instruments, 88.3% for strings instruments, and 97.5% for percussion instruments with a similar training procedure configuration.
Wang et al. used a cross-cultural approach to explore the correlations between perception and musical elements by comparing music emotion recognition models. In this approach, the participants are asked to rate valence, tension arousal and energy arousal on labelled nine-point analogical-categorical scales for four types of classical music: Chinese ensemble,
{"title":"Guest editorial: Music perception and cognition in music technology","authors":"Zijin Li, Stephen McAdams","doi":"10.1049/ccs2.12066","DOIUrl":"10.1049/ccs2.12066","url":null,"abstract":"<p>There has been a remarkably increasing interest in music technology in the past few years, which is a multi-disciplinary overlapping research area. It involves digital signal processing, acoustics, mechanics, computer science, electronic engineering, artificial intelligence psychophysiology, cognitive neuroscience and music performance, theory and analysis. Among these sub-domains of music technology, Music Perception and Cognition are important parts of Computational Musicology as <i>Musiking</i> is a whole activity from music noumenon to being perceived and cognised by human beings. In addition to the calculation of basic elements of music itself, such as rhythm, pitch, timbre, harmony and structure, the perception of music to the human ear and the creative cognitive process should gain more attention from researchers because it serves as a bridge to join the humanity and technology.</p><p>Music perception exists in almost every aspect related to music, such as composing, playing, improvising, performing, teaching and learning. It is so comprehensive that a range of disciplines, including cognitive musicology, musical timbre perception, music emotions, acoustics, audio-based music signal processing, music interactive, cognitive modelling and music information retrieval, can be incorporated.</p><p>This special issue aims to bring together humanity and technology scientists in music technology in areas such as music performance art, creativity, computer science, experimental psychology, and cognitive science. It is composed of 10 outstanding contributions covering auditory attention selection behaviours, emotional music generation, instrument and performance skills recognition, perception and musical elements, music educational robots, affective computing, music-related social behaviour, and cross-cultural music dataset.</p><p>Li et al. studied the automatic recognition of traditional Chinese musical instrument audio. Specifically in the instrument type identification experiment, Mel-spectrum is used as input, and an 8-layer convolutional neural network is trained. This configuration achieves 99.3% accuracy; in the performance skills recognition experiments respectively conducted on single-instrument level and same-kind instruments level where the regularity of the same playing technique of different instruments can be utilised. The recognition accuracy of the four kinds of instruments is as follows: 95.7% for blowing instruments, 82.2% for plucked string instruments, 88.3% for strings instruments, and 97.5% for percussion instruments with a similar training procedure configuration.</p><p>Wang et al. used a cross-cultural approach to explore the correlations between perception and musical elements by comparing music emotion recognition models. In this approach, the participants are asked to rate valence, tension arousal and energy arousal on labelled nine-point analogical-categorical scales for four types of classical music: Chinese ensemble, ","PeriodicalId":33652,"journal":{"name":"Cognitive Computation and Systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ccs2.12066","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126148087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aiming to help improve quality of life of the visually impaired people, this paper presents a novel wearable aid in the shape of a helmet for helping them find objects in indoor scenes. An object-goal navigation system based on a wearable device is developed, which consists of four modules: object relation prior knowledge (ORPK), perception, decision and feedback. To make the aid also work well in unfamiliar environment, ORPK is used for sub-goal inference to help the user find the target goal. And a method that learns the ORPK from unlabelled images by utilising a scene graph and knowledge graph is proposed. The effectiveness of the aid is demonstrated in real world experiments.
{"title":"Knowledge driven indoor object-goal navigation aid for visually impaired people","authors":"Xuan Hou, Huailin Zhao, Chenxu Wang, Huaping Liu","doi":"10.1049/ccs2.12061","DOIUrl":"10.1049/ccs2.12061","url":null,"abstract":"<p>Aiming to help improve quality of life of the visually impaired people, this paper presents a novel wearable aid in the shape of a helmet for helping them find objects in indoor scenes. An object-goal navigation system based on a wearable device is developed, which consists of four modules: object relation prior knowledge (ORPK), perception, decision and feedback. To make the aid also work well in unfamiliar environment, ORPK is used for sub-goal inference to help the user find the target goal. And a method that learns the ORPK from unlabelled images by utilising a scene graph and knowledge graph is proposed. The effectiveness of the aid is demonstrated in real world experiments.</p>","PeriodicalId":33652,"journal":{"name":"Cognitive Computation and Systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ccs2.12061","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126248908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nanfei Jiang, Zhexiao Xiong, Hui Tian, Xu Zhao, Xiaojie Du, Chaoyang Zhao, Jinqiao Wang
Face detection is the basic step of many face analysis tasks. In practice, face detectors usually run on mobile devices with limited memory and computing resources. Therefore, it is important to keep face detectors lightweight. To this end, current methods usually focus on directly designing lightweight detectors. Nevertheless, it is not fully explored whether the resource consumption of these lightweight detectors can be further suppressed without too much sacrifice on accuracy. In this study, we propose to apply the network pruning method to the lightweight face detection network, to further reduce its parameters and floating point operations. To identify the channels of less importance, we perform network training with sparsity regularisation on channel scaling factors of each layer. Then, we remove the connections and corresponding weights with near-zero scaling factors after sparsity training. We apply the proposed pruning pipeline to a state-of-the-art face detection method, EagleEye, and get a shrunken EagleEye model, which has a reduced number of computing operations and parameters. The shrunken model achieves comparable accuracy as the unpruned model. By using the proposed method, the shrunken EagleEye achieves a 56.3% reduction of parameter size with almost no accuracy loss on the WiderFace dataset.
{"title":"PruneFaceDet: Pruning lightweight face detection network by sparsity training","authors":"Nanfei Jiang, Zhexiao Xiong, Hui Tian, Xu Zhao, Xiaojie Du, Chaoyang Zhao, Jinqiao Wang","doi":"10.1049/ccs2.12065","DOIUrl":"https://doi.org/10.1049/ccs2.12065","url":null,"abstract":"<p>Face detection is the basic step of many face analysis tasks. In practice, face detectors usually run on mobile devices with limited memory and computing resources. Therefore, it is important to keep face detectors lightweight. To this end, current methods usually focus on directly designing lightweight detectors. Nevertheless, it is not fully explored whether the resource consumption of these lightweight detectors can be further suppressed without too much sacrifice on accuracy. In this study, we propose to apply the network pruning method to the lightweight face detection network, to further reduce its parameters and floating point operations. To identify the channels of less importance, we perform network training with sparsity regularisation on channel scaling factors of each layer. Then, we remove the connections and corresponding weights with near-zero scaling factors after sparsity training. We apply the proposed pruning pipeline to a state-of-the-art face detection method, EagleEye, and get a shrunken EagleEye model, which has a reduced number of computing operations and parameters. The shrunken model achieves comparable accuracy as the unpruned model. By using the proposed method, the shrunken EagleEye achieves a 56.3% reduction of parameter size with almost no accuracy loss on the WiderFace dataset.</p>","PeriodicalId":33652,"journal":{"name":"Cognitive Computation and Systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ccs2.12065","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"137494508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jun Yang, Zhenbo Cheng, Gang Xiao, Xuesong Xu, Yaming Wang, Haonan Ding, Diting Zhou
Engineers solving engineering design problems can be regarded as a gradual optimisation process that involves strategising. The process can be modelled as a reinforcement learning (RL) framework. This article presents an RL model with episodic controllers to solve engineering problems. Episodic controllers provide a mechanism for using the short-term and long-term memories to improve the efficiency of searching for engineering problem solutions. This work demonstrates that the two kinds of models of memories can be incorporated into the existing RL framework. Finally, an optimised design problem of a crane girder is illustrated by RL with episodic controllers. The work presented in this study leverages the RL model that has been shown to mimic human problem solving in engineering optimised design problems.
{"title":"Engineering design optimisation using reinforcement learning with episodic controllers","authors":"Jun Yang, Zhenbo Cheng, Gang Xiao, Xuesong Xu, Yaming Wang, Haonan Ding, Diting Zhou","doi":"10.1049/ccs2.12063","DOIUrl":"10.1049/ccs2.12063","url":null,"abstract":"<p>Engineers solving engineering design problems can be regarded as a gradual optimisation process that involves strategising. The process can be modelled as a reinforcement learning (RL) framework. This article presents an RL model with episodic controllers to solve engineering problems. Episodic controllers provide a mechanism for using the short-term and long-term memories to improve the efficiency of searching for engineering problem solutions. This work demonstrates that the two kinds of models of memories can be incorporated into the existing RL framework. Finally, an optimised design problem of a crane girder is illustrated by RL with episodic controllers. The work presented in this study leverages the RL model that has been shown to mimic human problem solving in engineering optimised design problems.</p>","PeriodicalId":33652,"journal":{"name":"Cognitive Computation and Systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ccs2.12063","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121007534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ke Xu, Bin Liu, Jianhua Tao, Zhao Lv, Cunhang Fan, Leichao Song
In order to solve the problem that the existing methods cannot effectively capture the semantic emotion of the sentence when faced with the lack of cross-language corpus, it is difficult to effectively perform cross-language sentiment analysis, we propose a neural network architecture called the Attention-Based Hybrid Robust Neural Network. The proposed architecture includes pre-trained word embedding with fine-tuning training to obtain prior semantic information, two sub-networks and attention mechanism to capture the global semantic emotional information in the text, and a fully connected layer and softmax function to jointly perform final emotional classification. The Convolutional Neural Networks sub-network captures the local semantic emotional information of the text, the BiLSTM sub-network captures the contextual semantic emotional information of the text, and the attention mechanism dynamically integrates the semantic emotional information to obtain key emotional information. We conduct experiments on Chinese (International Conference on Natural Language Processing and Chinese Computing) and English (SST) datasets. The experiment is divided into three subtasks to evaluate the superiority of our method. It improves the recognition accuracy of single sentence positive/negative classification from 79% to 86% in the single-language emotion recognition task. The recognition performance of fine-grained emotional tags is also improved by 9.6%. The recognition accuracy of cross-language emotion recognition tasks has also been improved by 1.5%. Even in the face of faulty data, the performance of our model is not significantly reduced when the error rate is less than 20%. These experimental results prove the superiority of our method.
为了解决现有方法在缺乏跨语言语料库的情况下无法有效捕获句子的语义情感,难以有效进行跨语言情感分析的问题,我们提出了一种基于注意力的混合鲁棒神经网络(Attention-Based Hybrid Robust neural network)。该架构包括预训练词嵌入和微调训练来获取先验语义信息,两个子网络和注意机制来捕获文本中的全局语义情感信息,以及一个全连接层和softmax函数来共同进行最终的情感分类。卷积神经网络子网络捕获文本的局部语义情感信息,BiLSTM子网络捕获文本的上下文语义情感信息,注意机制动态集成语义情感信息以获取关键情感信息。我们在中文(International Conference on Natural Language Processing and Chinese Computing)和英文(SST)数据集上进行实验。实验分为三个子任务来评估我们的方法的优越性。它将单语言情感识别任务中单句正负分类的识别准确率从79%提高到86%。细粒度情感标签的识别性能也提高了9.6%。跨语言情感识别任务的识别准确率也提高了1.5%。即使面对错误的数据,当错误率小于20%时,我们的模型的性能也不会明显下降。这些实验结果证明了我们方法的优越性。
{"title":"AHRNN: Attention-Based Hybrid Robust Neural Network for emotion recognition","authors":"Ke Xu, Bin Liu, Jianhua Tao, Zhao Lv, Cunhang Fan, Leichao Song","doi":"10.1049/ccs2.12038","DOIUrl":"https://doi.org/10.1049/ccs2.12038","url":null,"abstract":"<p>In order to solve the problem that the existing methods cannot effectively capture the semantic emotion of the sentence when faced with the lack of cross-language corpus, it is difficult to effectively perform cross-language sentiment analysis, we propose a neural network architecture called the Attention-Based Hybrid Robust Neural Network. The proposed architecture includes pre-trained word embedding with fine-tuning training to obtain prior semantic information, two sub-networks and attention mechanism to capture the global semantic emotional information in the text, and a fully connected layer and softmax function to jointly perform final emotional classification. The Convolutional Neural Networks sub-network captures the local semantic emotional information of the text, the BiLSTM sub-network captures the contextual semantic emotional information of the text, and the attention mechanism dynamically integrates the semantic emotional information to obtain key emotional information. We conduct experiments on Chinese (International Conference on Natural Language Processing and Chinese Computing) and English (SST) datasets. The experiment is divided into three subtasks to evaluate the superiority of our method. It improves the recognition accuracy of single sentence positive/negative classification from 79% to 86% in the single-language emotion recognition task. The recognition performance of fine-grained emotional tags is also improved by 9.6%. The recognition accuracy of cross-language emotion recognition tasks has also been improved by 1.5%. Even in the face of faulty data, the performance of our model is not significantly reduced when the error rate is less than 20%. These experimental results prove the superiority of our method.</p>","PeriodicalId":33652,"journal":{"name":"Cognitive Computation and Systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ccs2.12038","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"92307873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The article presents the design and development of a virtual fretless Chinese stringed instrument App with the Duxianqin as an example, whose performance is expected to be no different from a real instrument. The digital simulation of fretless musical instruments is mainly divided into two parts: the simulation of the continuous pitch processing of the strings, and the simulation of the sound produced by plucking strings. The article returns to the theory of mechanics and wave theory and obtains the quantitative relationship between string frequency and its deformation and elongation. The Duxianqin selected in this article is a fretless instrument, which cannot be completely simulated by relying solely on sound source data. Playing and vocalization require real-time synthesis through pitch processing, which has certain reference significance for the realization of other fretless instruments.
{"title":"Design and app development of a virtual fretless Chinese musical instrument","authors":"Rongfeng Li, Ke Lyu","doi":"10.1049/ccs2.12046","DOIUrl":"10.1049/ccs2.12046","url":null,"abstract":"<p>The article presents the design and development of a virtual fretless Chinese stringed instrument App with the Duxianqin as an example, whose performance is expected to be no different from a real instrument. The digital simulation of fretless musical instruments is mainly divided into two parts: the simulation of the continuous pitch processing of the strings, and the simulation of the sound produced by plucking strings. The article returns to the theory of mechanics and wave theory and obtains the quantitative relationship between string frequency and its deformation and elongation. The Duxianqin selected in this article is a fretless instrument, which cannot be completely simulated by relying solely on sound source data. Playing and vocalization require real-time synthesis through pitch processing, which has certain reference significance for the realization of other fretless instruments.</p>","PeriodicalId":33652,"journal":{"name":"Cognitive Computation and Systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ccs2.12046","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130755167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jing Jiang, Jingyu Liu, Zijin Li, Tingyu Zhang, Hong Yang
String instruments, wind instruments and percussion instruments are three traditional categories of musical instruments, among which wind instruments play an important role. Usually, pitches of wind instruments are determined by the vibrating air column, and the musical pitches will be affected by multiple factors of the air flow. In this article, the mechanism of sound production by a pipe is analysed in terms of the coupling of the edge tone and the air column's vibration in the tube. Experiments and computational fluid dynamics numerical calculations are combined to study the influence of the jet velocity on the oscillation frequency of the edge tone and the musical sound produced by the tube, which help to gain deeper insight into the relation between physics and music.
{"title":"Analysis on the mechanism of sound production and effects of musical flue pipe","authors":"Jing Jiang, Jingyu Liu, Zijin Li, Tingyu Zhang, Hong Yang","doi":"10.1049/ccs2.12048","DOIUrl":"https://doi.org/10.1049/ccs2.12048","url":null,"abstract":"<p>String instruments, wind instruments and percussion instruments are three traditional categories of musical instruments, among which wind instruments play an important role. Usually, pitches of wind instruments are determined by the vibrating air column, and the musical pitches will be affected by multiple factors of the air flow. In this article, the mechanism of sound production by a pipe is analysed in terms of the coupling of the edge tone and the air column's vibration in the tube. Experiments and computational fluid dynamics numerical calculations are combined to study the influence of the jet velocity on the oscillation frequency of the edge tone and the musical sound produced by the tube, which help to gain deeper insight into the relation between physics and music.</p>","PeriodicalId":33652,"journal":{"name":"Cognitive Computation and Systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ccs2.12048","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"92196978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper is part of a special issue on Music Technology. We study the type recognition of traditional Chinese musical instrument audio in the common way. Using MEL spectrum characteristics as input, we train an 8-layer convolutional neural network, and finally achieve 99.3% accuracy. After that, this paper mainly studies the performance skill recognition of Chinese traditional musical instruments. Firstly, for a single instrument, the features were extracted by using the pre-trained ResNet model, and then the SVM algorithm was used to classify all the instruments with an accuracy of 99%. Then, in order to improve the generalization of the model, the paper proposes the performance skill recognition of the same kind of instruments. In this way, the regularity of the same playing technique of different instruments can be utilized. Finally, the recognition accuracy of the four kinds of instruments is as follows: 95.7% for blowing instruments, 82.2% for plucked-string instruments, 88.3% for strings instruments, and 97.5% for percussion instruments. We open source the audio database of traditional Chinese musical instruments and the Python source code of the whole experiment for further research.
{"title":"Audio recognition of Chinese traditional instruments based on machine learning","authors":"Rongfeng Li, Qin Zhang","doi":"10.1049/ccs2.12047","DOIUrl":"10.1049/ccs2.12047","url":null,"abstract":"<p>This paper is part of a special issue on Music Technology. We study the type recognition of traditional Chinese musical instrument audio in the common way. Using MEL spectrum characteristics as input, we train an 8-layer convolutional neural network, and finally achieve 99.3% accuracy. After that, this paper mainly studies the performance skill recognition of Chinese traditional musical instruments. Firstly, for a single instrument, the features were extracted by using the pre-trained ResNet model, and then the SVM algorithm was used to classify all the instruments with an accuracy of 99%. Then, in order to improve the generalization of the model, the paper proposes the performance skill recognition of the same kind of instruments. In this way, the regularity of the same playing technique of different instruments can be utilized. Finally, the recognition accuracy of the four kinds of instruments is as follows: 95.7% for blowing instruments, 82.2% for plucked-string instruments, 88.3% for strings instruments, and 97.5% for percussion instruments. We open source the audio database of traditional Chinese musical instruments and the Python source code of the whole experiment for further research.</p>","PeriodicalId":33652,"journal":{"name":"Cognitive Computation and Systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ccs2.12047","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130303381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Patients with cerebral haemorrhages need to drain haematomas. Fresh blood may appear during the haematoma drainage process, so this needs to be observed and detected in real time. To solve this problem, this paper studies images produced during the haematoma drainage process. A blood image feature selection recognition and classification framework is designed. First, aiming at the characteristics of the small colour differences in blood images, the general RGB colour space feature is not obvious. This study proposes an optimal colour channel selection method. By extracting the colour information from the images, it is recombined into a 3 × 3 matrix. The normalised 4-neighbourhood contrast and variance are calculated for quantitative comparison. The optimised colour channel is selected to overcome the problem of weak features caused by a single colour space. After that, the effective region in the image is intercepted, and the best colour channel of the image in the region is transformed. The first, second and third moments of the three best colour channels are extracted to form a nine-dimensional eigenvector. K-means clustering is used to obtain the image eigenvector, outliers are removed, and the results are then transferred to the hidden Markov model (HMM) and support vector machine (SVM) for classification. After selecting the best color channel, the classification accuracy of HMM-SVM is greatly improved. Compared with other classification algorithms, the proposed method offers great advantages. Experiments show that the recognition accuracy of this method reaches 98.9%.
{"title":"Classification and detection using hidden Markov model-support vector machine algorithm based on optimal colour space selection for blood images","authors":"Lei Guo, Yao Wang, Yuan Song, Tengyue Sun","doi":"10.1049/ccs2.12045","DOIUrl":"https://doi.org/10.1049/ccs2.12045","url":null,"abstract":"<p>Patients with cerebral haemorrhages need to drain haematomas. Fresh blood may appear during the haematoma drainage process, so this needs to be observed and detected in real time. To solve this problem, this paper studies images produced during the haematoma drainage process. A blood image feature selection recognition and classification framework is designed. First, aiming at the characteristics of the small colour differences in blood images, the general RGB colour space feature is not obvious. This study proposes an optimal colour channel selection method. By extracting the colour information from the images, it is recombined into a 3 × 3 matrix. The normalised 4-neighbourhood contrast and variance are calculated for quantitative comparison. The optimised colour channel is selected to overcome the problem of weak features caused by a single colour space. After that, the effective region in the image is intercepted, and the best colour channel of the image in the region is transformed. The first, second and third moments of the three best colour channels are extracted to form a nine-dimensional eigenvector. K-means clustering is used to obtain the image eigenvector, outliers are removed, and the results are then transferred to the hidden Markov model (HMM) and support vector machine (SVM) for classification. After selecting the best color channel, the classification accuracy of HMM-SVM is greatly improved. Compared with other classification algorithms, the proposed method offers great advantages. Experiments show that the recognition accuracy of this method reaches 98.9%.</p>","PeriodicalId":33652,"journal":{"name":"Cognitive Computation and Systems","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/ccs2.12045","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"92314250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}