Pub Date : 2018-09-01DOI: 10.23919/SPA.2018.8563430
Tatsiana Viarbitskaya, A. Dobrucki
The topic of the article is recognition of instruments and playing techniques of music for detection and correction of errors in a given music sample. It shows how to achieve characteristics of recorded sound and also how to compare amplitudes and frequencies of the same music piece, but played by different persons and also with using various instruments. For this aim the signal processing algorithms are used, which are available in standard Python libraries such as “numpy” or “scipy”. The key idea of the processing is detection of errors, but save playing technique and individual style of the player.
{"title":"Audio processing with using Python language science libraries","authors":"Tatsiana Viarbitskaya, A. Dobrucki","doi":"10.23919/SPA.2018.8563430","DOIUrl":"https://doi.org/10.23919/SPA.2018.8563430","url":null,"abstract":"The topic of the article is recognition of instruments and playing techniques of music for detection and correction of errors in a given music sample. It shows how to achieve characteristics of recorded sound and also how to compare amplitudes and frequencies of the same music piece, but played by different persons and also with using various instruments. For this aim the signal processing algorithms are used, which are available in standard Python libraries such as “numpy” or “scipy”. The key idea of the processing is detection of errors, but save playing technique and individual style of the player.","PeriodicalId":265587,"journal":{"name":"2018 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133125282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-09-01DOI: 10.23919/SPA.2018.8563391
Marianna Parzych, T. Marciniak, A. Dabrowski
The paper presents an analysis of visualization methods of crowd density visualization. Generated density maps take into account changes in time. Three methods have been implemented and tested. The first one uses motion detection based on the background subtraction. The second one is based on BLOBs (binary large objects) analysis. The third method uses interest points ie. points on the image that can be used by the object track the movement. The tests were performed using the PETS2009 video sequence database. The obtained maps were evaluated and the time consumptions were estimated.
{"title":"Adaptive methods of time-dependent crowd density distribution visualization","authors":"Marianna Parzych, T. Marciniak, A. Dabrowski","doi":"10.23919/SPA.2018.8563391","DOIUrl":"https://doi.org/10.23919/SPA.2018.8563391","url":null,"abstract":"The paper presents an analysis of visualization methods of crowd density visualization. Generated density maps take into account changes in time. Three methods have been implemented and tested. The first one uses motion detection based on the background subtraction. The second one is based on BLOBs (binary large objects) analysis. The third method uses interest points ie. points on the image that can be used by the object track the movement. The tests were performed using the PETS2009 video sequence database. The obtained maps were evaluated and the time consumptions were estimated.","PeriodicalId":265587,"journal":{"name":"2018 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116621102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-09-01DOI: 10.23919/SPA.2018.8563429
Michał Bednarek, K. Walas
Convolutional Neural Networks (CNNs) have brought us the exceptionally significant improvement in the performance of the variety of visual tasks, such as object classification, semantic segmentation or linear regression. However, these powerful neural models suffer from the lack of spatial invariance. In this paper, we introduce the end-to-end system that is able to learn such invariance including in-plane and out-of-plane rotations. We performed extensive experiments on variations of widely known MNIST dataset, which consist of images subjected to deformations. Our comparative results show that we can successfully improve the classification score by implementing so-called Spatial Transformer module.
{"title":"Spatial Transformations in Deep Neural Networks","authors":"Michał Bednarek, K. Walas","doi":"10.23919/SPA.2018.8563429","DOIUrl":"https://doi.org/10.23919/SPA.2018.8563429","url":null,"abstract":"Convolutional Neural Networks (CNNs) have brought us the exceptionally significant improvement in the performance of the variety of visual tasks, such as object classification, semantic segmentation or linear regression. However, these powerful neural models suffer from the lack of spatial invariance. In this paper, we introduce the end-to-end system that is able to learn such invariance including in-plane and out-of-plane rotations. We performed extensive experiments on variations of widely known MNIST dataset, which consist of images subjected to deformations. Our comparative results show that we can successfully improve the classification score by implementing so-called Spatial Transformer module.","PeriodicalId":265587,"journal":{"name":"2018 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114967343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-09-01DOI: 10.23919/SPA.2018.8563432
Maciej Sabiniok, S. Brachmański
Notch-filter-based howling suppression is one of the most popular gain reduction method of dealing with acoustic feedback problem. The main goal of this paper is to analyze the possibilities of using the grey prediction model GM(1,1) in order to accelerate the feedback detection process of the algorithm. Computer based comparative simulations of the algorithm containing the prediction model in the detection stage and without it were performed. Simulations were performed for different prediction order, number of predicted samples and analysis window length. The comparison and evaluation were carried out for different source signals. Music, speech and noise signals were used.
{"title":"Analysis of application possibilities of Grey System Theory to detection of acoustic feedback","authors":"Maciej Sabiniok, S. Brachmański","doi":"10.23919/SPA.2018.8563432","DOIUrl":"https://doi.org/10.23919/SPA.2018.8563432","url":null,"abstract":"Notch-filter-based howling suppression is one of the most popular gain reduction method of dealing with acoustic feedback problem. The main goal of this paper is to analyze the possibilities of using the grey prediction model GM(1,1) in order to accelerate the feedback detection process of the algorithm. Computer based comparative simulations of the algorithm containing the prediction model in the detection stage and without it were performed. Simulations were performed for different prediction order, number of predicted samples and analysis window length. The comparison and evaluation were carried out for different source signals. Music, speech and noise signals were used.","PeriodicalId":265587,"journal":{"name":"2018 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129611949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-09-01DOI: 10.23919/SPA.2018.8563401
Krzysztof Krupa, Marcin Grochowina
Sound direction estimation can be used in many different mechatronic systems, while the use of bare-metal programming microcontrollers allows for miniaturization and broadening the range of applications. The paper presents a microprocessor implementation of the system allowing to determine the azimuth for the source of sound. The device operates based on the measurement of the phase shift of the incoming signal to two spaced apart microphones. The algorithm based on calculating the correlation of sound signals using the FFT algorithm was used in the research.
{"title":"Microprocessor implementation of the sound source location process based on the correlation of signals","authors":"Krzysztof Krupa, Marcin Grochowina","doi":"10.23919/SPA.2018.8563401","DOIUrl":"https://doi.org/10.23919/SPA.2018.8563401","url":null,"abstract":"Sound direction estimation can be used in many different mechatronic systems, while the use of bare-metal programming microcontrollers allows for miniaturization and broadening the range of applications. The paper presents a microprocessor implementation of the system allowing to determine the azimuth for the source of sound. The device operates based on the measurement of the phase shift of the incoming signal to two spaced apart microphones. The algorithm based on calculating the correlation of sound signals using the FFT algorithm was used in the research.","PeriodicalId":265587,"journal":{"name":"2018 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125330985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-09-01DOI: 10.23919/SPA.2018.8563359
J. Rafalko
The paper presents an approach to marking the boundaries of allophones in the speech signal based on the Dynamic Time Warping (DTW) algorithm. Setting and marking of allophones boundaries in continuous speech is a difficult issue due to the mutual influence of adjacent phonemes on each other. It is this neighbourhood on the one hand that creates variants of phonemes that is allophones, and on the other hand it affects that the border between allophones is in some cases very difficult to determine. Nowadays, this task is carried out manually in cooperation with specialists in the field of phonetics. The presented approach allows to build a system that is able to automate this process. The aim of the work currently carried out by the author is a method that facilitates the training material processing for the needs of the development of multimodal speech recognition systems. For this purpose, the difficult problem of marking boundaries of allophones is solved in this report based on the Polish dictionary in the context of the creation of allophone bases for speech synthesis. This is done in this way due to the simplified possibility of organizing critical listening and subjective evaluation of received allophones by a large group of Polish native speakers (73 people). Strengthening the method will allow it to be used for the extraction of allophones for the needs of developed system of automatic transcription of English speech and for its notation according to the IPA standard. The analysed continuous speech is combined in the DTW algorithm with a synthesized speech signal. The comparison of both signals is performed not in the time domain as in the classical DTW, but in the frequency domain. This allows for a statement that the phonetic content of both signals is compared. The paper describes the process of marking the boundaries of allophones for the Polish language, however after appropriate modifications, this approach can be used to determine the allophones boundaries in other languages, especially for English.
{"title":"Marking the Allophones Boundaries Based on the DTW Algorithm","authors":"J. Rafalko","doi":"10.23919/SPA.2018.8563359","DOIUrl":"https://doi.org/10.23919/SPA.2018.8563359","url":null,"abstract":"The paper presents an approach to marking the boundaries of allophones in the speech signal based on the Dynamic Time Warping (DTW) algorithm. Setting and marking of allophones boundaries in continuous speech is a difficult issue due to the mutual influence of adjacent phonemes on each other. It is this neighbourhood on the one hand that creates variants of phonemes that is allophones, and on the other hand it affects that the border between allophones is in some cases very difficult to determine. Nowadays, this task is carried out manually in cooperation with specialists in the field of phonetics. The presented approach allows to build a system that is able to automate this process. The aim of the work currently carried out by the author is a method that facilitates the training material processing for the needs of the development of multimodal speech recognition systems. For this purpose, the difficult problem of marking boundaries of allophones is solved in this report based on the Polish dictionary in the context of the creation of allophone bases for speech synthesis. This is done in this way due to the simplified possibility of organizing critical listening and subjective evaluation of received allophones by a large group of Polish native speakers (73 people). Strengthening the method will allow it to be used for the extraction of allophones for the needs of developed system of automatic transcription of English speech and for its notation according to the IPA standard. The analysed continuous speech is combined in the DTW algorithm with a synthesized speech signal. The comparison of both signals is performed not in the time domain as in the classical DTW, but in the frequency domain. This allows for a statement that the phonetic content of both signals is compared. The paper describes the process of marking the boundaries of allophones for the Polish language, however after appropriate modifications, this approach can be used to determine the allophones boundaries in other languages, especially for English.","PeriodicalId":265587,"journal":{"name":"2018 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"277 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122937932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-09-01DOI: 10.23919/SPA.2018.8563417
A. Konieczka, Ewelina Michałowicz, Karol Piniarski
In this paper, we propose a new thermal camera-based system for tram drivers. It aims to increase the safety of tram traffic at night. The proposed solution uses a standard vision camera and a thermal camera. Firstly, it processes the achieved images in order to detect the tram tracks. Secondly, it detects people or obstacles on tracks and generates warnings for the driver. This solution has been tested in static condition using a standard-gauge tram. The achieved results prove that this prototype system can effectively warn of danger situations especially in dark places.
{"title":"Infrared thermal camera-based system for tram drivers warning about hazardous situations","authors":"A. Konieczka, Ewelina Michałowicz, Karol Piniarski","doi":"10.23919/SPA.2018.8563417","DOIUrl":"https://doi.org/10.23919/SPA.2018.8563417","url":null,"abstract":"In this paper, we propose a new thermal camera-based system for tram drivers. It aims to increase the safety of tram traffic at night. The proposed solution uses a standard vision camera and a thermal camera. Firstly, it processes the achieved images in order to detect the tram tracks. Secondly, it detects people or obstacles on tracks and generates warnings for the driver. This solution has been tested in static condition using a standard-gauge tram. The achieved results prove that this prototype system can effectively warn of danger situations especially in dark places.","PeriodicalId":265587,"journal":{"name":"2018 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128183894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-09-01DOI: 10.23919/SPA.2018.8563381
P. Strumiłło
Visual impairment is one of the most serious sensory disabilities. It deprives a human being of an active professional and social live. EU reports indicate that for every 1000 Europeans citizens 4 are blind or suffer from serious visual impairment and this number is predicted to increase with time due to our ageing society. In spite of numerous, worldwide research efforts focusing on building innovative aids helping the blind no single electronic travel aid (ETA) solution has been widely accepted by the blind community. The aim of the tutorial is to apprise the current state of the art in the field of electronic interfaces aiding the blind in independent travel, navigation and access to information. Functional solutions and outcomes of recent research projects devoted to assistive technologies for the visually impaired will be presented.
{"title":"Electronic Systems and Interfaces Aiding the Visually Impaired","authors":"P. Strumiłło","doi":"10.23919/SPA.2018.8563381","DOIUrl":"https://doi.org/10.23919/SPA.2018.8563381","url":null,"abstract":"Visual impairment is one of the most serious sensory disabilities. It deprives a human being of an active professional and social live. EU reports indicate that for every 1000 Europeans citizens 4 are blind or suffer from serious visual impairment and this number is predicted to increase with time due to our ageing society. In spite of numerous, worldwide research efforts focusing on building innovative aids helping the blind no single electronic travel aid (ETA) solution has been widely accepted by the blind community. The aim of the tutorial is to apprise the current state of the art in the field of electronic interfaces aiding the blind in independent travel, navigation and access to information. Functional solutions and outcomes of recent research projects devoted to assistive technologies for the visually impaired will be presented.","PeriodicalId":265587,"journal":{"name":"2018 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127673035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-09-01DOI: 10.23919/SPA.2018.8563390
A. Platonov, I. Zaitsev
This paper presents the backgrounds of approach to optimization and design of software defined adaptive feedback communication systems (AFCS) for applications at the physical (PHY) layer of wireless sensor networks. A particular feature of AFCS is they transmit the signals from digital or analog sensors to the base stations (BS) using pulse-amplitude (PAM) modulators adaptively adjusted by the controls formed in BS, no coding. Absence of coders permits to derive optimal transmission-reception algorithms determining the way of optimal AFCS design. Adaptive properties of the systems permit to transmit data to BS perfectly, i.e. with energy and spectral efficiencies attaining Shannon's limits. The not used before adequate measures of AFCS performance are discussed and used for investigation of designed prototype of optimal AFCS functioning. Optimal AFCS may become a perspective class of high efficient narrowband low energy communication channels for the wireless sensor networks.
{"title":"Perfect Low Power Narrowband Transmitters for Dense Wireless Sensor Networks","authors":"A. Platonov, I. Zaitsev","doi":"10.23919/SPA.2018.8563390","DOIUrl":"https://doi.org/10.23919/SPA.2018.8563390","url":null,"abstract":"This paper presents the backgrounds of approach to optimization and design of software defined adaptive feedback communication systems (AFCS) for applications at the physical (PHY) layer of wireless sensor networks. A particular feature of AFCS is they transmit the signals from digital or analog sensors to the base stations (BS) using pulse-amplitude (PAM) modulators adaptively adjusted by the controls formed in BS, no coding. Absence of coders permits to derive optimal transmission-reception algorithms determining the way of optimal AFCS design. Adaptive properties of the systems permit to transmit data to BS perfectly, i.e. with energy and spectral efficiencies attaining Shannon's limits. The not used before adequate measures of AFCS performance are discussed and used for investigation of designed prototype of optimal AFCS functioning. Optimal AFCS may become a perspective class of high efficient narrowband low energy communication channels for the wireless sensor networks.","PeriodicalId":265587,"journal":{"name":"2018 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129103112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-09-01DOI: 10.23919/SPA.2018.8563341
A. Ikuta, H. Orimoto
In the measurement and evaluation of actual random signal in a sound environment, the observed data often contain the fuzziness due to several causes. Furthermore, there exists usually a background noise in addition to the objective specific signal, and it is often that the specific signal partly or completely is buried in the background noise. In this paper, a fuzzy Bayesian filter for estimating a specific signal, based on the observed data containing the fuzziness, and the effects of a background noise with non-Gaussian type is proposed. More specifically, after paying attention to the energy variables satisfying the additive property of the specific signal and background noise, by introducing a new type of membership function suitable for the energy variable and the observation in decibel scale, a state estimation method is theoretically derived. The proposed theory is applied to the actual estimation problem of the sound environment, and its usefulness is experimentally verified.
{"title":"Fuzzy Bayesian Filter for Sound Environment by Considering Additive Property of Energy Variable and Fuzzy Observation in Decibel Scale","authors":"A. Ikuta, H. Orimoto","doi":"10.23919/SPA.2018.8563341","DOIUrl":"https://doi.org/10.23919/SPA.2018.8563341","url":null,"abstract":"In the measurement and evaluation of actual random signal in a sound environment, the observed data often contain the fuzziness due to several causes. Furthermore, there exists usually a background noise in addition to the objective specific signal, and it is often that the specific signal partly or completely is buried in the background noise. In this paper, a fuzzy Bayesian filter for estimating a specific signal, based on the observed data containing the fuzziness, and the effects of a background noise with non-Gaussian type is proposed. More specifically, after paying attention to the energy variables satisfying the additive property of the specific signal and background noise, by introducing a new type of membership function suitable for the energy variable and the observation in decibel scale, a state estimation method is theoretically derived. The proposed theory is applied to the actual estimation problem of the sound environment, and its usefulness is experimentally verified.","PeriodicalId":265587,"journal":{"name":"2018 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"377 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131646256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}