Pub Date : 2020-06-01DOI: 10.1109/ISSC49989.2020.9180190
Mariusz P. Wilk, M. Walsh, B. O’flynn
Human motion tracking systems are widely used in various application spaces, such as motion capture, rehabilitation, or sports. There exists a number of such systems in the State-Of-The-Art (SOA) that vary in price, complexity, accuracy and the target applications. With the continued advances in system integration and miniaturization, wearable motion trackers gain in popularity in the research community. The opto-inertial trackers with multimodal sensor fusion algorithms are some of the common approaches found in SOA. However, these trackers tend to be expensive and have high computational requirements. In this work, we present a prototype version of our opto-inertial, motion tracking system that offers a low-cost alternative. The 3D position and orientation are determined by fusing optical and inertial sensor data together with knowledge about two external reference points using a purpose-designed data fusion algorithm. An experimental validation was carried out on one of the use cases that this system is intended for, i.e. barbell squat in strength training. The results showed that the total RMSE in position and orientation was 32.8 mm and 0.89 degree, respectively. It operated in real-time at 20 frames per second.
{"title":"Low Cost Embedded Multimodal Opto-Inertial Human Motion Tracking System","authors":"Mariusz P. Wilk, M. Walsh, B. O’flynn","doi":"10.1109/ISSC49989.2020.9180190","DOIUrl":"https://doi.org/10.1109/ISSC49989.2020.9180190","url":null,"abstract":"Human motion tracking systems are widely used in various application spaces, such as motion capture, rehabilitation, or sports. There exists a number of such systems in the State-Of-The-Art (SOA) that vary in price, complexity, accuracy and the target applications. With the continued advances in system integration and miniaturization, wearable motion trackers gain in popularity in the research community. The opto-inertial trackers with multimodal sensor fusion algorithms are some of the common approaches found in SOA. However, these trackers tend to be expensive and have high computational requirements. In this work, we present a prototype version of our opto-inertial, motion tracking system that offers a low-cost alternative. The 3D position and orientation are determined by fusing optical and inertial sensor data together with knowledge about two external reference points using a purpose-designed data fusion algorithm. An experimental validation was carried out on one of the use cases that this system is intended for, i.e. barbell squat in strength training. The results showed that the total RMSE in position and orientation was 32.8 mm and 0.89 degree, respectively. It operated in real-time at 20 frames per second.","PeriodicalId":351013,"journal":{"name":"2020 31st Irish Signals and Systems Conference (ISSC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129559912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-06-01DOI: 10.1109/ISSC49989.2020.9180215
J. Blackledge, N. Mosola
This paper considers an algorithm for transferring a digital image over an open network using a No-key(s) Protocol or Three-Way Pass and phase-only encryption/decryption. After providing a short study on the theoretical background to the method, an algorithm is presented on a step-by-step basis. Cryptanalysis is undertaken for the three intercept and single intercept cases, when it is assumed that the encrypted data is intercepted in its entirety for each pass or for any single pass, respectively. The algorithm focuses on the exchange of a JPEG image although in principle, the approach is independent of the format of the image file that is used. Prototype MATLAB functions are provided for the validation of the approach and for further development by interested readers.
{"title":"Digital Image Exchange using a No-key(s) Protocol with Phase-only Encryption","authors":"J. Blackledge, N. Mosola","doi":"10.1109/ISSC49989.2020.9180215","DOIUrl":"https://doi.org/10.1109/ISSC49989.2020.9180215","url":null,"abstract":"This paper considers an algorithm for transferring a digital image over an open network using a No-key(s) Protocol or Three-Way Pass and phase-only encryption/decryption. After providing a short study on the theoretical background to the method, an algorithm is presented on a step-by-step basis. Cryptanalysis is undertaken for the three intercept and single intercept cases, when it is assumed that the encrypted data is intercepted in its entirety for each pass or for any single pass, respectively. The algorithm focuses on the exchange of a JPEG image although in principle, the approach is independent of the format of the image file that is used. Prototype MATLAB functions are provided for the validation of the approach and for further development by interested readers.","PeriodicalId":351013,"journal":{"name":"2020 31st Irish Signals and Systems Conference (ISSC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129912950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-06-01DOI: 10.1109/ISSC49989.2020.9180202
Xutong Wu, T. Siriburanon, R. Staszewski
This paper proposes a time-domain multiply-accumulator (MAC) circuit exploiting the use of a digital-to-time multiplier (DTM) unit for performing convolutional operations and designed in 28nm CMOS. As the foundational calculation of state-of-the-art convolutional neural network (CNN) models, convolutional operation could normally be executed millions of times in one CNN task. The proposed circuit is designed to support this large computational resource requirement during the CNN computations. It is running in time-domain with 6-bit resolution (1 sign bit) and performs calculations based on corresponding time delays. Compared with other analog-domain propositions, time-domain designs perform better, with higher operating frequencies up to 50 MHz. In schematic simulations, the proposed DTM unit, operating with 5 bits, achieves 0.1 GOPS ideal throughput and consumes 74.79 µW at 1.0V supply.
{"title":"Time-Domain Multiply-Accumulator using Digital-to-Time Multiplier for CNN Processors in 28-nm CMOS","authors":"Xutong Wu, T. Siriburanon, R. Staszewski","doi":"10.1109/ISSC49989.2020.9180202","DOIUrl":"https://doi.org/10.1109/ISSC49989.2020.9180202","url":null,"abstract":"This paper proposes a time-domain multiply-accumulator (MAC) circuit exploiting the use of a digital-to-time multiplier (DTM) unit for performing convolutional operations and designed in 28nm CMOS. As the foundational calculation of state-of-the-art convolutional neural network (CNN) models, convolutional operation could normally be executed millions of times in one CNN task. The proposed circuit is designed to support this large computational resource requirement during the CNN computations. It is running in time-domain with 6-bit resolution (1 sign bit) and performs calculations based on corresponding time delays. Compared with other analog-domain propositions, time-domain designs perform better, with higher operating frequencies up to 50 MHz. In schematic simulations, the proposed DTM unit, operating with 5 bits, achieves 0.1 GOPS ideal throughput and consumes 74.79 µW at 1.0V supply.","PeriodicalId":351013,"journal":{"name":"2020 31st Irish Signals and Systems Conference (ISSC)","volume":"256 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132033076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-06-01DOI: 10.1109/ISSC49989.2020.9180159
Keith Phelan, D. Riordan
Ringforts are one of the most populous field monuments in Ireland with approximately 45000 examples surviving to date. Their distribution and dispersal patterns are key to our understanding of the habitation patterns of our ancestors. Due to the nature of these structures and the construction materials used, centuries of abandonment means that they often go unnoticed at ground level, while being easily identified from an aerial perspective. The increased requirements of land use for the development of urban areas, infrastructure and increased industrialised farming practices means that these monuments are under threat. Recent developments in the field of machine learning coupled with access to hi-resolution multi-spectral satellite imagery from Open Data sources, presents the opportunity to investigate the development of a system for the automated detection of these features. If successful, such a system could provide an automated, efficient and cost effective tool for the detection of interference or destruction of known sites as well as the discovery of new ones.
{"title":"Detection of ringforts from aerial photography using machine learning","authors":"Keith Phelan, D. Riordan","doi":"10.1109/ISSC49989.2020.9180159","DOIUrl":"https://doi.org/10.1109/ISSC49989.2020.9180159","url":null,"abstract":"Ringforts are one of the most populous field monuments in Ireland with approximately 45000 examples surviving to date. Their distribution and dispersal patterns are key to our understanding of the habitation patterns of our ancestors. Due to the nature of these structures and the construction materials used, centuries of abandonment means that they often go unnoticed at ground level, while being easily identified from an aerial perspective. The increased requirements of land use for the development of urban areas, infrastructure and increased industrialised farming practices means that these monuments are under threat. Recent developments in the field of machine learning coupled with access to hi-resolution multi-spectral satellite imagery from Open Data sources, presents the opportunity to investigate the development of a system for the automated detection of these features. If successful, such a system could provide an automated, efficient and cost effective tool for the detection of interference or destruction of known sites as well as the discovery of new ones.","PeriodicalId":351013,"journal":{"name":"2020 31st Irish Signals and Systems Conference (ISSC)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130998442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-06-01DOI: 10.1109/ISSC49989.2020.9180176
Cathal Ferry, J. Connolly
Energy saving and energy conservation are fast becoming key ideologies in the construction and creation of modern data centres and IT infrastructure. This applies to large scale deployments and on smaller to more intermediate scale sites. Data centres consume large quantities of energy and contribute to carbon dioxide (CO2) emissions. Reducing CO2output using methods such as sustainable power generation and better energy efficiency can help mitigate against the effects of global warming. This paper proposes methods of saving energy in IT equipment by monitoring key power statistics such as power factor to determine the efficiency of the power being used by network equipment. This is achieved using an open-source power factor meter which is not only low cost but also accurate. The meter measures power factor as well as true power, apparent power, reactive power, mains voltage, current, and mains frequency to determine the energy efficiency of the installation or equipment. Readings are measured using three primary sensors; a current transformer, voltage transformer, and a mains frequency sensor. The system is designed for use with single-phase systems and incorporates a local HMI and a cloud-based CMS system. All of the software and hardware elements used are open source and therefore low cost.
{"title":"Open Source Power Quality Meter with cloud monitoring","authors":"Cathal Ferry, J. Connolly","doi":"10.1109/ISSC49989.2020.9180176","DOIUrl":"https://doi.org/10.1109/ISSC49989.2020.9180176","url":null,"abstract":"Energy saving and energy conservation are fast becoming key ideologies in the construction and creation of modern data centres and IT infrastructure. This applies to large scale deployments and on smaller to more intermediate scale sites. Data centres consume large quantities of energy and contribute to carbon dioxide (CO2) emissions. Reducing CO2output using methods such as sustainable power generation and better energy efficiency can help mitigate against the effects of global warming. This paper proposes methods of saving energy in IT equipment by monitoring key power statistics such as power factor to determine the efficiency of the power being used by network equipment. This is achieved using an open-source power factor meter which is not only low cost but also accurate. The meter measures power factor as well as true power, apparent power, reactive power, mains voltage, current, and mains frequency to determine the energy efficiency of the installation or equipment. Readings are measured using three primary sensors; a current transformer, voltage transformer, and a mains frequency sensor. The system is designed for use with single-phase systems and incorporates a local HMI and a cloud-based CMS system. All of the software and hardware elements used are open source and therefore low cost.","PeriodicalId":351013,"journal":{"name":"2020 31st Irish Signals and Systems Conference (ISSC)","volume":"295 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132800909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-06-01DOI: 10.1109/ISSC49989.2020.9180207
Davide Villa, Chih-kuang Lin, Adam Kuenzi, Michael Lang
Wireless networks are ubiquitous in our modern world, and we rely more and more on their continuous and reliable operation for battery-powered devices. Networks that self-maintain and self-heal are inherently more reliable. We study efficient and effective network self-healing and update methods for routing recovery following routing failures in a wireless multi-hop network. Network update processes are important since they enable local nodes to maintain the latest and updated neighbor information for routing given the network changes caused by failures. Network update also introduces control signals overhead. In this paper, we investigate the trade-off between routing performance and overhead cost with different network update algorithms and we characterize the performance of the proposed algorithms using network simulations. We show that network updates have positive impacts on routing. In particular, the on-demand route update method provides better results among compared techniques. The improvement is varying depending on the network topology and failure condition scenario.
{"title":"On-demand updates after a node failure in a wireless network","authors":"Davide Villa, Chih-kuang Lin, Adam Kuenzi, Michael Lang","doi":"10.1109/ISSC49989.2020.9180207","DOIUrl":"https://doi.org/10.1109/ISSC49989.2020.9180207","url":null,"abstract":"Wireless networks are ubiquitous in our modern world, and we rely more and more on their continuous and reliable operation for battery-powered devices. Networks that self-maintain and self-heal are inherently more reliable. We study efficient and effective network self-healing and update methods for routing recovery following routing failures in a wireless multi-hop network. Network update processes are important since they enable local nodes to maintain the latest and updated neighbor information for routing given the network changes caused by failures. Network update also introduces control signals overhead. In this paper, we investigate the trade-off between routing performance and overhead cost with different network update algorithms and we characterize the performance of the proposed algorithms using network simulations. We show that network updates have positive impacts on routing. In particular, the on-demand route update method provides better results among compared techniques. The improvement is varying depending on the network topology and failure condition scenario.","PeriodicalId":351013,"journal":{"name":"2020 31st Irish Signals and Systems Conference (ISSC)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115727001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-06-01DOI: 10.1109/ISSC49989.2020.9180172
Con Cronin, A. Conway, Anshul Awasthi, Joseph Walsh
This paper details a proof-of-concept material handling system integrated to a small-scale, self-navigating AIV. The prototype has the ability to autonomously navigate from one manufacturing cell to another. It loads and unloads material from one fixed conveyor to another in an environment that includes confined and populated passageways. This paper advances the concept of the material handling system to communicate directly with the production equipment. A successful development of an material handling system with an AIV in automation proposes to increase productivity in manufacturing while securing jobs in a competitive market. AIVs promote flexibility within the factory floor and increase the realisation of Industry 4.0.
{"title":"Flexible Manufacturing using Automated Material Handling and Autonomous Intelligent Vehicles","authors":"Con Cronin, A. Conway, Anshul Awasthi, Joseph Walsh","doi":"10.1109/ISSC49989.2020.9180172","DOIUrl":"https://doi.org/10.1109/ISSC49989.2020.9180172","url":null,"abstract":"This paper details a proof-of-concept material handling system integrated to a small-scale, self-navigating AIV. The prototype has the ability to autonomously navigate from one manufacturing cell to another. It loads and unloads material from one fixed conveyor to another in an environment that includes confined and populated passageways. This paper advances the concept of the material handling system to communicate directly with the production equipment. A successful development of an material handling system with an AIV in automation proposes to increase productivity in manufacturing while securing jobs in a competitive market. AIVs promote flexibility within the factory floor and increase the realisation of Industry 4.0.","PeriodicalId":351013,"journal":{"name":"2020 31st Irish Signals and Systems Conference (ISSC)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125016523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-06-01DOI: 10.1109/ISSC49989.2020.9180194
N. Akhtar, M. Kelly, William N. Scott, J. Connolly
Cancer Related Fatigue (CRF) is a well-recognised symptom of malignant breast disease and may affect up to 70% of those undergoing therapy or deemed to be in remission. The condition is frequently subject to unpredictable recurrence that can result in unavoidable and unforeseen detriment to quality of life. Moreover, management of the condition can place significant financial burden on health and social care facilities. CRF is distinct from normal tiredness which may be resolved by periods of sleep or rest. Customers' extensive use of wearable technologies has contributed to the evolution of clinical trial procedures and, as a result, health data can also be obtained using wearables [1]. New technologies have the potential to improve data accuracy and timeliness, improve efficiency and increasing patient engagement in the clinical trial process Medical quality tracking devices are already supporting patient care in several clinical areas [1]. The main aim of this study is to define an accurate fatigue baseline for individuals diagnosed with breast cancer to determine potential relationships between possible fatigue markers, measurable daily activity and individual perceptions of fatigue.
{"title":"Implementing wearable sensor technology for the determination of a biomarker profile for cancer-related fatigue","authors":"N. Akhtar, M. Kelly, William N. Scott, J. Connolly","doi":"10.1109/ISSC49989.2020.9180194","DOIUrl":"https://doi.org/10.1109/ISSC49989.2020.9180194","url":null,"abstract":"Cancer Related Fatigue (CRF) is a well-recognised symptom of malignant breast disease and may affect up to 70% of those undergoing therapy or deemed to be in remission. The condition is frequently subject to unpredictable recurrence that can result in unavoidable and unforeseen detriment to quality of life. Moreover, management of the condition can place significant financial burden on health and social care facilities. CRF is distinct from normal tiredness which may be resolved by periods of sleep or rest. Customers' extensive use of wearable technologies has contributed to the evolution of clinical trial procedures and, as a result, health data can also be obtained using wearables [1]. New technologies have the potential to improve data accuracy and timeliness, improve efficiency and increasing patient engagement in the clinical trial process Medical quality tracking devices are already supporting patient care in several clinical areas [1]. The main aim of this study is to define an accurate fatigue baseline for individuals diagnosed with breast cancer to determine potential relationships between possible fatigue markers, measurable daily activity and individual perceptions of fatigue.","PeriodicalId":351013,"journal":{"name":"2020 31st Irish Signals and Systems Conference (ISSC)","volume":"384 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123956475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-06-01DOI: 10.1109/ISSC49989.2020.9180160
Mohannad Al-Ahmadi, P. Počta, H. Melvin
Real-time multimedia applications like Web realtime communication WebRTC support a wide range of codecs, from the standard narrowband up to fullband codecs. The IETF standardized Opus codec is the default codec utilized by WebRTC speech and audio applications, by supporting a wide range of bitrates. In current best effort networks, network impairments such as packet loss, delay and jitter affect the quality of VoIP. To assess the impact of such impairments in order to estimate the quality experienced by the end users of speech applications, the E-model standardized in ITU-T Rec. G.107 can be used. In this paper we derive codec-specific parameters required by the E-model to estimate the quality degradation in speech applications deploying narrowband and wideband Opus codec, namely the equipment impairment factor Ie and packet loss robustness factor Bpl. We followed the ITU-T methods designed for this purpose and share the results arising from all the experiments covering all the narrowband and wideband Opus codec conditions. The derived values make it possible to integrate the E-model in realtime communication applications including WebRTC to assess the quality experienced by the end user.
{"title":"Derivation of E-model Equipment Impairment Factors for Narrowband and Wideband Opus Codec Using the Instrumental Method","authors":"Mohannad Al-Ahmadi, P. Počta, H. Melvin","doi":"10.1109/ISSC49989.2020.9180160","DOIUrl":"https://doi.org/10.1109/ISSC49989.2020.9180160","url":null,"abstract":"Real-time multimedia applications like Web realtime communication WebRTC support a wide range of codecs, from the standard narrowband up to fullband codecs. The IETF standardized Opus codec is the default codec utilized by WebRTC speech and audio applications, by supporting a wide range of bitrates. In current best effort networks, network impairments such as packet loss, delay and jitter affect the quality of VoIP. To assess the impact of such impairments in order to estimate the quality experienced by the end users of speech applications, the E-model standardized in ITU-T Rec. G.107 can be used. In this paper we derive codec-specific parameters required by the E-model to estimate the quality degradation in speech applications deploying narrowband and wideband Opus codec, namely the equipment impairment factor Ie and packet loss robustness factor Bpl. We followed the ITU-T methods designed for this purpose and share the results arising from all the experiments covering all the narrowband and wideband Opus codec conditions. The derived values make it possible to integrate the E-model in realtime communication applications including WebRTC to assess the quality experienced by the end user.","PeriodicalId":351013,"journal":{"name":"2020 31st Irish Signals and Systems Conference (ISSC)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131353980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-06-01DOI: 10.1109/ISSC49989.2020.9180161
Pradeep Vashisth, Kevin Meehan
Increasingly content sharing websites such as social media have become very popular in many countries across the world. Classifying the gender of a person based on these short messages is an interesting research area that could benefit legal investigation, forensics, marketing analysis, advertising and recommendation. This research will explore the use of Natural Language Processing (NLP) techniques and tweets in a gender classification system. This investigation will compare multiple techniques such as Bag of Words (Term Frequency - Inverse Document Frequency), Word Embedding (W2Vec, GloVe) and traditional Machine Learning techniques (Logistic Regression, Support Vector Machine and Naïve Bayes) in this context. A new dataset has been generated to be used as part of this study comprising of the user gender and associated tweets. This dataset was developed due to the unavailability of any public standard dataset with the volume required to perform this investigation. The results have determined that the traditional Bag of Words model did not provide any significant results in classification. However, word embedding models have significantly performed better using multiple machine learning techniques. Therefore, the word embedding models have been proven to be the most effective technique in classifying gender based on twitter text data.
{"title":"Gender Classification using Twitter Text Data","authors":"Pradeep Vashisth, Kevin Meehan","doi":"10.1109/ISSC49989.2020.9180161","DOIUrl":"https://doi.org/10.1109/ISSC49989.2020.9180161","url":null,"abstract":"Increasingly content sharing websites such as social media have become very popular in many countries across the world. Classifying the gender of a person based on these short messages is an interesting research area that could benefit legal investigation, forensics, marketing analysis, advertising and recommendation. This research will explore the use of Natural Language Processing (NLP) techniques and tweets in a gender classification system. This investigation will compare multiple techniques such as Bag of Words (Term Frequency - Inverse Document Frequency), Word Embedding (W2Vec, GloVe) and traditional Machine Learning techniques (Logistic Regression, Support Vector Machine and Naïve Bayes) in this context. A new dataset has been generated to be used as part of this study comprising of the user gender and associated tweets. This dataset was developed due to the unavailability of any public standard dataset with the volume required to perform this investigation. The results have determined that the traditional Bag of Words model did not provide any significant results in classification. However, word embedding models have significantly performed better using multiple machine learning techniques. Therefore, the word embedding models have been proven to be the most effective technique in classifying gender based on twitter text data.","PeriodicalId":351013,"journal":{"name":"2020 31st Irish Signals and Systems Conference (ISSC)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127212712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}