Chi Mai Nguyen, Phat Thai, Duy Khang Lam, Van Tuan Nguyen
{"title":"A Real-Time Text Analysis System","authors":"Chi Mai Nguyen, Phat Thai, Duy Khang Lam, Van Tuan Nguyen","doi":"10.1109/COMPSAC57700.2023.00053","DOIUrl":null,"url":null,"abstract":"We live in an age of information overload. Manual information processing is increasingly overwhelmed with the enormous amount of information created by the explosive growth of news portals and online social networks. Such a situation calls for an automatic system that can support the process of handling, analyzing, and filtering information, especially information from online sources. In this work, we proposed a text analysis system that automatically collects, extracts, and analyses information from public-source-text documents such as news portals and social media networks. The proposed system can handle both long and short-text documents. It also has real-time features and is not restricted by any input data domain. The system can be used in different domains, such as scientific research, marketing, and security-related domains. Moreover, the system is engineered in modules and is flexible. Each module is an independent micro-service that can be used as a separate standalone application. The system is also extensible since new modules can be added easily. Index Terms—text analysis system, data mining, natural language processing","PeriodicalId":296288,"journal":{"name":"2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COMPSAC57700.2023.00053","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We live in an age of information overload. Manual information processing is increasingly overwhelmed with the enormous amount of information created by the explosive growth of news portals and online social networks. Such a situation calls for an automatic system that can support the process of handling, analyzing, and filtering information, especially information from online sources. In this work, we proposed a text analysis system that automatically collects, extracts, and analyses information from public-source-text documents such as news portals and social media networks. The proposed system can handle both long and short-text documents. It also has real-time features and is not restricted by any input data domain. The system can be used in different domains, such as scientific research, marketing, and security-related domains. Moreover, the system is engineered in modules and is flexible. Each module is an independent micro-service that can be used as a separate standalone application. The system is also extensible since new modules can be added easily. Index Terms—text analysis system, data mining, natural language processing