首页 > 最新文献

2022 IEEE 2nd Conference on Information Technology and Data Science (CITDS)最新文献

英文 中文
Neural Network Based Comparison of Real and Synthetic Data Series in TeraHertz Domain 基于神经网络的太赫兹域真实与合成数据序列比较
Pub Date : 2022-05-16 DOI: 10.1109/CITDS54976.2022.9914076
Yousif Mudhafar, Djamila Talbi, Zoltán Gál
Extension of real data by synthetic data becomes more important aspect of the virtualization technics today. In this paper we demonstrate how synthetic data generated from real data can be used in the supervised classification process of three different recurrent neural networks: Long-Short Term Memory (LSTM), Bidirectional LSTM (BiLSTM) and Gated Recurrent Unit (GRU). Other aspect is presented concerning the influence of the noise to the classification of real and synthetic data series. The paper demonstrates that LSTM network has better classification performance than GRU, even the last one has higher accuracy during the training. Synthetic data can eternalize just part of the features of the original real data and extraction efficiency of these characteristics depend on the applied neural network.
利用合成数据对真实数据进行扩展已成为当今虚拟化技术的一个重要方面。在本文中,我们展示了如何将真实数据生成的合成数据用于三种不同的递归神经网络的监督分类过程:长短期记忆(LSTM),双向LSTM (BiLSTM)和门控递归单元(GRU)。从另一个方面讨论了噪声对真实数据序列和合成数据序列分类的影响。本文证明LSTM网络在训练过程中具有比GRU更好的分类性能,甚至后者的准确率更高。合成数据只能永久保存原始真实数据的部分特征,这些特征的提取效率取决于所应用的神经网络。
{"title":"Neural Network Based Comparison of Real and Synthetic Data Series in TeraHertz Domain","authors":"Yousif Mudhafar, Djamila Talbi, Zoltán Gál","doi":"10.1109/CITDS54976.2022.9914076","DOIUrl":"https://doi.org/10.1109/CITDS54976.2022.9914076","url":null,"abstract":"Extension of real data by synthetic data becomes more important aspect of the virtualization technics today. In this paper we demonstrate how synthetic data generated from real data can be used in the supervised classification process of three different recurrent neural networks: Long-Short Term Memory (LSTM), Bidirectional LSTM (BiLSTM) and Gated Recurrent Unit (GRU). Other aspect is presented concerning the influence of the noise to the classification of real and synthetic data series. The paper demonstrates that LSTM network has better classification performance than GRU, even the last one has higher accuracy during the training. Synthetic data can eternalize just part of the features of the original real data and extraction efficiency of these characteristics depend on the applied neural network.","PeriodicalId":271992,"journal":{"name":"2022 IEEE 2nd Conference on Information Technology and Data Science (CITDS)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130164741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Auditory and Haptic Solutions for Access and Feedback in Internet of Digital Reality Applications 数字现实互联网应用中听觉和触觉的访问和反馈解决方案
Pub Date : 2022-05-16 DOI: 10.1109/CITDS54976.2022.9914161
G. Wersényi, Á. Csapó
The concept of Internet of Digital Reality (IoD) was introduced as the next level organization of cognitive entities following the concept of the Internet of Things (IoT) and Internet of Everything (IoE). As virtual-immersive environments are a fundamental component of IoD which allow human and non-human entities to interact in real time, the ability a wide range of communication modalities is crucial. This paper briefly presents the concept of IoD together with an overview of various I/O solutions for human users, with a focus on research directions and (re)emerging technologies in the near future.
数字现实互联网(IoD)概念是继物联网(IoT)和万物互联(IoE)概念之后引入的认知实体的下一级组织。由于虚拟沉浸式环境是IoD的基本组成部分,它允许人类和非人类实体实时交互,因此广泛的通信模式能力至关重要。本文简要介绍了IoD的概念,并概述了各种用于人类用户的I/O解决方案,重点介绍了近期的研究方向和(重新)新兴技术。
{"title":"Auditory and Haptic Solutions for Access and Feedback in Internet of Digital Reality Applications","authors":"G. Wersényi, Á. Csapó","doi":"10.1109/CITDS54976.2022.9914161","DOIUrl":"https://doi.org/10.1109/CITDS54976.2022.9914161","url":null,"abstract":"The concept of Internet of Digital Reality (IoD) was introduced as the next level organization of cognitive entities following the concept of the Internet of Things (IoT) and Internet of Everything (IoE). As virtual-immersive environments are a fundamental component of IoD which allow human and non-human entities to interact in real time, the ability a wide range of communication modalities is crucial. This paper briefly presents the concept of IoD together with an overview of various I/O solutions for human users, with a focus on research directions and (re)emerging technologies in the near future.","PeriodicalId":271992,"journal":{"name":"2022 IEEE 2nd Conference on Information Technology and Data Science (CITDS)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129140684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
M-tree index for music search based on similarity of cosine contours and tags 基于余弦轮廓和标签相似性的音乐搜索m树索引
Pub Date : 2022-05-16 DOI: 10.1109/CITDS54976.2022.9914099
G. Gombos, Zsolt Zoltán Sajti, J. Szalai-Gindl
The similarity between songs is an important part of the MIR (Music Information Retrieval), but the definition of the similarity is very subjective. Similarity can be used, for example, in recommendation systems. These systems recommend similar songs based on the user history from a database. To find similar songs in a fast way we have to index the data. Most database indexes are created for exact item searches. GiST (Generalized Search Tree) gives us the possibility to create an index with a distance function between items. These distances can be used for similarity measures. In this paper, we show how can use music similarity for distance in M-tree, which is a distance-based index. Two similarity metrics are used to create an index of music data: song tags and cosine contour.
歌曲之间的相似度是音乐信息检索的重要组成部分,但相似度的定义是非常主观的。例如,在推荐系统中可以使用相似性。这些系统根据数据库中的用户历史推荐相似的歌曲。为了快速找到相似的歌曲,我们必须对数据进行索引。大多数数据库索引都是为精确的项搜索而创建的。GiST(广义搜索树)为我们提供了用项目之间的距离函数创建索引的可能性。这些距离可用于相似性度量。在本文中,我们展示了如何在m树中使用音乐相似度作为距离,这是一种基于距离的索引。使用两个相似度度量来创建音乐数据的索引:歌曲标签和余弦轮廓。
{"title":"M-tree index for music search based on similarity of cosine contours and tags","authors":"G. Gombos, Zsolt Zoltán Sajti, J. Szalai-Gindl","doi":"10.1109/CITDS54976.2022.9914099","DOIUrl":"https://doi.org/10.1109/CITDS54976.2022.9914099","url":null,"abstract":"The similarity between songs is an important part of the MIR (Music Information Retrieval), but the definition of the similarity is very subjective. Similarity can be used, for example, in recommendation systems. These systems recommend similar songs based on the user history from a database. To find similar songs in a fast way we have to index the data. Most database indexes are created for exact item searches. GiST (Generalized Search Tree) gives us the possibility to create an index with a distance function between items. These distances can be used for similarity measures. In this paper, we show how can use music similarity for distance in M-tree, which is a distance-based index. Two similarity metrics are used to create an index of music data: song tags and cosine contour.","PeriodicalId":271992,"journal":{"name":"2022 IEEE 2nd Conference on Information Technology and Data Science (CITDS)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123299191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CatMat: 3D Object Recognition Using Catenarian Matching CatMat:使用Catenarian匹配的3D物体识别
Pub Date : 2022-05-16 DOI: 10.1109/CITDS54976.2022.9914341
Máté Michelisz, D. Varga, J. Szalai-Gindl
Object recognition in 3D point clouds is an important and widely researched topic. We propose a novel method based on local point descriptors. We detect edge points on the scene and object clouds, and construct a weighted edge graph on the object clouds. We find point chains on the objects based on the constructed graph, and seek similar point chains on the scene cloud using local descriptor matching and geometric constraints. We estimate transformations using corresponding point chains, and validate the transformations with a voxel-based method. Our method is capable of multi-instance object recognition. In this paper we present our method and compare it with a similar solution. Based on our evaluation, the proposed method is able to find various objects on scene clouds and robust to noise.
三维点云中的目标识别是一个重要而广泛研究的课题。提出了一种基于局部点描述子的方法。我们在场景云和目标云上检测边缘点,并在目标云上构造加权边缘图。基于构造好的图在物体上寻找点链,并利用局部描述符匹配和几何约束在场景云上寻找相似点链。我们使用相应的点链估计变换,并使用基于体素的方法验证变换。该方法具有多实例目标识别的能力。在本文中,我们提出了我们的方法,并与一个类似的解决方案进行了比较。根据我们的评估,该方法能够在场景云中找到各种目标,并且对噪声具有鲁棒性。
{"title":"CatMat: 3D Object Recognition Using Catenarian Matching","authors":"Máté Michelisz, D. Varga, J. Szalai-Gindl","doi":"10.1109/CITDS54976.2022.9914341","DOIUrl":"https://doi.org/10.1109/CITDS54976.2022.9914341","url":null,"abstract":"Object recognition in 3D point clouds is an important and widely researched topic. We propose a novel method based on local point descriptors. We detect edge points on the scene and object clouds, and construct a weighted edge graph on the object clouds. We find point chains on the objects based on the constructed graph, and seek similar point chains on the scene cloud using local descriptor matching and geometric constraints. We estimate transformations using corresponding point chains, and validate the transformations with a voxel-based method. Our method is capable of multi-instance object recognition. In this paper we present our method and compare it with a similar solution. Based on our evaluation, the proposed method is able to find various objects on scene clouds and robust to noise.","PeriodicalId":271992,"journal":{"name":"2022 IEEE 2nd Conference on Information Technology and Data Science (CITDS)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117276931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multimodal E-Commerce Product Classification Using Hierarchical Fusion 基于层次融合的多模式电子商务产品分类
Pub Date : 2022-05-16 DOI: 10.1109/CITDS54976.2022.9914136
Tsegaye Misikir Tashu, Sara Fattouh, Peter Kiss, Tomáš Horváth
In this work, we present a multi-modal model for commercial product classification, that combines features extracted by multiple neural network models from textual (Camem-BERT and FlauBERT) and visual data (SE-ResNeXt-50), using simple fusion techniques. The proposed method significantly outperformed the performance of the unimodal models, as well as the reported performance of similar models on our specific task. We made experiments with multiple fusing techniques, and found, that the best preforming technique to combine the individual embedding of the unimodal network is based on the combination of concatenation and averaging the feature vectors. Each modality complemented the shortcomings of the other modalities, demonstrating that increasing the number of modalities can be an effective method for improving the performance of multi-label and multimodal classification problems.
在这项工作中,我们提出了一个用于商业产品分类的多模态模型,该模型使用简单的融合技术,将多个神经网络模型从文本(Camem-BERT和FlauBERT)和视觉数据(SE-ResNeXt-50)中提取的特征结合起来。所提出的方法明显优于单峰模型的性能,以及在我们的特定任务上报道的类似模型的性能。我们用多种融合技术进行了实验,发现将特征向量拼接和平均相结合是组合单峰网络单个嵌入的最佳预成型技术。每种模态都补充了其他模态的缺点,表明增加模态的数量可以是提高多标签和多模态分类问题性能的有效方法。
{"title":"Multimodal E-Commerce Product Classification Using Hierarchical Fusion","authors":"Tsegaye Misikir Tashu, Sara Fattouh, Peter Kiss, Tomáš Horváth","doi":"10.1109/CITDS54976.2022.9914136","DOIUrl":"https://doi.org/10.1109/CITDS54976.2022.9914136","url":null,"abstract":"In this work, we present a multi-modal model for commercial product classification, that combines features extracted by multiple neural network models from textual (Camem-BERT and FlauBERT) and visual data (SE-ResNeXt-50), using simple fusion techniques. The proposed method significantly outperformed the performance of the unimodal models, as well as the reported performance of similar models on our specific task. We made experiments with multiple fusing techniques, and found, that the best preforming technique to combine the individual embedding of the unimodal network is based on the combination of concatenation and averaging the feature vectors. Each modality complemented the shortcomings of the other modalities, demonstrating that increasing the number of modalities can be an effective method for improving the performance of multi-label and multimodal classification problems.","PeriodicalId":271992,"journal":{"name":"2022 IEEE 2nd Conference on Information Technology and Data Science (CITDS)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129558957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimating road traffic flows in macroscopic Markov model 基于宏观马尔可夫模型的道路交通流估计
Pub Date : 2022-05-16 DOI: 10.1109/CITDS54976.2022.9914332
P. Jeszenszky, Renátó Besenczi, M. Szabó, M. Ispány
Traffic flows gain more and more attention in transportation engineering. One possible means of understanding the traffic flow in a city is to gather sequences of traffic position data, called link flows, measured by vehicle-mounted sensors, which are increasingly available by various providers for municipalities. Link flows can be used for planning of operation and maintenance, and for forecasting of future traffic events. In this paper, we investigate how the microscopic Markov traffic model can be used to predict traffic congestion on the roads between different nodes or regions of a city. The proposed model is evaluated in a numerical study by using real traffic data recorded in the city of Porto. The results show that the model developed for simulation is of limited use for predicting the traffic between different areas of a city.
交通流在交通工程中越来越受到重视。了解城市交通流量的一种可能方法是收集交通位置数据序列,称为链路流量,由车载传感器测量,越来越多的供应商可以为市政当局提供这些数据。链路流可用于规划运营和维护,并用于预测未来的交通事件。本文研究了微观马尔可夫交通模型如何用于预测城市不同节点或区域之间道路的交通拥堵。利用波尔图市的实际交通数据,对所提出的模型进行了数值研究。结果表明,为模拟而开发的模型对于预测城市不同区域之间的交通是有限的。
{"title":"Estimating road traffic flows in macroscopic Markov model","authors":"P. Jeszenszky, Renátó Besenczi, M. Szabó, M. Ispány","doi":"10.1109/CITDS54976.2022.9914332","DOIUrl":"https://doi.org/10.1109/CITDS54976.2022.9914332","url":null,"abstract":"Traffic flows gain more and more attention in transportation engineering. One possible means of understanding the traffic flow in a city is to gather sequences of traffic position data, called link flows, measured by vehicle-mounted sensors, which are increasingly available by various providers for municipalities. Link flows can be used for planning of operation and maintenance, and for forecasting of future traffic events. In this paper, we investigate how the microscopic Markov traffic model can be used to predict traffic congestion on the roads between different nodes or regions of a city. The proposed model is evaluated in a numerical study by using real traffic data recorded in the city of Porto. The results show that the model developed for simulation is of limited use for predicting the traffic between different areas of a city.","PeriodicalId":271992,"journal":{"name":"2022 IEEE 2nd Conference on Information Technology and Data Science (CITDS)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129205320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Comparison of HOG-SVM and SIFT-SVM Techniques for Identifying Brown Planthoppers in Rice Fields HOG-SVM与SIFT-SVM技术在稻田褐飞虱识别中的比较
Pub Date : 2022-05-16 DOI: 10.1109/CITDS54976.2022.9914061
Christopher G. Harris, I. Andika, Y. Trisyono
Brown planthoppers (BPH) are insect pests that cause significant damage to rice crop yields throughout the Asia-Pacific region. Early identification of BPH forms has ramifications for forecasting potential outbreaks. To address this, we use Adaboost and Haar features to discover areas of interest in images of rice plants. We apply two separate techniques to identify the BPH in images: we compare a technique that utilizes HOG descriptors and another that utilizes SIFT feature descriptors. To each of these techniques, we apply a Support Vector Machine (SVM) to allow us to classify areas of interest in the images. Our approach achieves a weighted average classification rate of 95.38% for HOG and 96.38% for SIFT, improving upon state-of-the-art BPH detection methods and our findings lay the groundwork for other insect pest identification and detection efforts.
褐飞虱(BPH)是对整个亚太地区水稻作物产量造成重大损害的害虫。BPH形式的早期识别对预测潜在的爆发具有影响。为了解决这个问题,我们使用Adaboost和Haar功能来发现水稻植物图像中感兴趣的区域。我们应用两种不同的技术来识别图像中的BPH:我们比较了利用HOG描述符的技术和利用SIFT特征描述符的技术。对于每一种技术,我们应用支持向量机(SVM)来对图像中感兴趣的区域进行分类。该方法的加权平均分类率为95.38%,SIFT的加权平均分类率为96.38%,对现有的BPH检测方法进行了改进,为其他害虫的鉴定和检测工作奠定了基础。
{"title":"A Comparison of HOG-SVM and SIFT-SVM Techniques for Identifying Brown Planthoppers in Rice Fields","authors":"Christopher G. Harris, I. Andika, Y. Trisyono","doi":"10.1109/CITDS54976.2022.9914061","DOIUrl":"https://doi.org/10.1109/CITDS54976.2022.9914061","url":null,"abstract":"Brown planthoppers (BPH) are insect pests that cause significant damage to rice crop yields throughout the Asia-Pacific region. Early identification of BPH forms has ramifications for forecasting potential outbreaks. To address this, we use Adaboost and Haar features to discover areas of interest in images of rice plants. We apply two separate techniques to identify the BPH in images: we compare a technique that utilizes HOG descriptors and another that utilizes SIFT feature descriptors. To each of these techniques, we apply a Support Vector Machine (SVM) to allow us to classify areas of interest in the images. Our approach achieves a weighted average classification rate of 95.38% for HOG and 96.38% for SIFT, improving upon state-of-the-art BPH detection methods and our findings lay the groundwork for other insect pest identification and detection efforts.","PeriodicalId":271992,"journal":{"name":"2022 IEEE 2nd Conference on Information Technology and Data Science (CITDS)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127751184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Learning Techniques Applied To Bangla Crime News Classification 机器学习技术在孟加拉犯罪新闻分类中的应用
Pub Date : 2022-05-16 DOI: 10.1109/CITDS54976.2022.9914240
Nusrat Islam, Rokeya Siddiqua, S. Momen
The methodical approach to crime detection, crime pattern classification and crime tendency guessing is called crime analysis and prediction. Crime is naturally unpredictable and socially disruptive. With the increase in the population of Bangladesh, the tendency of crime is also increasing, which is destroying our society in various ways. Therefore, crime data analysis has become essential in order to predict future crime types. In our research paper, six types of Machine learning algorithms were used in order to classify the crime news. Crime news were fetched from online Bangla newspapers and TV channels using Web Scraper. In order to extract the features (important words), two types of feature extractors have been used including CountVectorizer and TfidfVectorizer where CountVectorizer was from a well-known python pre-trained package named BnVec. Accuracies of 87.69% and 86.09% were found from the Logistic Regression and SVM models respectively. Besides, Logistic regression provided less false negative with 86.65% recall and 86.58% F1-score. This research has a potential to be used to prevent crime and to apprehend, investigate and prosecute the criminals.
犯罪侦查、犯罪模式分类和犯罪倾向猜测的方法被称为犯罪分析与预测。犯罪自然是不可预测的,而且具有社会破坏性。随着孟加拉国人口的增加,犯罪的趋势也在增加,这正在以各种方式破坏我们的社会。因此,为了预测未来的犯罪类型,犯罪数据分析变得至关重要。在我们的研究论文中,使用了六种类型的机器学习算法来对犯罪新闻进行分类。犯罪新闻从网上的孟加拉报纸和电视频道使用Web Scraper获取。为了提取特征(重要的词),使用了两种类型的特征提取器,包括CountVectorizer和TfidfVectorizer,其中CountVectorizer来自一个著名的python预训练包BnVec。Logistic回归模型和SVM模型的准确率分别为87.69%和86.09%。此外,Logistic回归的假阴性结果较少,召回率为86.65%,f1得分为86.58%。这项研究有可能用于预防犯罪和逮捕、调查和起诉罪犯。
{"title":"Machine Learning Techniques Applied To Bangla Crime News Classification","authors":"Nusrat Islam, Rokeya Siddiqua, S. Momen","doi":"10.1109/CITDS54976.2022.9914240","DOIUrl":"https://doi.org/10.1109/CITDS54976.2022.9914240","url":null,"abstract":"The methodical approach to crime detection, crime pattern classification and crime tendency guessing is called crime analysis and prediction. Crime is naturally unpredictable and socially disruptive. With the increase in the population of Bangladesh, the tendency of crime is also increasing, which is destroying our society in various ways. Therefore, crime data analysis has become essential in order to predict future crime types. In our research paper, six types of Machine learning algorithms were used in order to classify the crime news. Crime news were fetched from online Bangla newspapers and TV channels using Web Scraper. In order to extract the features (important words), two types of feature extractors have been used including CountVectorizer and TfidfVectorizer where CountVectorizer was from a well-known python pre-trained package named BnVec. Accuracies of 87.69% and 86.09% were found from the Logistic Regression and SVM models respectively. Besides, Logistic regression provided less false negative with 86.65% recall and 86.58% F1-score. This research has a potential to be used to prevent crime and to apprehend, investigate and prosecute the criminals.","PeriodicalId":271992,"journal":{"name":"2022 IEEE 2nd Conference on Information Technology and Data Science (CITDS)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128638649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Clustering-based customer representation learning from dynamic transactional data 从动态事务数据中学习基于聚类的客户表示
Pub Date : 2022-05-16 DOI: 10.1109/CITDS54976.2022.9914293
Gleb Glukhov, Klavdiya Olegovna Bochenina
We propose a new clustering-based method for customer feature vector extraction based on the history of their financial transactions. Customer vector representations can be used to solve downstream tasks, such as customer segmentation or next purchase category prediction. The main advantage of the proposed method is that the obtained feature vectors may be interpreted in terms of temporal activity while preserving sufficient quality for solving downstream tasks. Using this method, we were able to extract well-interpreted customer segments (using the debit card transaction data from a large Russian bank) which are useful for various business cases (e.g., planning of marketing campaigns or customized recommendations of financial products). This interpretation would help meet the tasks of analyzing the typical customer behavior and its reasons. In addition, we demonstrate that our method of constructing embeddings provides comparable quality for several downstream tasks (customer purchase category forecasting, missing category prediction, and campaign targeting) with non-interpretable algorithms such as word2vec and autoencoders approaches.
本文提出了一种基于客户金融交易历史的聚类特征向量提取方法。客户向量表示可用于解决下游任务,例如客户细分或下一次购买类别预测。该方法的主要优点是,所获得的特征向量可以根据时间活动进行解释,同时保留足够的质量来解决下游任务。使用这种方法,我们能够提取出解释良好的客户细分(使用来自一家大型俄罗斯银行的借记卡交易数据),这对各种业务案例(例如,营销活动的规划或金融产品的定制推荐)都很有用。这种解释将有助于完成分析典型客户行为及其原因的任务。此外,我们证明了我们构建嵌入的方法为几个下游任务(客户购买类别预测,缺失类别预测和活动定位)提供了不可解释的算法,如word2vec和自动编码器方法。
{"title":"Clustering-based customer representation learning from dynamic transactional data","authors":"Gleb Glukhov, Klavdiya Olegovna Bochenina","doi":"10.1109/CITDS54976.2022.9914293","DOIUrl":"https://doi.org/10.1109/CITDS54976.2022.9914293","url":null,"abstract":"We propose a new clustering-based method for customer feature vector extraction based on the history of their financial transactions. Customer vector representations can be used to solve downstream tasks, such as customer segmentation or next purchase category prediction. The main advantage of the proposed method is that the obtained feature vectors may be interpreted in terms of temporal activity while preserving sufficient quality for solving downstream tasks. Using this method, we were able to extract well-interpreted customer segments (using the debit card transaction data from a large Russian bank) which are useful for various business cases (e.g., planning of marketing campaigns or customized recommendations of financial products). This interpretation would help meet the tasks of analyzing the typical customer behavior and its reasons. In addition, we demonstrate that our method of constructing embeddings provides comparable quality for several downstream tasks (customer purchase category forecasting, missing category prediction, and campaign targeting) with non-interpretable algorithms such as word2vec and autoencoders approaches.","PeriodicalId":271992,"journal":{"name":"2022 IEEE 2nd Conference on Information Technology and Data Science (CITDS)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114345479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging Fog Computing for Geographically Distributed Smart Cities 利用雾计算实现地理分布的智慧城市
Pub Date : 2022-05-16 DOI: 10.1109/CITDS54976.2022.9914276
Rasha S. Gargees
Recently, the emergence of smart cities (SC), where data streams come from various geographically distributed places, has posed new challenges. Cloud Computing provides excellent services for smart cities, such as powerful computation and storage. However, processing the geographically distributed data using cloud computing only is not an ideal solution in some cases. Additionally, moving all the big raw data to the remote cloud is another challenge for cloud computing since there will be shortcomings in terms of delay and high bandwidth consumption. A solution that allows fog-to-cloud or fog-to-fog communication can address these limitations as fogs are typically located locally near the data sources. However, the questions related to the efficient frameworks design, workload distribution, cost, and various key technologies and communication challenges remain. To this end, this research investigates the impact of fog, employing our proposed architecture, on the efficient utilization and management of resources in highly distributed systems through experiments. The comparison showed that fog computing reduces the cost in terms of time and resource utilization. Additionally, the collaboration of autonomous agents locally (within one fog) or globally (across multiple fogs and cloud) supports scalability and automation. It also facilitates large-scale data processing across various real-world distributed locations.
最近,智能城市(SC)的出现带来了新的挑战,其中数据流来自不同地理分布的地方。云计算为智慧城市提供了强大的计算能力和存储能力等优良服务。但是,在某些情况下,仅使用云计算处理地理上分布的数据并不是理想的解决方案。此外,将所有大的原始数据移动到远程云中是云计算面临的另一个挑战,因为在延迟和高带宽消耗方面存在缺点。允许雾对云或雾对雾通信的解决方案可以解决这些限制,因为雾通常位于数据源附近。然而,与高效框架设计、工作负载分配、成本以及各种关键技术和通信挑战相关的问题仍然存在。为此,本研究采用我们提出的架构,通过实验研究雾对高度分布式系统中资源的有效利用和管理的影响。比较表明,雾计算在时间和资源利用率方面降低了成本。此外,本地(在一个雾中)或全局(跨多个雾和云)自治代理的协作支持可伸缩性和自动化。它还促进了跨各种真实分布位置的大规模数据处理。
{"title":"Leveraging Fog Computing for Geographically Distributed Smart Cities","authors":"Rasha S. Gargees","doi":"10.1109/CITDS54976.2022.9914276","DOIUrl":"https://doi.org/10.1109/CITDS54976.2022.9914276","url":null,"abstract":"Recently, the emergence of smart cities (SC), where data streams come from various geographically distributed places, has posed new challenges. Cloud Computing provides excellent services for smart cities, such as powerful computation and storage. However, processing the geographically distributed data using cloud computing only is not an ideal solution in some cases. Additionally, moving all the big raw data to the remote cloud is another challenge for cloud computing since there will be shortcomings in terms of delay and high bandwidth consumption. A solution that allows fog-to-cloud or fog-to-fog communication can address these limitations as fogs are typically located locally near the data sources. However, the questions related to the efficient frameworks design, workload distribution, cost, and various key technologies and communication challenges remain. To this end, this research investigates the impact of fog, employing our proposed architecture, on the efficient utilization and management of resources in highly distributed systems through experiments. The comparison showed that fog computing reduces the cost in terms of time and resource utilization. Additionally, the collaboration of autonomous agents locally (within one fog) or globally (across multiple fogs and cloud) supports scalability and automation. It also facilitates large-scale data processing across various real-world distributed locations.","PeriodicalId":271992,"journal":{"name":"2022 IEEE 2nd Conference on Information Technology and Data Science (CITDS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131027655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2022 IEEE 2nd Conference on Information Technology and Data Science (CITDS)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1