首页 > 最新文献

Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis最新文献

英文 中文
A New Term Weight Scheme and Ensemble Technique for Authorship Identification 一种新的术语权重方案和作者身份识别集成技术
Hanan Alshaher, Jinsheng Xu
A few of the previous studies on authorship identification have applied term weighting to features. The present study introduced a new term weight scheme, called 1/sigma, that rescales the values of a feature set to a mean of zero and a standard deviation of one. In other words, the 1/sigma scheme standardizes the values of a feature set. Three experiments showed the robustness of the proposed term weight scheme from different perspectives. These experiments showed that the proposed term weight scheme worked perfectly with different feature sets and classifiers in comparison to two popular term weight scemes: TF and TF-IDF. Furthermore, 1/sigma was shown to work successfully with the following different types of datasets: literary texts (fiction) and online messages (blogs, emails, and tweets). Although these experiments did not directly examine the effects of the numbers of documents and authors, the results indicated that these factors did not have any effects because the numbers of documents and authors vary from dataset to dataset.
以前的一些作者身份识别研究将术语加权应用于特征。目前的研究引入了一种新的术语权重方案,称为1/sigma,它将特征集的值重新缩放为平均值为0,标准差为1。换句话说,1/sigma方案标准化了特征集的值。三个实验从不同角度证明了所提出的项权重方案的鲁棒性。这些实验表明,与两种流行的术语权重方案:TF和TF- idf相比,所提出的术语权重方案在不同的特征集和分类器上都能很好地工作。此外,1/sigma被证明可以成功地处理以下不同类型的数据集:文学文本(小说)和在线信息(博客、电子邮件和推特)。虽然这些实验没有直接检查文档和作者数量的影响,但结果表明,这些因素没有任何影响,因为文档和作者的数量因数据集而异。
{"title":"A New Term Weight Scheme and Ensemble Technique for Authorship Identification","authors":"Hanan Alshaher, Jinsheng Xu","doi":"10.1145/3388142.3388159","DOIUrl":"https://doi.org/10.1145/3388142.3388159","url":null,"abstract":"A few of the previous studies on authorship identification have applied term weighting to features. The present study introduced a new term weight scheme, called 1/sigma, that rescales the values of a feature set to a mean of zero and a standard deviation of one. In other words, the 1/sigma scheme standardizes the values of a feature set. Three experiments showed the robustness of the proposed term weight scheme from different perspectives. These experiments showed that the proposed term weight scheme worked perfectly with different feature sets and classifiers in comparison to two popular term weight scemes: TF and TF-IDF. Furthermore, 1/sigma was shown to work successfully with the following different types of datasets: literary texts (fiction) and online messages (blogs, emails, and tweets). Although these experiments did not directly examine the effects of the numbers of documents and authors, the results indicated that these factors did not have any effects because the numbers of documents and authors vary from dataset to dataset.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125091169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Exponential triplet loss 指数三重态损耗
Ē. Urtāns, A. Ņikitenko, Valters Vecins
This paper introduces a novel variant of the Triplet Loss function that converges faster and gives better results. This function can separate class instances homogeneously through the whole embedding space. With Exponential Triplet Loss function we also introduce a novel type of embedding space regularization Unit-Range and Unit-Bounce that utilizes euclidean space more efficiently and resembles features of the cosine distance. We also examined factors for choosing the best embedding vector size for specific embedding spaces. Finally, we also demonstrate how new function can train models for one-shot learning and re-identification tasks.
本文介绍了一种新的三重损失函数的变体,它收敛速度更快,结果也更好。该函数可以通过整个嵌入空间均匀地分离类实例。利用指数三重态损失函数,我们还引入了一种新的嵌入空间正则化单元范围和单元反弹,它更有效地利用了欧几里德空间,并且类似于余弦距离的特征。我们还研究了为特定嵌入空间选择最佳嵌入向量大小的因素。最后,我们还演示了新函数如何训练模型进行一次性学习和重新识别任务。
{"title":"Exponential triplet loss","authors":"Ē. Urtāns, A. Ņikitenko, Valters Vecins","doi":"10.1145/3388142.3388163","DOIUrl":"https://doi.org/10.1145/3388142.3388163","url":null,"abstract":"This paper introduces a novel variant of the Triplet Loss function that converges faster and gives better results. This function can separate class instances homogeneously through the whole embedding space. With Exponential Triplet Loss function we also introduce a novel type of embedding space regularization Unit-Range and Unit-Bounce that utilizes euclidean space more efficiently and resembles features of the cosine distance. We also examined factors for choosing the best embedding vector size for specific embedding spaces. Finally, we also demonstrate how new function can train models for one-shot learning and re-identification tasks.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125424288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A Review of Applications of Formal Specification in Safety-Critical System Development 形式规范在安全关键系统开发中的应用综述
Emanuel S. Grant, S. P. Nanda
Since the advent of the computer and computer programming there have been many attempts to improve the quality of the software systems developed. At various stages in this evolution of development techniques, processes, and methodologies, a review of the current trend in software development is conducted. One such current trend is in the realm of safety-critical system development. Safety-critical systems are characterized by the resulting potential of harm to or loss of life if such systems should fail during operation. A strategy applied in developing such systems is the use of formal specification techniques. Formal specification techniques are the application of rigorous techniques to assess the correctness of system design. The use of formal specification techniques in safety-critical system development has been in place for a number of decades and there have been multiple reviews and comparisons of the successful and failed application of formal specification techniques. This report reviews examples of the application of formal specification techniques in a number of application domains, with a focus on the types of error detection and correction associated with the particular technique. The benefit of this work is towards the assessment of the suitable of a specific formal specification technique with a particular problem domain.
自从计算机和计算机编程出现以来,已经有许多尝试来提高所开发的软件系统的质量。在开发技术、过程和方法学发展的不同阶段,对软件开发的当前趋势进行了回顾。当前的一个趋势是在安全关键系统开发领域。安全关键系统的特点是,如果此类系统在运行过程中发生故障,可能会造成人身伤害或生命损失。在开发这样的系统时应用的策略是使用正式的规范技术。正式规格说明技术是应用严格的技术来评估系统设计的正确性。在安全关键系统开发中使用正式规范技术已经有几十年了,并且已经对正式规范技术的成功应用和失败应用进行了多次审查和比较。本报告回顾了形式化规范技术在许多应用领域中的应用示例,重点关注与特定技术相关的错误检测和纠正类型。这项工作的好处是对特定问题领域的特定形式化规范技术的适用性进行评估。
{"title":"A Review of Applications of Formal Specification in Safety-Critical System Development","authors":"Emanuel S. Grant, S. P. Nanda","doi":"10.1145/3388142.3388175","DOIUrl":"https://doi.org/10.1145/3388142.3388175","url":null,"abstract":"Since the advent of the computer and computer programming there have been many attempts to improve the quality of the software systems developed. At various stages in this evolution of development techniques, processes, and methodologies, a review of the current trend in software development is conducted. One such current trend is in the realm of safety-critical system development. Safety-critical systems are characterized by the resulting potential of harm to or loss of life if such systems should fail during operation. A strategy applied in developing such systems is the use of formal specification techniques. Formal specification techniques are the application of rigorous techniques to assess the correctness of system design. The use of formal specification techniques in safety-critical system development has been in place for a number of decades and there have been multiple reviews and comparisons of the successful and failed application of formal specification techniques. This report reviews examples of the application of formal specification techniques in a number of application domains, with a focus on the types of error detection and correction associated with the particular technique. The benefit of this work is towards the assessment of the suitable of a specific formal specification technique with a particular problem domain.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125065364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Comparative Study of Subject-Dependent and Subject-Independent Strategies for EEG-Based Emotion Recognition using LSTM Network 基于LSTM网络的基于脑电图的情感识别中主体依赖与主体独立策略的比较研究
Debarshi Nath, Anubhav, Mrigank Singh, Divyashikha Sethia, Diksha Kalra, S. Indu
This paper addresses the problem of EEG-based emotion recognition and classification and investigates the performance of classifiers for subject-independent and subject-dependent models separately. The results are compared with other classifiers and also with existing work in the concerned domain as well. We perform the experiments on the publicly available DEAP dataset with band power as the feature and classification accuracies are found pertaining to the widely accepted Valence-Arousal Model. The best results were reported by the LSTM model in case of the subject-dependent model with accuracies of 94.69% and 93.13% on valence and arousal scales respectively. SVM performed the best for the subject-independent model with accuracies of 72.19% on valence scale and 71.25% on arousal scale.
本文研究了基于脑电图的情感识别和分类问题,并分别研究了主体独立模型和主体依赖模型的分类器性能。结果与其他分类器进行了比较,并与相关领域的现有工作进行了比较。我们在公开可用的DEAP数据集上进行实验,带功率作为特征和分类精度被发现与广泛接受的Valence-Arousal模型有关。在被试依赖模型中,LSTM模型在效价和唤醒量表上的准确率分别为94.69%和93.13%。支持向量机在主体独立模型上表现最好,在效价量表上的准确率为72.19%,在唤醒量表上的准确率为71.25%。
{"title":"A Comparative Study of Subject-Dependent and Subject-Independent Strategies for EEG-Based Emotion Recognition using LSTM Network","authors":"Debarshi Nath, Anubhav, Mrigank Singh, Divyashikha Sethia, Diksha Kalra, S. Indu","doi":"10.1145/3388142.3388167","DOIUrl":"https://doi.org/10.1145/3388142.3388167","url":null,"abstract":"This paper addresses the problem of EEG-based emotion recognition and classification and investigates the performance of classifiers for subject-independent and subject-dependent models separately. The results are compared with other classifiers and also with existing work in the concerned domain as well. We perform the experiments on the publicly available DEAP dataset with band power as the feature and classification accuracies are found pertaining to the widely accepted Valence-Arousal Model. The best results were reported by the LSTM model in case of the subject-dependent model with accuracies of 94.69% and 93.13% on valence and arousal scales respectively. SVM performed the best for the subject-independent model with accuracies of 72.19% on valence scale and 71.25% on arousal scale.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130533759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Attack-tolerant Unequal Probability Sampling Methods over Sliding Window for Distributed Streams 基于滑动窗口的分布式流容错不等概率抽样方法
Yann Busnel, Yves Tillé
Distributed systems increasingly require the processing of large amounts of data, for metrology, safety or security purposes. The online processing of these large data streams requires the development of algorithms to efficiently calculate parameters. If elegant solutions have been proposed recently, their approximation is commonly calculated from the inception of the data stream. In a distributed execution context, it would be preferable to collect information only on the recent past (for resource saving or relevancy of most recent information). We therefore consider here the sliding window model. In this article, we propose a family of new sampling techniques that take into account both the sliding window model and the presence of a malicious adversary. Wayne Fuller proposed in 1970 a very ingenious method of sampling with unequal inclusion probabilities. After doing justice to this precursor paper and proposing a fast and simple implementation of it, we completely generalize Fuller's method in order to enable the use of a tuning parameter of spreading. The analytical results of these techniques show the excellent performance of the generalized pivotal approach. This generalization makes the sampling method less predictable and seems appropriate to be protected from malicious attacks when sampling from a stream.
分布式系统越来越需要处理大量的数据,用于计量、安全或安保目的。这些大数据流的在线处理需要开发有效计算参数的算法。如果最近提出了优雅的解决方案,它们的近似值通常是从数据流开始计算的。在分布式执行上下文中,最好只收集最近发生的信息(为了节省资源或使最新信息具有相关性)。因此,我们在这里考虑滑动窗口模型。在本文中,我们提出了一系列新的采样技术,这些技术同时考虑了滑动窗口模型和恶意对手的存在。韦恩·富勒(Wayne Fuller)在1970年提出了一种非常巧妙的不相等包含概率抽样方法。在对这篇前导论文进行了公正的评价,并提出了一种快速简单的实现方法后,我们对富勒方法进行了全面的推广,以便能够使用扩展的调谐参数。这些方法的分析结果表明了广义枢纽方法的优良性能。这种泛化使得采样方法的可预测性更低,并且似乎适合在从流中采样时免受恶意攻击。
{"title":"Attack-tolerant Unequal Probability Sampling Methods over Sliding Window for Distributed Streams","authors":"Yann Busnel, Yves Tillé","doi":"10.1145/3388142.3388162","DOIUrl":"https://doi.org/10.1145/3388142.3388162","url":null,"abstract":"Distributed systems increasingly require the processing of large amounts of data, for metrology, safety or security purposes. The online processing of these large data streams requires the development of algorithms to efficiently calculate parameters. If elegant solutions have been proposed recently, their approximation is commonly calculated from the inception of the data stream. In a distributed execution context, it would be preferable to collect information only on the recent past (for resource saving or relevancy of most recent information). We therefore consider here the sliding window model. In this article, we propose a family of new sampling techniques that take into account both the sliding window model and the presence of a malicious adversary. Wayne Fuller proposed in 1970 a very ingenious method of sampling with unequal inclusion probabilities. After doing justice to this precursor paper and proposing a fast and simple implementation of it, we completely generalize Fuller's method in order to enable the use of a tuning parameter of spreading. The analytical results of these techniques show the excellent performance of the generalized pivotal approach. This generalization makes the sampling method less predictable and seems appropriate to be protected from malicious attacks when sampling from a stream.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121225281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Detecting Active Sites in Protein 3D Structures 检测蛋白质三维结构中的活性位点
Jimmy Li, S. Wang
Active sites in proteins are three dimensional structures appear on the surface of proteins. Drug designers often look for certain active sites that can be used to inhibit some specific pathway. Detecting active sites of proteins has been a very popular research area. Previous research efforts in this area often use the one dimensional sequence of the protein. Many approaches have been developed to identify a potential active site representing as a segment in the protein sequence. However, an active site can function only in its 3D structure when folded appropriately. In other words, a potential active site detected in the sequence still needs to be verified in the 3D structure. In this paper, we introduce an approach that takes the three dimensional structure of a protein and discovers potential active sites from the 3D structure directly.
蛋白质中的活性位点是出现在蛋白质表面的三维结构。药物设计者经常寻找某些活性位点,可以用来抑制某些特定的途径。蛋白质活性位点的检测一直是一个非常热门的研究领域。该领域以前的研究工作通常使用蛋白质的一维序列。已经开发了许多方法来确定作为蛋白质序列片段的潜在活性位点。然而,活性位点只有在适当折叠时才能以其3D结构发挥作用。换句话说,在序列中检测到的潜在活性位点仍需要在三维结构中进行验证。本文介绍了一种利用蛋白质的三维结构,直接从三维结构中发现潜在活性位点的方法。
{"title":"Detecting Active Sites in Protein 3D Structures","authors":"Jimmy Li, S. Wang","doi":"10.1145/3388142.3388151","DOIUrl":"https://doi.org/10.1145/3388142.3388151","url":null,"abstract":"Active sites in proteins are three dimensional structures appear on the surface of proteins. Drug designers often look for certain active sites that can be used to inhibit some specific pathway. Detecting active sites of proteins has been a very popular research area. Previous research efforts in this area often use the one dimensional sequence of the protein. Many approaches have been developed to identify a potential active site representing as a segment in the protein sequence. However, an active site can function only in its 3D structure when folded appropriately. In other words, a potential active site detected in the sequence still needs to be verified in the 3D structure. In this paper, we introduce an approach that takes the three dimensional structure of a protein and discovers potential active sites from the 3D structure directly.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123994877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Developing Practical Management Support System for Regional Public Transportation Service Provided by Municipalities 开发实用的城市区域公共交通服务管理支持系统
Chinasa Sueyoshi, Hideya Takagi, K. Inenaga
Public transportation is especially important in regions with decreasing population. In Japan, regional transportation suffers financially, because the transportation of local residents alone cannot support through taxes the costs of fixed-route public transportation. Although the situation varies between municipalities, human, and financial constrains prevent these local communities from appropriately addressing this problem. Instead, municipalities hire external traffic consultants to conduct surveys, at a significant cost. As a result, the municipalities receive regular reports from the external traffic consultant on their public transportation situation. However, despite of the large cost of this service burdened by the municipality, the results are not fully available and merely indicative of a temporary situation both in terms of quality and quantity. These results cannot be used for actually improving the improvement in bus service management through timetable revision and adjustment of the fixed route. This would instead require medium-to-long term data on the usage of their community buses. For this reason, our laboratory has been developing a practical service management support system for the regional public transportation provided by municipalities. Within this system, we developed two applications for tablets named ASHIYA and SHINGU. ASHIYA can be used for conducting simple questionnaire surveys for passengers inside community buses, whereas SHINGU records for the number of get-on or get-off passengers.
公共交通在人口减少的地区尤为重要。在日本,区域交通在财政上受到影响,因为仅靠当地居民的交通不能通过税收来支持固定路线公共交通的成本。尽管各城市的情况各不相同,但人力和财政限制使这些地方社区无法适当地解决这一问题。相反,市政当局聘请外部交通顾问进行调查,费用不菲。因此,市政当局定期收到外部交通顾问关于其公共交通状况的报告。然而,尽管市政当局承担了这项服务的大量费用,但结果并没有完全得到,只是表明在质量和数量方面的暂时情况。这些结果不能实际用于通过修改时刻表和调整固定路线来提高公交服务管理水平。相反,这需要社区公共汽车使用情况的中长期数据。为此,我们实验室一直在开发一套实用的市政区域公共交通服务管理支持系统。在这个系统中,我们开发了两款平板电脑应用,分别是ASHIYA和SHINGU。ASHIYA可以对社区公交车内的乘客进行简单的问卷调查,而SHINGU则记录上下车的乘客数量。
{"title":"Developing Practical Management Support System for Regional Public Transportation Service Provided by Municipalities","authors":"Chinasa Sueyoshi, Hideya Takagi, K. Inenaga","doi":"10.1145/3388142.3388155","DOIUrl":"https://doi.org/10.1145/3388142.3388155","url":null,"abstract":"Public transportation is especially important in regions with decreasing population. In Japan, regional transportation suffers financially, because the transportation of local residents alone cannot support through taxes the costs of fixed-route public transportation. Although the situation varies between municipalities, human, and financial constrains prevent these local communities from appropriately addressing this problem. Instead, municipalities hire external traffic consultants to conduct surveys, at a significant cost. As a result, the municipalities receive regular reports from the external traffic consultant on their public transportation situation. However, despite of the large cost of this service burdened by the municipality, the results are not fully available and merely indicative of a temporary situation both in terms of quality and quantity. These results cannot be used for actually improving the improvement in bus service management through timetable revision and adjustment of the fixed route. This would instead require medium-to-long term data on the usage of their community buses. For this reason, our laboratory has been developing a practical service management support system for the regional public transportation provided by municipalities. Within this system, we developed two applications for tablets named ASHIYA and SHINGU. ASHIYA can be used for conducting simple questionnaire surveys for passengers inside community buses, whereas SHINGU records for the number of get-on or get-off passengers.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127364658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Word Cloud Analysis of Customer Satisfaction in Cosmetic Products in Thailand 泰国化妆品顾客满意度的词云分析
O. Thinnukool, Phasit Charoenkwan, P. Khuwuthyakorn, Pachara Tinamat
This research aims to investigate customer satisfaction in cosmetic products by utilizing a low-cost tool, word cloud, to analyze online reviews, based on the research questions: how customers feel about each type of cosmetic products? what their feedbacks are? and which words are positive and which ones are negative? The dataset for the investigation comprises with reviews over a 4-year duration of data collection from 2015-2018, collected from popular social networking sites in Thailand including Facebook and Pantip. A hierarchical clustering approach, the Linkage algorithm, was employed in the context of text mining. The result shows that factors that influence customer satisfaction are based on customer experience affecting positive or negative words in online reviews of each product.
本研究旨在调查消费者对化妆品的满意度,利用一种低成本的工具,词云,分析在线评论,基于研究问题:顾客对每种化妆品的感受?他们的反馈是什么?哪些词是积极的,哪些是消极的?调查的数据集包括2015-2018年4年期间收集的数据,收集自泰国流行的社交网站,包括Facebook和Pantip。在文本挖掘的背景下,采用了一种层次聚类方法,即联动算法。结果表明,影响顾客满意度的因素是基于顾客体验对每个产品在线评论中积极或消极词语的影响。
{"title":"Word Cloud Analysis of Customer Satisfaction in Cosmetic Products in Thailand","authors":"O. Thinnukool, Phasit Charoenkwan, P. Khuwuthyakorn, Pachara Tinamat","doi":"10.1145/3388142.3388152","DOIUrl":"https://doi.org/10.1145/3388142.3388152","url":null,"abstract":"This research aims to investigate customer satisfaction in cosmetic products by utilizing a low-cost tool, word cloud, to analyze online reviews, based on the research questions: how customers feel about each type of cosmetic products? what their feedbacks are? and which words are positive and which ones are negative? The dataset for the investigation comprises with reviews over a 4-year duration of data collection from 2015-2018, collected from popular social networking sites in Thailand including Facebook and Pantip. A hierarchical clustering approach, the Linkage algorithm, was employed in the context of text mining. The result shows that factors that influence customer satisfaction are based on customer experience affecting positive or negative words in online reviews of each product.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124386148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Framework for Intelligent Navigation Using Latent Dirichlet Allocation on Reddit Posts About Opiates 基于潜在狄利克雷分配的关于鸦片类药物的Reddit帖子智能导航框架
Peter Akioyamen, Levi C Nicklas, R. Sanchez-Arias
Many people look to the internet for support and assistance when faced with issues in life, particularly when these issues are related to behaviors or conditions that are stigmatized within society, generally making open discussion difficult. In this study, we utilize the unique characteristics of the news aggregation and discussion internet forum, reddit, to demonstrate the potential for text mining as an intelligent content filtering and navigation framework; we use online discussion surrounding opiates as a case study. Topic modeling is used as a text mining approach to organize and discover hidden semantic structures within reddit posts, developing a representation of a post through the topics and the words which comprise them. These characterizations may act as an intelligent navigation system of an online community, providing users the ability to actively navigate through similar posts and identify dissimilar ones based on their specific interests.
许多人在面对生活中的问题时寻求支持和帮助,特别是当这些问题与社会上被污名化的行为或状况有关时,通常使公开讨论变得困难。在本研究中,我们利用新闻聚合和讨论互联网论坛reddit的独特特征,展示了文本挖掘作为智能内容过滤和导航框架的潜力;我们使用围绕鸦片的在线讨论作为案例研究。主题建模是一种文本挖掘方法,用于组织和发现reddit帖子中隐藏的语义结构,通过主题和组成主题的单词来开发帖子的表示。这些特征可以作为在线社区的智能导航系统,使用户能够主动浏览相似的帖子,并根据他们的特定兴趣识别不同的帖子。
{"title":"A Framework for Intelligent Navigation Using Latent Dirichlet Allocation on Reddit Posts About Opiates","authors":"Peter Akioyamen, Levi C Nicklas, R. Sanchez-Arias","doi":"10.1145/3388142.3388156","DOIUrl":"https://doi.org/10.1145/3388142.3388156","url":null,"abstract":"Many people look to the internet for support and assistance when faced with issues in life, particularly when these issues are related to behaviors or conditions that are stigmatized within society, generally making open discussion difficult. In this study, we utilize the unique characteristics of the news aggregation and discussion internet forum, reddit, to demonstrate the potential for text mining as an intelligent content filtering and navigation framework; we use online discussion surrounding opiates as a case study. Topic modeling is used as a text mining approach to organize and discover hidden semantic structures within reddit posts, developing a representation of a post through the topics and the words which comprise them. These characterizations may act as an intelligent navigation system of an online community, providing users the ability to actively navigate through similar posts and identify dissimilar ones based on their specific interests.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114203282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design of the base isolation system with artificial neural network models 基于人工神经网络模型的基础隔震系统设计
Samer M. Barakat
This work presents the application of the artificial neural networks (ANN) for modeling and designing Seismic-Isolation (SI) systems consisting of Natural Rubber Bearings and Viscous Fluid Dampers subject to Near-Field (NF) earthquake ground motion. Four lumped-mass stick models representing a realistic five, ten, fifteen, and 20-story base-isolated buildings are used. The key response parameters selected to represent the behavior of SI system are the Damper Force (PDF), Total Maximum Displacement (DTM), the Peak the Top Story Acceleration Ratio (TSAR) of the isolated structure compared to the fixed-base structure and the maximum amplified drift ratio (δmax). Twenty-four NF earthquake records representing two seismic hazard levels are used. The commercial analysis program SAP2000 was used to perform the Time-History Analysis (THA) of the MDOF system (stick model representing a realistic N-story base-isolated building) subject to all 24 records. Different combinations of damping coefficients (c) and damping exponents (ą) are investigated under the 24 earthquake records to develop the database of feasible combinations for the SI system. The total number of considered THA combinations is 751680 and were used for training and testing the neural network. Mathematical models for the key response parameters are established via ANN and produced acceptable results with significantly less computation. The results of this study show that ANN models can be a powerful tool to be included in the design process of Seismic-Isolation (SI) systems, especially at the preliminary stages.
本文介绍了人工神经网络(ANN)在模拟和设计受近场地震地面运动影响的由天然橡胶支座和粘性流体阻尼器组成的隔震(SI)系统中的应用。使用了四个集总质量棒模型,分别代表现实的5层、10层、15层和20层的基础隔离建筑。所选择的代表SI系统行为的关键响应参数是阻尼力(PDF),总最大位移(DTM),隔震结构与固定基础结构相比的峰值顶层加速度比(TSAR)和最大放大漂移比(δmax)。24条NF地震记录代表两个地震危险级别。使用商业分析程序SAP2000对所有24条记录的MDOF系统(代表实际n层基础隔离建筑的棒模型)进行时程分析(THA)。研究了24次地震记录下阻尼系数(c)和阻尼指数(z)的不同组合,建立了SI系统可行组合的数据库。考虑的THA组合总数为751680,用于训练和测试神经网络。通过人工神经网络建立了关键响应参数的数学模型,计算量大大减少,结果令人满意。研究结果表明,人工神经网络模型在隔震系统的设计过程中是一个强有力的工具,特别是在初步设计阶段。
{"title":"Design of the base isolation system with artificial neural network models","authors":"Samer M. Barakat","doi":"10.1145/3388142.3388169","DOIUrl":"https://doi.org/10.1145/3388142.3388169","url":null,"abstract":"This work presents the application of the artificial neural networks (ANN) for modeling and designing Seismic-Isolation (SI) systems consisting of Natural Rubber Bearings and Viscous Fluid Dampers subject to Near-Field (NF) earthquake ground motion. Four lumped-mass stick models representing a realistic five, ten, fifteen, and 20-story base-isolated buildings are used. The key response parameters selected to represent the behavior of SI system are the Damper Force (PDF), Total Maximum Displacement (DTM), the Peak the Top Story Acceleration Ratio (TSAR) of the isolated structure compared to the fixed-base structure and the maximum amplified drift ratio (δmax). Twenty-four NF earthquake records representing two seismic hazard levels are used. The commercial analysis program SAP2000 was used to perform the Time-History Analysis (THA) of the MDOF system (stick model representing a realistic N-story base-isolated building) subject to all 24 records. Different combinations of damping coefficients (c) and damping exponents (ą) are investigated under the 24 earthquake records to develop the database of feasible combinations for the SI system. The total number of considered THA combinations is 751680 and were used for training and testing the neural network. Mathematical models for the key response parameters are established via ANN and produced acceptable results with significantly less computation. The results of this study show that ANN models can be a powerful tool to be included in the design process of Seismic-Isolation (SI) systems, especially at the preliminary stages.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114805473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1