首页 > 最新文献

2018 6th International Conference on Information and Communication Technology (ICoICT)最新文献

英文 中文
Tokenization and N-Gram for Indexing Indonesian Translation of the Quran 古兰经印尼语翻译的标记化和N-Gram索引
S. Putra, M. Gunawan, Agung Suryatno
Tokenization is an important process used to break the text into parts of a word. N-gram model now is widely used in computational linguistics for predicting the next item in such a contiguous sequence of $mathbf{n}$ items from a particular sample of text. This paper focuses on the implementation of tokenization and n-gram model using RapidMiner to produce unigram and bigram word for indexing Indonesian Translation of the Quran (ITQ). This study uses ITQ data sets consisting of 114 documents. The methods are data extracting and preprocessing text including tokenization, stemming, stopword removal, transformation cases, and n-grams. The results of this study showed the model produces the 6794 and 60323 tokens combination unigram and bigram use for index ITQ. Significant the contribution of this study is to enhance the digital index of ITQ.
标记化是一个重要的过程,用于将文本分解为单词的部分。n -gram模型现在广泛应用于计算语言学中,用于预测来自特定文本样本的$mathbf{n}$项的连续序列中的下一个项目。本文研究了利用RapidMiner实现标记化和n-gram模型,生成单字和双字词,用于索引《古兰经》印尼语翻译(ITQ)。本研究使用114篇文献的ITQ数据集。方法是数据提取和预处理文本,包括标记化、词干提取、停止词去除、转换案例和n-grams。本研究的结果表明,该模型为索引ITQ产生了6794和60323个标记组合单元格和二元格格。本研究的重要贡献在于提高了ITQ的数字索引。
{"title":"Tokenization and N-Gram for Indexing Indonesian Translation of the Quran","authors":"S. Putra, M. Gunawan, Agung Suryatno","doi":"10.1109/ICOICT.2018.8528762","DOIUrl":"https://doi.org/10.1109/ICOICT.2018.8528762","url":null,"abstract":"Tokenization is an important process used to break the text into parts of a word. N-gram model now is widely used in computational linguistics for predicting the next item in such a contiguous sequence of $mathbf{n}$ items from a particular sample of text. This paper focuses on the implementation of tokenization and n-gram model using RapidMiner to produce unigram and bigram word for indexing Indonesian Translation of the Quran (ITQ). This study uses ITQ data sets consisting of 114 documents. The methods are data extracting and preprocessing text including tokenization, stemming, stopword removal, transformation cases, and n-grams. The results of this study showed the model produces the 6794 and 60323 tokens combination unigram and bigram use for index ITQ. Significant the contribution of this study is to enhance the digital index of ITQ.","PeriodicalId":266335,"journal":{"name":"2018 6th International Conference on Information and Communication Technology (ICoICT)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115218616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Automatic Tweet Classification Based on News Category in Indonesian Language 基于印尼语新闻分类的推文自动分类
Jaka E. Sembodo, E. B. Setiawan, M. Bijaksana
Tweet is being informative as well as news articles, so that the automatic tweet classifier based on news category could be useful to make ease in searching tweet based on certain interesting category. We identified those are 11 categories: religion, business, entertainment, law and crime, health, motivation, sport, government, education, politics and technology. In the learning process, we use ZeroR, Naive Bayes Multinomial (NBM), Support Vector Machine (SVM), Random Forest (RF) and Sequential Minimal Optimization (SMO) algorithm based on previous work that has similar topic with this paper. In experiments, we experiment classifier using all tweet and various maximum number of tweets and terms in each category. In evaluating performance system, we used 10-fold cross validation and use accuracy (correctly classified instances) as performance paramater. In the experiments result, NBM performs the highest performance with 77,47% accuracy with maximum number of tweets and terms in every category is 500 tweets and 1000 terms. At the last, we built automatic tweet classifier with NBM due to this classifier and experiment result perform the best performances using web-based programming.
推文与新闻文章一样具有信息性,基于新闻类别的自动推文分类器可以方便地根据某一有趣类别搜索推文。我们将其划分为11个类别:宗教、商业、娱乐、法律和犯罪、健康、动机、体育、政府、教育、政治和技术。在学习过程中,我们使用了ZeroR、朴素贝叶斯多项式(NBM)、支持向量机(SVM)、随机森林(RF)和顺序最小优化(SMO)算法,这些算法都是基于与本文主题相似的前人工作。在实验中,我们使用所有tweet和每个类别中的各种tweet和术语的最大数量来实验分类器。在评估性能系统时,我们使用了10倍交叉验证,并使用准确性(正确分类的实例)作为性能参数。在实验结果中,当每个类别中推文和术语的最大数量为500条推文和1000个术语时,NBM表现出最高的性能,准确率为77.47%。最后,我们使用NBM构建了自动推文分类器,由于该分类器和实验结果在基于web的编程中表现出最好的性能。
{"title":"Automatic Tweet Classification Based on News Category in Indonesian Language","authors":"Jaka E. Sembodo, E. B. Setiawan, M. Bijaksana","doi":"10.1109/ICOICT.2018.8528788","DOIUrl":"https://doi.org/10.1109/ICOICT.2018.8528788","url":null,"abstract":"Tweet is being informative as well as news articles, so that the automatic tweet classifier based on news category could be useful to make ease in searching tweet based on certain interesting category. We identified those are 11 categories: religion, business, entertainment, law and crime, health, motivation, sport, government, education, politics and technology. In the learning process, we use ZeroR, Naive Bayes Multinomial (NBM), Support Vector Machine (SVM), Random Forest (RF) and Sequential Minimal Optimization (SMO) algorithm based on previous work that has similar topic with this paper. In experiments, we experiment classifier using all tweet and various maximum number of tweets and terms in each category. In evaluating performance system, we used 10-fold cross validation and use accuracy (correctly classified instances) as performance paramater. In the experiments result, NBM performs the highest performance with 77,47% accuracy with maximum number of tweets and terms in every category is 500 tweets and 1000 terms. At the last, we built automatic tweet classifier with NBM due to this classifier and experiment result perform the best performances using web-based programming.","PeriodicalId":266335,"journal":{"name":"2018 6th International Conference on Information and Communication Technology (ICoICT)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130157899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Toward Full Enterprise Software Support on nDPI 面向nDPI的全面企业软件支持
Gregorius Aldo Radityatama, Charles Lim, Heru Purnomo Ipung
Next Generation Firewall (NGFW) adds new capabilities of a standard firewall with an ability to inspect packets' contents, thus increasing precision. Three main usages of NGFW are to improve the Quality of Service (QoS) of a business, as an application-based filtering firewall, and to protect the network from known security threats. A complete NGFW system has three main components: Deep Packet Inspection (DPI), Intrusion Prevention System (IPS), and an extra-firewall intelligence mechanism. One example of open-source DPI implementations is called nDPI. As the number of enterprise applications (used in the commercial organizations) continues to rise, nDPI is also lagging in terms of coverage for enterprise software support. The aim of this research is to design and implement better enterprise-grade software support protocols on nDPI. Five common enterprise applications were chosen and implemented. The experiment results were then compared with the commercial implementation of NGFW in terms of overall precision and performance of nDPI. The results show that the accuracy of nDPI the new protocols implemented reaches more than 90% with a small (less than 3,5%) increase of CPU execution time and very small (less than 1%) increase of peak heap memory usage.
NGFW (Next Generation Firewall)在标准防火墙的基础上增加了对报文内容的检测功能,提高了检测精度。NGFW的主要用途是提高业务的服务质量(QoS),作为基于应用的过滤防火墙,保护网络免受已知的安全威胁。一个完整的NGFW系统主要由三个部分组成:DPI (Deep Packet Inspection)、IPS (Intrusion Prevention system)和防火墙外智能机制。开源DPI实现的一个例子是nDPI。随着企业应用程序(在商业组织中使用)的数量不断增加,nDPI在企业软件支持的覆盖方面也落后了。本研究的目的是在nDPI上设计和实现更好的企业级软件支持协议。选择并实现了五个常见的企业应用程序。然后,将实验结果与NGFW的商业实现在nDPI的整体精度和性能方面进行了比较。结果表明,新协议实现的nDPI精度达到90%以上,CPU执行时间增加很小(小于3.5%),峰值堆内存使用增加很小(小于1%)。
{"title":"Toward Full Enterprise Software Support on nDPI","authors":"Gregorius Aldo Radityatama, Charles Lim, Heru Purnomo Ipung","doi":"10.1109/ICOICT.2018.8528792","DOIUrl":"https://doi.org/10.1109/ICOICT.2018.8528792","url":null,"abstract":"Next Generation Firewall (NGFW) adds new capabilities of a standard firewall with an ability to inspect packets' contents, thus increasing precision. Three main usages of NGFW are to improve the Quality of Service (QoS) of a business, as an application-based filtering firewall, and to protect the network from known security threats. A complete NGFW system has three main components: Deep Packet Inspection (DPI), Intrusion Prevention System (IPS), and an extra-firewall intelligence mechanism. One example of open-source DPI implementations is called nDPI. As the number of enterprise applications (used in the commercial organizations) continues to rise, nDPI is also lagging in terms of coverage for enterprise software support. The aim of this research is to design and implement better enterprise-grade software support protocols on nDPI. Five common enterprise applications were chosen and implemented. The experiment results were then compared with the commercial implementation of NGFW in terms of overall precision and performance of nDPI. The results show that the accuracy of nDPI the new protocols implemented reaches more than 90% with a small (less than 3,5%) increase of CPU execution time and very small (less than 1%) increase of peak heap memory usage.","PeriodicalId":266335,"journal":{"name":"2018 6th International Conference on Information and Communication Technology (ICoICT)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132742964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mapping Walls of Indoor Environment Using Moving RGB-D Sensor 利用移动RGB-D传感器绘制室内环境墙壁
Ismail Rusli, B. Trilaksono, W. Adiprawita
Inferring walls configuration of indoor environment could help robot “understand” the environment better. This allows the robot to execute a task that involves inter-room navigation, such as picking an object in the kitchen. In this paper, we present a method to inferring walls configuration from a moving RGB-D sensor. Our goal is to combine a simple wall configuration model and fast wall detection method in order to get a system that works online, is real-time, and does not need a Manhattan World assumption. We tested our preliminary work, i.e. wall detection and measurement from moving RGB-D sensor, with MIT Stata Center Dataset. The performance of our method is reported in terms of accuracy and speed of execution.
推断室内环境的墙体配置可以帮助机器人更好地“理解”环境。这使得机器人可以执行包括房间间导航的任务,比如在厨房里挑选一个物体。在本文中,我们提出了一种从移动的RGB-D传感器推断壁面结构的方法。我们的目标是结合一个简单的墙配置模型和快速的墙检测方法,以获得一个在线工作的系统,是实时的,不需要曼哈顿世界的假设。我们使用MIT Stata Center数据集测试了我们的初步工作,即移动RGB-D传感器的墙壁检测和测量。我们的方法在准确性和执行速度方面的性能得到了报道。
{"title":"Mapping Walls of Indoor Environment Using Moving RGB-D Sensor","authors":"Ismail Rusli, B. Trilaksono, W. Adiprawita","doi":"10.1109/ICOICT.2018.8528805","DOIUrl":"https://doi.org/10.1109/ICOICT.2018.8528805","url":null,"abstract":"Inferring walls configuration of indoor environment could help robot “understand” the environment better. This allows the robot to execute a task that involves inter-room navigation, such as picking an object in the kitchen. In this paper, we present a method to inferring walls configuration from a moving RGB-D sensor. Our goal is to combine a simple wall configuration model and fast wall detection method in order to get a system that works online, is real-time, and does not need a Manhattan World assumption. We tested our preliminary work, i.e. wall detection and measurement from moving RGB-D sensor, with MIT Stata Center Dataset. The performance of our method is reported in terms of accuracy and speed of execution.","PeriodicalId":266335,"journal":{"name":"2018 6th International Conference on Information and Communication Technology (ICoICT)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122862868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
2018 6th International Conference on Information and Communication Technology (ICoICT)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1