首页 > 最新文献

2015 International Conference on Advanced Computer Science and Information Systems (ICACSIS)最新文献

英文 中文
Road detection system based on RGB histogram filterization and boundary classifier 基于RGB直方图滤波和边界分类器的道路检测系统
M. D. Enjat Munajat, D. H. Widyantoro, R. Munir
The purpose of this paper is to describe a new approach in road detection. This research uses two detection processes approaches: RGB histogram Filterization and Boundary Classifer, which is different from previous works on road detection. RGB Histogram Filterization processes the reading from the camera in greyscale form and afterward processes them by color segmentation. The last step for this process is determining area between the slopes, which is considered to be the road area. Boundary classification process then employs the RGB indexing on slope ranges, and mapping them into real pictures of roads and its environments. The next process is specifically looking for line boundaries by using Hough-Transform and Canny Edge Detection, and transforms them into binary numbers of `0' and `1'. `1' represents road boundaries while `0' represents surrounding area. The coordinate of `1', then mapped by cubic spline to produce connecting line between point `1' coordinates, which in the end produce sharp images on boundaries between road and non-road. This model has proven to be able to detect road conditions and distinguish roads from non-road in a precise way. A test is already conducted for the system by using real-time roads in Bandung, Indonesia. The results are really promising for the road condition on both straight and curved road area.
本文的目的是描述一种新的道路检测方法。本研究采用了RGB直方图滤波(RGB histogram filtering)和边界分类器(Boundary classifier)两种检测处理方法,与以往的道路检测方法有所不同。RGB直方图滤波以灰度形式处理来自相机的读数,然后通过颜色分割进行处理。这个过程的最后一步是确定斜坡之间的面积,这被认为是道路面积。然后,边界分类过程在坡度范围上使用RGB索引,并将它们映射到道路及其环境的真实图片中。下一个过程是通过使用霍夫变换和Canny边缘检测专门寻找线边界,并将其转换为二进制数“0”和“1”。“1”表示道路边界,“0”表示周围区域。然后通过三次样条映射得到点1坐标之间的连接线,最终在道路和非道路边界上生成清晰的图像。该模型已被证明能够检测道路状况,并以精确的方式区分道路和非道路。该系统已经在印度尼西亚万隆的实时道路上进行了测试。研究结果对直弯路面的路况都很有帮助。
{"title":"Road detection system based on RGB histogram filterization and boundary classifier","authors":"M. D. Enjat Munajat, D. H. Widyantoro, R. Munir","doi":"10.1109/ICACSIS.2015.7415163","DOIUrl":"https://doi.org/10.1109/ICACSIS.2015.7415163","url":null,"abstract":"The purpose of this paper is to describe a new approach in road detection. This research uses two detection processes approaches: RGB histogram Filterization and Boundary Classifer, which is different from previous works on road detection. RGB Histogram Filterization processes the reading from the camera in greyscale form and afterward processes them by color segmentation. The last step for this process is determining area between the slopes, which is considered to be the road area. Boundary classification process then employs the RGB indexing on slope ranges, and mapping them into real pictures of roads and its environments. The next process is specifically looking for line boundaries by using Hough-Transform and Canny Edge Detection, and transforms them into binary numbers of `0' and `1'. `1' represents road boundaries while `0' represents surrounding area. The coordinate of `1', then mapped by cubic spline to produce connecting line between point `1' coordinates, which in the end produce sharp images on boundaries between road and non-road. This model has proven to be able to detect road conditions and distinguish roads from non-road in a precise way. A test is already conducted for the system by using real-time roads in Bandung, Indonesia. The results are really promising for the road condition on both straight and curved road area.","PeriodicalId":325539,"journal":{"name":"2015 International Conference on Advanced Computer Science and Information Systems (ICACSIS)","volume":"2004 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114129040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Dynamic, auto-adaptive software product lines using the ABS language 采用ABS语言的动态、自适应软件产品线
Radu Muschevici
Modern software systems must support a high degree of variability to adapt to a wide range of requirements and operating conditions. While static adaptation based on software product lines is becoming more common, dynamic adaptation is less well-explored. However, runtime adaptation has a host of advantages ranging from downtime avoidance to performance improvements. Auto-adaptation is a particularly promising form of runtime adaptation that enables a running program to adapt autonomously, in swift response to changing conditions in the running environment. This paper focuses on the design of a programming language facility to support the runtime auto-configuration of dynamic software product lines (DSPL). We implement this facility for the Abstract Behavioural Specification (ABS) language by introducing a dynamic, reflection-based meta-programming facility for ABS, called MetaABS and a runtime environment that readily supports dynamically auto-adapting systems written in MetaABS.
现代软件系统必须支持高度的可变性,以适应广泛的需求和操作条件。当基于软件产品线的静态适应变得越来越普遍时,动态适应却没有得到很好的探索。然而,运行时适应有很多优点,从避免停机时间到性能改进。自动适应是一种特别有前途的运行时适应形式,它使运行中的程序能够自主地适应运行环境中不断变化的条件。本文研究了一种支持动态软件产品线(DSPL)运行时自动配置的编程语言工具的设计。我们通过为抽象行为规范(ABS)语言引入一个动态的、基于反射的元编程工具(称为MetaABS)和一个运行时环境来实现这个功能,该环境可以很容易地支持用MetaABS编写的动态自适应系统。
{"title":"Dynamic, auto-adaptive software product lines using the ABS language","authors":"Radu Muschevici","doi":"10.1109/ICACSIS.2015.7415195","DOIUrl":"https://doi.org/10.1109/ICACSIS.2015.7415195","url":null,"abstract":"Modern software systems must support a high degree of variability to adapt to a wide range of requirements and operating conditions. While static adaptation based on software product lines is becoming more common, dynamic adaptation is less well-explored. However, runtime adaptation has a host of advantages ranging from downtime avoidance to performance improvements. Auto-adaptation is a particularly promising form of runtime adaptation that enables a running program to adapt autonomously, in swift response to changing conditions in the running environment. This paper focuses on the design of a programming language facility to support the runtime auto-configuration of dynamic software product lines (DSPL). We implement this facility for the Abstract Behavioural Specification (ABS) language by introducing a dynamic, reflection-based meta-programming facility for ABS, called MetaABS and a runtime environment that readily supports dynamically auto-adapting systems written in MetaABS.","PeriodicalId":325539,"journal":{"name":"2015 International Conference on Advanced Computer Science and Information Systems (ICACSIS)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129608252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A classification system for jamu efficacy based on formula using support vector machine and k-means algorithm as a feature selection 采用支持向量机和k-means算法作为特征选择,建立了基于公式的果酱功效分类系统
M. N. Puspita, W. Kusuma, A. Kustiyo, R. Heryanto
Jamu is an Indonesia herbal medicine made from natural materials such as roots, leaves, fruits, and animals. The purpose of this research is to develop a classification system for jamu efficacy based on the composition of plants using Support Vector Machine (SVM) and to implement the k-means clustering algorithm as a feature selection method. The result of this study was compared to the previous research that using SVM method without feature selection. This study used variances to evaluate the results of clustering. The total of 3138 data herbs and 465 plant species were grouped into 100 clusters with the variance of 0.0094. The managed group succesfully reduced the data dimension into 3047 of jamu sample and 236 species of herbs and plants as features. The result of SVM classification using feature selection yielded the accuracy of 71.5%.
Jamu是一种印度尼西亚草药,由根、叶子、水果和动物等天然材料制成。本研究的目的是利用支持向量机(Support Vector Machine, SVM)开发一种基于植物成分的木参功效分类系统,并实现k-means聚类算法作为特征选择方法。将本文的研究结果与之前使用支持向量机方法进行特征选择的研究结果进行了比较。本研究使用方差来评价聚类的结果。3138种数据草本植物和465种植物被归为100类,方差为0.0094。管理组成功地将数据维数缩减为jamu样本的3047个和草本植物的236种作为特征。基于特征选择的SVM分类准确率为71.5%。
{"title":"A classification system for jamu efficacy based on formula using support vector machine and k-means algorithm as a feature selection","authors":"M. N. Puspita, W. Kusuma, A. Kustiyo, R. Heryanto","doi":"10.1109/ICACSIS.2015.7415176","DOIUrl":"https://doi.org/10.1109/ICACSIS.2015.7415176","url":null,"abstract":"Jamu is an Indonesia herbal medicine made from natural materials such as roots, leaves, fruits, and animals. The purpose of this research is to develop a classification system for jamu efficacy based on the composition of plants using Support Vector Machine (SVM) and to implement the k-means clustering algorithm as a feature selection method. The result of this study was compared to the previous research that using SVM method without feature selection. This study used variances to evaluate the results of clustering. The total of 3138 data herbs and 465 plant species were grouped into 100 clusters with the variance of 0.0094. The managed group succesfully reduced the data dimension into 3047 of jamu sample and 236 species of herbs and plants as features. The result of SVM classification using feature selection yielded the accuracy of 71.5%.","PeriodicalId":325539,"journal":{"name":"2015 International Conference on Advanced Computer Science and Information Systems (ICACSIS)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116483129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
ECG signal compression by predictive coding and Set Partitioning in Hierarchical Trees (SPIHT) 基于预测编码和分层树集分割的心电信号压缩
G. Jati, Aprinaldi, S. M. Isa, W. Jatmiko
In this paper we present a method for multi-lead ECG signal compression using Predictive Coding combined with Set Partitioning In Hierarchical Trees (SPIHT). We utilize linear prediction between the beats to exploit the high correlation among those beats. This method can optimize the redundancy between adjacent samples and adjacent beats. Predictive coding is the next step after beat reordering step. The purpose of using predictive coding is to minimize amplitude variance of 2D ECG array so the compression error can be minimize. The experiments from selected records from MIT-BIH arrhythmia database shows that the proposed method is more efficient for ECG signal compression compared with original SPIHT and relatively have lower distortion with the same compression ratios compared to the other wavelet transformation techniques.
本文提出了一种结合分层树集分割(SPIHT)的预测编码多导联心电信号压缩方法。我们利用节拍之间的线性预测来利用这些节拍之间的高相关性。该方法可以优化相邻采样和相邻拍之间的冗余度。预测编码是节拍重排序后的下一步。采用预测编码的目的是使二维心电阵列的振幅方差最小,从而使压缩误差最小。从MIT-BIH心律失常数据库中选取的记录进行实验,结果表明,与原有的SPIHT相比,该方法对心电信号的压缩效率更高,在相同压缩比的情况下,与其他小波变换相比,该方法具有更低的失真。
{"title":"ECG signal compression by predictive coding and Set Partitioning in Hierarchical Trees (SPIHT)","authors":"G. Jati, Aprinaldi, S. M. Isa, W. Jatmiko","doi":"10.1109/ICACSIS.2015.7415191","DOIUrl":"https://doi.org/10.1109/ICACSIS.2015.7415191","url":null,"abstract":"In this paper we present a method for multi-lead ECG signal compression using Predictive Coding combined with Set Partitioning In Hierarchical Trees (SPIHT). We utilize linear prediction between the beats to exploit the high correlation among those beats. This method can optimize the redundancy between adjacent samples and adjacent beats. Predictive coding is the next step after beat reordering step. The purpose of using predictive coding is to minimize amplitude variance of 2D ECG array so the compression error can be minimize. The experiments from selected records from MIT-BIH arrhythmia database shows that the proposed method is more efficient for ECG signal compression compared with original SPIHT and relatively have lower distortion with the same compression ratios compared to the other wavelet transformation techniques.","PeriodicalId":325539,"journal":{"name":"2015 International Conference on Advanced Computer Science and Information Systems (ICACSIS)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133060005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Application of hierarchical clustering ordered partitioning and collapsing hybrid in Ebola Virus phylogenetic analysis 层次聚类、有序划分和崩溃杂交在埃博拉病毒系统发育分析中的应用
Hengki Muradi, A. Bustamam, D. Lestari
Gene clustering can be achieved through hierarchical or partition method. Both clustering methods can be combined by processing the partition and hierarchical phases alternately. This method is known as a hierarchical clustering ordered partitioning and collapsing hybrid (HOPACH) method. The Partitioning phase can be done by using PAM, SOM, or K-Means methods. The partition process is continued with the ordered process, and then it is corrected with agglomerative process, in order to have more accurate clustering results. Furthermore, the main clusters are determined by using MSS (Median Split Silhouette) value. We selected the clustering results which minimize the MSS value. In this work, we conduct the clustering on 136 Ebola Virus DNA sequences data from GenBank. The global alignment process is initially performed, followed by genetic distance calculation using Jukes-Cantor correction. In our implementation, we applied global alignment process and used the combination of HOPACH-PAM clustering using the R open source programming tool. In our results, we obtained maximum genetic distance is 0.6153407; meanwhile the minimum genetic distance is 0. Furthermore, genetic distance matrix can be used as a basis for sequences clustering and phylogenetic analysis. In our HOPACH-PAM clustering results, we obtained 10 main clusters with MSS value is 0.8873843. Ebola virus clusters can be identified by species and virus epidemic year.
基因聚类可以通过分层或划分的方法来实现。通过交替处理分割阶段和分层阶段,可以将两种聚类方法结合起来。这种方法被称为分层聚类有序分区和折叠混合(HOPACH)方法。分区阶段可以通过使用PAM、SOM或K-Means方法来完成。将分割过程用有序过程进行延续,再用聚类过程进行修正,以获得更准确的聚类结果。此外,使用MSS (Median Split Silhouette)值确定主聚类。我们选择最小MSS值的聚类结果。在这项工作中,我们对来自GenBank的136个埃博拉病毒DNA序列数据进行了聚类。首先执行全局比对过程,然后使用Jukes-Cantor校正进行遗传距离计算。在我们的实现中,我们应用了全局对齐过程,并使用R开源编程工具使用了HOPACH-PAM集群的组合。结果表明,最大遗传距离为0.6153407;最小遗传距离为0。遗传距离矩阵可作为序列聚类和系统发育分析的基础。在我们的HOPACH-PAM聚类结果中,我们得到了10个MSS值为0.8873843的主聚类。埃博拉病毒群可按病毒种类和流行年份进行识别。
{"title":"Application of hierarchical clustering ordered partitioning and collapsing hybrid in Ebola Virus phylogenetic analysis","authors":"Hengki Muradi, A. Bustamam, D. Lestari","doi":"10.1109/ICACSIS.2015.7415183","DOIUrl":"https://doi.org/10.1109/ICACSIS.2015.7415183","url":null,"abstract":"Gene clustering can be achieved through hierarchical or partition method. Both clustering methods can be combined by processing the partition and hierarchical phases alternately. This method is known as a hierarchical clustering ordered partitioning and collapsing hybrid (HOPACH) method. The Partitioning phase can be done by using PAM, SOM, or K-Means methods. The partition process is continued with the ordered process, and then it is corrected with agglomerative process, in order to have more accurate clustering results. Furthermore, the main clusters are determined by using MSS (Median Split Silhouette) value. We selected the clustering results which minimize the MSS value. In this work, we conduct the clustering on 136 Ebola Virus DNA sequences data from GenBank. The global alignment process is initially performed, followed by genetic distance calculation using Jukes-Cantor correction. In our implementation, we applied global alignment process and used the combination of HOPACH-PAM clustering using the R open source programming tool. In our results, we obtained maximum genetic distance is 0.6153407; meanwhile the minimum genetic distance is 0. Furthermore, genetic distance matrix can be used as a basis for sequences clustering and phylogenetic analysis. In our HOPACH-PAM clustering results, we obtained 10 main clusters with MSS value is 0.8873843. Ebola virus clusters can be identified by species and virus epidemic year.","PeriodicalId":325539,"journal":{"name":"2015 International Conference on Advanced Computer Science and Information Systems (ICACSIS)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124083428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Segmenting and targeting customers through clusters selection & analysis 通过集群选择和分析来细分和定位客户
I. Pranata, G. Skinner
This paper investigates the use of machine learning clustering technique to segment and target customers of a wholesale distributor. It describes the selection, analysis, and interpretation of clusters for evaluating customers annual spending on the products. We show how circular statistics can categorize customers by looking at the annual spending on six essential product categories. Several clusters were created using k-means clustering algorithm and an in-depth analysis on these clusters were performed using several techniques to carefully select the best cluster. Automated clustering was able to suggest groups that these customers fall into. The evaluation and interpretation of clusters were able to provide insights into various purchase behaviors and to nominate the best customer group to target.
本文研究了利用机器学习聚类技术对批发经销商进行客户细分和目标客户定位的方法。它描述了用于评估客户在产品上的年度支出的集群的选择、分析和解释。我们通过查看六个基本产品类别的年度支出来展示循环统计如何对客户进行分类。使用k-means聚类算法创建了多个聚类,并使用多种技术对这些聚类进行了深入分析,以仔细选择最佳聚类。自动集群能够建议这些客户所属的组。对集群的评估和解释能够提供对各种购买行为的见解,并提名最佳的客户群作为目标。
{"title":"Segmenting and targeting customers through clusters selection & analysis","authors":"I. Pranata, G. Skinner","doi":"10.1109/ICACSIS.2015.7415187","DOIUrl":"https://doi.org/10.1109/ICACSIS.2015.7415187","url":null,"abstract":"This paper investigates the use of machine learning clustering technique to segment and target customers of a wholesale distributor. It describes the selection, analysis, and interpretation of clusters for evaluating customers annual spending on the products. We show how circular statistics can categorize customers by looking at the annual spending on six essential product categories. Several clusters were created using k-means clustering algorithm and an in-depth analysis on these clusters were performed using several techniques to carefully select the best cluster. Automated clustering was able to suggest groups that these customers fall into. The evaluation and interpretation of clusters were able to provide insights into various purchase behaviors and to nominate the best customer group to target.","PeriodicalId":325539,"journal":{"name":"2015 International Conference on Advanced Computer Science and Information Systems (ICACSIS)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125764941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Gestalt geometric CAPTCHA 格式塔几何验证码
Nuttanont Hongwarittorrn, Suttikiat Meelap
This research investigated a new Image-Based CAPTCHA called Gestalt Geometric CAPTCHA, which does not require the use of a database of images, and is based on the Gestalt Theory of human recognition. The aim was to develop a type of CAPTCHA that is easier for human users and harder for bots. We experimentally tested the use and effectiveness of Gestalt Geometric CAPTCHA in terms of time for completion, authentication pass rate, and user satisfaction, in comparison with reCAPTCHA, Ironclad CAPTCHA, and ShapeCAPTCHA. We also tested our novel CAPTCHA for robustness against two shape detection and classification programs, ShapeChecker and Shape-detect. The results were promising, but some issues remain for future improvement.
本研究研究了一种新的基于图像的CAPTCHA,称为格式塔几何CAPTCHA,它不需要使用图像数据库,并且基于人类识别的格式塔理论。其目的是开发一种对人类用户更容易、对机器人更难的验证码。我们通过实验测试了格式塔几何验证码在完成时间、认证通过率和用户满意度方面的使用和有效性,并与reCAPTCHA、Ironclad CAPTCHA和ShapeCAPTCHA进行了比较。我们还测试了我们的新CAPTCHA对两种形状检测和分类程序(ShapeChecker和shape -detect)的鲁棒性。结果令人鼓舞,但仍有一些问题有待改进。
{"title":"Gestalt geometric CAPTCHA","authors":"Nuttanont Hongwarittorrn, Suttikiat Meelap","doi":"10.1109/ICACSIS.2015.7415185","DOIUrl":"https://doi.org/10.1109/ICACSIS.2015.7415185","url":null,"abstract":"This research investigated a new Image-Based CAPTCHA called Gestalt Geometric CAPTCHA, which does not require the use of a database of images, and is based on the Gestalt Theory of human recognition. The aim was to develop a type of CAPTCHA that is easier for human users and harder for bots. We experimentally tested the use and effectiveness of Gestalt Geometric CAPTCHA in terms of time for completion, authentication pass rate, and user satisfaction, in comparison with reCAPTCHA, Ironclad CAPTCHA, and ShapeCAPTCHA. We also tested our novel CAPTCHA for robustness against two shape detection and classification programs, ShapeChecker and Shape-detect. The results were promising, but some issues remain for future improvement.","PeriodicalId":325539,"journal":{"name":"2015 International Conference on Advanced Computer Science and Information Systems (ICACSIS)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129886462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spark-gram: Mining frequent N-grams using parallel processing in Spark Spark-gram:在Spark中使用并行处理挖掘频繁的n -gram
Prasetya Ajie Utama, Bayu Distiawan
Mining sequence patterns in form of n-grams (sequences of words that appear consecutively) from a large text data is one of the fundamental parts in several information retrieval and natural language processing applications. In this work, we present Spark-gram, a method for large scale frequent sequence mining based on Spark that was adapted from its equivalent method in MapReduce called Suffix-σ. Spark-gram design allows the discovery of all n-grams with maximum length σ and minimum occurrence frequency τ, using iterative algorithm with only a single shuffle phase. We show that Spark-gram can outperform Suffix-σ mainly when τ is high but potentially worse when the value of σ grows higher.
从大型文本数据中以n-gram(连续出现的单词序列)的形式挖掘序列模式是许多信息检索和自然语言处理应用程序的基本部分之一。在这项工作中,我们提出了Spark-gram,一种基于Spark的大规模频繁序列挖掘方法,该方法改编自MapReduce中的等效方法Suffix-σ。Spark-gram设计允许使用迭代算法发现所有具有最大长度σ和最小出现频率τ的n个图,只有一个洗牌阶段。我们发现,当τ较大时,Spark-gram的表现优于后缀-σ,但当σ增大时,表现可能会变差。
{"title":"Spark-gram: Mining frequent N-grams using parallel processing in Spark","authors":"Prasetya Ajie Utama, Bayu Distiawan","doi":"10.1109/ICACSIS.2015.7415169","DOIUrl":"https://doi.org/10.1109/ICACSIS.2015.7415169","url":null,"abstract":"Mining sequence patterns in form of n-grams (sequences of words that appear consecutively) from a large text data is one of the fundamental parts in several information retrieval and natural language processing applications. In this work, we present Spark-gram, a method for large scale frequent sequence mining based on Spark that was adapted from its equivalent method in MapReduce called Suffix-σ. Spark-gram design allows the discovery of all n-grams with maximum length σ and minimum occurrence frequency τ, using iterative algorithm with only a single shuffle phase. We show that Spark-gram can outperform Suffix-σ mainly when τ is high but potentially worse when the value of σ grows higher.","PeriodicalId":325539,"journal":{"name":"2015 International Conference on Advanced Computer Science and Information Systems (ICACSIS)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129927550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Developing smart telehealth system in Indonesia: Progress and challenge 在印度尼西亚发展智能远程医疗系统:进展与挑战
W. Jatmiko, M. A. Ma'sum, S. M. Isa, E. Imah, R. Rahmatullah, B. Wiweko
Indonesia is developing country with high population. There are more than 200 million residents living in the country. As a developing country, Indonesia has several health problems. First, Indonesia has a high value of mortality caused by heart and cardio vascular diseases. One of the major cause is the lack of medical checkup especially for heart monitoring. It is caused by limited number of medical instrumentation e.g. ECG in hospital and public health center. The supporting factor is the small number of cardiologist in Indonesia. There are 365 cardiologists across the country, which is a very small number compared to the 200 million of Indonesia population. Furthermore, they are not distributed evenly in all provinces, but only centered in Jakarta and other capital cities. Therefore, it is difficult for residents to get appropriate heart monitoring. Second, the mortality rate of mother and baby during delivery of the baby in Indonesia is also high. One way to solve this problem is to devise a system where the health clinics in rural areas can perform fetal biometry detection before consulting the results to the expert physicians from other areas. The proposed system will be equipped with algorithms for automatic fetal detection and biometry measurement. By the end of this development, we have several results, the first is a classifier to automatic heartbeat disease prediction with accuracy more than 95%, the second is compression method based on wavelet decompositon, and the third is detection and approximation a fetus in an ultrasound image with hit rate more than 93%.
印度尼西亚是一个人口众多的发展中国家。这个国家有超过2亿的居民。作为一个发展中国家,印度尼西亚存在若干健康问题。首先,印度尼西亚由心脏和心血管疾病引起的死亡率很高。其中一个主要原因是缺乏医疗检查,特别是心脏监测。这是由于医院和公共卫生中心的心电图等医疗仪器数量有限造成的。支持因素是印尼心脏病专家数量少。全国有365名心脏病专家,与印尼2亿人口相比,这是一个非常小的数字。此外,它们并非均匀分布在所有省份,而只是集中在雅加达和其他首都城市。因此,居民很难得到适当的心脏监测。第二,印度尼西亚产妇和婴儿在分娩期间的死亡率也很高。解决这一问题的一种方法是设计一个系统,使农村地区的卫生诊所可以在向其他地区的专家医生咨询结果之前进行胎儿生物测量检测。该系统将配备自动胎儿检测和生物测量算法。最后,我们取得了几个成果,第一是基于小波分解的分类器对心肌病的自动预测,准确率在95%以上;第二是基于小波分解的压缩方法;第三是超声图像中胎儿的检测和逼近,准确率在93%以上。
{"title":"Developing smart telehealth system in Indonesia: Progress and challenge","authors":"W. Jatmiko, M. A. Ma'sum, S. M. Isa, E. Imah, R. Rahmatullah, B. Wiweko","doi":"10.1109/ICACSIS.2015.7415199","DOIUrl":"https://doi.org/10.1109/ICACSIS.2015.7415199","url":null,"abstract":"Indonesia is developing country with high population. There are more than 200 million residents living in the country. As a developing country, Indonesia has several health problems. First, Indonesia has a high value of mortality caused by heart and cardio vascular diseases. One of the major cause is the lack of medical checkup especially for heart monitoring. It is caused by limited number of medical instrumentation e.g. ECG in hospital and public health center. The supporting factor is the small number of cardiologist in Indonesia. There are 365 cardiologists across the country, which is a very small number compared to the 200 million of Indonesia population. Furthermore, they are not distributed evenly in all provinces, but only centered in Jakarta and other capital cities. Therefore, it is difficult for residents to get appropriate heart monitoring. Second, the mortality rate of mother and baby during delivery of the baby in Indonesia is also high. One way to solve this problem is to devise a system where the health clinics in rural areas can perform fetal biometry detection before consulting the results to the expert physicians from other areas. The proposed system will be equipped with algorithms for automatic fetal detection and biometry measurement. By the end of this development, we have several results, the first is a classifier to automatic heartbeat disease prediction with accuracy more than 95%, the second is compression method based on wavelet decompositon, and the third is detection and approximation a fetus in an ultrasound image with hit rate more than 93%.","PeriodicalId":325539,"journal":{"name":"2015 International Conference on Advanced Computer Science and Information Systems (ICACSIS)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114168060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Weather forecasting using deep learning techniques 使用深度学习技术进行天气预报
A. G. Salman, Bayu Kanigoro, Y. Heryadi
Weather forecasting has gained attention many researchers from various research communities due to its effect to the global human life. The emerging deep learning techniques in the last decade coupled with the wide availability of massive weather observation data and the advent of information and computer technology have motivated many researches to explore hidden hierarchical pattern in the large volume of weather dataset for weather forecasting. This study investigates deep learning techniques for weather forecasting. In particular, this study will compare prediction performance of Recurrence Neural Network (RNN), Conditional Restricted Boltzmann Machine (CRBM), and Convolutional Network (CN) models. Those models are tested using weather dataset provided by BMKG (Indonesian Agency for Meteorology, Climatology, and Geophysics) which are collected from a number of weather stations in Aceh area from 1973 to 2009 and El-Nino Southern Oscilation (ENSO) data set provided by International Institution such as National Weather Service Center for Environmental Prediction Climate (NOAA). Forecasting accuracy of each model is evaluated using Frobenius norm. The result of this study expected to contribute to weather forecasting for wide application domains including flight navigation to agriculture and tourism.
由于天气预报对全球人类生活的影响,它受到了各个研究领域的广泛关注。近十年来兴起的深度学习技术,加上大量天气观测数据的广泛可用性以及信息和计算机技术的出现,促使许多研究探索大量天气数据集中隐藏的分层模式,用于天气预报。本研究探讨了用于天气预报的深度学习技术。特别地,本研究将比较递归神经网络(RNN)、条件受限玻尔兹曼机(CRBM)和卷积网络(CN)模型的预测性能。这些模型使用印度尼西亚气象、气候和地球物理局(BMKG)提供的1973 - 2009年在亚齐地区多个气象站收集的天气数据集和国家环境预测气候气象服务中心(NOAA)等国际机构提供的厄尔尼诺-南方涛动(ENSO)数据集进行了测试。采用Frobenius范数对各模型的预测精度进行了评价。研究结果将为气象预报在航空导航、农业、旅游等领域的广泛应用做出贡献。
{"title":"Weather forecasting using deep learning techniques","authors":"A. G. Salman, Bayu Kanigoro, Y. Heryadi","doi":"10.1109/ICACSIS.2015.7415154","DOIUrl":"https://doi.org/10.1109/ICACSIS.2015.7415154","url":null,"abstract":"Weather forecasting has gained attention many researchers from various research communities due to its effect to the global human life. The emerging deep learning techniques in the last decade coupled with the wide availability of massive weather observation data and the advent of information and computer technology have motivated many researches to explore hidden hierarchical pattern in the large volume of weather dataset for weather forecasting. This study investigates deep learning techniques for weather forecasting. In particular, this study will compare prediction performance of Recurrence Neural Network (RNN), Conditional Restricted Boltzmann Machine (CRBM), and Convolutional Network (CN) models. Those models are tested using weather dataset provided by BMKG (Indonesian Agency for Meteorology, Climatology, and Geophysics) which are collected from a number of weather stations in Aceh area from 1973 to 2009 and El-Nino Southern Oscilation (ENSO) data set provided by International Institution such as National Weather Service Center for Environmental Prediction Climate (NOAA). Forecasting accuracy of each model is evaluated using Frobenius norm. The result of this study expected to contribute to weather forecasting for wide application domains including flight navigation to agriculture and tourism.","PeriodicalId":325539,"journal":{"name":"2015 International Conference on Advanced Computer Science and Information Systems (ICACSIS)","volume":"190 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131911583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 130
期刊
2015 International Conference on Advanced Computer Science and Information Systems (ICACSIS)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1