2017 14th International Joint Conference on Computer Science and Software Engineering (JCSSE)最新文献

英文中文

Evaluate different machine learning techniques for classifying sleep stages on single-channel EEG 评估不同机器学习技术在单通道脑电图上的睡眠阶段分类

2017 14th International Joint Conference on Computer Science and Software Engineering (JCSSE)

Pub Date : 2017-07-01 DOI: 10.1109/JCSSE.2017.8025949

Shahnawaz Qureshi, S. Vanichayobon

In this paper, we propose 3 different machine learning techniques such as Random Forest, Bagging and Support Vector Machine along with time domain feature for classifying sleep stages based on single-channel EEG. Whole-night polysomnograms from 25 subjects were recorded employing R&K standard. The evolved process investigated the EEG signals of (C4-A1) for sleep staging. Automatic and manual scoring results were associated on an epoch-by-epoch basis. An entire 96,000 data samples 30s sleep EEG epoch were calculated and applied for performance evaluation. The epoch-by-epoch assessment was created by classifying the EEG epochs into six stages (W/S1/S2/S3/S4/REM) according to proposed method and manual scoring. Result shows that Random Forest classifiers achieve the overall accuracy; specificity and sensitivity level of 97.73%, 96.3% and 99.51% respectively.

在本文中，我们提出了3种不同的机器学习技术，如随机森林、Bagging和支持向量机，以及时域特征，用于基于单通道EEG的睡眠阶段分类。采用R&K标准记录25名受试者的夜间多导睡眠图。进化过程研究(C4-A1)脑电信号对睡眠分期的影响。自动和手动评分结果在一个epoch-by-epoch的基础上相关联。计算了96000个数据样本30秒睡眠脑电历元并应用于性能评价。根据提出的方法和人工评分方法，将脑电分期分为W/S1/S2/S3/S4/REM 6个阶段，形成逐期评价。结果表明，随机森林分类器达到了整体准确率;特异性97.73%，敏感性96.3%，敏感性99.51%。

引用次数: 9

A robust algorithm for R peak detection based on optimal Discrete Wavelet Transform 基于最优离散小波变换的鲁棒R峰值检测算法

2017 14th International Joint Conference on Computer Science and Software Engineering (JCSSE)

Pub Date : 2017-07-01 DOI: 10.1109/JCSSE.2017.8025931

Anurak Thungtong

Automated ECG signal processing can assist in diagnosing several heart diseases. Many R peak detection methods have been studied because the accuracy of R peak detection significantly affects the quality of subsequent ECG feature extraction. Two important steps in R peak detection algorithm that draw attention over researchers are the preprocessing and thresholding stages. Among several methods, wavelet transform is a widely used method for removing noise in the preprocessing stage. Various proposed algorithms require prior knowledge of frequency spectrum of the signal under consideration in order to select the wavelet detail coefficients in the reconstruction process. Moreover, parameter fine tuning is generally involved in threshold selection to accomplish high detection accuracy. As a result, it may be difficult to utilize these methods for general ECG data sets. Accordingly, we propose an automatic and parameter free method that optimally selects the appropriate detail components for wavelet reconstruction as well as the adaptive threshold. The proposed algorithm employs the analysis of probability density function of the processed ECG signal. The validation of the algorithm was performed over the MIT-BIH database and has produced an average sensitivity of 99.63% and specificity of 99.78% which is in the same range as the previously proposed approaches.

自动心电信号处理可以帮助诊断多种心脏疾病。由于R峰检测的准确性对后续心电特征提取的质量影响很大，因此人们研究了许多R峰检测方法。在R峰值检测算法中，预处理和阈值处理是研究人员关注的两个重要步骤。在各种方法中，小波变换是一种广泛应用于预处理阶段的去噪方法。为了在重建过程中选择小波细节系数，各种提出的算法都需要对所考虑的信号的频谱有先验知识。此外，阈值的选择通常涉及参数微调，以达到较高的检测精度。因此，很难将这些方法应用于一般的心电数据集。因此，我们提出了一种自动的、无参数的方法，以最优地选择合适的细节分量进行小波重构和自适应阈值。该算法对处理后的心电信号进行概率密度函数分析。该算法在MIT-BIH数据库上进行了验证，产生了99.63%的平均灵敏度和99.78%的特异性，与之前提出的方法在相同的范围内。

{"title":"A robust algorithm for R peak detection based on optimal Discrete Wavelet Transform","authors":"Anurak Thungtong","doi":"10.1109/JCSSE.2017.8025931","DOIUrl":"https://doi.org/10.1109/JCSSE.2017.8025931","url":null,"abstract":"Automated ECG signal processing can assist in diagnosing several heart diseases. Many R peak detection methods have been studied because the accuracy of R peak detection significantly affects the quality of subsequent ECG feature extraction. Two important steps in R peak detection algorithm that draw attention over researchers are the preprocessing and thresholding stages. Among several methods, wavelet transform is a widely used method for removing noise in the preprocessing stage. Various proposed algorithms require prior knowledge of frequency spectrum of the signal under consideration in order to select the wavelet detail coefficients in the reconstruction process. Moreover, parameter fine tuning is generally involved in threshold selection to accomplish high detection accuracy. As a result, it may be difficult to utilize these methods for general ECG data sets. Accordingly, we propose an automatic and parameter free method that optimally selects the appropriate detail components for wavelet reconstruction as well as the adaptive threshold. The proposed algorithm employs the analysis of probability density function of the processed ECG signal. The validation of the algorithm was performed over the MIT-BIH database and has produced an average sensitivity of 99.63% and specificity of 99.78% which is in the same range as the previously proposed approaches.","PeriodicalId":6460,"journal":{"name":"2017 14th International Joint Conference on Computer Science and Software Engineering (JCSSE)","volume":"18 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73620870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Separation of occluded leaves using direction field 利用方向场分离被遮挡的叶片

2017 14th International Joint Conference on Computer Science and Software Engineering (JCSSE)

Pub Date : 2017-07-01 DOI: 10.1109/JCSSE.2017.8025929

Nicha Piemkaroonwong, U. Watchareeruetai

This paper proposes a method that separates the region of each leaf from an image of occluded leaves and produces a set of single-leaf images as an output. To identify the region of a single leaf, intersection points and direction field are required. An intersection point, which is defined as a concave point between leaves, is used as the starting position of leaf estimation process. Direction field, which describes the average direction of edges in a local area, is used to guide the estimation process. Leaf separation process applies the result of leaf estimation process to create an output. Experimental results show that 71.23% of testing leaf images were correctly separated from each other with a segmentation accuracy of 88.80%.

本文提出了一种将每片叶子的区域从被遮挡的叶子图像中分离出来，并产生一组单叶图像作为输出的方法。为了识别单叶的区域，需要交点和方向场。将叶片间的凹点定义为交点，作为叶片估计过程的起始位置。方向场描述了局部区域内边缘的平均方向，用来指导估计过程。叶片分离过程应用叶片估计过程的结果来创建输出。实验结果表明，71.23%的测试叶片图像被正确分离，分割精度为88.80%。

引用次数: 0

Modeling realistic virtual pulse of radial artery pressure waveform using haptic interface 基于触觉界面的真实虚拟脉冲桡动脉压力波形建模

2017 14th International Joint Conference on Computer Science and Software Engineering (JCSSE)

Pub Date : 2017-07-01 DOI: 10.1109/JCSSE.2017.8025954

Moragot Kandee, P. Boonbrahm, Valla Tantayotai

This paper shows an investigation of the ability of using various waveform generated by mathematical model on haptic device. Realistic virtual pulse measurement and diagnostic can be done using haptic device with pulse generated waveform, Augmented Reality (AR) environment and mannequin. The aim of this work is to propose a mathematical model for generating pulse pattern in different type of abnormal pulse waves and test them on the Phantom Omni device under AR environment. The radial arterial waveforms were generated by the setting of pulse parameters and superimposed sine waves to make the new waveforms representing various diseases. The system can simulate the radial arterial pulse waves of some diseases. This modeling technique can be used in training the nursing or health sciences students on the ability to classify various type of diseases that related to the pulse waveform.

本文研究了利用数学模型产生的各种波形在触觉设备上的应用能力。利用具有脉冲产生波形的触觉设备、增强现实环境和人体模型，可以实现真实的虚拟脉搏测量和诊断。本文旨在建立不同类型异常脉冲波产生脉冲模式的数学模型，并在增强现实环境下的Phantom Omni设备上进行测试。通过设定脉搏参数，叠加正弦波，生成代表各种疾病的新波形。该系统可以模拟一些疾病的桡动脉脉搏波。这种建模技术可用于培训护理或健康科学专业的学生对与脉搏波形相关的各种疾病进行分类的能力。

引用次数: 1

Distributed consensus-based Sybil nodes detection in VANETs VANETs中基于共识的Sybil节点检测

2017 14th International Joint Conference on Computer Science and Software Engineering (JCSSE)

Pub Date : 2017-07-01 DOI: 10.1109/JCSSE.2017.8025908

Chea Sowattana, Wantanee Viriyasitavat, A. Khurat

Vehicular Ad-hoc Networks (VANETs) is a research area focusing on improving road safety and traffic management. However, VANETs are still vulnerable to different kind of security attacks due to its infrastructure-less networking. Sybil Attack is a well-known attack in VANET. It forges multiple nodes with different identities to broadcast fake messages to manipulate the road traffic and information. In this paper, we propose a distributed detection mechanism using the neighborhood information. In our approach, a node is considered as a Sybil node if its position is inside the intersected area of two communication nodes, but it does not acknowledge by one of them. Each vehicle exchanges the information of their neighbors periodically via beacon message. The received neighbor information, from each neighbor, will be used to vote on each of the receiver node's neighbor whether they are Sybil. Simulation on different test cases are performed to observe the performance of our algorithm in term of its detection rate and false positive rate. The result depicts the increase of detection rate in the scenario where the number of surrounding neighbors is high.

车辆自组织网络(Vehicular Ad-hoc Networks, VANETs)是一个致力于改善道路安全和交通管理的研究领域。然而，由于其无基础设施的网络，VANETs仍然容易受到各种安全攻击。西比尔攻击在VANET是一个众所周知的攻击。它通过伪造多个具有不同身份的节点来广播虚假消息，从而操纵道路交通和信息。本文提出了一种基于邻域信息的分布式检测机制。在我们的方法中，如果一个节点的位置在两个通信节点的相交区域内，但它不被其中一个通信节点承认，则将其视为Sybil节点。每辆车通过信标信息定期交换其邻居的信息。从每个邻居接收到的邻居信息将用于对每个接收节点的邻居是否为Sybil进行投票。在不同的测试用例上进行了仿真，观察了算法在检测率和误报率方面的性能。结果描述了在周边邻居数量较多的情况下，检测率的提高。

{"title":"Distributed consensus-based Sybil nodes detection in VANETs","authors":"Chea Sowattana, Wantanee Viriyasitavat, A. Khurat","doi":"10.1109/JCSSE.2017.8025908","DOIUrl":"https://doi.org/10.1109/JCSSE.2017.8025908","url":null,"abstract":"Vehicular Ad-hoc Networks (VANETs) is a research area focusing on improving road safety and traffic management. However, VANETs are still vulnerable to different kind of security attacks due to its infrastructure-less networking. Sybil Attack is a well-known attack in VANET. It forges multiple nodes with different identities to broadcast fake messages to manipulate the road traffic and information. In this paper, we propose a distributed detection mechanism using the neighborhood information. In our approach, a node is considered as a Sybil node if its position is inside the intersected area of two communication nodes, but it does not acknowledge by one of them. Each vehicle exchanges the information of their neighbors periodically via beacon message. The received neighbor information, from each neighbor, will be used to vote on each of the receiver node's neighbor whether they are Sybil. Simulation on different test cases are performed to observe the performance of our algorithm in term of its detection rate and false positive rate. The result depicts the increase of detection rate in the scenario where the number of surrounding neighbors is high.","PeriodicalId":6460,"journal":{"name":"2017 14th International Joint Conference on Computer Science and Software Engineering (JCSSE)","volume":"30 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84199587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

The classification of sets of medical procedures used in the treatment of Diabetes and/or Hypertension 用于治疗糖尿病和/或高血压的医疗程序的分类

2017 14th International Joint Conference on Computer Science and Software Engineering (JCSSE)

Pub Date : 2017-07-01 DOI: 10.1109/JCSSE.2017.8025939

Kanokwan Rungreangsuparat, S. Kitisin, K. Sripanidkulchai

The advancement of technology to support data storage is easy to store with large volumes of data. In order to make data storage to be used extensively, the standard of data storage is formed. The World Health Organization defines numbers of the standard medical procedures to cover the all treatments without classifying any medical procedures by diseases. The selection of medical procedures is based on a patient's symptoms. Therefore, if the sets of medical procedures can identified, we may know the diseases of the patient or it can be used in disease surveillance. In addition, diabetes and hypertension are silent killers that have been threatening numbers of Thai people and also lead to many serious diseases. This research identified sets of medical procedures related to diabetes and/or hypertension using C4.5 and Naive Bayes algorithms. The results showed that C4.5 could identify sets of medical procedures related to Diabetes and/or Hypertension more effectively than the Naive Bayes algorithm.

支持数据存储的技术的进步使得存储大量数据变得容易。为了使数据存储得到广泛应用，形成了数据存储标准。世界卫生组织定义了标准医疗程序的数量，以涵盖所有治疗，而不按疾病分类任何医疗程序。医疗程序的选择是基于病人的症状。因此，如果这套医疗程序可以识别，我们就可以知道患者的疾病情况，或者可以用于疾病监测。此外，糖尿病和高血压是无声的杀手，一直威胁着许多泰国人，也导致许多严重的疾病。本研究使用C4.5和朴素贝叶斯算法确定了与糖尿病和/或高血压相关的医疗程序集。结果表明，C4.5可以比朴素贝叶斯算法更有效地识别与糖尿病和/或高血压相关的医疗程序集。

引用次数: 0

Vertebral pose segmentation on low radiation image using Convergence Gravity Force 基于汇聚重力的低辐射图像椎体位姿分割

2017 14th International Joint Conference on Computer Science and Software Engineering (JCSSE)

Pub Date : 2017-07-01 DOI: 10.1109/JCSSE.2017.8025959

Jakapong Boonyai, Suwanna Rasmequan

Vertebral pose segmentation is an important factor in diagnosing diseases such as osteoporosis, osteopenia and scoliosis. Low radiation X-ray images are often used to diagnose such diseases. This has been done to reduce patients risk exposure of over dose radiation which may cause from a series of treatments. In this respect, it led to a low accuracy in vertebral pose detection. In this paper, we proposed to improve the automate segmentation of low quality image of vertebral pose with a more generalized technique. In the proposed method, there are three main steps. Firstly, in the pre-processing step, Auto Cropped, Multi-Threshold and Canny Edge Detection are applied to find the vertebral bone structure from the original image. Secondly, Feature Analysis and Gravity Force were used to find the region of interest or the area of each pose. Finally, Colormaps, Intensity Diagnosis and Angle Analysis are adopted to segment each vertebral pose from candidate areas retrieved from second step. The experimental results which were compared with ground truth shown that the proposed approach can estimate vertebral pose with Precision at 79.61% and Recall at 77.11%.

椎体位姿分割是诊断骨质疏松、骨质减少和脊柱侧凸等疾病的重要因素。低辐射x射线图像常用于诊断此类疾病。这样做是为了减少患者暴露于一系列治疗可能引起的过量辐射的风险。在这方面，它导致椎体姿态检测精度低。在本文中，我们提出了一种更广义的技术来改进低质量椎体姿态图像的自动分割。在提出的方法中，主要有三个步骤。首先，在预处理步骤中，采用自动裁剪、多阈值和Canny边缘检测从原始图像中找到椎体骨结构;其次，利用特征分析和重力法找到感兴趣的区域或每个姿态的面积;最后，采用colormap、Intensity Diagnosis和Angle Analysis从第二步提取的候选区域中分割出每个椎体姿态。实验结果表明，该方法对椎体姿态的估计精度为79.61%，查全率为77.11%。

{"title":"Vertebral pose segmentation on low radiation image using Convergence Gravity Force","authors":"Jakapong Boonyai, Suwanna Rasmequan","doi":"10.1109/JCSSE.2017.8025959","DOIUrl":"https://doi.org/10.1109/JCSSE.2017.8025959","url":null,"abstract":"Vertebral pose segmentation is an important factor in diagnosing diseases such as osteoporosis, osteopenia and scoliosis. Low radiation X-ray images are often used to diagnose such diseases. This has been done to reduce patients risk exposure of over dose radiation which may cause from a series of treatments. In this respect, it led to a low accuracy in vertebral pose detection. In this paper, we proposed to improve the automate segmentation of low quality image of vertebral pose with a more generalized technique. In the proposed method, there are three main steps. Firstly, in the pre-processing step, Auto Cropped, Multi-Threshold and Canny Edge Detection are applied to find the vertebral bone structure from the original image. Secondly, Feature Analysis and Gravity Force were used to find the region of interest or the area of each pose. Finally, Colormaps, Intensity Diagnosis and Angle Analysis are adopted to segment each vertebral pose from candidate areas retrieved from second step. The experimental results which were compared with ground truth shown that the proposed approach can estimate vertebral pose with Precision at 79.61% and Recall at 77.11%.","PeriodicalId":6460,"journal":{"name":"2017 14th International Joint Conference on Computer Science and Software Engineering (JCSSE)","volume":"16 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87688592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Extracting UML class diagrams from software requirements in Thai using NLP 使用NLP从泰语软件需求中提取UML类图

2017 14th International Joint Conference on Computer Science and Software Engineering (JCSSE)

Pub Date : 2017-07-01 DOI: 10.1109/JCSSE.2017.8025938

Mathawan Jaiwai, Usa Sammapun

In software development, requirements, normally written in natural language, are documents that specify what users want in software products. Software developers then analyze these requirements to create domain models represented in UML diagrams in an attempt to comprehend what users need in the software products. These domain models are usually converted into design models and finally carried over into classes in source code. Thus, domain models have an impact on the final software products. However, creating correct domain models can be difficult when software developers are not skilled. Moreover, even for skilled developers, when requirements are large, wading through all requirements to create domain models can take times and might result in errors. Therefore, researchers have studied various approaches to apply natural language processing techniques to transform requirements written in natural language into UML diagrams. Those researches focus on requirements written in English. This paper proposes an approach to process requirements written in Thai to extract UML class diagrams using natural language processing techniques. The UML class diagram extraction is based on transformation rules that identify classes and attributes from requirements. The results are evaluated with recall and precision using truth values created by humans. Future works include identifying operations and relationships from requirements to complete class diagram extraction. Our research should benefit Thai software developers by reducing time in requirement analysis and also helping novice software developers to create correct domain models represented in UML class diagram.

在软件开发中，需求通常用自然语言编写，是指定用户在软件产品中需要什么的文档。然后，软件开发人员分析这些需求，以创建用UML图表示的领域模型，试图理解用户在软件产品中需要什么。这些领域模型通常被转换为设计模型，并最终被转移到源代码中的类中。因此，领域模型对最终的软件产品有影响。然而，当软件开发人员不熟练时，创建正确的领域模型是很困难的。此外，即使是熟练的开发人员，当需求很大时，费力地浏览所有需求来创建领域模型可能会花费时间，并可能导致错误。因此，研究人员已经研究了各种方法来应用自然语言处理技术，将用自然语言编写的需求转换为UML图。这些研究的重点是用英语写的需求。本文提出了一种处理用泰语编写的需求的方法，使用自然语言处理技术提取UML类图。UML类图的提取是基于从需求中识别类和属性的转换规则。使用人类创建的真值来评估结果的召回率和精度。未来的工作包括识别操作和关系，从需求到完整的类图提取。我们的研究应该通过减少需求分析的时间和帮助新手软件开发人员创建用UML类图表示的正确的领域模型而使泰国的软件开发人员受益。

{"title":"Extracting UML class diagrams from software requirements in Thai using NLP","authors":"Mathawan Jaiwai, Usa Sammapun","doi":"10.1109/JCSSE.2017.8025938","DOIUrl":"https://doi.org/10.1109/JCSSE.2017.8025938","url":null,"abstract":"In software development, requirements, normally written in natural language, are documents that specify what users want in software products. Software developers then analyze these requirements to create domain models represented in UML diagrams in an attempt to comprehend what users need in the software products. These domain models are usually converted into design models and finally carried over into classes in source code. Thus, domain models have an impact on the final software products. However, creating correct domain models can be difficult when software developers are not skilled. Moreover, even for skilled developers, when requirements are large, wading through all requirements to create domain models can take times and might result in errors. Therefore, researchers have studied various approaches to apply natural language processing techniques to transform requirements written in natural language into UML diagrams. Those researches focus on requirements written in English. This paper proposes an approach to process requirements written in Thai to extract UML class diagrams using natural language processing techniques. The UML class diagram extraction is based on transformation rules that identify classes and attributes from requirements. The results are evaluated with recall and precision using truth values created by humans. Future works include identifying operations and relationships from requirements to complete class diagram extraction. Our research should benefit Thai software developers by reducing time in requirement analysis and also helping novice software developers to create correct domain models represented in UML class diagram.","PeriodicalId":6460,"journal":{"name":"2017 14th International Joint Conference on Computer Science and Software Engineering (JCSSE)","volume":"15 1","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89596192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

A performance comparison of Apache Tez and MapReduce with data compression on Hadoop cluster Apache Tez和MapReduce在Hadoop集群上数据压缩的性能比较

2017 14th International Joint Conference on Computer Science and Software Engineering (JCSSE)

Pub Date : 2017-07-01 DOI: 10.1109/JCSSE.2017.8025950

Kritwara Rattanaopas

Big data is a popular topic on cloud computing research. The main characteristics of big data are volume, velocity and variety. These characteristics are difficult to handle by using traditional softwares and methods. Hadoop is open-source framework software which was developed to provide solutions for handling several domains of big data problems. For big data analytic, MapReduce framework is a main engine of Hadoop cluster and widely used nowadays. It uses a batch oriented processing. Apache also developed an alternative engine called “Tez”. It supports an interactive query and does not write temporary data into HDFS. In this paper, we focus on the performance comparison between MapReduce and Tez. We also investigate the performance of these two engines with the compression of input files and map output files. Bzip is a compression algorithm used for input files and snappy is used for map output files. Word-count and terasort benchmarks are used in our experiments. For the word-count benchmark, the results show that Tez engine always has better execution-time than MapReduce engine for both of compressed data or non-compressed data. It can reduce an execution-time up to 39% comparing with the execution time of MapReduce engine. In contrast, the results show that Tez engine usually has higher execution-time than MapReduce engine up to 13% for terasort benchmark. The results also show that the performance of compressing map output files with snappy provides better performance on execution time for both benchmarks.

大数据是云计算研究的热门话题。大数据的主要特点是量大、速度快、种类多。这些特点是传统的软件和方法难以处理的。Hadoop是开源框架软件，它的开发是为了提供解决方案来处理几个领域的大数据问题。对于大数据分析，MapReduce框架是Hadoop集群的主要引擎，目前应用广泛。它使用面向批处理的处理。Apache还开发了一种名为“Tez”的替代引擎。它支持交互式查询，不将临时数据写入HDFS。在本文中，我们着重于MapReduce和Tez之间的性能比较。我们还研究了这两个引擎在压缩输入文件和映射输出文件方面的性能。Bzip是用于输入文件的压缩算法，snappy用于映射输出文件。在我们的实验中使用了单词计数和分类基准。对于单词计数的基准测试，结果表明Tez引擎无论对压缩数据还是非压缩数据都比MapReduce引擎有更好的执行时间。与MapReduce引擎相比，它可以减少高达39%的执行时间。相比之下，结果表明Tez引擎通常比MapReduce引擎具有更高的执行时间，在terassort基准测试中高达13%。结果还表明，使用snappy压缩映射输出文件的性能为两个基准测试提供了更好的执行时间性能。

{"title":"A performance comparison of Apache Tez and MapReduce with data compression on Hadoop cluster","authors":"Kritwara Rattanaopas","doi":"10.1109/JCSSE.2017.8025950","DOIUrl":"https://doi.org/10.1109/JCSSE.2017.8025950","url":null,"abstract":"Big data is a popular topic on cloud computing research. The main characteristics of big data are volume, velocity and variety. These characteristics are difficult to handle by using traditional softwares and methods. Hadoop is open-source framework software which was developed to provide solutions for handling several domains of big data problems. For big data analytic, MapReduce framework is a main engine of Hadoop cluster and widely used nowadays. It uses a batch oriented processing. Apache also developed an alternative engine called “Tez”. It supports an interactive query and does not write temporary data into HDFS. In this paper, we focus on the performance comparison between MapReduce and Tez. We also investigate the performance of these two engines with the compression of input files and map output files. Bzip is a compression algorithm used for input files and snappy is used for map output files. Word-count and terasort benchmarks are used in our experiments. For the word-count benchmark, the results show that Tez engine always has better execution-time than MapReduce engine for both of compressed data or non-compressed data. It can reduce an execution-time up to 39% comparing with the execution time of MapReduce engine. In contrast, the results show that Tez engine usually has higher execution-time than MapReduce engine up to 13% for terasort benchmark. The results also show that the performance of compressing map output files with snappy provides better performance on execution time for both benchmarks.","PeriodicalId":6460,"journal":{"name":"2017 14th International Joint Conference on Computer Science and Software Engineering (JCSSE)","volume":"42 1","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90120331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Strabismus screening by Eye Tracker and games 眼动仪和游戏的斜视筛查

2017 14th International Joint Conference on Computer Science and Software Engineering (JCSSE)

Pub Date : 2017-07-01 DOI: 10.1109/JCSSE.2017.8025956

Udomchai Saisara, P. Boonbrahm, Achara Chaiwiriya

More than 5% of Thai people have strabismus. Strabismus is known as cross-eyed or wall-eyed because the visual field angle of two eyes is not parallel. The amblyopia disease is the cause of strabismus in kids. Strabismus can be completely cured if the strabismus screening can be made in early stage. Currently, strabismus screening includes methods such as Hirschberg test, cover test and Krimsky test, and etc. The strabismus screening in kids is difficult and takes a lot time in special room. This research intend to develop a computer system to assist strabismus screening using the combination of computer games and eye tracking devices so that the screening results will be more accurate and exact. This screening technique requires shorter time and it is easy to use, so it is better in terms of efficiency and reducing time for strabismus screening.

超过5%的泰国人患有斜视。斜视被称为对眼或壁眼，因为两只眼睛的视野角度不平行。弱视疾病是造成儿童斜视的原因。如果能及早进行斜视筛查，斜视是可以完全治愈的。目前，斜视的筛查方法主要有Hirschberg试验、cover试验、Krimsky试验等。儿童斜视的筛查是一项困难的工作，需要在特殊的房间里花费大量的时间。本研究拟开发电脑游戏与眼动追踪设备相结合的辅助斜视筛查的电脑系统，使筛查结果更加准确、准确。这种筛查技术所需时间短，使用方便，因此在斜视筛查的效率和减少时间方面较好。

引用次数: 7

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2017 14th International Joint Conference on Computer Science and Software Engineering (JCSSE)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀