首页 > 最新文献

Proceedings. Data Compression Conference最新文献

英文 中文
Design and implementation of an OpenFlow hardware abstraction layer OpenFlow硬件抽象层的设计与实现
Pub Date : 2014-08-18 DOI: 10.1145/2627566.2627577
D. Parniewicz, R. D. Corin, L. Ogrodowczyk, Mehdi Rashidi-Fard, J. Matías, M. Gerola, Victor Fuentes, U. Toseef, A. Zaalouk, B. Belter, E. Jacob, K. Pentikousis
OpenFlow is a leading standard for Software-Defined Networking (SDN) and has already played a significant role in reshaping network infrastructures. However, a wide range of existing provider domains is still not equipped with a framework that supports wider deployment of an OpenFlow-based control plane beyond Ethernet-dominated networks. We address this gap by introducing a Hardware Abstraction Layer (HAL) which can transform legacy network elements into OpenFlow capable devices. This paper details the functional architecture of HAL, discusses the key design aspects and explains how HAL can support a number of network device classes. In addition, this paper presents the implementation details of HAL for hardware platforms such as DOCSIS (Data over Cable Service Interface Specification) and DWDM (Dense Wavelength Division Multiplexing) which have so far received little attention by the OpenFlow research community despite their wide real-world deployment.
OpenFlow是软件定义网络(SDN)的领先标准,已经在重塑网络基础设施方面发挥了重要作用。然而,大量现有的提供商领域仍然没有配备一个框架来支持基于openflow的控制平面在以太网主导的网络之外的更广泛部署。我们通过引入硬件抽象层(HAL)来解决这个问题,HAL可以将传统的网络元素转换为支持OpenFlow的设备。本文详细介绍了HAL的功能体系结构,讨论了关键的设计方面,并解释了HAL如何支持许多网络设备类。此外,本文还介绍了HAL在硬件平台上的实现细节,例如DOCSIS(电缆上的数据服务接口规范)和DWDM(密集波分复用),尽管它们在现实世界中得到了广泛的部署,但迄今为止很少受到OpenFlow研究社区的关注。
{"title":"Design and implementation of an OpenFlow hardware abstraction layer","authors":"D. Parniewicz, R. D. Corin, L. Ogrodowczyk, Mehdi Rashidi-Fard, J. Matías, M. Gerola, Victor Fuentes, U. Toseef, A. Zaalouk, B. Belter, E. Jacob, K. Pentikousis","doi":"10.1145/2627566.2627577","DOIUrl":"https://doi.org/10.1145/2627566.2627577","url":null,"abstract":"OpenFlow is a leading standard for Software-Defined Networking (SDN) and has already played a significant role in reshaping network infrastructures. However, a wide range of existing provider domains is still not equipped with a framework that supports wider deployment of an OpenFlow-based control plane beyond Ethernet-dominated networks. We address this gap by introducing a Hardware Abstraction Layer (HAL) which can transform legacy network elements into OpenFlow capable devices. This paper details the functional architecture of HAL, discusses the key design aspects and explains how HAL can support a number of network device classes. In addition, this paper presents the implementation details of HAL for hardware platforms such as DOCSIS (Data over Cable Service Interface Specification) and DWDM (Dense Wavelength Division Multiplexing) which have so far received little attention by the OpenFlow research community despite their wide real-world deployment.","PeriodicalId":91161,"journal":{"name":"Proceedings. Data Compression Conference","volume":"262 1","pages":"71-76"},"PeriodicalIF":0.0,"publicationDate":"2014-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79787047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
Response time-optimized distributed cloud resource allocation 响应时间优化的分布式云资源分配
Pub Date : 2014-08-18 DOI: 10.1145/2627566.2627570
Matthias Keller, H. Karl
In the near future many more compute resources will be available at different geographical locations. To minimize the response time of requests, application servers closer to the user can hence be used to shorten network round trip times. However, this advantage is neutralized if the used data centre is highly loaded as the processing time of requests is important as well. We model the request response time as the network round trip time plus the processing time at a data centre.We present a capacitated facility location problem formalization where the processing time is modelled as the sojourn time of a queueing model. We discuss the emph{Pareto trade-off} between the number of used data centres and the resulting response time. For example, using fewer data centres could cut expenses but results in high utilization, high response time, and smaller revenues.Previous work presented a non-linear cost function. We prove its emph{convexity} and exploit this property in two ways: First, we transform the convex model into a linear model while controlling the maximum approximation error. Second, we used a convex solver instead of a slower non-linear solver.Numerical results on network topologies exemplify our work.
在不久的将来,更多的计算资源将在不同的地理位置可用。为了最小化请求的响应时间,可以使用离用户更近的应用服务器来缩短网络往返时间。但是,如果使用的数据中心负载很高,这个优势就会被抵消,因为请求的处理时间也很重要。我们将请求响应时间建模为网络往返时间加上数据中心的处理时间。提出了一种可容设施选址问题的形式化方法,将处理时间建模为排队模型的停留时间。我们将讨论所使用的数据中心数量与由此产生的响应时间之间的emph{帕累托权衡}关系。例如,使用更少的数据中心可以削减开支,但会导致高利用率、高响应时间和更少的收入。以前的工作提出了一个非线性成本函数。我们证明了它的emph{凸性},并从两个方面利用了这一性质:首先,我们在控制最大逼近误差的同时,将凸模型转化为线性模型。其次,我们使用凸求解器而不是较慢的非线性求解器。网络拓扑的数值结果证明了我们的工作。
{"title":"Response time-optimized distributed cloud resource allocation","authors":"Matthias Keller, H. Karl","doi":"10.1145/2627566.2627570","DOIUrl":"https://doi.org/10.1145/2627566.2627570","url":null,"abstract":"In the near future many more compute resources will be available at different geographical locations. To minimize the response time of requests, application servers closer to the user can hence be used to shorten network round trip times. However, this advantage is neutralized if the used data centre is highly loaded as the processing time of requests is important as well. We model the request response time as the network round trip time plus the processing time at a data centre.We present a capacitated facility location problem formalization where the processing time is modelled as the sojourn time of a queueing model. We discuss the emph{Pareto trade-off} between the number of used data centres and the resulting response time. For example, using fewer data centres could cut expenses but results in high utilization, high response time, and smaller revenues.Previous work presented a non-linear cost function. We prove its emph{convexity} and exploit this property in two ways: First, we transform the convex model into a linear model while controlling the maximum approximation error. Second, we used a convex solver instead of a slower non-linear solver.Numerical results on network topologies exemplify our work.","PeriodicalId":91161,"journal":{"name":"Proceedings. Data Compression Conference","volume":"39 1","pages":"47-52"},"PeriodicalIF":0.0,"publicationDate":"2014-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80764460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Achieving efficient and fast update for multiple flows in software-defined networks 实现软件定义网络中多个流的高效快速更新
Pub Date : 2014-08-18 DOI: 10.1145/2627566.2627572
Yujie Liu, Yong Li, Yue Wang, A. Vasilakos, Jian Yuan
Aiming to adapt traffic dynamics, deal with network errors, perform planned maintenance, etc., flow update is carried out frequently in Software-Defined Networks (SDN) to change the data plane configuration, and how to update the flows efficiently and successfully is an important and challenging problem. In this work, we address the multi-flow update problem and present a polynomial-time heuristic algorithm, which aims at completing the update in the shortest time considering link bandwidth and flow table size constraints. By extensive simulations under real network settings, we demonstrate the effectiveness and efficiency of our algorithm, which has near-optimal performance and is hundreds of times faster than the optimal solution.
为了适应流量动态、处理网络错误、进行计划性维护等,软件定义网络(SDN)经常进行流量更新以改变数据平面配置,如何高效、成功地更新流量是一个重要而具有挑战性的问题。在这项工作中,我们解决了多流更新问题,并提出了一个多项式时间启发式算法,旨在考虑链路带宽和流表大小的约束,在最短的时间内完成更新。通过在真实网络设置下的大量模拟,我们证明了算法的有效性和效率,该算法具有接近最优的性能,并且比最优解快数百倍。
{"title":"Achieving efficient and fast update for multiple flows in software-defined networks","authors":"Yujie Liu, Yong Li, Yue Wang, A. Vasilakos, Jian Yuan","doi":"10.1145/2627566.2627572","DOIUrl":"https://doi.org/10.1145/2627566.2627572","url":null,"abstract":"Aiming to adapt traffic dynamics, deal with network errors, perform planned maintenance, etc., flow update is carried out frequently in Software-Defined Networks (SDN) to change the data plane configuration, and how to update the flows efficiently and successfully is an important and challenging problem. In this work, we address the multi-flow update problem and present a polynomial-time heuristic algorithm, which aims at completing the update in the shortest time considering link bandwidth and flow table size constraints. By extensive simulations under real network settings, we demonstrate the effectiveness and efficiency of our algorithm, which has near-optimal performance and is hundreds of times faster than the optimal solution.","PeriodicalId":91161,"journal":{"name":"Proceedings. Data Compression Conference","volume":"73 1","pages":"77-82"},"PeriodicalIF":0.0,"publicationDate":"2014-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79167536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Modeling and simulation of concurrent workload processing in cloud-distributed enterprise information systems 云分布式企业信息系统中并发工作负载处理的建模与仿真
Pub Date : 2014-08-18 DOI: 10.1145/2627566.2627575
Alexandru-Florian Antonescu, T. Braun
Cloud Computing enables provisioning and distribution of highly scalable services in a reliable, on-demand and sustainable manner. Distributed Enterprise Information Systems (dEIS) are a class of applications with important economic value and with strong requirements in terms of performance and reliability. In order to validate dEIS architectures, stability, scaling and SLA compliance, large testing deployments are necessary, adding complexity to the design and testing of such systems. To fill this gap, we present and validate a methodology for modeling and simulating such complex distributed systems using the CloudSim cloud computing simulator, based on measurement data from an actual distributed system. We present an approach for creating a performance-based model of a distributed cloud application using recorded service performance traces. We then show how to integrate the created model into CloudSim. We validate the CloudSim simulation model by comparing performance traces gathered during distributed concurrent experiments with simulation results using different VM configurations. We demonstrate the usefulness of using a cloud simulator for modeling properties of real cloud-distributed applications.
云计算支持以可靠、按需和可持续的方式提供和分发高度可扩展的服务。分布式企业信息系统(dEIS)是一类具有重要经济价值、对性能和可靠性要求很高的应用。为了验证dEIS架构、稳定性、可伸缩性和SLA遵从性,需要进行大规模的测试部署,这增加了此类系统的设计和测试的复杂性。为了填补这一空白,我们提出并验证了使用CloudSim云计算模拟器建模和模拟这种复杂分布式系统的方法,该方法基于实际分布式系统的测量数据。我们提出了一种使用记录的服务性能跟踪来创建分布式云应用程序的基于性能的模型的方法。然后我们将展示如何将创建的模型集成到CloudSim中。我们通过比较分布式并发实验期间收集的性能跟踪与使用不同VM配置的仿真结果来验证CloudSim仿真模型。我们演示了使用云模拟器对真实的云分布式应用程序的属性建模的有用性。
{"title":"Modeling and simulation of concurrent workload processing in cloud-distributed enterprise information systems","authors":"Alexandru-Florian Antonescu, T. Braun","doi":"10.1145/2627566.2627575","DOIUrl":"https://doi.org/10.1145/2627566.2627575","url":null,"abstract":"Cloud Computing enables provisioning and distribution of highly scalable services in a reliable, on-demand and sustainable manner. Distributed Enterprise Information Systems (dEIS) are a class of applications with important economic value and with strong requirements in terms of performance and reliability. In order to validate dEIS architectures, stability, scaling and SLA compliance, large testing deployments are necessary, adding complexity to the design and testing of such systems. To fill this gap, we present and validate a methodology for modeling and simulating such complex distributed systems using the CloudSim cloud computing simulator, based on measurement data from an actual distributed system. We present an approach for creating a performance-based model of a distributed cloud application using recorded service performance traces. We then show how to integrate the created model into CloudSim. We validate the CloudSim simulation model by comparing performance traces gathered during distributed concurrent experiments with simulation results using different VM configurations. We demonstrate the usefulness of using a cloud simulator for modeling properties of real cloud-distributed applications.","PeriodicalId":91161,"journal":{"name":"Proceedings. Data Compression Conference","volume":"2003 1","pages":"11-16"},"PeriodicalIF":0.0,"publicationDate":"2014-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82911463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Capacity of inter-cloud layer-2 virtual networking 云间第二层虚拟组网容量
Pub Date : 2014-08-18 DOI: 10.1145/2627566.2627573
Yufeng Xin, I. Baldin, Chris Heermann, A. Mandal, P. Ruth
Due to the economy of scale of Ethernet networks and available dynamic circuit capability from the major national research and educational networks, VLAN (Virtual LAN) based virtual networking solution has been successfully adopted in some advanced distributed cloud systems. However, there are two major constraints in this adaptation: (1) dynamic circuit service is far from pervasive; (2) there is only limited VLAN tags offered by regional network service providers. In this paper, after examining layer-2 networking in large-scale distributed cloud environments, we present a graph theoretical model to study the network capacity in terms of the number of inter-cloud connections that can co-exist. We further design the algorithms to achieve this capacity for both point-to-point and multi-point inter-cloud connections in both static and dynamic scenarios. We also study a general topology embedding problem based on this model. As tagging is a common mechanism for isolating communication channels in other network layers, the proposed models and algorithms can be extended to optical and IP networks.
由于以太网的规模经济和国家主要科研和教育网络的动态电路能力,基于VLAN (Virtual LAN)的虚拟网络解决方案已成功应用于一些先进的分布式云系统。然而,这种适应存在两个主要制约因素:(1)动态电路服务还远远没有普及;(2)区域网络服务提供商提供的VLAN标签有限。在本文中,在研究了大规模分布式云环境中的第二层网络之后,我们提出了一个图理论模型,以云间可以共存的连接数量来研究网络容量。我们进一步设计了算法,以在静态和动态场景中实现点对点和多点云间连接的能力。并在此基础上研究了一般的拓扑嵌入问题。由于标签是隔离其他网络层通信通道的常用机制,因此所提出的模型和算法可以扩展到光网络和IP网络。
{"title":"Capacity of inter-cloud layer-2 virtual networking","authors":"Yufeng Xin, I. Baldin, Chris Heermann, A. Mandal, P. Ruth","doi":"10.1145/2627566.2627573","DOIUrl":"https://doi.org/10.1145/2627566.2627573","url":null,"abstract":"Due to the economy of scale of Ethernet networks and available dynamic circuit capability from the major national research and educational networks, VLAN (Virtual LAN) based virtual networking solution has been successfully adopted in some advanced distributed cloud systems. However, there are two major constraints in this adaptation: (1) dynamic circuit service is far from pervasive; (2) there is only limited VLAN tags offered by regional network service providers.\u0000 In this paper, after examining layer-2 networking in large-scale distributed cloud environments, we present a graph theoretical model to study the network capacity in terms of the number of inter-cloud connections that can co-exist. We further design the algorithms to achieve this capacity for both point-to-point and multi-point inter-cloud connections in both static and dynamic scenarios. We also study a general topology embedding problem based on this model. As tagging is a common mechanism for isolating communication channels in other network layers, the proposed models and algorithms can be extended to optical and IP networks.","PeriodicalId":91161,"journal":{"name":"Proceedings. Data Compression Conference","volume":"9 1","pages":"31-36"},"PeriodicalIF":0.0,"publicationDate":"2014-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74348847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Optimizing job reliability via contention-free, distributed scheduling of vm checkpointing 通过无争用、分布式调度vm检查点来优化作业可靠性
Pub Date : 2014-08-18 DOI: 10.1145/2627566.2627568
Yu Xiang, Hang Liu, Tian Lan, H. H. Huang, S. Subramaniam
Checkpointing a virtual machine (VM) is a proven technique to improve the reliability in modern datacenters. Inspired by the CSMA protocol in wireless congestion control, we propose a novel framework for distributed and contention-free scheduling of VM checkpointing to offer reliability as a transparent, elastic service in datacenters. In this work, we quantify the reliability in closed form by studying system stationary behaviors, and maximize the job reliability through utility optimization. We implement a proof-of-concept prototype based on our design. Evaluation results show that the proposed checkpoint scheduling can significantly reduce the performance interference from checkpointing and improve reliability by as much as one order of magnitude over contention-oblivious scheme.
在现代数据中心中,检查点虚拟机(VM)是一种经过验证的提高可靠性的技术。受无线拥塞控制中的CSMA协议的启发,我们提出了一种新的虚拟机检查点分布式和无争用调度框架,以提供数据中心透明、弹性的可靠性服务。本文通过研究系统平稳行为,以封闭形式量化可靠性,并通过效用优化实现作业可靠性最大化。我们基于我们的设计实现了一个概念验证原型。评估结果表明,所提出的检查点调度可以显著减少检查点对性能的干扰,并将可靠性提高一个数量级。
{"title":"Optimizing job reliability via contention-free, distributed scheduling of vm checkpointing","authors":"Yu Xiang, Hang Liu, Tian Lan, H. H. Huang, S. Subramaniam","doi":"10.1145/2627566.2627568","DOIUrl":"https://doi.org/10.1145/2627566.2627568","url":null,"abstract":"Checkpointing a virtual machine (VM) is a proven technique to improve the reliability in modern datacenters. Inspired by the CSMA protocol in wireless congestion control, we propose a novel framework for distributed and contention-free scheduling of VM checkpointing to offer reliability as a transparent, elastic service in datacenters. In this work, we quantify the reliability in closed form by studying system stationary behaviors, and maximize the job reliability through utility optimization. We implement a proof-of-concept prototype based on our design. Evaluation results show that the proposed checkpoint scheduling can significantly reduce the performance interference from checkpointing and improve reliability by as much as one order of magnitude over contention-oblivious scheme.","PeriodicalId":91161,"journal":{"name":"Proceedings. Data Compression Conference","volume":"207 1","pages":"59-64"},"PeriodicalIF":0.0,"publicationDate":"2014-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77131825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A decomposition-based architecture for distributed virtual network embedding 基于分解的分布式虚拟网络嵌入体系结构
Pub Date : 2014-03-14 DOI: 10.1145/2627566.2627569
Flavio Esposito, I. Matta
Network protocols have historically been developed on an ad-hoc basis, and cloud computing is no exception. A fundamental management protocol, not yet standardized, that cloud providers need to run to support wide-area virtual network services is the virtual network (VN) embedding protocol. In this paper, we use decomposition theory to provide a unifying architecture for the VN embedding problem. We show how our architecture subsumes existing solutions, and how it can be used by cloud providers to design a distributed VN embedding protocol that adapts to different scenarios, by merely instantiating different decomposition policies. We analyze key representative tradeoffs via simulation, and with our VN embedding testbed that uses a Linux system architecture to reserve virtual node and link capacities. In contrast with existing VN embedding solutions, we found that partitioning a VN request not only increases the signaling overhead, but may decrease cloud providers' revenue.
网络协议历来都是在临时基础上开发的,云计算也不例外。云提供商需要运行一个尚未标准化的基本管理协议来支持广域虚拟网络服务,这就是虚拟网络(VN)嵌入协议。本文利用分解理论为VN嵌入问题提供了一个统一的体系结构。我们将展示我们的体系结构如何包含现有的解决方案,以及云提供商如何使用它来设计一个分布式VN嵌入协议,该协议可以通过实例化不同的分解策略来适应不同的场景。我们通过仿真分析了关键的代表性权衡,并使用我们的VN嵌入测试平台,该平台使用Linux系统架构来保留虚拟节点和链路容量。与现有的VN嵌入解决方案相比,我们发现分割VN请求不仅增加了信令开销,而且可能减少云提供商的收入。
{"title":"A decomposition-based architecture for distributed virtual network embedding","authors":"Flavio Esposito, I. Matta","doi":"10.1145/2627566.2627569","DOIUrl":"https://doi.org/10.1145/2627566.2627569","url":null,"abstract":"Network protocols have historically been developed on an ad-hoc basis, and cloud computing is no exception. A fundamental management protocol, not yet standardized, that cloud providers need to run to support wide-area virtual network services is the virtual network (VN) embedding protocol.\u0000 In this paper, we use decomposition theory to provide a unifying architecture for the VN embedding problem. We show how our architecture subsumes existing solutions, and how it can be used by cloud providers to design a distributed VN embedding protocol that adapts to different scenarios, by merely instantiating different decomposition policies. We analyze key representative tradeoffs via simulation, and with our VN embedding testbed that uses a Linux system architecture to reserve virtual node and link capacities. In contrast with existing VN embedding solutions, we found that partitioning a VN request not only increases the signaling overhead, but may decrease cloud providers' revenue.","PeriodicalId":91161,"journal":{"name":"Proceedings. Data Compression Conference","volume":"39 1","pages":"53-58"},"PeriodicalIF":0.0,"publicationDate":"2014-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85336295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
An Adaptive Difference Distribution-based Coding with Hierarchical Tree Structure for DNA Sequence Compression. 基于层次树结构的DNA序列自适应差分编码。
Pub Date : 2013-01-01 Epub Date: 2013-03-22 DOI: 10.1109/DCC.2013.45
Wenrui Dai, Hongkai Xiong, Xiaoqian Jiang, Lucila Ohno-Machado

Previous reference-based compression on DNA sequences do not fully exploit the intrinsic statistics by merely concerning the approximate matches. In this paper, an adaptive difference distribution-based coding framework is proposed by the fragments of nucleotides with a hierarchical tree structure. To keep the distribution of difference sequence from the reference and target sequences concentrated, the sub-fragment size and matching offset for predicting are flexible to the stepped size structure. The matching with approximate repeats in reference will be imposed with the Hamming-like weighted distance measure function in a local region closed to the current fragment, such that the accuracy of matching and the overhead of describing matching offset can be balanced. A well-designed coding scheme will make compact both the difference sequence and the additional parameters, e.g. sub-fragment size and matching offset. Experimental results show that the proposed scheme achieves 150% compression improvement in comparison with the best reference-based compressor GReEn.

以往的基于参考的DNA序列压缩,并没有充分利用内在统计数据,仅仅关注近似匹配。本文提出了一种基于差异分布的自适应编码框架,该框架由核苷酸片段组成,具有层次树结构。为了保持与参考序列和目标序列的差异序列分布集中,预测的子片段大小和匹配偏移量可灵活地适应步长结构。在当前片段附近的局部区域,使用haming -like加权距离度量函数对参考中近似重复的匹配进行施加,这样可以平衡匹配的精度和描述匹配偏移量的开销。设计良好的编码方案将使差异序列和附加参数(如子片段大小和匹配偏移量)都变得紧凑。实验结果表明,与基于参考的最佳压缩器GReEn相比,该方案的压缩性能提高了150%。
{"title":"An Adaptive Difference Distribution-based Coding with Hierarchical Tree Structure for DNA Sequence Compression.","authors":"Wenrui Dai,&nbsp;Hongkai Xiong,&nbsp;Xiaoqian Jiang,&nbsp;Lucila Ohno-Machado","doi":"10.1109/DCC.2013.45","DOIUrl":"https://doi.org/10.1109/DCC.2013.45","url":null,"abstract":"<p><p>Previous reference-based compression on DNA sequences do not fully exploit the intrinsic statistics by merely concerning the approximate matches. In this paper, an adaptive difference distribution-based coding framework is proposed by the fragments of nucleotides with a hierarchical tree structure. To keep the distribution of difference sequence from the reference and target sequences concentrated, the sub-fragment size and matching offset for predicting are flexible to the stepped size structure. The matching with approximate repeats in reference will be imposed with the Hamming-like weighted distance measure function in a local region closed to the current fragment, such that the accuracy of matching and the overhead of describing matching offset can be balanced. A well-designed coding scheme will make compact both the difference sequence and the additional parameters, e.g. sub-fragment size and matching offset. Experimental results show that the proposed scheme achieves 150% compression improvement in comparison with the best reference-based compressor GReEn.</p>","PeriodicalId":91161,"journal":{"name":"Proceedings. Data Compression Conference","volume":"2013 ","pages":"371-380"},"PeriodicalIF":0.0,"publicationDate":"2013-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/DCC.2013.45","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34117399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Virtual Impression Networks for Capturing Deep Impressions 虚拟印象网络捕捉深刻的印象
Pub Date : 2011-09-01 DOI: 10.1007/978-94-007-0510-4_30
T. Taura, E. Yamamoto, M. Y. N. Fasiha, Y. Nagai
{"title":"Virtual Impression Networks for Capturing Deep Impressions","authors":"T. Taura, E. Yamamoto, M. Y. N. Fasiha, Y. Nagai","doi":"10.1007/978-94-007-0510-4_30","DOIUrl":"https://doi.org/10.1007/978-94-007-0510-4_30","url":null,"abstract":"","PeriodicalId":91161,"journal":{"name":"Proceedings. Data Compression Conference","volume":"7 1","pages":"559-578"},"PeriodicalIF":0.0,"publicationDate":"2011-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90778402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
A Redefinition of the Paradox of Choice 重新定义选择悖论
Pub Date : 2011-09-01 DOI: 10.1007/978-94-007-0510-4_19
M. Piasecki, S. Hanna
{"title":"A Redefinition of the Paradox of Choice","authors":"M. Piasecki, S. Hanna","doi":"10.1007/978-94-007-0510-4_19","DOIUrl":"https://doi.org/10.1007/978-94-007-0510-4_19","url":null,"abstract":"","PeriodicalId":91161,"journal":{"name":"Proceedings. Data Compression Conference","volume":"38 1","pages":"347-366"},"PeriodicalIF":0.0,"publicationDate":"2011-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79028315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
期刊
Proceedings. Data Compression Conference
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1