首页 > 最新文献

2014 IEEE 33rd International Performance Computing and Communications Conference (IPCCC)最新文献

英文 中文
Resisting label-neighborhood attacks in outsourced social networks 在外包社交网络中抵制标签社区攻击
Yang Wang, Fudong Qiu, Fan Wu, Guihai Chen
With the popularity of cloud computing, many companies would outsource their social network data to a cloud service provider, where privacy leaks have become a more and more serious problem. However, most of the previous studies have ignored an important fact, i.e., in real social networks, users possess various attributes and have the flexibility to decide which attributes of their profiles are sensitive attributes by themselves. These sensitive attributes of the users should be protected from being revealed when outsourcing a social network to a cloud service provider. In this paper, we consider the problem of resisting privacy attacks with neighborhood information of both network structure and labels of one-hop neighbors as background knowledge. To tackle this problem, we propose a Global Similarity-based Group Anonymization (GSGA) method to generate a anonymized social network while maintaining as much utility as possible. We also extensively evaluate our approach on both real data set and synthetic data sets. Evaluation results show that the social network anonymized by our approach can still be used to answer aggregation queries with high accuracy.
随着云计算的普及,许多公司将社交网络数据外包给云服务提供商,隐私泄露问题越来越严重。然而,以往的研究大多忽略了一个重要的事实,即在真实的社交网络中,用户具有多种属性,并且可以灵活地自行决定其个人资料中哪些属性是敏感属性。在将社交网络外包给云服务提供商时,应保护用户的这些敏感属性不被泄露。本文考虑了利用网络结构的邻域信息和一跳邻居标签作为背景知识来抵抗隐私攻击的问题。为了解决这个问题,我们提出了一种基于全局相似性的群体匿名化(GSGA)方法来生成匿名的社交网络,同时保持尽可能多的效用。我们还在真实数据集和合成数据集上广泛评估了我们的方法。评估结果表明,通过我们的方法匿名化的社交网络仍然可以用于回答聚合查询,并且准确率很高。
{"title":"Resisting label-neighborhood attacks in outsourced social networks","authors":"Yang Wang, Fudong Qiu, Fan Wu, Guihai Chen","doi":"10.1109/PCCC.2014.7017106","DOIUrl":"https://doi.org/10.1109/PCCC.2014.7017106","url":null,"abstract":"With the popularity of cloud computing, many companies would outsource their social network data to a cloud service provider, where privacy leaks have become a more and more serious problem. However, most of the previous studies have ignored an important fact, i.e., in real social networks, users possess various attributes and have the flexibility to decide which attributes of their profiles are sensitive attributes by themselves. These sensitive attributes of the users should be protected from being revealed when outsourcing a social network to a cloud service provider. In this paper, we consider the problem of resisting privacy attacks with neighborhood information of both network structure and labels of one-hop neighbors as background knowledge. To tackle this problem, we propose a Global Similarity-based Group Anonymization (GSGA) method to generate a anonymized social network while maintaining as much utility as possible. We also extensively evaluate our approach on both real data set and synthetic data sets. Evaluation results show that the social network anonymized by our approach can still be used to answer aggregation queries with high accuracy.","PeriodicalId":105442,"journal":{"name":"2014 IEEE 33rd International Performance Computing and Communications Conference (IPCCC)","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115655298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Let more nodes have a second choice 让更多的节点有第二个选择
Haijun Geng, Xingang Shi, Xia Yin, Zhiliang Wang, Han Zhang, Jiangyuan Yao
Current intra-domain routing protocols computes only shortest paths for any pair of nodes which cannot provide good fast reroute when network failures occur. Multipath routing can be fundamentally more efficient than the currently used single path routing protocols. It can significantly reduce congestion in network by shifting traffic to unused network resources. This improves network utilization and provides load balancing. To enhance failure resiliency we propose a new scheme More Nodes Have At Least Two Choices (MNTC) where the goal is how to maximize the number of nodes that have at least two next-hops towards their destinations. We evaluate the algorithm in a wide space of relevant topologies and the results show that it can achieve good reliability while keeping low stretch.
当前的域内路由协议只能计算任意一对节点的最短路径,不能在网络出现故障时提供良好的快速路由。多路径路由可以从根本上比目前使用的单路径路由协议更有效。它可以通过将流量转移到未使用的网络资源来显著减少网络拥塞。这样可以提高网络利用率并提供负载均衡。为了提高故障恢复能力,我们提出了一个新的方案“多节点至少有两个选择”(More Nodes Have Least Two Choices, MNTC),其目标是如何最大化至少有两个下一跳到达目的地的节点数量。我们在广泛的相关拓扑空间中对该算法进行了评估,结果表明该算法在保持低拉伸的同时可以获得良好的可靠性。
{"title":"Let more nodes have a second choice","authors":"Haijun Geng, Xingang Shi, Xia Yin, Zhiliang Wang, Han Zhang, Jiangyuan Yao","doi":"10.1109/PCCC.2014.7017018","DOIUrl":"https://doi.org/10.1109/PCCC.2014.7017018","url":null,"abstract":"Current intra-domain routing protocols computes only shortest paths for any pair of nodes which cannot provide good fast reroute when network failures occur. Multipath routing can be fundamentally more efficient than the currently used single path routing protocols. It can significantly reduce congestion in network by shifting traffic to unused network resources. This improves network utilization and provides load balancing. To enhance failure resiliency we propose a new scheme More Nodes Have At Least Two Choices (MNTC) where the goal is how to maximize the number of nodes that have at least two next-hops towards their destinations. We evaluate the algorithm in a wide space of relevant topologies and the results show that it can achieve good reliability while keeping low stretch.","PeriodicalId":105442,"journal":{"name":"2014 IEEE 33rd International Performance Computing and Communications Conference (IPCCC)","volume":"303 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122917206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Parallelization of tree-to-TLV serialization 树到tlv序列化的并行化
Makoto Nakayama, K. Yamazaki, Satoshi Tanaka, H. Kasahara
A serializer/deserializer (SerDe) is necessary to serialize a data object into a byte array and to deserialize in reverse direction. A SerDe that is used worldwide and runs quickly is the Protocol Buffer (ProtoBuf), which serializes a tree-structured data object into the Type-Length-Value (TLV) format. Acceleration of SerDe processing is beneficial because SerDes are used in various fields. This paper proposes a new method that accelerates the tree-to-TLV serialization through 2-way parallel processing called “parallelized serialization” and “parallelization with streaming”. Experimental results show that parallelized serialization with 4 worker threads achieves a 1.97 fold shorter serialization time than when using a single worker thread, and the combination of 2-way parallel processing achieves a 2.11 fold shorter output time than that for ProtoBuf when 4 worker threads, FileOutputStream and trees of 10,080 container nodes are used.
序列化/反序列化(SerDe)是将数据对象序列化为字节数组和反向反序列化所必需的。协议缓冲区(ProtoBuf)是世界范围内使用且运行速度很快的一个SerDe,它将树结构的数据对象序列化为类型-长度-值(TLV)格式。加速SerDe处理是有益的,因为SerDe用于各种领域。本文提出了一种通过双向并行处理加速树到tlv串行化的新方法,称为“并行串行化”和“流并行化”。实验结果表明,当使用4个工作线程、FileOutputStream和10,080个容器节点的树时,使用4个工作线程的并行序列化比使用单个工作线程的序列化时间缩短1.97倍,并且组合使用双向并行处理的输出时间比使用ProtoBuf的输出时间缩短2.11倍。
{"title":"Parallelization of tree-to-TLV serialization","authors":"Makoto Nakayama, K. Yamazaki, Satoshi Tanaka, H. Kasahara","doi":"10.1109/PCCC.2014.7017059","DOIUrl":"https://doi.org/10.1109/PCCC.2014.7017059","url":null,"abstract":"A serializer/deserializer (SerDe) is necessary to serialize a data object into a byte array and to deserialize in reverse direction. A SerDe that is used worldwide and runs quickly is the Protocol Buffer (ProtoBuf), which serializes a tree-structured data object into the Type-Length-Value (TLV) format. Acceleration of SerDe processing is beneficial because SerDes are used in various fields. This paper proposes a new method that accelerates the tree-to-TLV serialization through 2-way parallel processing called “parallelized serialization” and “parallelization with streaming”. Experimental results show that parallelized serialization with 4 worker threads achieves a 1.97 fold shorter serialization time than when using a single worker thread, and the combination of 2-way parallel processing achieves a 2.11 fold shorter output time than that for ProtoBuf when 4 worker threads, FileOutputStream and trees of 10,080 container nodes are used.","PeriodicalId":105442,"journal":{"name":"2014 IEEE 33rd International Performance Computing and Communications Conference (IPCCC)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125593244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Virtual data center allocation with dynamic clustering in clouds 在云中使用动态集群的虚拟数据中心分配
Li Shi, D. Katramatos, Dantong Yu
Clouds are being widely used for leasing resources to users in the form of on-demand virtual data centers, which comprise sets of virtual machines interconnected by sets of virtual links. Given a user request for a virtual data center with specific resource requirements, a critical problem is to select a set of servers and links in the physical data center of a cloud to satisfy the request in a manner that minimizes the amount of reserved resources. In this paper, we study the main aspects of this Virtual Data Center Allocation (VDCA) problem, and decompose it into three subproblems: virtual data center clustering, virtual machine allocation, and virtual link allocation. We prove the NP-hardness of VDCA and propose an algorithm that solves the problem by dynamically clustering the requested virtual data center and jointly optimizing virtual machine and virtual link allocation. We further compare the performance and scalability of the proposed algorithm with two existing algorithms, called LoCo and SecondNet, through simulations. We demonstrate that our algorithm generates 30%-200% more revenue than LoCo and 55%-300% than SecondNet, while being up to 12 times faster.
云被广泛用于以按需虚拟数据中心的形式向用户出租资源,这些数据中心由一组虚拟机组成,通过一组虚拟链路相互连接。给定用户对具有特定资源需求的虚拟数据中心的请求,一个关键问题是在云的物理数据中心中选择一组服务器和链接,以最小化保留资源量的方式满足请求。本文研究了虚拟数据中心分配(VDCA)问题的主要方面,并将其分解为三个子问题:虚拟数据中心集群问题、虚拟机分配问题和虚拟链路分配问题。我们证明了VDCA的np -硬度,并提出了一种通过对请求的虚拟数据中心进行动态聚类、共同优化虚拟机和虚拟链路分配来解决问题的算法。通过仿真,我们进一步将所提出算法的性能和可扩展性与两种现有算法(LoCo和SecondNet)进行了比较。我们证明,我们的算法产生的收入比LoCo多30%-200%,比SecondNet多55%-300%,同时速度快12倍。
{"title":"Virtual data center allocation with dynamic clustering in clouds","authors":"Li Shi, D. Katramatos, Dantong Yu","doi":"10.1109/PCCC.2014.7017105","DOIUrl":"https://doi.org/10.1109/PCCC.2014.7017105","url":null,"abstract":"Clouds are being widely used for leasing resources to users in the form of on-demand virtual data centers, which comprise sets of virtual machines interconnected by sets of virtual links. Given a user request for a virtual data center with specific resource requirements, a critical problem is to select a set of servers and links in the physical data center of a cloud to satisfy the request in a manner that minimizes the amount of reserved resources. In this paper, we study the main aspects of this Virtual Data Center Allocation (VDCA) problem, and decompose it into three subproblems: virtual data center clustering, virtual machine allocation, and virtual link allocation. We prove the NP-hardness of VDCA and propose an algorithm that solves the problem by dynamically clustering the requested virtual data center and jointly optimizing virtual machine and virtual link allocation. We further compare the performance and scalability of the proposed algorithm with two existing algorithms, called LoCo and SecondNet, through simulations. We demonstrate that our algorithm generates 30%-200% more revenue than LoCo and 55%-300% than SecondNet, while being up to 12 times faster.","PeriodicalId":105442,"journal":{"name":"2014 IEEE 33rd International Performance Computing and Communications Conference (IPCCC)","volume":"439 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116014206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Design and analysis of fault tolerance mechanism for sparrow 麻雀容错机构的设计与分析
Wenzhuo Li, Chuang Lin
Big data processing frameworks are developing towards larger degrees of parallelism and shorter task durations in order to achieve lower response time. Scheduling highly parallel tasks that complete in nearly 100 milliseconds poses a major challenge for task schedulers. Taking the challenge, researchers turn to decentralized frameworks to relieve the pressure of task schedulers, among which Sparrow is a good choice. However, little efforts are devoted to fault tolerance of Sparrow, which does not handle worker failures, giving rise to incomplete tasks. We present a fault tolerance mechanism named Heartbeat on Sparrow to handle failures of worker machines. Through simulation, we compare it with a simple mechanism. The result shows that Heartbeat on Sparrow can detect worker failures faster and reschedule all failed tasks more efficiently, achieving recovery of tasks and states in sub-second time. We hope this mechanism will make some contributions to Sparrow and other decentralized designs on fault tolerance side.
大数据处理框架正朝着更高的并行度和更短的任务持续时间发展,以实现更低的响应时间。调度在近100毫秒内完成的高度并行任务对任务调度器提出了重大挑战。面对这一挑战,研究人员转向分散式框架来缓解任务调度程序的压力,其中Sparrow是一个很好的选择。但是,由于麻雀的容错能力不够,没有处理工人的故障,导致任务不完整。我们在Sparrow上提出了一个名为Heartbeat的容错机制来处理工作机的故障。通过仿真,我们将其与一个简单的机构进行了比较。结果表明,Heartbeat可以更快地检测工作线程故障,并更有效地重新调度所有失败的任务,在亚秒级时间内实现任务和状态的恢复。我们希望这个机制能够对Sparrow和其他去中心化设计在容错方面做出一些贡献。
{"title":"Design and analysis of fault tolerance mechanism for sparrow","authors":"Wenzhuo Li, Chuang Lin","doi":"10.1109/PCCC.2014.7017054","DOIUrl":"https://doi.org/10.1109/PCCC.2014.7017054","url":null,"abstract":"Big data processing frameworks are developing towards larger degrees of parallelism and shorter task durations in order to achieve lower response time. Scheduling highly parallel tasks that complete in nearly 100 milliseconds poses a major challenge for task schedulers. Taking the challenge, researchers turn to decentralized frameworks to relieve the pressure of task schedulers, among which Sparrow is a good choice. However, little efforts are devoted to fault tolerance of Sparrow, which does not handle worker failures, giving rise to incomplete tasks. We present a fault tolerance mechanism named Heartbeat on Sparrow to handle failures of worker machines. Through simulation, we compare it with a simple mechanism. The result shows that Heartbeat on Sparrow can detect worker failures faster and reschedule all failed tasks more efficiently, achieving recovery of tasks and states in sub-second time. We hope this mechanism will make some contributions to Sparrow and other decentralized designs on fault tolerance side.","PeriodicalId":105442,"journal":{"name":"2014 IEEE 33rd International Performance Computing and Communications Conference (IPCCC)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127437124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Measuring path divergence in the Internet 测量互联网的路径发散度
Nazim Ahmed, K. Saraç
Path divergence refers to a situation where a path from source to an intermediate router on a source to destination path may not be a prefix of the source to destination path. Studying path divergence helps us understand various operational characteristics of the underlying network. In this paper, we perform an active measurement study to observe the magnitude, causes, and types of path divergence in the Internet. We observe that most path divergence cases occur due to load balancing routers but policy-based inter domain routing practices also contribute to divergence. We also observe that most routers causing path divergence are positioned in the backbone of the network but routers closer to the sources are causing more number of divergences. Our study combined with peering relationship data between neighboring domains can also point out potential routing anomaly cases in the inter domain routing process in the Internet. Finally, our techniques to trace to intermediate routers can explore new IP addresses, routers, Autonomous Systems (ASes) which can potentially help enrich topology mapping procedure and infer new peering relationships among ASes.
路径发散是指源到目的路径上从源到中间路由器的路径可能不是源到目的路径的前缀的情况。研究路径发散有助于我们理解底层网络的各种运行特征。在本文中,我们进行了一项主动测量研究,以观察互联网中路径分歧的大小、原因和类型。我们观察到,大多数路径发散情况是由于负载均衡路由器而发生的,但基于策略的域间路由实践也会导致发散。我们还观察到,大多数引起路径发散的路由器位于网络的主干,但靠近源的路由器引起的发散数量更多。结合相邻域之间的对等关系数据,我们的研究还可以指出互联网域间路由过程中潜在的路由异常情况。最后,我们的中间路由器跟踪技术可以探索新的IP地址、路由器、自治系统(as),这可能有助于丰富拓扑映射过程,并推断as之间新的对等关系。
{"title":"Measuring path divergence in the Internet","authors":"Nazim Ahmed, K. Saraç","doi":"10.1109/PCCC.2014.7017052","DOIUrl":"https://doi.org/10.1109/PCCC.2014.7017052","url":null,"abstract":"Path divergence refers to a situation where a path from source to an intermediate router on a source to destination path may not be a prefix of the source to destination path. Studying path divergence helps us understand various operational characteristics of the underlying network. In this paper, we perform an active measurement study to observe the magnitude, causes, and types of path divergence in the Internet. We observe that most path divergence cases occur due to load balancing routers but policy-based inter domain routing practices also contribute to divergence. We also observe that most routers causing path divergence are positioned in the backbone of the network but routers closer to the sources are causing more number of divergences. Our study combined with peering relationship data between neighboring domains can also point out potential routing anomaly cases in the inter domain routing process in the Internet. Finally, our techniques to trace to intermediate routers can explore new IP addresses, routers, Autonomous Systems (ASes) which can potentially help enrich topology mapping procedure and infer new peering relationships among ASes.","PeriodicalId":105442,"journal":{"name":"2014 IEEE 33rd International Performance Computing and Communications Conference (IPCCC)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130682490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Session-based access control in information-centric networks: Design and analyses 信息中心网络中基于会话的访问控制:设计与分析
Yu Wang, Mingwei Xu, Zhenyang Feng, Qing Li, Qi Li
Information-Centric Networking (ICN) has been proposed recently to improve the efficiency of content delivery in current IP networks. ICN employs data names, instead of host addresses, as routing and forwarding indicators. Content in the ICN carries only signature of the content provider but does not contain the identity of the content consumer by default. Such information is, however, essential for many of the web applications, such as email, online social networking, online game, e-commerce, and other session-based web services. In this paper, we propose a session-based access control (SAC) mechanism for ICN scenario to bridge the gap. Key distribution protocols are designed to protect the confidentiality of the content during information delivery. We also employ a dynamic naming scheme to enhance user privacy. According to security analysis, our access control mechanism can provide communication security and privacy protection for both sides of the session. Our design can be easily applied to session-based applications in ICN with negligible overhead.
信息中心网络(ICN)是为了提高当前IP网络的内容传输效率而提出的。ICN使用数据名称而不是主机地址作为路由和转发指标。ICN中的内容只携带内容提供者的签名,但默认情况下不包含内容使用者的标识。然而,这些信息对于许多web应用程序是必不可少的,例如电子邮件、在线社交网络、在线游戏、电子商务和其他基于会话的web服务。在本文中,我们提出了一种基于会话的ICN访问控制(SAC)机制来弥补这一差距。密钥分发协议的设计是为了在信息传递期间保护内容的机密性。我们还采用动态命名方案来增强用户隐私。根据安全性分析,我们的访问控制机制可以为会话双方提供通信安全和隐私保护。我们的设计可以很容易地应用于ICN中基于会话的应用程序,开销可以忽略不计。
{"title":"Session-based access control in information-centric networks: Design and analyses","authors":"Yu Wang, Mingwei Xu, Zhenyang Feng, Qing Li, Qi Li","doi":"10.1109/PCCC.2014.7017094","DOIUrl":"https://doi.org/10.1109/PCCC.2014.7017094","url":null,"abstract":"Information-Centric Networking (ICN) has been proposed recently to improve the efficiency of content delivery in current IP networks. ICN employs data names, instead of host addresses, as routing and forwarding indicators. Content in the ICN carries only signature of the content provider but does not contain the identity of the content consumer by default. Such information is, however, essential for many of the web applications, such as email, online social networking, online game, e-commerce, and other session-based web services. In this paper, we propose a session-based access control (SAC) mechanism for ICN scenario to bridge the gap. Key distribution protocols are designed to protect the confidentiality of the content during information delivery. We also employ a dynamic naming scheme to enhance user privacy. According to security analysis, our access control mechanism can provide communication security and privacy protection for both sides of the session. Our design can be easily applied to session-based applications in ICN with negligible overhead.","PeriodicalId":105442,"journal":{"name":"2014 IEEE 33rd International Performance Computing and Communications Conference (IPCCC)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125387692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Data analytics workloads: Characterization and similarity analysis 数据分析工作负载:特征和相似性分析
Reena Panda, L. John
Performance of modern day computer systems greatly depends on the wide range of workloads, which run on the systems. Thus, a representative set of workloads, representing the different classes of real-world applications, need to be used by computer designers and researchers for processor design-space evaluation studies. While a number of different benchmark suites are available, a few common benchmark suites like the SPEC CPU2006 benchmarks are widely used by researchers either due to ease of setup, or simulation time constraints etc. However, as the popular benchmarks such as SPEC CPU2006 benchmarks do not capture the characteristics of the wide variety of emerging real-world applications, using them as the basis for performance evaluation may lead to either suboptimal designs or misleading results. In this paper, we characterize the behavior of the data analytics workloads, an important class of emerging applications, and perform a systematic similarity analysis with the popular SPEC CPU2006 & SPECjbb2013 benchmarks suites. To characterize the workloads, we use hardware performance counter based measurements and a variety of extracted micro-architecture independent workload characteristics. Then, we use statistical data analysis techniques, namely principal component analysis and clustering techniques, to analyze the similarity/dissimilarity among these different classes of applications. In this paper, we demonstrate the inherent differences between the characteristics of the different classes of applications and how to arrive at meaningful subsets of benchmarks, which will help in faster and more accurate targeted early hardware system performance evaluation.
现代计算机系统的性能在很大程度上取决于系统上运行的各种工作负载。因此,计算机设计人员和研究人员需要使用一组具有代表性的工作负载来进行处理器设计空间评估研究,这些工作负载代表了实际应用程序的不同类别。虽然有许多不同的基准测试套件可用,但一些常见的基准测试套件,如SPEC CPU2006基准测试,由于易于设置或模拟时间限制等原因,被研究人员广泛使用。然而,由于SPEC CPU2006等流行的基准测试并没有捕捉到各种新兴的实际应用程序的特征,因此使用它们作为性能评估的基础可能会导致次优设计或误导性结果。在本文中,我们描述了数据分析工作负载的行为特征,这是一类重要的新兴应用程序,并使用流行的SPEC CPU2006和SPECjbb2013基准套件进行了系统的相似性分析。为了描述工作负载,我们使用基于硬件性能计数器的测量和各种提取的独立于微架构的工作负载特征。然后,我们使用统计数据分析技术,即主成分分析和聚类技术,来分析这些不同类别的应用程序之间的相似/不相似。在本文中,我们展示了不同类别应用程序特征之间的内在差异,以及如何获得有意义的基准子集,这将有助于更快,更准确地进行有针对性的早期硬件系统性能评估。
{"title":"Data analytics workloads: Characterization and similarity analysis","authors":"Reena Panda, L. John","doi":"10.1109/PCCC.2014.7017065","DOIUrl":"https://doi.org/10.1109/PCCC.2014.7017065","url":null,"abstract":"Performance of modern day computer systems greatly depends on the wide range of workloads, which run on the systems. Thus, a representative set of workloads, representing the different classes of real-world applications, need to be used by computer designers and researchers for processor design-space evaluation studies. While a number of different benchmark suites are available, a few common benchmark suites like the SPEC CPU2006 benchmarks are widely used by researchers either due to ease of setup, or simulation time constraints etc. However, as the popular benchmarks such as SPEC CPU2006 benchmarks do not capture the characteristics of the wide variety of emerging real-world applications, using them as the basis for performance evaluation may lead to either suboptimal designs or misleading results. In this paper, we characterize the behavior of the data analytics workloads, an important class of emerging applications, and perform a systematic similarity analysis with the popular SPEC CPU2006 & SPECjbb2013 benchmarks suites. To characterize the workloads, we use hardware performance counter based measurements and a variety of extracted micro-architecture independent workload characteristics. Then, we use statistical data analysis techniques, namely principal component analysis and clustering techniques, to analyze the similarity/dissimilarity among these different classes of applications. In this paper, we demonstrate the inherent differences between the characteristics of the different classes of applications and how to arrive at meaningful subsets of benchmarks, which will help in faster and more accurate targeted early hardware system performance evaluation.","PeriodicalId":105442,"journal":{"name":"2014 IEEE 33rd International Performance Computing and Communications Conference (IPCCC)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125855771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Optimal phase control for joint transmission and reception with beamforming 波束成形联合收发的最优相位控制
Seonghyun Kim, Hojae Lee, Beom Kwon, Inwoong Lee, Sanghoon Lee
In this paper, we propose a joint transmission and reception with phase control for beamforming in a multi-cell environment. For generated transmit weight vectors of multiple base stations (BSs), a mobile station (MS) calculates phases to maximize an achievable rate with low rate feedback. By using the phases, the multiple transmit weight vectors are coordinated to improve the signal to noise ratio (SNR). In order to find optimal phases, we present a phase control method for the effective channel via geometrical approach.
在本文中,我们提出了一种在多小区环境中具有相位控制的联合发射和接收波束形成方法。对于生成的多个基站(BSs)的发射权向量,移动站(MS)计算相位以在低速率反馈下最大化可实现的速率。通过相位协调多个传输权矢量,提高了信噪比。为了找到最优相位,我们提出了一种基于几何方法的有效信道相位控制方法。
{"title":"Optimal phase control for joint transmission and reception with beamforming","authors":"Seonghyun Kim, Hojae Lee, Beom Kwon, Inwoong Lee, Sanghoon Lee","doi":"10.1109/PCCC.2014.7017038","DOIUrl":"https://doi.org/10.1109/PCCC.2014.7017038","url":null,"abstract":"In this paper, we propose a joint transmission and reception with phase control for beamforming in a multi-cell environment. For generated transmit weight vectors of multiple base stations (BSs), a mobile station (MS) calculates phases to maximize an achievable rate with low rate feedback. By using the phases, the multiple transmit weight vectors are coordinated to improve the signal to noise ratio (SNR). In order to find optimal phases, we present a phase control method for the effective channel via geometrical approach.","PeriodicalId":105442,"journal":{"name":"2014 IEEE 33rd International Performance Computing and Communications Conference (IPCCC)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114699506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A hybrid erasure-coded ECC scheme to improve performance and reliability of solid state drives 一种混合擦除编码ECC方案,以提高固态硬盘的性能和可靠性
P. Subedi, Ping Huang, Xubin He, Ming Zhang, Jizhong Han
The high performance and ever-increasing capacity of flash memory has led to the rapid adoption of Solid-State Disks (SSDs) in mass storage systems. In order to increase disk capacity, multi-level cells (MLC) are used in the design of SSDs, but the use of such SSDs in persistent storage systems raise concerns for users due to the low reliability of such disks. In this paper, we present a hybrid erasure-coded (EECC) architecture that incorporates ECC schemes and erasure codes to improve both performance and reliability. As weak error-correction codes have faster decoding speed than complex error correction codes (ECC), we propose the use of weak-ECC at the segment level rather than complex ECC. To compensate the reduced correction ability of weak-ECC, we use an erasure code that is striped across segments rather than pages or blocks. We use a small sized HDD to store parities so that we can leverage parallelism across multiple devices and remove the parity updates from the critical write path. We carry out simulation experiments based on Disksim to demonstrate that our proposed scheme is able reduce the SSD average read-latency by up to 31.23% and along with tolerance from double chip failures, it dramatically reduces the uncorrectable page error rate.
闪存的高性能和不断增加的容量使得固态硬盘(ssd)在大容量存储系统中的应用迅速普及。为了增加磁盘容量,在ssd的设计中采用了多级单元(MLC),但在持久存储系统中使用这种ssd时,由于其可靠性较低,引起了用户的担忧。在本文中,我们提出了一种混合擦除编码(EECC)架构,该架构结合了ECC方案和擦除码,以提高性能和可靠性。由于弱纠错码比复杂纠错码(ECC)具有更快的解码速度,我们建议在段级别使用弱纠错码而不是复杂纠错码。为了补偿弱ecc降低的纠错能力,我们使用跨段而不是页或块的条带擦除码。我们使用小型HDD来存储奇偶校验,这样我们就可以利用多个设备的并行性,并从关键写路径中删除奇偶校验更新。基于Disksim的仿真实验表明,该方案能够将SSD的平均读延迟降低31.23%,并且具有双芯片故障容忍度,显著降低不可纠正页面错误率。
{"title":"A hybrid erasure-coded ECC scheme to improve performance and reliability of solid state drives","authors":"P. Subedi, Ping Huang, Xubin He, Ming Zhang, Jizhong Han","doi":"10.1109/PCCC.2014.7017095","DOIUrl":"https://doi.org/10.1109/PCCC.2014.7017095","url":null,"abstract":"The high performance and ever-increasing capacity of flash memory has led to the rapid adoption of Solid-State Disks (SSDs) in mass storage systems. In order to increase disk capacity, multi-level cells (MLC) are used in the design of SSDs, but the use of such SSDs in persistent storage systems raise concerns for users due to the low reliability of such disks. In this paper, we present a hybrid erasure-coded (EECC) architecture that incorporates ECC schemes and erasure codes to improve both performance and reliability. As weak error-correction codes have faster decoding speed than complex error correction codes (ECC), we propose the use of weak-ECC at the segment level rather than complex ECC. To compensate the reduced correction ability of weak-ECC, we use an erasure code that is striped across segments rather than pages or blocks. We use a small sized HDD to store parities so that we can leverage parallelism across multiple devices and remove the parity updates from the critical write path. We carry out simulation experiments based on Disksim to demonstrate that our proposed scheme is able reduce the SSD average read-latency by up to 31.23% and along with tolerance from double chip failures, it dramatically reduces the uncorrectable page error rate.","PeriodicalId":105442,"journal":{"name":"2014 IEEE 33rd International Performance Computing and Communications Conference (IPCCC)","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133497415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2014 IEEE 33rd International Performance Computing and Communications Conference (IPCCC)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1