{"title":"基于Hadoop的基因组测序集群系统","authors":"Anju Ramesh Ekre, R. Mante","doi":"10.1109/ICONSTEM.2016.7560916","DOIUrl":null,"url":null,"abstract":"Genomics is an interdisciplinary branch of science that is bringing vital changes in the field of medicine and agriculture. It is believed that the scientific and technological advancements in 21st century will be related to the processing, manipulation and analysis of the vast information that is generated from genome sequencing of living organisms. A scientific and big data research domain includes the problem of genome sequencing. Genome sequence is also called as read sequence. Next-Generation sequencing is playing a crucial role in the development and advancements of read alignment algorithms. Computer scientists, mathematician and physicists are together helping for this research of alignment. However, increase in the data size and faster data access requirement for the scientists and researchers are increasing which is leading advancements in genome alignment towards acceleration approach. This paper includes a MapReduce acceleration scheme for faster sequence alignment. It works on multiple commodity hardware. With the use of MapReduce programming along with the clustering algorithm for distribution of genome data on multiple nodes may reduce the time, also it can lead towards accuracy in genome sequencing.","PeriodicalId":256750,"journal":{"name":"2016 Second International Conference on Science Technology Engineering and Management (ICONSTEM)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Hadoop based clustering system for genome sequencing\",\"authors\":\"Anju Ramesh Ekre, R. Mante\",\"doi\":\"10.1109/ICONSTEM.2016.7560916\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Genomics is an interdisciplinary branch of science that is bringing vital changes in the field of medicine and agriculture. It is believed that the scientific and technological advancements in 21st century will be related to the processing, manipulation and analysis of the vast information that is generated from genome sequencing of living organisms. A scientific and big data research domain includes the problem of genome sequencing. Genome sequence is also called as read sequence. Next-Generation sequencing is playing a crucial role in the development and advancements of read alignment algorithms. Computer scientists, mathematician and physicists are together helping for this research of alignment. However, increase in the data size and faster data access requirement for the scientists and researchers are increasing which is leading advancements in genome alignment towards acceleration approach. This paper includes a MapReduce acceleration scheme for faster sequence alignment. It works on multiple commodity hardware. With the use of MapReduce programming along with the clustering algorithm for distribution of genome data on multiple nodes may reduce the time, also it can lead towards accuracy in genome sequencing.\",\"PeriodicalId\":256750,\"journal\":{\"name\":\"2016 Second International Conference on Science Technology Engineering and Management (ICONSTEM)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-03-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 Second International Conference on Science Technology Engineering and Management (ICONSTEM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICONSTEM.2016.7560916\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 Second International Conference on Science Technology Engineering and Management (ICONSTEM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICONSTEM.2016.7560916","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Hadoop based clustering system for genome sequencing
Genomics is an interdisciplinary branch of science that is bringing vital changes in the field of medicine and agriculture. It is believed that the scientific and technological advancements in 21st century will be related to the processing, manipulation and analysis of the vast information that is generated from genome sequencing of living organisms. A scientific and big data research domain includes the problem of genome sequencing. Genome sequence is also called as read sequence. Next-Generation sequencing is playing a crucial role in the development and advancements of read alignment algorithms. Computer scientists, mathematician and physicists are together helping for this research of alignment. However, increase in the data size and faster data access requirement for the scientists and researchers are increasing which is leading advancements in genome alignment towards acceleration approach. This paper includes a MapReduce acceleration scheme for faster sequence alignment. It works on multiple commodity hardware. With the use of MapReduce programming along with the clustering algorithm for distribution of genome data on multiple nodes may reduce the time, also it can lead towards accuracy in genome sequencing.