{"title":"大型流感病毒序列数据集的多分辨率表示和可视化方法","authors":"L. Zaslavsky, Yīmíng Bào, T. Tatusova","doi":"10.1109/BIBMW.2007.4425408","DOIUrl":null,"url":null,"abstract":"Rapid growth of the amount of genome sequence data requires enhancing exploratory analysis tools, with analysis being performed in a fast and robust manner. Users need data representations serving different purposes: from seeing overall structure and data coverage to evolutionary processes during a particular season. Our approach to the problem is in constructing hierarchies of data representations, and providing users with representations adaptable to specific goals. It can be done efficiently because the structure of a typical influenza dataset is characterized by low estimated values of the Kolmogorov (box) dimension. Multi-scale methodologies allow interactive visual representation of the dataset and accelerate computations by importance sampling. Our tree visualization approach is based on a subtree aggregation with subscale resolution. It allows interactive refinements and coarsening of subtree views. For importance sampling large influenza datasets, we construct sets of well-scattered points (e-nets). While a tree build for a global sample provides a coarse-level representation of the whole dataset, it can be complemented by trees showing more details in chosen areas. To reflect both global dataset structure and local details correctly, we perform local refinement gradually, using a multiscale hierarchy of e-nets. Our hierarchical representations allow fast metadata searching.","PeriodicalId":260286,"journal":{"name":"2007 IEEE International Conference on Bioinformatics and Biomedicine Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Multiresolution approaches to representation and visualization of large influenza virus sequence datasets\",\"authors\":\"L. Zaslavsky, Yīmíng Bào, T. Tatusova\",\"doi\":\"10.1109/BIBMW.2007.4425408\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Rapid growth of the amount of genome sequence data requires enhancing exploratory analysis tools, with analysis being performed in a fast and robust manner. Users need data representations serving different purposes: from seeing overall structure and data coverage to evolutionary processes during a particular season. Our approach to the problem is in constructing hierarchies of data representations, and providing users with representations adaptable to specific goals. It can be done efficiently because the structure of a typical influenza dataset is characterized by low estimated values of the Kolmogorov (box) dimension. Multi-scale methodologies allow interactive visual representation of the dataset and accelerate computations by importance sampling. Our tree visualization approach is based on a subtree aggregation with subscale resolution. It allows interactive refinements and coarsening of subtree views. For importance sampling large influenza datasets, we construct sets of well-scattered points (e-nets). While a tree build for a global sample provides a coarse-level representation of the whole dataset, it can be complemented by trees showing more details in chosen areas. To reflect both global dataset structure and local details correctly, we perform local refinement gradually, using a multiscale hierarchy of e-nets. Our hierarchical representations allow fast metadata searching.\",\"PeriodicalId\":260286,\"journal\":{\"name\":\"2007 IEEE International Conference on Bioinformatics and Biomedicine Workshops\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2007 IEEE International Conference on Bioinformatics and Biomedicine Workshops\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BIBMW.2007.4425408\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE International Conference on Bioinformatics and Biomedicine Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBMW.2007.4425408","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multiresolution approaches to representation and visualization of large influenza virus sequence datasets
Rapid growth of the amount of genome sequence data requires enhancing exploratory analysis tools, with analysis being performed in a fast and robust manner. Users need data representations serving different purposes: from seeing overall structure and data coverage to evolutionary processes during a particular season. Our approach to the problem is in constructing hierarchies of data representations, and providing users with representations adaptable to specific goals. It can be done efficiently because the structure of a typical influenza dataset is characterized by low estimated values of the Kolmogorov (box) dimension. Multi-scale methodologies allow interactive visual representation of the dataset and accelerate computations by importance sampling. Our tree visualization approach is based on a subtree aggregation with subscale resolution. It allows interactive refinements and coarsening of subtree views. For importance sampling large influenza datasets, we construct sets of well-scattered points (e-nets). While a tree build for a global sample provides a coarse-level representation of the whole dataset, it can be complemented by trees showing more details in chosen areas. To reflect both global dataset structure and local details correctly, we perform local refinement gradually, using a multiscale hierarchy of e-nets. Our hierarchical representations allow fast metadata searching.