Pub Date : 2009-12-09DOI: 10.1109/E-SCIENCE.2009.16
Yang Yang, Adrian Osmond, Xiaoyu Chen, M. Weal, G. Wills, D. D. Roure, J. Joseph, L. Yardley
Behavioural interventions - packages of advice and support for behaviour change - are one of the most important methodologies and technologies employed by social scientists for understanding and changing behaviour. A typical web-based behavioural intervention study includes the designing, deploying, piloting and trialling of the intervention as well as data analysis. We have developed a research environment named LifeGuide, which covers the full scope of this process, enabling social scientists to carry out intervention studies with minimal technical expertise. In this paper, we present how the LifeGuide can assist and accelerate intervention research, particularly focusing on supporting the running and analysis of trials of web-based behavioural interventions along with the case study of an intervention that has been developed within the LifeGuide.
{"title":"Supporting the Running and Analysis of Trials of Web-Based Behavioural Interventions: The LifeGuide","authors":"Yang Yang, Adrian Osmond, Xiaoyu Chen, M. Weal, G. Wills, D. D. Roure, J. Joseph, L. Yardley","doi":"10.1109/E-SCIENCE.2009.16","DOIUrl":"https://doi.org/10.1109/E-SCIENCE.2009.16","url":null,"abstract":"Behavioural interventions - packages of advice and support for behaviour change - are one of the most important methodologies and technologies employed by social scientists for understanding and changing behaviour. A typical web-based behavioural intervention study includes the designing, deploying, piloting and trialling of the intervention as well as data analysis. We have developed a research environment named LifeGuide, which covers the full scope of this process, enabling social scientists to carry out intervention studies with minimal technical expertise. In this paper, we present how the LifeGuide can assist and accelerate intervention research, particularly focusing on supporting the running and analysis of trials of web-based behavioural interventions along with the case study of an intervention that has been developed within the LifeGuide.","PeriodicalId":325840,"journal":{"name":"2009 Fifth IEEE International Conference on e-Science","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133546862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-12-09DOI: 10.1109/E-SCIENCE.2009.40
André Merzky, K. Stamou, S. Jha, D. Katz
Most workflow based applications currently have to adapt to available tools. While this keeps the cost of development low, it can lead to performance and flexibility tradeoffs that the application developer and deployer must make. In this paper, we use the Montage astronomical image mosaicking application as prototypical DAG-based workflow application to layout the development and deployment decisions for distributed applications. We discuss and explain the lack of simple (easy-to-use), scalable, and extensible distributed applications. We then introduce SAGA as a technology that permits the construction of abstractions that aid the development and execution of the applications, and thus addresses some of common shortcomings of traditional distributed applications development. We use Montage together with SAGA to examine how legacy applications can be made to run on distributed infrastructures, to see if our reasons are valid, and to compare potential new methods for creating distributed applications with existing technologies that are currently used. We demonstrate the ability to (i) scale-out and (ii) use different production infrastructure, while maintaining performance comparable to established systems. Our hope is that by demonstrating the simplicity of development along with other advantages (performance, scalability, extensibility, and infrastructure independence), this example will encourage others to think more broadly about how distributed applications are created and how new programming models such as Dryad can be supported in an infrastructure independent way, thus eventually leading to more applications that can seamlessly scale-out.
{"title":"A Fresh Perspective on Developing and Executing DAG-Based Distributed Applications: A Case-Study of SAGA-Based Montage","authors":"André Merzky, K. Stamou, S. Jha, D. Katz","doi":"10.1109/E-SCIENCE.2009.40","DOIUrl":"https://doi.org/10.1109/E-SCIENCE.2009.40","url":null,"abstract":"Most workflow based applications currently have to adapt to available tools. While this keeps the cost of development low, it can lead to performance and flexibility tradeoffs that the application developer and deployer must make. In this paper, we use the Montage astronomical image mosaicking application as prototypical DAG-based workflow application to layout the development and deployment decisions for distributed applications. We discuss and explain the lack of simple (easy-to-use), scalable, and extensible distributed applications. We then introduce SAGA as a technology that permits the construction of abstractions that aid the development and execution of the applications, and thus addresses some of common shortcomings of traditional distributed applications development. We use Montage together with SAGA to examine how legacy applications can be made to run on distributed infrastructures, to see if our reasons are valid, and to compare potential new methods for creating distributed applications with existing technologies that are currently used. We demonstrate the ability to (i) scale-out and (ii) use different production infrastructure, while maintaining performance comparable to established systems. Our hope is that by demonstrating the simplicity of development along with other advantages (performance, scalability, extensibility, and infrastructure independence), this example will encourage others to think more broadly about how distributed applications are created and how new programming models such as Dryad can be supported in an infrastructure independent way, thus eventually leading to more applications that can seamlessly scale-out.","PeriodicalId":325840,"journal":{"name":"2009 Fifth IEEE International Conference on e-Science","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132080346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-12-09DOI: 10.1109/e-Science.2009.20
P. Ribeiro, Fernando M A Silva, Marcus Kaiser
Complex networks from domains like Biology or Sociology are present in many e-Science data sets. Dealing with networks can often form a workflow bottleneck as several related algorithms are computationally hard. One example is detecting characteristic patterns or "network motifs" - a problem involving subgraph mining and graph isomorphism. This paper provides a review and runtime comparison of current motif detection algorithms in the field. We present the strategies and the corresponding algorithms in pseudo-code yielding a framework for comparison. We categorize the algorithms outlining the main differences and advantages of each strategy. We finally implement all strategies in a common platform to allow a fair and objective efficiency comparison using a set of benchmark networks. We hope to inform the choice of strategy and critically discuss future improvements in motif detection.
{"title":"Strategies for Network Motifs Discovery","authors":"P. Ribeiro, Fernando M A Silva, Marcus Kaiser","doi":"10.1109/e-Science.2009.20","DOIUrl":"https://doi.org/10.1109/e-Science.2009.20","url":null,"abstract":"Complex networks from domains like Biology or Sociology are present in many e-Science data sets. Dealing with networks can often form a workflow bottleneck as several related algorithms are computationally hard. One example is detecting characteristic patterns or \"network motifs\" - a problem involving subgraph mining and graph isomorphism. This paper provides a review and runtime comparison of current motif detection algorithms in the field. We present the strategies and the corresponding algorithms in pseudo-code yielding a framework for comparison. We categorize the algorithms outlining the main differences and advantages of each strategy. We finally implement all strategies in a common platform to allow a fair and objective efficiency comparison using a set of benchmark networks. We hope to inform the choice of strategy and critically discuss future improvements in motif detection.","PeriodicalId":325840,"journal":{"name":"2009 Fifth IEEE International Conference on e-Science","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133174157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-12-09DOI: 10.1109/e-Science.2009.22
M. S. Avila-Garcia, Anne E. Trefethen, N. Joshi, F. Gleeson, W. Ba-alawi
Image analysis researchers would benefit considerably by sharing and reusing image processing algorithms. We consider some of the issues that researchers face in trying to provide algorithms in a shareable and reusable form illustrating our approach in the context of medical imaging needs and workflow for colorectal cancer. We consider the use of workflow as a model for developing and reusing components of medical imaging and specifically we consider a solution built using .Net and Windows Workflow Foundation.
{"title":"Sharing and Reusing Cancer Image Segmentation Algorithms Using Scientific Workflows: Pros and Cons","authors":"M. S. Avila-Garcia, Anne E. Trefethen, N. Joshi, F. Gleeson, W. Ba-alawi","doi":"10.1109/e-Science.2009.22","DOIUrl":"https://doi.org/10.1109/e-Science.2009.22","url":null,"abstract":"Image analysis researchers would benefit considerably by sharing and reusing image processing algorithms. We consider some of the issues that researchers face in trying to provide algorithms in a shareable and reusable form illustrating our approach in the context of medical imaging needs and workflow for colorectal cancer. We consider the use of workflow as a model for developing and reusing components of medical imaging and specifically we consider a solution built using .Net and Windows Workflow Foundation.","PeriodicalId":325840,"journal":{"name":"2009 Fifth IEEE International Conference on e-Science","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125483429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-12-09DOI: 10.1109/e-Science.2009.52
Yogesh L. Simmhan, C. Ingen, A. Szalay, R. Barga, J. Heasley
The growing amount of scientific data from sensors and field observations is posing a challenge to “data valets” responsible for managing them in data repositories. These repositories built on commodity clusters need to reliably ingest data continuously and ensure its availability to a wide user community. Workflows provide several benefits to modeling data-intensive science applications and many of these benefits can help manage the data ingest pipelines too. But using workflows is not panacea in itself and data valets need to consider several issues when designing workflows that behave reliably on fault prone hardware while retaining the consistency of the scientific data. In this paper, we propose workflow designs for reliable data ingest in a distributed environment and identify workflow framework features to support resilience. We illustrate these using the data pipeline for the Pan-STARRS repository, one of the largest digital surveys that accumulates 100TB of data annually to support 300 astronomers.
{"title":"Building Reliable Data Pipelines for Managing Community Data Using Scientific Workflows","authors":"Yogesh L. Simmhan, C. Ingen, A. Szalay, R. Barga, J. Heasley","doi":"10.1109/e-Science.2009.52","DOIUrl":"https://doi.org/10.1109/e-Science.2009.52","url":null,"abstract":"The growing amount of scientific data from sensors and field observations is posing a challenge to “data valets” responsible for managing them in data repositories. These repositories built on commodity clusters need to reliably ingest data continuously and ensure its availability to a wide user community. Workflows provide several benefits to modeling data-intensive science applications and many of these benefits can help manage the data ingest pipelines too. But using workflows is not panacea in itself and data valets need to consider several issues when designing workflows that behave reliably on fault prone hardware while retaining the consistency of the scientific data. In this paper, we propose workflow designs for reliable data ingest in a distributed environment and identify workflow framework features to support resilience. We illustrate these using the data pipeline for the Pan-STARRS repository, one of the largest digital surveys that accumulates 100TB of data annually to support 300 astronomers.","PeriodicalId":325840,"journal":{"name":"2009 Fifth IEEE International Conference on e-Science","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125163056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-12-09DOI: 10.1109/E-SCIENCE.2009.19
Maurício O. Tsugawa, Andréa M. Matsunaga, J. Fortes
With the emergence of multiple cloud providers of Infrastructure-as-a-Service, it becomes possible to envision a near-future when high-performance computing users could combine services from different clouds to access huge numbers of resources. However, as more administrative privileges are exposed to end users, providers are required to deploy network security measures that present challenges to the network virtualization technologies that are needed to enable inter-cloud communication. This paper studies these challenges and proposes techniques to enable unmodified applications on resources across distinct clouds. The techniques are implemented in TinyViNe, an extension to ViNe, a virtual networking technology for distributed resources in different administrative domains. The results of evaluating TinyViNe on a WAN-based testbed across three sites are reported for a bioinformatics application (BLAST) and MPI benchmarks. The results confirm that TinyViNe enables cross-cloud computing while having little impact on application performance. TinyViNe also has auto-configuration and “download-and-run” capabilities for easy deployment by users who are not knowledgeable about networking.
{"title":"User-Level Virtual Network Support for Sky Computing","authors":"Maurício O. Tsugawa, Andréa M. Matsunaga, J. Fortes","doi":"10.1109/E-SCIENCE.2009.19","DOIUrl":"https://doi.org/10.1109/E-SCIENCE.2009.19","url":null,"abstract":"With the emergence of multiple cloud providers of Infrastructure-as-a-Service, it becomes possible to envision a near-future when high-performance computing users could combine services from different clouds to access huge numbers of resources. However, as more administrative privileges are exposed to end users, providers are required to deploy network security measures that present challenges to the network virtualization technologies that are needed to enable inter-cloud communication. This paper studies these challenges and proposes techniques to enable unmodified applications on resources across distinct clouds. The techniques are implemented in TinyViNe, an extension to ViNe, a virtual networking technology for distributed resources in different administrative domains. The results of evaluating TinyViNe on a WAN-based testbed across three sites are reported for a bioinformatics application (BLAST) and MPI benchmarks. The results confirm that TinyViNe enables cross-cloud computing while having little impact on application performance. TinyViNe also has auto-configuration and “download-and-run” capabilities for easy deployment by users who are not knowledgeable about networking.","PeriodicalId":325840,"journal":{"name":"2009 Fifth IEEE International Conference on e-Science","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131819671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-12-09DOI: 10.1109/E-SCIENCE.2009.33
Yong Liu, Kailash Kotwani, Alejandro Rodríguez, J. Futrelle, R. McGrath, J. Myers
Every modern portal ships with some form of internal document library portlet tools that can be used to enable groups to share files. Unfortunately, certain limitations -- that the portlet can only view data managed directly by the portal and that document and data files are the only resource that can be browsed -- make these tools less valuable in many real-world collaborations. This paper describes a semantically-enhanced scientific resource library portlet that extends the traditional document library to enable interaction with multiple distributed repositories in the cloud and to broaden the set of resources that can be viewed beyond simple hierarchical document folder-files structure to include people, sensors, data streams and other complex digital entities and their relationships. Our technology is based on semantic content abstraction and context aggregation functionality supported by Tupelo, a semantic content middleware and is implemented as a portlet plugin to the Liferay-based CyberCollaboratory portal. We describe the architecture components and the browsing features currently implemented and present a water science use case in which users are able to share documents, raw sensor data streams, and derived virtual sensor data (rainfall) from distributed sources within the same semantic resource library.
{"title":"Beyond the Document Library: Portal-Based Browsing and Exploration of Community Data Clouds","authors":"Yong Liu, Kailash Kotwani, Alejandro Rodríguez, J. Futrelle, R. McGrath, J. Myers","doi":"10.1109/E-SCIENCE.2009.33","DOIUrl":"https://doi.org/10.1109/E-SCIENCE.2009.33","url":null,"abstract":"Every modern portal ships with some form of internal document library portlet tools that can be used to enable groups to share files. Unfortunately, certain limitations -- that the portlet can only view data managed directly by the portal and that document and data files are the only resource that can be browsed -- make these tools less valuable in many real-world collaborations. This paper describes a semantically-enhanced scientific resource library portlet that extends the traditional document library to enable interaction with multiple distributed repositories in the cloud and to broaden the set of resources that can be viewed beyond simple hierarchical document folder-files structure to include people, sensors, data streams and other complex digital entities and their relationships. Our technology is based on semantic content abstraction and context aggregation functionality supported by Tupelo, a semantic content middleware and is implemented as a portlet plugin to the Liferay-based CyberCollaboratory portal. We describe the architecture components and the browsing features currently implemented and present a water science use case in which users are able to share documents, raw sensor data streams, and derived virtual sensor data (rainfall) from distributed sources within the same semantic resource library.","PeriodicalId":325840,"journal":{"name":"2009 Fifth IEEE International Conference on e-Science","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128780093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-12-09DOI: 10.1109/E-SCIENCE.2009.57
E. Santos, Julien Tierny, Ayla Khan, Brad Grimm, L. Lins, J. Freire, Valerio Pascucci, Cláudio T. Silva, S. Klasky, Roselyne B. Tchoua, N. Podhorszki
Simulations that require massive amounts of computing power and generate tens of terabytes of data are now part of the daily lives of scientists. Analyzing and visualizing the results of these simulations as they are computed can lead not only to early insights but also to useful knowledge that can be provided as feedback to the simulation, avoiding unnecessary use of computing power. Our work is aimed at making advanced visualization tools available to scientists in a user-friendly, Web-based environment where they can be accessed anytime from anywhere. In the context of turbulent combustion for example, visualization is used to understand the coupling between turbulence and the turbulent mixing of scalars. Although isosurface generation is a useful technique in this scenario, computing and rendering isosurfaces one at a time is expensive and not particularly well-suited for such a Web-based framework. In this paper we propose the use of a summary structure, called contour tree, that captures the topological structure of a scalar field and guides the user in identifying useful isosurfaces. We have also designed an interface which has been integrated with a Web-based simulation monitoring system, that allows users to interact with and explore multiple isosurfaces.
{"title":"Enabling Advanced Visualization Tools in a Web-Based Simulation Monitoring System","authors":"E. Santos, Julien Tierny, Ayla Khan, Brad Grimm, L. Lins, J. Freire, Valerio Pascucci, Cláudio T. Silva, S. Klasky, Roselyne B. Tchoua, N. Podhorszki","doi":"10.1109/E-SCIENCE.2009.57","DOIUrl":"https://doi.org/10.1109/E-SCIENCE.2009.57","url":null,"abstract":"Simulations that require massive amounts of computing power and generate tens of terabytes of data are now part of the daily lives of scientists. Analyzing and visualizing the results of these simulations as they are computed can lead not only to early insights but also to useful knowledge that can be provided as feedback to the simulation, avoiding unnecessary use of computing power. Our work is aimed at making advanced visualization tools available to scientists in a user-friendly, Web-based environment where they can be accessed anytime from anywhere. In the context of turbulent combustion for example, visualization is used to understand the coupling between turbulence and the turbulent mixing of scalars. Although isosurface generation is a useful technique in this scenario, computing and rendering isosurfaces one at a time is expensive and not particularly well-suited for such a Web-based framework. In this paper we propose the use of a summary structure, called contour tree, that captures the topological structure of a scalar field and guides the user in identifying useful isosurfaces. We have also designed an interface which has been integrated with a Web-based simulation monitoring system, that allows users to interact with and explore multiple isosurfaces.","PeriodicalId":325840,"journal":{"name":"2009 Fifth IEEE International Conference on e-Science","volume":"13 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120910931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-12-09DOI: 10.1109/e-Science.2009.17
Priyanka Katariya, Sathish S. Vadhiyar
A phylogenetic or evolutionary tree is constructed from a set of species or DNA sequences and depicts the relatedness between the sequences. Predictions of future sequences in a phylogenetic tree are important for a variety of applications including drug discovery, pharmaceutical research and disease control. In this work, we predict future DNA sequences in a phylogenetic tree using cellular automata. Cellular automata are used for modeling neighbor-dependent mutations from an ancestor to a progeny in a branch of the phylogenetic tree. Since the number of possible ways of transformations from an ancestor to a progeny is huge, we use computational grids and middleware techniques to explore the large number of cellular automata rules used for the mutations. We use the popular and recurring neighbor-based transitions or mutations to predict the progeny sequences in the phylogenetic tree. We performed predictions for three types of sequences, namely, triose phosphate isomerase, pyruvate kinase, and polyketide synthase sequences, by obtaining cellular automata rules on a grid consisting of 29 machines in 4 clusters located in 4 countries, and compared the predictions of the sequences using our method with predictions by random methods. We found that in all cases, our method gave about 40% better predictions than the random methods.
{"title":"Phylogenetic Predictions on Grids","authors":"Priyanka Katariya, Sathish S. Vadhiyar","doi":"10.1109/e-Science.2009.17","DOIUrl":"https://doi.org/10.1109/e-Science.2009.17","url":null,"abstract":"A phylogenetic or evolutionary tree is constructed from a set of species or DNA sequences and depicts the relatedness between the sequences. Predictions of future sequences in a phylogenetic tree are important for a variety of applications including drug discovery, pharmaceutical research and disease control. In this work, we predict future DNA sequences in a phylogenetic tree using cellular automata. Cellular automata are used for modeling neighbor-dependent mutations from an ancestor to a progeny in a branch of the phylogenetic tree. Since the number of possible ways of transformations from an ancestor to a progeny is huge, we use computational grids and middleware techniques to explore the large number of cellular automata rules used for the mutations. We use the popular and recurring neighbor-based transitions or mutations to predict the progeny sequences in the phylogenetic tree. We performed predictions for three types of sequences, namely, triose phosphate isomerase, pyruvate kinase, and polyketide synthase sequences, by obtaining cellular automata rules on a grid consisting of 29 machines in 4 clusters located in 4 countries, and compared the predictions of the sequences using our method with predictions by random methods. We found that in all cases, our method gave about 40% better predictions than the random methods.","PeriodicalId":325840,"journal":{"name":"2009 Fifth IEEE International Conference on e-Science","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124520791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-12-09DOI: 10.1109/e-Science.2009.15
J. Lindh, A. Eriksson
The project described here may be seen as a continuation of an earlier project, SweDia 2000, aimed at transforming the database collected in that project to a full-fledged e-science database. The database consists of recordings of Swedish dialects from 107 locations in Sweden and Swedish speaking parts of Finland. The goal of the present project is to make the material searchable in a flexible and simple way to make it available to a much wider sector of the research community than is the case at present. The database will be accessible over the Internet via user-friendly interfaces specifically designed for this type of data. Other more specialized research interfaces will also be designed to facilitate phonetic acoustic research and orientation of the database.
{"title":"The SweDat Project and Swedia Database for Phonetic and Acoustic Research","authors":"J. Lindh, A. Eriksson","doi":"10.1109/e-Science.2009.15","DOIUrl":"https://doi.org/10.1109/e-Science.2009.15","url":null,"abstract":"The project described here may be seen as a continuation of an earlier project, SweDia 2000, aimed at transforming the database collected in that project to a full-fledged e-science database. The database consists of recordings of Swedish dialects from 107 locations in Sweden and Swedish speaking parts of Finland. The goal of the present project is to make the material searchable in a flexible and simple way to make it available to a much wider sector of the research community than is the case at present. The database will be accessible over the Internet via user-friendly interfaces specifically designed for this type of data. Other more specialized research interfaces will also be designed to facilitate phonetic acoustic research and orientation of the database.","PeriodicalId":325840,"journal":{"name":"2009 Fifth IEEE International Conference on e-Science","volume":"145 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131562280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}