Pub Date : 2018-10-01DOI: 10.1109/eScience.2018.00084
T. Martin
n/a
{"title":"Implementation of the ATLAS Trigger Within the ATLAS Multi-threaded Software Framework AthenaMT","authors":"T. Martin","doi":"10.1109/eScience.2018.00084","DOIUrl":"https://doi.org/10.1109/eScience.2018.00084","url":null,"abstract":"n/a","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"27 1","pages":"339-339"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74079503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/eScience.2018.00106
L. Bogaardt, Frank W. Takes
Recently, a new network formation model was proposed: SUGM. Our research looks into a method to estimate the parameters of this model based on the subgraph census.
最近,提出了一种新的网络形成模型:SUGM。本文研究了一种基于子图普查的模型参数估计方法。
{"title":"Estimating Subgraph Generation Models to Understand Large Network Formation","authors":"L. Bogaardt, Frank W. Takes","doi":"10.1109/eScience.2018.00106","DOIUrl":"https://doi.org/10.1109/eScience.2018.00106","url":null,"abstract":"Recently, a new network formation model was proposed: SUGM. Our research looks into a method to estimate the parameters of this model based on the subgraph census.","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"70 1","pages":"375-376"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74595272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/eScience.2018.00113
Lise Stork, Andreas Weber, E. Miracle, K. Wolstencroft
n/a
{"title":"Linking Natural History Collections","authors":"Lise Stork, Andreas Weber, E. Miracle, K. Wolstencroft","doi":"10.1109/eScience.2018.00113","DOIUrl":"https://doi.org/10.1109/eScience.2018.00113","url":null,"abstract":"n/a","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"1 1","pages":"388-389"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73004001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/eScience.2018.00095
Viktor Bakayov, R. Goncalves, R. Zurita-Milla, E. Izquierdo-Verdiguier
Phenology is the study of periodic plant and animal life cycle events and how these are influenced by seasonal and inter-annual variations in weather and climate, as well as in other environmental factors. Time series of remote sensing (RS) images can be used to characterize land surface phenology at continental to global scales. For this, the RS images are typically transformed into various vegetation indices (VI) such as the normalized difference vegetation index (NDVI) or the enhanced vegetation index (EVI). These indices can then be used to extract various phenological metrics. In our previous work we used cloud computing to generate temperature-based phenological indices [1], [2], and to relate one phenological metric, namely the Start-of-Season (SOS), with those indices [3], [4]. Here we present an extension of our work where we use a Spark-based platform to efficiently extract phenological metrics from time series of NDVI and EVI. This platform allows obtaining and analyzing high spatial resolution metrics (in this case 1km) from 10-day composites. The platform uses the same architecture as in [3], i.e., it is organized into three layers: a storage layer, a processing layer, and JupyterHub services for user-interaction. It is designed to store the data in well-known file formats like GeoTiffs and Hierarchical Data Format (HDF). For the data analysis the user expresses the operations in Jupyter notebooks as Python, R, or Scala code (Fig. 1). Hence, with a browser and remote connection, the user can express a research question and/or collect insights from large data sets. All computations are pushed down to the computational platform, and results fetched back for data visualization. To extract the phenological metrics, we rely on TimeSat [5]. TimeSat is a software package that can be used to fit a function (e.g. double logistic) to time series of VIs. After that, it uses various approaches to extract vegetation seasonality metrics such as SOS. The programs numerical and graphical routines are coded in Matlab and Fortran. These routines are highly vectorized and efficient for use with large data sets. However, distributed processing is required to determine SOS at continental scales. Through an efficient partition of the data, and Spark’s scheduling policies, these single-core routines are scheduled for parallel execution over multiple machines. The study evaluates which VIs and fitting functions are most suitable for certain vegetation types by comparing the SOS metrics to volunteered phenological observations curated by the USA national phenological network [6]. Our preliminary results show there can be up to 20-30 days differences in the SOS depending on the fitting function, the VI and the approach used to extract the SOS metric. In the South, SOS is around mid-February or March whereas in mountainous regions and the North, the SOS can be as late as June-July. We are to further evaluate how our results compare to the ground volunteered ob
{"title":"A Spark-Based Platform to Extract Phenological Information from Satellite Images","authors":"Viktor Bakayov, R. Goncalves, R. Zurita-Milla, E. Izquierdo-Verdiguier","doi":"10.1109/eScience.2018.00095","DOIUrl":"https://doi.org/10.1109/eScience.2018.00095","url":null,"abstract":"Phenology is the study of periodic plant and animal life cycle events and how these are influenced by seasonal and inter-annual variations in weather and climate, as well as in other environmental factors. Time series of remote sensing (RS) images can be used to characterize land surface phenology at continental to global scales. For this, the RS images are typically transformed into various vegetation indices (VI) such as the normalized difference vegetation index (NDVI) or the enhanced vegetation index (EVI). These indices can then be used to extract various phenological metrics. In our previous work we used cloud computing to generate temperature-based phenological indices [1], [2], and to relate one phenological metric, namely the Start-of-Season (SOS), with those indices [3], [4]. Here we present an extension of our work where we use a Spark-based platform to efficiently extract phenological metrics from time series of NDVI and EVI. This platform allows obtaining and analyzing high spatial resolution metrics (in this case 1km) from 10-day composites. The platform uses the same architecture as in [3], i.e., it is organized into three layers: a storage layer, a processing layer, and JupyterHub services for user-interaction. It is designed to store the data in well-known file formats like GeoTiffs and Hierarchical Data Format (HDF). For the data analysis the user expresses the operations in Jupyter notebooks as Python, R, or Scala code (Fig. 1). Hence, with a browser and remote connection, the user can express a research question and/or collect insights from large data sets. All computations are pushed down to the computational platform, and results fetched back for data visualization. To extract the phenological metrics, we rely on TimeSat [5]. TimeSat is a software package that can be used to fit a function (e.g. double logistic) to time series of VIs. After that, it uses various approaches to extract vegetation seasonality metrics such as SOS. The programs numerical and graphical routines are coded in Matlab and Fortran. These routines are highly vectorized and efficient for use with large data sets. However, distributed processing is required to determine SOS at continental scales. Through an efficient partition of the data, and Spark’s scheduling policies, these single-core routines are scheduled for parallel execution over multiple machines. The study evaluates which VIs and fitting functions are most suitable for certain vegetation types by comparing the SOS metrics to volunteered phenological observations curated by the USA national phenological network [6]. Our preliminary results show there can be up to 20-30 days differences in the SOS depending on the fitting function, the VI and the approach used to extract the SOS metric. In the South, SOS is around mid-February or March whereas in mountainous regions and the North, the SOS can be as late as June-July. We are to further evaluate how our results compare to the ground volunteered ob","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"42 1","pages":"354-355"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78413015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/eScience.2018.00126
M. Kempe, F. Goes, K. Lemmink
Sports analytics in general and soccer analytics, in particular, have evolved in recent years due to the increased availability of large data amounts of (tracking) data. Especially in terms of evaluating tactical behavior, data science could change the way we think about soccer. In this study, we evaluate passing performance in soccer to prove the hypothesis that tactical behavior in team sports can be analyzed based exclusively on tracking data. To prove this point, we explore the relationship between changes in spatiotemporal variables in relation to passing and key performance indicators. Based on our results that demonstrate the ability of spatiotemporal variables to predict pass accuracy and key performances indicators on an individual level, we confirmed our hypothesis. Furthermore, we calculated a simple composite performance indicator to evaluate passes and players based on tracking data. In conclusion, our results can be used as an approach for real-time evaluation of tactical behavior and as a new method to scout and evaluate players in soccer and team sports in general.
{"title":"Smart Data Scouting in Professional Soccer: Evaluating Passing Performance Based on Position Tracking Data","authors":"M. Kempe, F. Goes, K. Lemmink","doi":"10.1109/eScience.2018.00126","DOIUrl":"https://doi.org/10.1109/eScience.2018.00126","url":null,"abstract":"Sports analytics in general and soccer analytics, in particular, have evolved in recent years due to the increased availability of large data amounts of (tracking) data. Especially in terms of evaluating tactical behavior, data science could change the way we think about soccer. In this study, we evaluate passing performance in soccer to prove the hypothesis that tactical behavior in team sports can be analyzed based exclusively on tracking data. To prove this point, we explore the relationship between changes in spatiotemporal variables in relation to passing and key performance indicators. Based on our results that demonstrate the ability of spatiotemporal variables to predict pass accuracy and key performances indicators on an individual level, we confirmed our hypothesis. Furthermore, we calculated a simple composite performance indicator to evaluate passes and players based on tracking data. In conclusion, our results can be used as an approach for real-time evaluation of tactical behavior and as a new method to scout and evaluate players in soccer and team sports in general.","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"26 3","pages":"409-410"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72627661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/eScience.2018.00104
Claudio Atzori, P. Manghi, A. Bardi
The OpenAIRE infrastructure populates a scholarly communication big graph interlinking metadata objects of publications, datasets, software, organizations, funders, and projects. In order to de-duplicate this graph, OpenAIRE has developed GDup, an integrated, scalable, general-purpose system for entity deduplication over big information graphs. GDup offers functionalities to realize a fully-fledged entity deduplication workflow over a generic input graph, inclusive of Ground Truth support, end-user feedback, and strategies for identifying and merging duplicates to obtain an output disambiguated graph.
{"title":"De-duplicating the OpenAIRE Scholarly Communication Big Graph","authors":"Claudio Atzori, P. Manghi, A. Bardi","doi":"10.1109/eScience.2018.00104","DOIUrl":"https://doi.org/10.1109/eScience.2018.00104","url":null,"abstract":"The OpenAIRE infrastructure populates a scholarly communication big graph interlinking metadata objects of publications, datasets, software, organizations, funders, and projects. In order to de-duplicate this graph, OpenAIRE has developed GDup, an integrated, scalable, general-purpose system for entity deduplication over big information graphs. GDup offers functionalities to realize a fully-fledged entity deduplication workflow over a generic input graph, inclusive of Ground Truth support, end-user feedback, and strategies for identifying and merging duplicates to obtain an output disambiguated graph.","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"25 1","pages":"372-373"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81835274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/eScience.2018.00011
Mateusz Kuzak, Jen Harrow, R. Jiménez, P. Martínez, Fotis Psomopoulos, R. Vareková, A. Via
The "ELIXIR Training Platform" is partnering with The Carpentries (Software and Data Carpentry) to train life science researchers in computing and data management skills. The "ELIXIR Software development best practices" group, which is part of the ELIXIR Tools Platform, has proposed "Four simple recommendations to encourage best practices in research software" aiming to help researchers and developers to adopt Open Source Software (OSS) practices and thus improve the quality and sustainability of research software. In order to encourage researchers and developers to adopt the four recommendations (4OSS) and build FAIR software, we are developing specific and practical training materials, taking advantage of the Carpentries approach and experience in training material development and maintenance.
{"title":"Lesson Development for Open Source Software Best Practices Adoption","authors":"Mateusz Kuzak, Jen Harrow, R. Jiménez, P. Martínez, Fotis Psomopoulos, R. Vareková, A. Via","doi":"10.1109/eScience.2018.00011","DOIUrl":"https://doi.org/10.1109/eScience.2018.00011","url":null,"abstract":"The \"ELIXIR Training Platform\" is partnering with The Carpentries (Software and Data Carpentry) to train life science researchers in computing and data management skills. The \"ELIXIR Software development best practices\" group, which is part of the ELIXIR Tools Platform, has proposed \"Four simple recommendations to encourage best practices in research software\" aiming to help researchers and developers to adopt Open Source Software (OSS) practices and thus improve the quality and sustainability of research software. In order to encourage researchers and developers to adopt the four recommendations (4OSS) and build FAIR software, we are developing specific and practical training materials, taking advantage of the Carpentries approach and experience in training material development and maintenance.","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"15 1","pages":"19-20"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77969846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/eScience.2018.00061
Yassene Mohammed
High throughput -omics like proteomics and genomics allow detailed molecular studies of organisms. Such studies are inherently on the Big Data side regarding volume and complexity. Following the FAIR principles and reaching for transparency in publication, raw data and results are often shared in public repositories. However, despite the steadily increased amount of shared omics data, it is still challenging to compare, correlate, and integrate it to answer new questions. Here we report on our experience in reusing and repurposing publically available proteomics and genomics data to design new targeted proteomics experiments. We have developed a scientific workflow to retrieve and integrate information from various repositories and domain knowledge-bases including UniPortKB [1], GPMDB [2], PRIDE [3], PeptideAtlas [4], ProteomicsDB [5], MassIVE [6], ExPASy [7], NCBI’s dbSNP [8], and PeptideTracker [9]. Following a “Map-Reduce” approach [10] the workflow select best proteotypic peptides for Multiple Reaction Monitoring (MRM) experiment. In an attempt to gain insights into the human proteome, we have designed a second workflow to orchestrate the selection workflow. 100,000s of queries were sent to online repositories to determine if peptides were seen in previous experiments. Fault tolerance ranged from dealing with no-reply to wrong annotations. Three months run of the workflow generated a comprehensive list of 165k+ suitable proteotypic peptides covering most human proteins. The main challenge has been the evolving APIs of the resources which continuously affects the components of our integrative bioinformatic solutions.
{"title":"Workflows Orchestrating Workflows: Thousands of Queries and Their Fault Tolerance Using APIs of Omics Web Resources","authors":"Yassene Mohammed","doi":"10.1109/eScience.2018.00061","DOIUrl":"https://doi.org/10.1109/eScience.2018.00061","url":null,"abstract":"High throughput -omics like proteomics and genomics allow detailed molecular studies of organisms. Such studies are inherently on the Big Data side regarding volume and complexity. Following the FAIR principles and reaching for transparency in publication, raw data and results are often shared in public repositories. However, despite the steadily increased amount of shared omics data, it is still challenging to compare, correlate, and integrate it to answer new questions. Here we report on our experience in reusing and repurposing publically available proteomics and genomics data to design new targeted proteomics experiments. We have developed a scientific workflow to retrieve and integrate information from various repositories and domain knowledge-bases including UniPortKB [1], GPMDB [2], PRIDE [3], PeptideAtlas [4], ProteomicsDB [5], MassIVE [6], ExPASy [7], NCBI’s dbSNP [8], and PeptideTracker [9]. Following a “Map-Reduce” approach [10] the workflow select best proteotypic peptides for Multiple Reaction Monitoring (MRM) experiment. In an attempt to gain insights into the human proteome, we have designed a second workflow to orchestrate the selection workflow. 100,000s of queries were sent to online repositories to determine if peptides were seen in previous experiments. Fault tolerance ranged from dealing with no-reply to wrong annotations. Three months run of the workflow generated a comprehensive list of 165k+ suitable proteotypic peptides covering most human proteins. The main challenge has been the evolving APIs of the resources which continuously affects the components of our integrative bioinformatic solutions.","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"145 1","pages":"299-300"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80459975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/eScience.2018.00051
M. Schultz, Sander Apweiler, Jan Vogelsang, B. Hagemeier, F. Kleinert, Daniel Mallmann
The Tropospheric Ozone Assessment Report (TOAR) has recently pioneered the use of global Earth Observation data to derive a globally consistent scheme for characterizing the local environment of stations measuring weather and atmospheric composition. Here, we are building on the TOAR concept and expand it to a set of web services, which will allow for a flexible, automated characterization of any point location through standardized REST APIs. These services shall be freely available to the community and thus pave the way for new concepts to analyze global monitoring data and evaluate numerical models.
{"title":"A Web Service Architecture for Objective Station Classification Purposes","authors":"M. Schultz, Sander Apweiler, Jan Vogelsang, B. Hagemeier, F. Kleinert, Daniel Mallmann","doi":"10.1109/eScience.2018.00051","DOIUrl":"https://doi.org/10.1109/eScience.2018.00051","url":null,"abstract":"The Tropospheric Ozone Assessment Report (TOAR) has recently pioneered the use of global Earth Observation data to derive a globally consistent scheme for characterizing the local environment of stations measuring weather and atmospheric composition. Here, we are building on the TOAR concept and expand it to a set of web services, which will allow for a flexible, automated characterization of any point location through standardized REST APIs. These services shall be freely available to the community and thus pave the way for new concepts to analyze global monitoring data and evaluate numerical models.","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"25 1","pages":"283-284"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78883739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/eScience.2018.00035
David P. Randall, Drew Paine, Charlotte P. Lee
Over the last several years, a growing body of work has examined the nature of large-scale virtual organizations for data-intensive cooperative science. These projects, known as Cyberinfrastructures (CI) in the United States, are established realms of inquiry for the eScience and Computer Supported Cooperative Work (CSCW) communities. Scholarship in these communities extends technology focused inquiries to investigate the sociotechnical concerns to such infrastructure creation and maintenance. In this paper we present findings from our qualitative study of a federated cyberinfrastructure organization known as GENI. We contribute to this body of scholarship by investigating how stakeholders in the GENI project position existing, and newly created, resources for use in educational settings. We examine how stakeholders acquaint new potential stakeholders with this CI in order to draw them into the community, and the ways in which stakeholder's roles evolve over time. Our findings illustrate several ways stakeholders leverage and align existing relationships and resources to expand the CI project's user base. Finally, this paper suggests avenues of further inquiry and implications for organizing future CI projects.
{"title":"Educational Outreach & Stakeholder Role Evolution in a Cyberinfrastructure Project","authors":"David P. Randall, Drew Paine, Charlotte P. Lee","doi":"10.1109/eScience.2018.00035","DOIUrl":"https://doi.org/10.1109/eScience.2018.00035","url":null,"abstract":"Over the last several years, a growing body of work has examined the nature of large-scale virtual organizations for data-intensive cooperative science. These projects, known as Cyberinfrastructures (CI) in the United States, are established realms of inquiry for the eScience and Computer Supported Cooperative Work (CSCW) communities. Scholarship in these communities extends technology focused inquiries to investigate the sociotechnical concerns to such infrastructure creation and maintenance. In this paper we present findings from our qualitative study of a federated cyberinfrastructure organization known as GENI. We contribute to this body of scholarship by investigating how stakeholders in the GENI project position existing, and newly created, resources for use in educational settings. We examine how stakeholders acquaint new potential stakeholders with this CI in order to draw them into the community, and the ways in which stakeholder's roles evolve over time. Our findings illustrate several ways stakeholders leverage and align existing relationships and resources to expand the CI project's user base. Finally, this paper suggests avenues of further inquiry and implications for organizing future CI projects.","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"73 1","pages":"201-211"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86928464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}