Pub Date : 2018-10-01DOI: 10.1109/eScience.2018.00110
M. Vos, Berend Weel, Adrienne Mendrik, A. Dekker, J. V. Soest
n/a
{"title":"Fast and Easy Mapping of Relational Data to RDF for Rapid Learning Health Care","authors":"M. Vos, Berend Weel, Adrienne Mendrik, A. Dekker, J. V. Soest","doi":"10.1109/eScience.2018.00110","DOIUrl":"https://doi.org/10.1109/eScience.2018.00110","url":null,"abstract":"n/a","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"1 1","pages":"382-383"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79894146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/eScience.2018.00023
Daniyal Kazempour, A. Beer, Friederike Herzog, Daniel Kaltenthaler, Johannes-Y. Lohrer, T. Seidl
Abstract-Analyzing flyways of birds is one approach ornithologists pursue e.g. to be able to detect potential risks during the animal's migration. But this analysis is not trivial and the functionalities of existing supporting tools are neither perfect nor all-encompassing. In this paper, we introduce our new FATBIRD Tool, which not only visualizes flyways or arbitrary trajectories, but also helps the researchers in several aspects of the analysis. Similarities between all trajectories of the individual birds are calculated via Dynamic Time Warping distances, which is to the best of our knowledge the first usage in this field and delivers promising results. We show the functionalities of our tool on a use case based on real data of a GPS/GSM telemetry study of Eurasian curlews of the "Bavarian Society for the Protection of Birds". The similarities are shown in an intuitively understandable heat map colored distance matrix as well as a hierarchical clustering dendrogram. The clustering of all data points is performed and shown, and the data can be filtered by several parameters. With that, potential stop-over and wintering areas can be detected very fast and easily. After having obtained the similarities and differences of the trajectories in an automatic way, the researchers can focus on the biological reasons of the generated results of the FATBIRD Tool. These can lead to a better understanding of e.g. why certain birds die on their flyways and thus to new approaches to develop optimized conservation measures for the specific species.
{"title":"FATBIRD: A Tool for Flight and Trajectories Analyses of Birds","authors":"Daniyal Kazempour, A. Beer, Friederike Herzog, Daniel Kaltenthaler, Johannes-Y. Lohrer, T. Seidl","doi":"10.1109/eScience.2018.00023","DOIUrl":"https://doi.org/10.1109/eScience.2018.00023","url":null,"abstract":"Abstract-Analyzing flyways of birds is one approach ornithologists pursue e.g. to be able to detect potential risks during the animal's migration. But this analysis is not trivial and the functionalities of existing supporting tools are neither perfect nor all-encompassing. In this paper, we introduce our new FATBIRD Tool, which not only visualizes flyways or arbitrary trajectories, but also helps the researchers in several aspects of the analysis. Similarities between all trajectories of the individual birds are calculated via Dynamic Time Warping distances, which is to the best of our knowledge the first usage in this field and delivers promising results. We show the functionalities of our tool on a use case based on real data of a GPS/GSM telemetry study of Eurasian curlews of the \"Bavarian Society for the Protection of Birds\". The similarities are shown in an intuitively understandable heat map colored distance matrix as well as a hierarchical clustering dendrogram. The clustering of all data points is performed and shown, and the data can be filtered by several parameters. With that, potential stop-over and wintering areas can be detected very fast and easily. After having obtained the similarities and differences of the trajectories in an automatic way, the researchers can focus on the biological reasons of the generated results of the FATBIRD Tool. These can lead to a better understanding of e.g. why certain birds die on their flyways and thus to new approaches to develop optimized conservation measures for the specific species.","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"22 1","pages":"75-82"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74052355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/eScience.2018.00134
Roee Ebenstein, G. Agrawal, Jiali Wang, J. Boley, R. Kettimuthu
Scientific data is not only rapidly increasing in size, but in complexity of operations performed upon as well. Compared to the prevalent use of ad-hoc approaches, structured operators provide many benefits. In this paper, we introduce FDQ - an Analytical Functions Distributed Querying Engine intended for Array Data. Motivated by needs of climate scientists in terms of both functionality and scalability, we make three major contributions: First, we introduce a new class of analytical querying - querying over windows where the planes that construct these windows are internally ordered. An example of this querying type is the introduced MINUS analytical function, a function that supports querying over accumulative measurements with data resets. Second, we describe in detail memory management optimizations for efficient processing of analytical (and other structured operators) querying over large datasets. Last, we provide efficient methods to execute these queries in parallel, using a sectioned (tiled) approach. We evaluate our methods using real multi-dimensional climate datasets, and show they outperform existing approaches. When running locally (not in a distributed manner), we observed an average performance improvement of 538% compared to other engines for analytical calculations. We also show our methods performance improve linearly with the provided computing resources (scale up and out).
{"title":"FDQ: Advance Analytics Over Real Scientific Array Datasets","authors":"Roee Ebenstein, G. Agrawal, Jiali Wang, J. Boley, R. Kettimuthu","doi":"10.1109/eScience.2018.00134","DOIUrl":"https://doi.org/10.1109/eScience.2018.00134","url":null,"abstract":"Scientific data is not only rapidly increasing in size, but in complexity of operations performed upon as well. Compared to the prevalent use of ad-hoc approaches, structured operators provide many benefits. In this paper, we introduce FDQ - an Analytical Functions Distributed Querying Engine intended for Array Data. Motivated by needs of climate scientists in terms of both functionality and scalability, we make three major contributions: First, we introduce a new class of analytical querying - querying over windows where the planes that construct these windows are internally ordered. An example of this querying type is the introduced MINUS analytical function, a function that supports querying over accumulative measurements with data resets. Second, we describe in detail memory management optimizations for efficient processing of analytical (and other structured operators) querying over large datasets. Last, we provide efficient methods to execute these queries in parallel, using a sectioned (tiled) approach. We evaluate our methods using real multi-dimensional climate datasets, and show they outperform existing approaches. When running locally (not in a distributed manner), we observed an average performance improvement of 538% compared to other engines for analytical calculations. We also show our methods performance improve linearly with the provided computing resources (scale up and out).","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"1 1","pages":"453-463"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82063966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/eScience.2018.00059
Rebecka Weegar
This study describes an approach for mining events preceding a cervical cancer diagnosis from health records.
本研究描述了一种从健康记录中挖掘宫颈癌诊断前事件的方法。
{"title":"Mining Events Preceding a Cancer Diagnosis","authors":"Rebecka Weegar","doi":"10.1109/eScience.2018.00059","DOIUrl":"https://doi.org/10.1109/eScience.2018.00059","url":null,"abstract":"This study describes an approach for mining events preceding a cervical cancer diagnosis from health records.","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"17 1","pages":"295-296"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84338179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/eScience.2018.00092
B. V. Werkhoven, Adrienne Mendrik, R. V. Nieuwpoort
n/a
N/A
{"title":"Poster Abstracts eScience 2018 Conference","authors":"B. V. Werkhoven, Adrienne Mendrik, R. V. Nieuwpoort","doi":"10.1109/eScience.2018.00092","DOIUrl":"https://doi.org/10.1109/eScience.2018.00092","url":null,"abstract":"n/a","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"45 1","pages":"349-349"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87214843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/eScience.2018.00101
J. Farnes, B. Mort, F. Dulwich, K. Adámek, Anna Brown, Jan Novotný, S. Salvini, W. Armour
The Square Kilometre Array (SKA) will be the largest radio telescope constructed to date and the largest Big Data project in the known Universe. The first phase of the project will generate 160 terabytes every second. This amounts to 5 zettabytes (5 million petabytes) of data that will be generated by the facility each year - a data rate equivalent to 5 times the estimated global internet traffic in 2015. These data need to be reduced and then continuously ingested by the SKA Science Data Processor (SDP). Within the SDP Consortium, we are contributing to various roles in the development of the telescope including building a lightweight end-to-end prototype of the major components of the SDP system - a project we call the SDP Integration Prototype (SIP). The aim is to build a mini, fully-operational SDP, for which we have been developing realistic SKA-like science pipelines that can handle these unprecedented data volumes.
{"title":"Building the World's Largest Radio Telescope: The Square Kilometre Array Science Data Processor","authors":"J. Farnes, B. Mort, F. Dulwich, K. Adámek, Anna Brown, Jan Novotný, S. Salvini, W. Armour","doi":"10.1109/eScience.2018.00101","DOIUrl":"https://doi.org/10.1109/eScience.2018.00101","url":null,"abstract":"The Square Kilometre Array (SKA) will be the largest radio telescope constructed to date and the largest Big Data project in the known Universe. The first phase of the project will generate 160 terabytes every second. This amounts to 5 zettabytes (5 million petabytes) of data that will be generated by the facility each year - a data rate equivalent to 5 times the estimated global internet traffic in 2015. These data need to be reduced and then continuously ingested by the SKA Science Data Processor (SDP). Within the SDP Consortium, we are contributing to various roles in the development of the telescope including building a lightweight end-to-end prototype of the major components of the SDP system - a project we call the SDP Integration Prototype (SIP). The aim is to build a mini, fully-operational SDP, for which we have been developing realistic SKA-like science pipelines that can handle these unprecedented data volumes.","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"75 1","pages":"366-367"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88961051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/eScience.2018.00065
M. H. Nguyen, Ehab Abdelmaguid, Jolene Huang, Sanjay Kenchareddy, D. Singla, L. Wilke, Marcus Bobar, Eric D. Carruth, Dylan Uys, I. Altintas, E. Muse, Giorgio Quer, S. Steinhubl
n/a
{"title":"Analytics Pipeline for Left Ventricle Segmentation and Volume Estimation on Cardiac MRI Using Deep Learning","authors":"M. H. Nguyen, Ehab Abdelmaguid, Jolene Huang, Sanjay Kenchareddy, D. Singla, L. Wilke, Marcus Bobar, Eric D. Carruth, Dylan Uys, I. Altintas, E. Muse, Giorgio Quer, S. Steinhubl","doi":"10.1109/eScience.2018.00065","DOIUrl":"https://doi.org/10.1109/eScience.2018.00065","url":null,"abstract":"n/a","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"28 1","pages":"305-306"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89284851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/eScience.2018.00098
Anna-Lena Lamprecht, Magnus Palmblad, J. Ison, V. Schwämmle
Numerous software utilities operating on mass spectrometry (MS) data are described in the literature that provide specific operations as building blocks for the assembly of purposespecific workflows. Working out which tools and combinations are applicable or optimal is often hard: insufficient annotation of tool functions and interfaces impedes finding viable tool combinations, and potentially compatible tools may not, in practice, operate together. Thus researchers face difficulties in selecting practical and effective data analysis pipelines for a specific experimental design.
{"title":"Automated Composition of Scientific Workflows in Mass Spectrometry-Based Proteomics","authors":"Anna-Lena Lamprecht, Magnus Palmblad, J. Ison, V. Schwämmle","doi":"10.1109/eScience.2018.00098","DOIUrl":"https://doi.org/10.1109/eScience.2018.00098","url":null,"abstract":"Numerous software utilities operating on mass spectrometry (MS) data are described in the literature that provide specific operations as building blocks for the assembly of purposespecific workflows. Working out which tools and combinations are applicable or optimal is often hard: insufficient annotation of tool functions and interfaces impedes finding viable tool combinations, and potentially compatible tools may not, in practice, operate together. Thus researchers face difficulties in selecting practical and effective data analysis pipelines for a specific experimental design.","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"101 1","pages":"360-361"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77443171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/eScience.2018.00125
Seyran Khademi, Xiangwei Shi, Tino Mager, R. Siebes, C. Hein, V. D. Boer, J. V. Gemert
We address the interpretability of convolutional neural networks (CNNs) for predicting a geo-location from an image. In a pilot experiment we classify images of Pittsburgh vs Tokyo and visualize the learned CNN filters. We found that varying the CNN architecture leads to variating in the visualized filters. This calls for further investigation of the effective parameters on the interpretability of CNNs.
{"title":"Sight-Seeing in the Eyes of Deep Neural Networks","authors":"Seyran Khademi, Xiangwei Shi, Tino Mager, R. Siebes, C. Hein, V. D. Boer, J. V. Gemert","doi":"10.1109/eScience.2018.00125","DOIUrl":"https://doi.org/10.1109/eScience.2018.00125","url":null,"abstract":"We address the interpretability of convolutional neural networks (CNNs) for predicting a geo-location from an image. In a pilot experiment we classify images of Pittsburgh vs Tokyo and visualize the learned CNN filters. We found that varying the CNN architecture leads to variating in the visualized filters. This calls for further investigation of the effective parameters on the interpretability of CNNs.","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"107 1","pages":"407-408"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77863354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-10-01DOI: 10.1109/eScience.2018.00088
P. Calafiura, S. Farrell, H. Gray, J. Vlimant, V. Innocente, A. Salzburger, S. Amrouche, T. Golling, M. Kiehn, Victor Estrade, Cécile Germain, Isabelle M Guyon, E. Moyse, D. Rousseau, Y. Yilmaz, V. Gligorov, M. Hushchyn, A. Ustyuzhanin
n/a
{"title":"TrackML: A High Energy Physics Particle Tracking Challenge","authors":"P. Calafiura, S. Farrell, H. Gray, J. Vlimant, V. Innocente, A. Salzburger, S. Amrouche, T. Golling, M. Kiehn, Victor Estrade, Cécile Germain, Isabelle M Guyon, E. Moyse, D. Rousseau, Y. Yilmaz, V. Gligorov, M. Hushchyn, A. Ustyuzhanin","doi":"10.1109/eScience.2018.00088","DOIUrl":"https://doi.org/10.1109/eScience.2018.00088","url":null,"abstract":"n/a","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"163 1","pages":"344-344"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80300544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}