R. Devarakonda, Kavya Guntupally, M. Thornton, Yaxing Wei, Debjani Singh, D. Lunga
Several factors must be considered in designing a highly accurate, reliable, scalable, and user-friendly geospatial data search interfaces. This paper examines four critical questions that ought to be considered during design phase: (1) Is the search interface or API that provides the search capability useable by both humans and machines? (2) Are the results consistent and reliable? (3) Is the output response format free to use, community-defined, and non-propriety? (4) Does the API clearly state the usage clauses? This paper discusses how certain data repositories at the US Department of Energy's Oak Ridge National Laboratory apply FAIR data principles to enable geospatial searches and address the above-mentioned questions.
{"title":"FAIR Interfaces for Geospatial Scientific Data Searches","authors":"R. Devarakonda, Kavya Guntupally, M. Thornton, Yaxing Wei, Debjani Singh, D. Lunga","doi":"10.1145/3486640.3491391","DOIUrl":"https://doi.org/10.1145/3486640.3491391","url":null,"abstract":"Several factors must be considered in designing a highly accurate, reliable, scalable, and user-friendly geospatial data search interfaces. This paper examines four critical questions that ought to be considered during design phase: (1) Is the search interface or API that provides the search capability useable by both humans and machines? (2) Are the results consistent and reliable? (3) Is the output response format free to use, community-defined, and non-propriety? (4) Does the API clearly state the usage clauses? This paper discusses how certain data repositories at the US Department of Energy's Oak Ridge National Laboratory apply FAIR data principles to enable geospatial searches and address the above-mentioned questions.","PeriodicalId":315583,"journal":{"name":"Proceedings of the 1st ACM SIGSPATIAL International Workshop on Searching and Mining Large Collections of Geospatial Data","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125426824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhe Zhang, Zhangyang Wang, A. Li, Xinyue Ye, E. L. Usery, Diya Li
Geospatial data has been widely used in Geographic Information Systems to understand spatial relationships in various application domains such as disaster response, agriculture risk management, environmental planning, and water resource protection. Many data sharing platforms such as NASA Open Data Portal and USGS Geo Data portal have been developed to enhance spatial data sharing services. However, enabling intelligent and efficient spatial data sharing and communication across different domains and stakeholders (e.g., data producers, researchers, and domain experts) is a formidable task. The challenges appear in building meaningful semantics between data products using spatiotemporal similarity measures to efficiently help users find all the relevant data and information at the space-time scale. In this paper, we developed a novel AI-based graph embedding algorithm to build semantic relationships between different spatial data sets to enable efficient and accurate data search. We applied the graph embedding algorithm to 30,000 NASA metadata records to test our algorithm's performance. In the end, we visualized the knowledge graph using the Neo4j database graphical user interface.
{"title":"An Al-based Spatial Knowledge Graph for Enhancing Spatial Data and Knowledge Search and Discovery","authors":"Zhe Zhang, Zhangyang Wang, A. Li, Xinyue Ye, E. L. Usery, Diya Li","doi":"10.1145/3486640.3491393","DOIUrl":"https://doi.org/10.1145/3486640.3491393","url":null,"abstract":"Geospatial data has been widely used in Geographic Information Systems to understand spatial relationships in various application domains such as disaster response, agriculture risk management, environmental planning, and water resource protection. Many data sharing platforms such as NASA Open Data Portal and USGS Geo Data portal have been developed to enhance spatial data sharing services. However, enabling intelligent and efficient spatial data sharing and communication across different domains and stakeholders (e.g., data producers, researchers, and domain experts) is a formidable task. The challenges appear in building meaningful semantics between data products using spatiotemporal similarity measures to efficiently help users find all the relevant data and information at the space-time scale. In this paper, we developed a novel AI-based graph embedding algorithm to build semantic relationships between different spatial data sets to enable efficient and accurate data search. We applied the graph embedding algorithm to 30,000 NASA metadata records to test our algorithm's performance. In the end, we visualized the knowledge graph using the Neo4j database graphical user interface.","PeriodicalId":315583,"journal":{"name":"Proceedings of the 1st ACM SIGSPATIAL International Workshop on Searching and Mining Large Collections of Geospatial Data","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122450771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper proposes a new method to join building footprint GIS data with the relevant buildings in a street-view image, taken by a vehicle-mounted camera. This is achieved by segmenting buildings in the street-view images and identifying the relevant building coordinates in the image. The building coordinates on the image are then estimated from the building vertices in the building footprint GIS data and vehicle trajectory history. Finally, the objective building is identified and relevant building attributes corresponding to each building image are linked together. This method enables the development of building image datasets with associated building attributes. The building image data, when linked to the relevant building attributes, could contribute to many innovative urban analyses, such as urban monitoring, the development of three-dimensional (3D) city models, and image datasets for training with annotated building attributes.
{"title":"Joining Street-View Images and Building Footprint GIS Data","authors":"Y. Ogawa, Takuya Oki, Shenglong Chen, Y. Sekimoto","doi":"10.1145/3486640.3491395","DOIUrl":"https://doi.org/10.1145/3486640.3491395","url":null,"abstract":"This paper proposes a new method to join building footprint GIS data with the relevant buildings in a street-view image, taken by a vehicle-mounted camera. This is achieved by segmenting buildings in the street-view images and identifying the relevant building coordinates in the image. The building coordinates on the image are then estimated from the building vertices in the building footprint GIS data and vehicle trajectory history. Finally, the objective building is identified and relevant building attributes corresponding to each building image are linked together. This method enables the development of building image datasets with associated building attributes. The building image data, when linked to the relevant building attributes, could contribute to many innovative urban analyses, such as urban monitoring, the development of three-dimensional (3D) city models, and image datasets for training with annotated building attributes.","PeriodicalId":315583,"journal":{"name":"Proceedings of the 1st ACM SIGSPATIAL International Workshop on Searching and Mining Large Collections of Geospatial Data","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126285196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We selected 48 European cities and gathered their public transport timetables in the GTFS format. We utilized Uber's H3 spatial index to divide each city into hexagonal micro-regions. Based on the timetables data we created certain features describing the quantity and variety of public transport availability in each region. Next, we trained an auto-associative deep neural network to embed each of the regions. Having such prepared representations, we then used a hierarchical clustering approach to identify similar regions. To do so, we utilized an agglomerative clustering algorithm with a euclidean distance between regions and Ward's method to minimize in-cluster variance. Finally, we analyzed the obtained clusters at different levels to identify some number of clusters that qualitatively describe public transport availability. We showed that our typology matches the characteristics of analyzed cities and allows succesful searching for areas with similar public transport schedule characteristics.
{"title":"gtfs2vec: Learning GTFS Embeddings for comparing Public Transport Offer in Microregions","authors":"Piotr Gramacki, Szymon Wo'zniak, Piotr Szyma'nski","doi":"10.1145/3486640.3491392","DOIUrl":"https://doi.org/10.1145/3486640.3491392","url":null,"abstract":"We selected 48 European cities and gathered their public transport timetables in the GTFS format. We utilized Uber's H3 spatial index to divide each city into hexagonal micro-regions. Based on the timetables data we created certain features describing the quantity and variety of public transport availability in each region. Next, we trained an auto-associative deep neural network to embed each of the regions. Having such prepared representations, we then used a hierarchical clustering approach to identify similar regions. To do so, we utilized an agglomerative clustering algorithm with a euclidean distance between regions and Ward's method to minimize in-cluster variance. Finally, we analyzed the obtained clusters at different levels to identify some number of clusters that qualitatively describe public transport availability. We showed that our typology matches the characteristics of analyzed cities and allows succesful searching for areas with similar public transport schedule characteristics.","PeriodicalId":315583,"journal":{"name":"Proceedings of the 1st ACM SIGSPATIAL International Workshop on Searching and Mining Large Collections of Geospatial Data","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129709095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}