{"title":"A Geocoding Framework Powered by Delivery Data","authors":"Vishal Srivastava, Priyam Tejaswin, Lucky Dhakad, Mohit Kumar, Amar Dani","doi":"10.1145/3397536.3422254","DOIUrl":null,"url":null,"abstract":"Over the last decade, India has witnessed an explosion in the ecommerce industry. There is increasing adoption of e-commerce in smaller towns and cities over and above the densely populated urban centers. In this paper, we discuss the practical challenges involved with developing high-precision geocoding engines for these geographical regions in India. These challenges motivate the next iteration of our geocoding framework. In particular, we focus on addressing three core areas of improvement: 1) leveraging customer delivery data for geocoding, 2) understanding and solving for the diversity and variations in addresses for these new regions, and 3) overcoming the limited coverage of our reference corpus. To this end, we present GeoCloud. Key contributions of GeoCloud are 1) a training algorithm for learning reference-representations from delivery coordinates and 2) a retrieval algorithm for geocoding new addresses. We perform extensive testing of GeoCloud across India to capture the regional, socio-economical and linguistic diversity of our country. Our evaluation data is sampled from 72 cities and 21 states from the delivery addresses of a large e-commerce platform in India. The results show a significant improvement in precision and recall over the state-of-the-art geocoding system for India, and demonstrate the effectiveness of our intuitive, robust and generic approach. While we have shown the effectiveness of the framework for Indian addresses, we believe the framework can be applied to other countries as well, particularly where addresses are unstructured. To the best of our knowledge, this is the first instance of geocoding by learning reference-representations from large-scale delivery data.","PeriodicalId":233918,"journal":{"name":"Proceedings of the 28th International Conference on Advances in Geographic Information Systems","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 28th International Conference on Advances in Geographic Information Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3397536.3422254","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
Abstract
Over the last decade, India has witnessed an explosion in the ecommerce industry. There is increasing adoption of e-commerce in smaller towns and cities over and above the densely populated urban centers. In this paper, we discuss the practical challenges involved with developing high-precision geocoding engines for these geographical regions in India. These challenges motivate the next iteration of our geocoding framework. In particular, we focus on addressing three core areas of improvement: 1) leveraging customer delivery data for geocoding, 2) understanding and solving for the diversity and variations in addresses for these new regions, and 3) overcoming the limited coverage of our reference corpus. To this end, we present GeoCloud. Key contributions of GeoCloud are 1) a training algorithm for learning reference-representations from delivery coordinates and 2) a retrieval algorithm for geocoding new addresses. We perform extensive testing of GeoCloud across India to capture the regional, socio-economical and linguistic diversity of our country. Our evaluation data is sampled from 72 cities and 21 states from the delivery addresses of a large e-commerce platform in India. The results show a significant improvement in precision and recall over the state-of-the-art geocoding system for India, and demonstrate the effectiveness of our intuitive, robust and generic approach. While we have shown the effectiveness of the framework for Indian addresses, we believe the framework can be applied to other countries as well, particularly where addresses are unstructured. To the best of our knowledge, this is the first instance of geocoding by learning reference-representations from large-scale delivery data.