EDGE: Entity-Diffusion Gaussian Ensemble for Interpretable Tweet Geolocation Prediction

Bo Hui; Haiquan Chen; Da Yan; Wei-Shinn Ku

doi:10.1109/ICDE51399.2021.00099

Back

Conference proceeding

EDGE: Entity-Diffusion Gaussian Ensemble for Interpretable Tweet Geolocation Prediction

Bo Hui, Haiquan Chen, Da Yan and Wei-Shinn Ku

2021 IEEE 37th International Conference on Data Engineering (ICDE), pp.1092-1103

04/2021

DOI: https://doi.org/10.1109/ICDE51399.2021.00099

Handle:

https://hdl.handle.net/20.500.12741/rep:6833

Abstract

Correlation

Gaussian Mixture Model

Event detection

Geology

Tweet Geolocation

Semantics

Graph Neural Network

Probabilistic logic

Graph neural networks

Noise measurement

Knowing the locations of tweets can benefit a wide variety of applications such as venue recommendation, event detection, and monitoring disaster outbreaks. However, the problem of fine-grained tweet geolocation prediction is challenging since tweets are short and therefore may not contain any geo-indicative words or may contain ambiguous, noisy information. Existing solutions either yield an unsatisfactory accuracy in practical applications or make predictions that even experts struggle to interpret, failing to engender sufficient trust and actionability for real-world deployment. Our paper presents a tweet geolocation prediction framework, EDGE (Entity-Diffusion Gaussian Ensemble), which delivers predictions that are both accurate and highly interpretable without requiring any additional contextual information such as user profile and location history. In EDGE, we cast the geolocation problem as a neutral network optimization problem by learning probabilistic generative models. Compared with existing works, EDGE has two distinctive features: (1) the inference builds on mining the correlation between non geo-indicative entities and geo-indicative entities by diffusing their semantic embeddings over the constructed graph neural network (Entity Diffusion) and (2) each prediction result is represented as a Gaussian mixture instead of specific geographical coordinates (Gaussian Ensemble). Extensive experiments using real-world tweet datasets validate the superiority of EDGE over the state of the art in terms of all distance-based and POI-based metrics.

Metrics

9 Record Views

Details

Title: EDGE: Entity-Diffusion Gaussian Ensemble for Interpretable Tweet Geolocation Prediction
Creators: Bo Hui - Auburn University
Haiquan Chen - California State University,Sacramento
Da Yan - University of Alabama,Birmingham
Wei-Shinn Ku - Auburn University
Academic Unit: Computer Science Department
Publisher: IEEE
Publication Details: 04/2021
Grant note: National Science Foundation (10.13039/100000001)
Identifiers: 99257880271601671; https://hdl.handle.net/20.500.12741/rep:6833; https://doi.org/10.1109/ICDE51399.2021.00099
Language: English

EDGE: Entity-Diffusion Gaussian Ensemble for Interpretable Tweet Geolocation Prediction

Abstract

Related links

Metrics

Details