Block-LDA: Jointly Modeling Entity-Annotated Text and Entity-Entity Links

Authored by: Edoardo M. Airoldi , David M. Blei , Elena A. Erosheva , Stephen E. Fienberg , Ramnath Balasubramanyan , William W. Cohen

Handbook of Mixed Membership Models and Their Applications

Print publication date:  November  2014
Online publication date:  November  2014

Print ISBN: 9781466504080
eBook ISBN: 9781466504097
Adobe ISBN:

10.1201/b17520-17

 Download Chapter

 

Abstract

Identifying latent groups of entities from observed interactions between pairs of entities is a frequently encountered problem in areas like analysis of protein interactions and social networks. We present a model that combines aspects of mixed membership stochastic blockmodels and topic models to improve entity-entity link modeling by jointly modeling links and text about the entities that are linked. We apply the model to two datasets: a protein-protein interaction (PPI) dataset supplemented with a corpus of abstracts of scientific publications annotated with the proteins in the PPI dataset and an Enron email corpus. The induced topics’ ability to help understand the nature of the data provides a qualitative evaluation of the model. Quantitative evaluation shows improvements in functional category prediction of proteins and in perplexity, using the joint model over baselines that use only link or text information. For the PPI dataset, the topic coherence of the emergent topics and the ability of the model to retrieve relevant scientific articles and proteins related to the topic are compared to that of a text-only approach that does not make use of the protein-protein interaction matrix. Evaluation of the results by biologists show that the joint modeling results in better topic coherence and improves retrieval performance in the task of identifying top related papers and proteins.

 Cite
Search for more...
Back to top

Use of cookies on this website

We are using cookies to provide statistics that help us give you the best experience of our site. You can find out more in our Privacy Policy. By continuing to use the site you are agreeing to our use of cookies.