Calendar - University of Houston
Skip to main content

[Seminar] Adapting Neural Models to Address Challenges in Information Extraction from Social Media Data

Friday, February 5, 2021

11:00 am - 12:00 pm


Online via MS Teams


Social media data poses several interesting challenges to information extraction technology. In my group, we have been working on studying how and why we observe lower performance of sequence labelling methods on social media data, compared to performance of the same models on more edited text, such as newswire data. These studies have informed our design choices for models that are more robust to naturalistic data, even data that includes language switching. My goal is to contribute to increasing the coverage of language abilities by NLP technology.

During this talk, I’ll briefly discuss the different proposals we have developed that include simple adaptations to contextualized embeddings, and a more flexible subword tokenization approach than what is available in the commonly used byte-pair encoding of language models. I’ll conclude with a discussion of possible research lines for the near future.

About the Speaker

Thamar Solorio is an Associate Professor of the Department of Computer Science at the University of Houston (UH). She holds graduate degrees in Computer Science from the Instituto Nacional de Astrofísica, Óptica y Electrónica, in Puebla, Mexico. Her research interests include information extraction from social media data, enabling technology for code-switched data, stylistic modeling of text and more recently multimodal approaches to online content understanding. She is the director and founder of the Research in Text Understanding and Language Analysis Lab at UH. She is the recipient of an NSF CAREER award for her work on authorship attribution, and recipient of the 2014 Emerging Leader ABIE Award in Honor of Denice Denton. She is an elected board member of the North American Chapter of the Association of Computational Linguistics (2020-2021). Her research is currently funded by the National Science Foundation and ADOBE, and in the past she has received support from the Office of Naval Research and the Defense Advanced Research Projects Agency (DARPA).

Thamar Solorio
Online via MS Teams