Computer Science Seminar - University of Houston
Skip to main content

Computer Science Seminar

Project Mélange: I Know When and Why You Switched from Inglés a Español

When: Thursday, March 3, 2016
Where: PGH 550
Time: 11:00 AM - 12:00 PM

Speaker: Monojit Choudhury and Kalika Bali

Host: Prof. Thamar Solorio

Multilingual communities exhibit code-switching, that is, mixing of two or more socially stable languages in a single conversation, sometimes even in a single utterance. This phenomenon has been widely studied by linguists and interaction scientists in the spoken language of such communities. However, with the prevalence of social media and other informal interactive platforms, code-switching is now also ubiquitously observed in user-generated text. As multilingual communities are more the norm from a global perspective, it becomes essential that code-switched text and speech are adequately handled by language technologies and NUIs like Skype Translator and Cortana.

Project Mélange aims to analyze and understand code-switching behavior at two levels: first, the formal structural level that dictates the grammar of such a construct, and second, the functional level that motivates its use from a cognitive, pragmatic and socio-cultural perspective. This in turn would allow us to process mixed language as well as better model conversations and dialogues in a multilingual setting.

In this talk we will present an overview of our work towards making this a possibility. In particular, we will discuss word-level language detection, and linguistic analysis and automatic identification of functions of switching in Twitter.

Website: http://research.microsoft.com/en-us/projects/melange/

Blog: https://pocomixmaadi.wordpress.com/

Bio:

Monojit Choudhury is a Researcher at Microsoft Research Lab India. Prior to this, he did his PhD (2007) and B.Tech (2002), both in Computer Science and Engineering, from Indian Institute of Technology Kharagpur. His research interests include NLP for low resource languages, technologies for multilingual communities, and computational approaches to linguistics, sociolinguistics, evolutionary linguistics and cognition. Monojit is very actively involved with the organization of the International Linguistics Olympiad (http://www.ioling.org) and its Indian national counterpart – the Panini Linguistics Olympiad (http://plo-in.org) – programs that try to attract the brightest high school kids to linguistics and NLP through challenging yet interesting and thought-provoking puzzles.

Website: http://research.microsoft.com/en-us/people/monojitc/

Kalika Bali is a Researcher at Microsoft Research Lab India. A linguist and an acoustic phonetician by training, she has worked for the last 15 years in the area of Speech and Language Technology, especially for resource poor languages. Her brief stint as a lecturer in the University of the South Pacific, Fiji, has left her with a lasting interest in how technology can be used to enhance and further education and some of her current research lies at the intersection of ICT and Education, for primary school students to Adults learning new skills. The primary focus of her research is on how Natural Language systems can help Human-Computer Interaction, including computer-mediated interaction, in the domain of education and social media.

Website: http://research.microsoft.com/en-us/people/kalikab/