"Making NLP Systems Robust to Language Variation: The Case of Slang" - Zhewei Sun, Research at TTIC
Вставка
- Опубліковано 20 гру 2024
- Originally presented on: Friday, October 25th, 2024 at 12:30am CT, TTIC, 6045 S. Kenwood Avenue, 5th Floor, Room 530
Title: "Making NLP Systems Robust to Language Variation: The Case of Slang"
Speaker: Zhewei Sun, TTIC
Abstract: Humans can leverage a rich set of linguistic expressions to describe similar concepts. Such variations often reflect one’s emotion and social identity. Natural language processing (NLP) systems, on the other hand, tend to be trained on formal language corpora that lacks variety. As a result, systems show diminished performance when tasked to process under-represented varieties of language.
In this talk, I will discuss different avenues of research that can be pursued to address this limitation. In particular, I will focus on both data-driven and knowledge-driven approaches that have been proposed to allow NLPs systems to process slang, a specific variety of contextual language. The first part of my talk will cover knowledge-driven approaches that inject linguistic and cognitive knowledge about slang into foundational NLP models for zero-shot generation and interpretation of novel slang. Next, I will discuss emerging data-driven approaches based on large language models (LLMs), examining the extent of knowledge LLMs have acquired about slang and how such knowledge may have been obtained. Finally, I will discuss potential future directions in further enhancing NLP systems’ robustness to language variation.
Timestamps:
00:00
00:05 Intro
00:44 Lecture
55:00 Q&A
#llms #largelanguagemodels #nlp #naturallanguageprocessing #artificialintelligence #machinelearning #algorithm #computervision #robotics #research