Wei Xu     

[phonetic pronunciation: way shoo ]

Associate Professor
College of Computing
Georgia Institute of Technology
  wei.xu@cc.gatech.edu
  @cocoweixu

I am a faculty member of the School of Interactive Computing and Machine Learning Center at Georgia Tech. My research lies at the intersections of machine learning, natural language processing, and social media. I direct the NLP X Lab which currently focuses on (1) large language models, such as cultural bias, multilingual capability, temporal shifts, and personalization; (2) text generation, such as constrained decoding and learnable evaluation metric; and (3) interdisciplinary NLP applications that can make impact in education, security, accessibility, etc. I received the NSF CAREER Award, Faculty Research Awards from Google, Sony, and Criteo, CrowdFlower AI for Everyone Award, Best Paper Awards at COLING'18 and ACL'24, as well as research funds from DARPA and IARPA. I am a member of NAACL executive board. I was a postdoctoral researcher at the University of Pennsylvania. I received my PhD in Computer Science from New York University, BSMS from Tsinghua University.

  I'm recruiting 1-2 PhD students every year (apply to Machine Learning or CS PhD program and list me as a potential advisor; if you have EE background, consider also apply to ML ECE program). I recruit MS students (apply to MSCS program and email me) and undergraduates who have sufficient time and motivation for research theses.
What's New
  Apr 2025, talk at Apple ML Research (virtual)
  May 2025, guest lecture at Sungkyunkwan University
Research Highlights

Multilingual Multicultural LLMs

While LLMs have demonstrated impressive performance, their success is largely concentrated in English and other high-resource languages. In contrast, many non-English languages remain underrepresented and underserved. Moreover, these models often reflect Western cultural biases and struggle to capture the nuances of non-Western cultural contexts (Naous et al., ACL 2024; Naous et al., NAACL 2025). We work on identifying and closing these gaps in performance and cultural adaptation. Addressing these challenges calls for a deeper analysis of pre-training data to identify and mitigate representational gaps, as well as alignment (Guo et al., arXiv 2025) and inference-time algorithms (Le at al., ICLR 2024) that can dynamically adapt model behavior to diverse linguistic and cultural contexts.

Robustness and Reasoning of LLMs

Artificial General Intelligence (AGI) benchmarks seek to assess an AI system’s capacity to perform tasks that require human-level intelligence, including reasoning, learning, and adapting to novel situations (Zheng et al., ACL 2024; Mendes et al., EMNLP 2024). While current systems fall short of true AGI, there is growing interest in moving beyond static benchmarks toward more realistic, dynamic evaluations. Our research focuses on designing real-world tasks that better reflect practical challenges faced by LLMs, and on developing innovative methods (Zheng et al., arXiv 2025) to enhance their robustness and performance in these complex settings.

Interdisciplinary NLP+X Research

We actively collaborate with researchers to explore impactful real-world applications of large language models in Human-Computer Interaction, Security and Privacy, Healthcare, and Law (Jiang et al., EMNLP 2024; Dou et al., ACL 2024). As LLMs continue to advance, they offer exciting new capabilities across specialized domains. There are a lot of opportunities, as LLMs often exhibit promising but inconsistent performance in domain-specific tasks, where precision, context sensitivity, and domain knowledge are critical.

NLP X Lab
    Yao Dou (CS PhD student; human-centered LLM evaluation, generation)
    Tarek Naous (ECE ML PhD; multilingual multicultural LLM)
    Duong Minh Le (CS PhD; controllable text generation -- co-advisor: Alan Ritter)
    Jonathan Zheng (ML PhD; reasoning, robustness of LLM -- co-advisor: Alan Ritter)
    Geyang Guo (CS PhD; LLM alignment -- co-advisor: Alan Ritter)
    Junmo Kang (CS PhD; efficiency -- co-advisor: Alan Ritter)
    Chao Jiang (CS PhD; scientific writing)
    Anton Lavrouk (MS, autumn 2022 -- ; multilingual LLM)
    Xiaofeng Wu (MS, autumn 2023 -- ; LLM subcharacter)
    Romain Froger (MS, spring 2025)
    Neel Kothari (MS, spring 2025)
    Tanish Patwa (MS, spring 2025)
    Vishnesh Jayanthi (Undergrad, summer 2022 -- ; stylistics)
    Rachel Choi (Undergrad, summer 2022 -- )
    Ian Ligon (Undergrad, summer 2022 -- )
    Govind Ramesh (Undergrad, winter 2022 -- ; LLM safety)
    Nour Allah El Senary (Undergrad, winter 2022 -- )
    Joseph Thomas (Undergrad, summer 2024 -- )
    Oleksandr Lavreniuk (Undergrad, summer 2024 -- )
    Jad Matthew Bardawil (Undergrad, autumn 2024 -- )

Preprints
Publications
Teaching
Current Offering:
Previous Offerings:

Service

I am or was an executive board member of NAACL (2023-2024), a best paper award committee member for EMNLP 2022 and 2024, a senior area chair for EMNLP 2024 (resource and evaluation), 2022 (generation); NAACL 2025 (generation), 2022 (machine learning for NLP), 2021 (generation), and ACL 2020 (generation), and an area chair for COLM 2024, ACL 2023 (semantics), EMNLP 2021 (computational social science), EMNLP 2020 (generation), AAAI 2020 (NLP), ACL 2019 (semantics), NAACL 2019 (generation), EMNLP 2018 (social media), COLING 2018 (semantics), EMNLP 2016 (generation), a workshop chair for ACL 2017, and the publicity chair for EMNLP 2019, NAACL 2018 and 2016.

Miscellaneous

When I have spare time, I enjoy visiting art museums, hiking, biking, and snowboarding.

I wrote a biography of my phd advisor Ralph Grishman along with some early history of Information Extraction research in 2017.

I also made a list of the best dressed NLP researchers in 2016/17 , 2015 and 2014.