Proposals

MPhil Projects

These are proposals for MPhil project with the NLIP group. If you are an MPhil student in the ACS at Cambridge and you are interested in any of these projects, send me an email!

Project 0: LLMs for 0-shot Word Sense Disambiguation

Proposer: David Strohmaier

Supervisors: David Strohmaier and Paula Buttery

Use LLMs for 0-shot word sense disambiguation and break the ceiling of human performance!

Word sense disambiguation has long been one of the biggest challenges in NLP. It has even been considered AI-complete. While recent advances using transformer models have started to reach the human ceiling, these breakthroughs typically rely on large amounts of data. For some dictionaries, such as the Cambridge Advanced Learner’s Dictionary (CALD), no such extensive resources are available. Humans can nonetheless disambiguate words using CALD. Can machines do the same? You’ll use LLMs to address this challenge and solve the problem of word sense disambiguation in a 0-shot setting.

You’ll be provided with access to GPU resources and a digital version of CALD.

Relevant Literature

Maru, M., Conia, S., Bevilacqua, M., & Navigli, R. (2022). Nibbling at the Hard Core of Word Sense Disambiguation. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 4724–4737. https://doi.org/10.18653/v1/2022.acl-long.324 Navigli, R. (2009). Word Sense Disambiguation: A Survey. ACM Computing Surveys, 41(2), 1–69. https://doi.org/10.1145/1459352.1459355 Navigli, R. (2012). A Quick Tour of Word Sense Disambiguation, Induction and Related Approaches. In M. Bieliková, G. Friedrich, G. Gottlob, S. Katzenbeisser, & G. Turán (Eds.), SOFSEM 2012: Theory and Practice of Computer Science (pp. 115–129). Springer. https://doi.org/10.1007/978-3-642-27660-6_10 Oele, D., & Noord, G. van. (2017). Distributional Lesk: Effective Knowledge-Based Word Sense Disambiguation. IWCS 2017 — 12th International Conference on Computational Semantics: Short Papers, W17-6931. https://www.rug.nl/research/portal/en/publications/distributional-lesk-effective-knowledgebased-word-sense-disambiguation(b49623ea-3683-4173-89a9-2594f023aff4).html Scarlini, B., Pasini, T., & Navigli, R. (2020). With More Contexts Comes Better Performance: Contextualized Sense Embeddings for All-Round Word Sense Disambiguation. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 3528–3539. https://doi.org/10.18653/v1/2020.emnlp-main.285 Vandenbussche, P.-Y., Scerri, T., & Jr., R. D. (2021). Word Sense Disambiguation with Transformer Models. Proceedings of the 6th Workshop on Semantic Deep Learning (SemDeep-6), 7–12. https://aclanthology.org/2021.semdeep-1.2