Basics
- Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed Representations of Words and Phrases and Their Compositionality. Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2, 3111–3119.
- Pennington, J., Socher, R., & Manning, C. D. (2014). Glove: Global Vectors for Word Representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–1543.
- Levy, O., & Goldberg, Y. (2014). Neural Word Embedding as Implicit Matrix Factorization. Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, 2177–2185.
- Baroni, M., Dinu, G., & Kruszewski, G. (2014). Don’t Count, Predict! A Systematic Comparison of Context-Counting vs. Context-Predicting Semantic Vectors. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 238–247.
Advanced
- Arora, S., Li, Y., Liang, Y., Ma, T., & Risteski, A. (2016). A Latent Variable Model Approach to PMI-based Word Embeddings. Transactions of the Association for Computational Linguistics, 4, 385–399.
- Arora, S., Li, Y., Liang, Y., Ma, T., & Risteski, A. (2018). Linear Algebraic Structure of Word Senses, with Applications to Polysemy. Transactions of the Association for Computational Linguistics, 6, 483–495.
- Hashimoto, T. B., Alvarez-Melis, D., & Jaakkola, T. S. (2016). Word Embeddings as Metric Recovery in Semantic Spaces. Transactions of the Association for Computational Linguistics, 4, 273–286.
- Kenyon-Dean, K., Newell, E., & Cheung, J. C. K. (2020). Deconstructing Word Embedding Algorithms. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 8479–8484.