Veselin Stoyanov

Adjunct Professor of Natural Language Processing

Research interests

Stoyanov’s research interests relate to large language models, including pretraining, fine-tuning related to instruction following and application that are designed to solve real-world problems. He is also interested in efficient, sparsely activated models such as mixtures of experts (MoE) as well as multilingual LLMs and training models for performing tasks cross-lingually. He is seeking to develop new paradigms for applying LLMs in real-world, interactive scenarios that augment the creative process, while allowing people to be more efficient.

Email

Since April 2023, Stoyanov has served as the head of AI/ML at Tome, a productivity company based in San Francisco, where he leads the development of new approaches for AI-powered products. Before joining Tome, he worked for nearly a decade at Facebook and Meta where he most recently served as applied research scientist manager and led the development of pretrained language models such as RoBERTa, XLM-R and OPT. His work at Facebook and Meta broadly related to NLP search, neural machine translation, self-supervised methods for identifying hate speech and multilingual language models. He was also integral to the team who built MultiRay, a service that runs multiple, very large and accurate self-supervised models on the same input. Prior to Facebook, Stoyanov was an assistant research scientist at Johns Hopkins University's Center for Language and Speech Processing where he received a computing innovation fellowship and focused on learning for structured prediction.
  • Ph.D. in computer science from Cornell University
  • M.Sc. in computer science from Cornell University
  • B.Sc. with distinction in computer science from University of Delaware.
  • Computing Innovation Fellowship from the Computer Research Association 2010
  • Graduate research fellowship from the National Science Foundation 2005

  • A Ni, S Iyer, D Radev, V Stoyanov, W Yih, S Wang, XV Lin. Lever: Learning to verify language-to-code generation with execution. International Conference on Machine Learning, 2023.
  • P Hase, M Diab, A Celikyilmaz, X Li, Z Kozareva, V Stoyanov, M Bansal, S Iyer. Methods for Measuring, Updating, and Visualizing Factual Beliefs in Language Models, Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023.
  • A Halevy, C Canton-Ferrer, H Ma, U Ozertem, P Pantel, M Saeidi, F Silvestri, V Stoyanov. Preserving integrity in online social networks, Communications of the ACM, 2022.
  • B Gunel, J Du, A Conneau, V Stoyanov. Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning, https://arxiv.org/pdf/2011.01403.pdf, 2021.
  • A Conneau, K Khandelwal, N Goyal, V Chaudhary, G Wenzek, F Guzmán, E Grave, M Ott, L Zettlemoyer, V Stoyanov. Unsupervised Cross-lingual Representation Learning at Scale, https://arxiv.org/pdf/1911.02116.pdf, 2020.
  • A Conneau, S Wu, H Li, L Zettlemoyer, V Stoyanov. Emerging Cross-lingual Structure in Pretrained Language Models, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020.
  • M Lewis, Y Liu, N Goyal, M Ghazvininejad, A Mohamed, O Levy, V Stoyanov, L Zettlemoyer. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension, https://arxiv.org/pdf/1910.13461.pdf, 2019.
  • Y Liu, M Ott, N Goyal, J Du, M Joshi, D Chen, O Levy, M Lewis, L Zettlemoyer, V Stoyanov. RoBERTa: A Robustly Optimized BERT Pretraining Approach, https://arxiv.org/pdf/1907.11692.pdf%5C, 2019.

Contact faculty affairs

Interested in working with our renowned faculty?
Fill out the below form and we will get back to you.