MBZUAI Professor Timothy Baldwin, delivered the presidential keynote address Monday, May 23, 2022, at the 60th Annual Meeting of the Association for Computational Linguistics (ACL). Alongside his contributions as a publishing academic, Baldwin headlined the global conference, held in Dublin, in his capacity as president of the association.
Baldwin published three conference papers at ACL 2022, which show off both the breadth of his academic work in Natural Language Processing (NLP), as well as in his network of global collaborations.
His first paper entitled: “The patient is more dead than alive: exploring the current state of the multi-document summarisation of the biomedical literature,” was published in collaboration with researchers from The University of Melbourne and the ARC Training Centre in Cognitive Computing for Medical Technologies. The authors set out to propose a new approach to assessing how well a document summarizes a series of medical documents, e.g. for a systematic review, and identified major shortcomings in current multi-document summarization systems.
“Long term, we are looking to better understand and propose a way to reduce the human effort associated with the grueling task of reviewing medical documents,” Baldwin said. “Ultimately, we are aiming to identify important linkages, warning signs, and opportunities to improve healthcare through the application of NLP.”
In his second conference paper Baldwin and 11 co-authors from around the world wrestle with the gargantuan challenge of digitally preserving and developing language technologies for more than 700 Indonesian indigenous languages. The paper provides an overview of the current state of NLP research for Indonesia’s languages, and from that, offers recommendations to help develop NLP technologies globally.
Baldwin’s third paper, also with co-authors from The University of Melbourne, investigates procedural texts such as recipes and chemical patents. Baldwin and his co-authors demonstrate empirically that referential language (such as pronouns referring to entities introduced earlier in the text) is a core component of translating procedural texts into structured workflows, and that computational models can, with reasonable success, capture such workflows automatically.
“It’s relatively easy to tell a five-year-old child how to make a peanut butter sandwich, as they implicitly understand things like how to use a knife or where the peanut butter should be spread” Baldwin said. “It is quite a bit more challenging to tell a computer how to do it because of the amount of implicit knowledge there is in procedural text. We’ve found that “bridging” terms are crucial to understanding procedural language—the things we take for granted, such as bread changing state from a loaf to a slice, and then into one side of a sandwich.”
As an association, the ACL holds annual meetings and regional chapter and special-interest events, and sponsors the journals Computational Linguistics and Transactions of the Association for Computational Linguistics, which are both published by MIT Press. This year’s event was the first hybrid event, with a particularly large number of submissions from researchers in Asia who were not able to travel and present papers in person. The event, which was held in Dublin, Ireland, had a special theme on low-resource languages, of which Irish, the local language of the host country of Ireland, is one.
As a conference, ACL has a relatively long lineage for an AI conference dating back to 1962. At that time the association was named the Association for Machine Translation and Computational Linguistics. It wasn’t until 1968 that the association’s name was changed to its current form, ACL.
60 years might seem a long time for an AI conference, but Baldwin, who works in NLP, attributes the association’s origins in part to the Cold War era need for fast, accurate intelligence, and the subsequent rise of machine translation.
As the flagship conference in the ACL calendar, the 2022 conference was supported by a range of globally known tech companies including Amazon, Bloomberg, Google, Meta, Baidu, IBM, Microsoft, Duolingo, Adobe and many more.
Earlier this year, Baldwin was appointed Associate Provost for Academic and Student Affairs, and the Acting Department Chair of the MBZUAI NLP department. Prior to joining MBZUAI, Baldwin spent 17 years at The University of Melbourne, including roles as Melbourne Laureate Professor, director of the ARC Training Centre in Cognitive Computing for Medical Technologies (in partnership with IBM), Associate Dean Research Training in the Melbourne School of Engineering, and deputy head of the Department of Computing and Information Systems.
Baldwin has previously held visiting positions at Cambridge University, the University of Washington, the University of Tokyo, Saarland University, NTT Communication Science Laboratories, and the National Institute of Informatics. His primary research focus is NLP including deep learning, algorithmic fairness, computational social science, and social media analytics.
The paper entitled: “MuCoT: Multilingual Contrastive Training For Question-Answering In Low-resource Languages” was accepted for oral presentation at ACL 2022 in the Workshop on Speech and Language Technologies for Dravidian Languages.
The team’s work aimed to show that fine-tuning the mBERT model with translations from the same language family boosts the question-answering performance, whereas the performance degrades in the case of cross-language families. This was done in Tamil and Hindi with translations from Telugu, Malayalam, Bengali, and Marathi. Student authors Gokul Karthik Kumar, Abhishek Singh Gehlot, Sahal Shaji Mullappilly, co-authored the paper with Nandakumar.
The Arabic language is underrepresented in the digital world, making AI inaccessible for many of its 400.....
A team from MBZUAI is improving LLMs' performance across languages by helping them find the nuances of.....
A team from MBZUAI created a fine-grained benchmark to analyze each step of the fact-checking process and.....