MBZUAI affiliated professor of Machine Learning Chih-Jen Lin gave one of the keynotes at the recent 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, held in Taipei, Taiwan.
Lin, also a distinguished professor at the Department of Computer Science, National Taiwan University, joined a select group of keynote speakers that included Marc Najork, Distinguished Research Scientist, Google DeepMind; Ranjitha Kumar, Associate Professor, University of Illinois at Urbana-Champaign; and Ryen W. White, General Manager and Deputy Lab Director, Microsoft Research.
Titled ‘On the “Rough Use” of Machine Learning Techniques’, Lin’s address focused on instances where machine learning techniques were employed inappropriately. Lin emphasized that such challenges are not unusual and can sometimes arise unavoidably.
Introducing the concept of the “rough use” of these techniques, he used two real-life stories. Firstly, he explored the realm of graph representation learning, where the evaluation of obtained representations often involves a node classification problem with multiple labels. However, Lin revealed a common unrealistic assumption: many researchers assume that they know the number of labels for each test instance during prediction. He highlighted the rarity of having such ground truth information available in practical situations.
Secondly, Lin delved into the realm of deep neural networks and how they are trained. He exposed a common misunderstanding where users incorrectly combine training, validation, and test sets in certain scenarios. By sharing real stories, he highlighted the prevalent confusion around the relationship between these sets and the potential pitfalls that can arise.
“Although the rough use of machine learning methods is common and sometimes unavoidable, the community should work together to change the culture and improve the practical use,” Lin said.
Lin’s presentation concluded with a call to action. He argued that in the intricate landscape of machine learning, achieving perfection is elusive, and sometimes, missteps are inevitable. He argued that the key to improving the situation lies in developing high-quality, user-friendly software. Such software, he posited, would significantly enhance the practical application of machine learning techniques and mitigate instances of misuse.
A team from MBZUAI used instruction tuning to help multimodal LLMs generate HTML code and answer questions.....
Martin Takáč and Zangir Iklassov's 'self-guided exploration' significantly improves LLM performance in solving combinatorial problems.
The caliber of MBZUAI’s faculty is on display in the organizing committee of ICDM 2024, enhancing the.....