As the world’s first graduate-level, research-based artificial intelligence (AI) university, Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) is continuing to increase the breadth and pace of publication of ground-breaking research in artificial intelligence (AI).
Between January and June 2024, the MBZUAI community—made up of more than 80 world-class faculty, 200-plus researchers, and hundreds of students—published more than 300 papers at top-tier AI venues. This included 39 papers at the prestigious International Conference on Learning Representations 2024 (ICLR) held in May.
This follows last year’s success of 612 published papers at top-tier venues in 2023. Highlights included delivering 30 papers at the International Conference on Computer Vision (ICCV), 34 papers at the IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), 44 papers at Empirical Methods in Natural Language Processing (EMNLP), and 53 papers at the Conference on Neural Information Processing Systems (NeurIPS).
Five years since its inception, MBZUAI is now recognized as one of the world’s top 100 universities across all of computer science, and is ranked in the top 20 globally across AI, computer vision, machine learning, natural language processing (NLP), and robotics (CSRankings).
Five stand-out research papers published by MBZUAI in the past six months are listed below:
A team of MBZUAI researchers, working with international collaborators, developed a series of resources for identifying text created by large language models (LLMs), which could have a profound impact in fields such as journalism, academia, and education. Previous research in this field was limited to reviewing only one or two languages, using only one text generator or considering only single domains, such as news, and uses, such as summarization of text. In contrast, the M4 analyzer that came out of this work covers multiple languages, various LLMs, and diverse domains, to enable more general machine-generated text detection. Additionally, the dataset associated with this work will lay the foundation for future research on more robust approaches to the pressing societal problems associated with LLM-created text.
The paper, ‘M4: Multi-generator, multi-domain, and multi-lingual black-box machine-generated text detection’, was awarded the Best Resource Paper Award at the European Chapter of the Association for Computational Linguistics Conference 2024 (EACL) held in March.
Gene regulatory networks (GRNs) represent the causal relationships governing gene activities in cells that are vital for understanding biological processes and diseases. A common problem in investigating GRNs are dropouts, or zero values, in single-cell RNA sequencing data. These zero values could be true dropouts, reflecting no gene activity, or false dropouts caused by the sequencing process itself. In their paper, ‘Gene Regulatory Network Inference In The Presence Of Dropouts: A Causal View’, the researchers propose a new causal dropout model that that provides a more accurate picture by focusing on data without zeros when testing gene relationships.
The paper was presented at ICLR and marked a significant step forward in genetic research.
The paper was published at ICLR 2024.
Researchers from MBZUAI have led a global team in developing an advanced vision language model called Grounding Large Multimodal Model (GLaMM), which supports higher-fidelity interaction between text and images. It is capable of generating natural language responses related to objects in an image at the pixel-level, offering enhanced automated image captioning, reasoning, and the ability to switch objects in images.
In their paper, ‘GLaMM: Pixel grounding large multimodal model’, the researchers detail how the model is trained to allow users to interact using both text and visual prompts, generating natural language responses seamlessly intertwined with corresponding object segmentation masks. To test GLaMM, the authors created a novel dataset with millions of detailed image annotations. GLaMM’s advanced capabilities make AI more intuitive and effective in tasks like grounded conversation generation, referring expression segmentation, image and region-level captioning, and vision-language conversations.Real-world applications include sectors such as e-commerce, fashion, safe and smart city, and home retail. GLaMM was published at CVPR 2024 held in Seattle in June, which is the highest-ranked engineering and computer science venue worldwide. It has already received more than 50 citations and 600 stars on GitHub.
Dr. Xiaodan Liang and Professor Xiaojun Chang, both MBZUAI professors from the Computer Vision Department, have teamed up with international collaborators to develop a new technique that can make vision transformers, a core component of most modern models for image and video analysis, more efficient. As set out in their paper, ‘MLP can be a good Transformer Learner’, the key discovery is that certain layers in the transformer can be replaced with much simpler multilayer perceptron (MLP) layers. This change, guided by a measure of randomness known as entropy, helps maintain model performance with much smaller models. The new method supports more streamlined and efficient AI model training, potentially paving the way for faster and less resource-intensive technologies.
The paper was presented as an oral at CVPR 2024 and was nominated for a Best Paper Award.
MBZUAI research shows how a better understanding of the relationships between variables can benefit fundamental scientific research.
The Arabic language is underrepresented in the digital world, making AI inaccessible for many of its 400.....
A team from MBZUAI used instruction tuning to help multimodal LLMs generate HTML code and answer questions.....