Hisham Cholakkal - Page 2 of 3 - MBZUAI MBZUAI

Hisham Cholakkal

Assistant Professor of Computer Vision

Research Interests

Professor Cholakkal’s research lies at the intersection of computer vision and multimodal learning, with a focus on foundation models and multimodal large language models (LMMs). His objective is to build omnimodal AI companions that seamlessly integrate vision, audio, speech and text across different languages and cultures, and to deploy these systems in smart wearables such as smart glasses. He is also interested in the application of multimodal large language models and AI companions to healthcare and for social good. To support this vision, his research program is structured around three interconnected pillars: multimodal learning, healthcare foundation models, and advanced visual recognition architectures.

Email

Prior to joining MBZUAI, Professor Cholakkal held research and technical leadership positions at the Inception Institute of Artificial Intelligence (IIAI) in the UAE, Mercedes-Benz R&D India, BEL Central Research Laboratory (India), and the Advanced Digital Sciences Center in Singapore. With more than 12 years of experience in computer vision and multimodal AI, he bridges fundamental research, teaching, and AI product development at scale.

As a principal investigator at MBZUAI, Professor Cholakkal has secured more than eight research grants and awards, including the Meta Llama Impact Innovation Award (2024), NVIDIA Academic Grant (2025), Meta Regional Research Grant (2025), and Google Gift Research Award (2023). His research as a PI has received paper awards and recognitions, including the SAC Highlights Award at EMNLP 2025, awarded to selected top papers at the conference.

His teaching contributions at MBZUAI have been recognized through the inaugural MBZUAI Teaching Excellence Award (2025). His role in building the University as one of its founding faculty members was acknowledged through the MBZUAI Founding Service Award.

He serves in leadership roles at top AI conferences, including General Chair of ACM Multimedia Asia 2026 and Local Chair of ACCV 2028. He has also held Area Chair positions at leading conferences such as CVPR, ICLR, NeurIPS, ACM Multimedia, ECCV, and BMVC. In addition, he has organized workshops on foundation models and vision transformers at major venues including CVPR, ICCV, NeurIPS, ACCV, and ICME.

  • Ph.D. in Computer Engineering, Nanyang Technological University (NTU), Singapore
  • Master of Technology (M.Tech), Indian Institute of Technology Guwahati (IIT Guwahati) , India
  • Google Gift Research Award 2023 (Role: PI)
  • Meta Llama Impact Innovation Award 2024 (Role: PI)
  • Meta Regional Research Grant 2025 (Role: PI)
  • NVIDIA Academic Grant 2025 (Role: PI)
  • Paper Awards: Senior Area Chair Highlights Award at EMNLP 2025 (Role: PI)
  • MBZUAI Teaching Excellence Award 2025: Inaugural recipient, one faculty across all departments.
  • Computer Vision Department Teaching Award 2026
  • Conference Organization: General Chair for ACM Multimedia Asia 2026, Abu Dhabi.
  • Conference Organization: Local Chair for ACCV 2028, Abu Dhabi.
  • Area Chair: CVPR 2026, ICLR 2026, NeurIPS 2025, ACM Multimedia 2025, ECCV 2024, BMVC 2024.
  • Guest Editor: Guest editor for Computer Vision and Image Understanding (CVIU) Journal special issue on "Foundational Models for Pixel-level Scene Understanding" 2025.
  • Workshop Organization: Organized four workshops as the primary organizer at top conferences such as CVPR 2024, ICCV 2023, NeurIPS 2022, and ACCV 2022. Additionally, co-organized workshops at ICME 2025.
  • Associate Editor: Pattern Analysis and Applications (PAA), IET Computer Vision.
  • Grant Proposal Reviewer: for organizations such as the National Science Center, Poland.
  • Co-inventor of the first US patent granted to MBZUAI and several other granted US patents.
  • Best Student Paper Award, VISAPP 2023 (project led by Professor Fahad Khan).
  • Research paper Agent-X received second place at the AgentX Competition hosted by UC Berkeley (project led by Professor Salman Khan).

Professor Cholakkal has published more than 100 research papers and holds more than eight granted U.S. patents across three primary research pillars: multimodal learning, healthcare foundation models, and efficient visual recognition architectures. His representative research publications are listed below:

  • Yevheniia Kryklyvets, Mohammed Irfan Kurpath, Sahal Shaji Mullappilly, Jinxing Zhou, Fahad Shahbaz Khan, Rao Muhammad Anwer, Salman Khan, Hisham Cholakkal: “MAviS: An Audio-visual Conversational Assistant for Avian Species”. EMNLP, 2025 (main conference, SAC Highlights Award – among top 36 papers at the conference and oral).
  • Amrin Kareem, Jean Lahoud, Hisham Cholakkal: “PARIS3D: Reasoning-based 3D Part Segmentation Using Large Multimodal Model”. ECCV, 2024.
  • Sambal Shikhar, Mohammed Irfan Kurpath, Sahal Shaji Mullappilly, Jean Lahoud, Fahad Khan, Rao Muhammad Anwer, Salman Khan, Hisham Cholakkal: “LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM”. Findings of ACL, 2025.
  • Komal Kumar, Rao Muhammad Anser, Fahad Shahbaz Khan, Salman Khan, Ivan Laptev, Hisham Cholakkal: “DEFT: Decompositional Efficient Fine-Tuning for Text-to-Image Models”. NeurIPS, 2025.
  • ALM team: “All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages”. CVPR, 2025 (Highlights).
  • Sara Pieri, Sahal Shaji Mullappilly, Fahad Shahbaz Khan, Rao Muhammad Anwer, Salman H. Khan, Timothy Baldwin, Hisham Cholakkal: “BiMediX: Bilingual Medical Mixture of Experts LLM”. Findings of EMNLP, 2024.
  • Sahal Shaji Mullappilly, Mohammed Irfan Kurpath, Sara Pieri, Saeed Yahya Alseiari, Shanavas Cholakkal, Khaled Aldahmani, Fahad Khan, Rao Anwer, Salman Khan, Timothy Baldwin, Hisham Cholakkal: “BiMediX2: Bio-Medical Expert LMM for Diverse Medical Modalities”. Findings of EMNLP, 2025.
  • Mohammad Almansoori, Komal Kumar, Hisham Cholakkal: “Self-Evolving Multi-Agent Simulations for Realistic Clinical Interactions”. MICCAI 2025 (Oral, early accept, top 2%).
  • Sara Pieri, J.R. Resto, S. Horvath, Hisham Cholakkal: “Handling Data Heterogenity via Architectural Design for Federated Visual Recognition”. NeurIPS, 2023.
  • Hisham Cholakkal, G. Sun, S. Khan, F.S. Khan, L. Shao, L.V. Gool: “Towards Partial Supervision for Generic Object Counting in Natural Scenes”. IEEE TPAMI, 2022.
  • H. Rasheed, M. Maaz, Sahal Shaji, A. Shaker, S. Khan, Hisham Cholakkal, R.M. Anwer, E.P. Xing, M.-H. Yang, F.S. Khan: “GLaMM: Pixel Grounding Large Multimodal Model”. CVPR, 2024.

Contact faculty affairs

Interested in working with our renowned faculty?
Fill out the below form and we will get back to you.