Salman Khan - Page 2 of 4 - MBZUAI MBZUAI

Salman Khan

Associate Professor of Computer Vision

Research Interests

Professor Khan's research is in computer vision and multimodal AI, centered on building general-purpose visual and multimodal reasoning systems that operate reliably in open-world settings. His group works on large multimodal models (LMMs) for images, video, and Earth observation; pixel-grounded vision-language models; geospatial and climate foundation models; and multilingual, culturally inclusive multimodal systems. A continuing thread across this work is efficiency and robustness, designing compact transformer architectures and models that stay dependable under distribution shift, so that AI can support real-world recognition, reasoning, and decision-making, including UAE national priorities in climate resilience, agriculture, and sustainability. Email

Professor Khan is a founding faculty member of MBZUAI, where he leads an independent research program in computer vision and multimodal AI. His group is recognized for work on open-world detection, pixel-grounded vision-language models, geospatial and climate foundation models, and culturally diverse multilingual multimodal systems. He has published more than 150 papers in premier venues including TPAMI, IJCV, CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, and ACL. He was named a Clarivate Highly Cited Researcher (2025) and ranks among the top 0.4% of AI scientists worldwide in the Stanford global ranking.

He leads several nationally strategic programs in climate, agriculture, and Earth observation as principal investigator, with partners including IBM, ADIA, ADQ, NASA, and the Gates Foundation. He serves as Area Chair at all major AI and vision venues and has accepted leadership roles in the community. He is a steering committee member of the AI Alliance and an inventor on 18 US patents, including MBZUAI's first granted US patent. His group has released more than 100 open-source repositories (25,000+ GitHub stars) and 30+ datasets and benchmarks, with more than 10 million Hugging Face downloads.

Prior to joining MBZUAI, Professor Khan was a senior scientist at the Inception Institute of Artificial Intelligence (IIAI) (2018–2020) and an honorary faculty at the Australian National University (ANU) from 2016. He previously worked as a research scientist with Data61–CSIRO (2016–2018) and a visiting researcher with National ICT Australia (NICTA) in 2015. He received his Ph.D. from the University of Western Australia (UWA) in 2016.

  • Ph.D. in Computer Science, University of Western Australia (UWA), Australia (honorable mention on Dean’s list)
  • Provost's Distinguished Research Award, MBZUAI (2025).
  • ADIA Lab Fellow (2025), for agentic decision-support systems.
  • Microsoft AIEI Senior Fellow (2026).
  • Clarivate Highly Cited Researcher in Computer Science (2025).
  • Ranked among the top 0.4% of AI scientists worldwide, Stanford global ranking.
  • UAE AI Award 2025 Finalist (Scientific Research category), for geospatial vision-language models.
  • Meta Llama Impact Innovation Award (2024), for the BiMediX bilingual biomedical LMM.
  • NASSCOM AI Game Changer Award (2022), for open-world object detection.
  • Best Paper Award Finalist, CVPR 2022 (burst image restoration).
  • Best Student Paper Honorable Mention, ACCV 2024 (ObjectCompose).
  • Best Paper Award, TerraBytes Workshop at ICML 2025 (AirCast).
  • Best Paper Award, MONTI Workshop (MIT) at CVPR 2026 (ThinkGeo).
  • 2nd place, Berkeley RDI AgentX Competition 2025 (Reasoning & Planning track).
  • Multiple NTIRE Challenge wins and podiums at CVPR (2019, 2021).
  • Restormer and MPRNet repeatedly listed among Paper Digest's "Most Influential CVPR Papers"; UNETR++ the most-read IEEE TMI article (2025).
  • Repeated Outstanding Reviewer awards at CVPR and ICCV (most recently CVPR 2026); Publons Top Reviewer (top 1% globally).
  • Recipient of prestigious scholarships including Fulbright and IPRS.
  • Publication Salman Khan

Professor Khan has published more than 150 papers in top journals and conferences such as TPAMI, IJCV, CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, and ACL, with more than 52,000 citations and an h-index of 94. His full and current publication list is available on Google Scholar. Selected and recent works across his main research directions:

  • M. S. Danish, M. A. Munir, S. R. A. Shah, M. H. Khan, R. M. Anwer, J. Laaksonen, F. S. Khan, S. Khan. "TerraFM: A Scalable Foundation Model for Unified Multisensor Earth Observation." ICLR 2026.
  • T. Ashraf, A. Saqib, H. Gani, M. AlMahri, Y. Li, N. Ahsan, U. Nawaz, J. Lahoud, H. Cholakkal, M. Shah, P. Torr, F. S. Khan, R. M. Anwer, S. Khan. "Agent-X: Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks." ICLR 2026. (2nd place, Berkeley RDI AgentX Competition 2025)
  • S. Soni, A. Dudhane, H. Debary, M. Fiaz, M. A. Munir, M. S. Danish, P. Fraccaro, C. D. Watson, L. J. Klein, F. S. Khan, S. Khan. "EarthDial: Turning Multi-sensory Earth Observations to Interactive Dialogues." CVPR 2025. (UAE AI Award 2025 Finalist)
  • S. Munasinghe, H. Gani, W. Zhu, J. Cao, E. P. Xing, F. S. Khan, S. Khan. "VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos." CVPR 2025.
  • A. Vayani, D. Dissanayake, H. Watawana, N. Ahsan, …, S. Khan, F. S. Khan. "All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages (ALM-Bench)." CVPR 2025 (Highlight).
  • A. Shabbir, M. Zumri, M. Bennamoun, F. S. Khan, S. Khan. "GeoPixel: Pixel Grounding Large Multimodal Model in Remote Sensing." ICML 2025.
  • O. Thawakar, D. Dissanayake, K. More, R. Thawkar, A. Heakl, N. Ahsan, Y. Li, M. Zumri, J. Lahoud, R. M. Anwer, H. Cholakkal, I. Laptev, M. Shah, F. S. Khan, S. Khan. "LlamaV-o1: Rethinking Step-by-Step Visual Reasoning in LLMs." ACL 2025.
  • S. Ghaboura, A. Heakl, O. C. Thawakar, A. H. S. A. Alharthi, I. Riahi, A. Radman, J. Laaksonen, F. S. Khan, S. Khan, R. M. Anwer. "CAMEL-Bench: A Comprehensive Arabic LMM Benchmark." NAACL 2025.
  • M. S. Danish, M. A. Munir, S. R. A. Shah, K. Kuckreja, F. S. Khan, P. Fraccaro, A. Lacoste, S. Khan. "GEOBench-VLM: Benchmarking Vision-Language Models for Geospatial Tasks." ICCV 2025 (Highlight).
  • V. Nedungadi, M. A. Munir, M. Rußwurm, R. Sarafian, I. N. Athanasiadis, Y. Rudich, F. S. Khan, S. Khan. "AirCast: Improving Air Pollution Forecasting Through Multi-Variable Data Alignment." TerraBytes Workshop, ICML 2025. (Best Paper Award)
  • H. A. Rasheed, M. Maaz, S. S. Mullappilly, A. M. Shaker, S. Khan, H. Cholakkal, R. M. Anwer, E. P. Xing, M.-H. Yang, F. S. Khan. "GLaMM: Grounding Large Multimodal Model." CVPR 2024.
  • K. Kuckreja, M. S. Danish, M. Naseer, A. Das, S. Khan, F. S. Khan. "GeoChat: Grounded Large Vision-Language Model for Remote Sensing." CVPR 2024.
  • M. Maaz, H. Rasheed, S. Khan, F. S. Khan. "Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models." ACL 2024.
  • A. Shaker, M. Maaz, H. Rasheed, S. Khan, M.-H. Yang, F. S. Khan. "SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications." ICCV 2023.
  • S. W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, M.-H. Yang. "Restormer: Efficient Transformer for High-Resolution Image Restoration." CVPR 2022 (Oral).
  • S. Khan, M. Naseer, M. Hayat, S. W. Zamir, F. S. Khan, M. Shah. "Transformers in Vision: A Survey." ACM Computing Surveys, 2022.
  • M. Naseer, K. Ranasinghe, S. Khan, M. Hayat, F. S. Khan, M.-H. Yang. "Intriguing Properties of Vision Transformers." NeurIPS 2021 (Spotlight).
  • J. K. Joseph, S. Khan, F. S. Khan, V. N. Balasubramanian. "Towards Open World Object Detection." CVPR 2021 (Oral). (NASSCOM AI Game Changer Award)

Contact faculty affairs

Interested in working with our renowned faculty?
Fill out the below form and we will get back to you.