Center for Integrative Artificial Intelligence (CIAI)

Center for Integrative Artificial Intelligence (CIAI)

About

MBZUAI’s Center for Integrative Artificial Intelligence (CIAI) has been created to further develop an AI operating system (OS) based on the foundation of the CASL project.

The CIAI is led by MBZUAI President, Professor Eric Xing and center director Dr. Kun Zhang and has brought together a team of highly experienced senior systems and machine learning engineers, and researchers, who are passionate about building systems.

The aim of the center is to develop the next generation OS that can support easy composition, experimentation, and deployment of even the most advanced ML-pipelines such as building GPT3-like language models for new tasks, or full-stack AI systems for clinical management.

Objectives

An integrative AI system is not a monolithic blackbox, but a modular, standardizable, and certifiable assembly of building blocks at all levels: data, model, algorithm, computing, and infrastructure, etc. At CIAI, we seek to develop principled approaches including representations, optimization formalisms, intra- and inter-level mapping strategies, theoretical analysis, and production platforms for optimal and potentially autonomous creation and configuration of AI solutions at ALL
LEVELS – data harmonization, model composition, learning to learn, scalable computing, and infrastructure orchestration.

We believe machine learning at all levels is a necessity, not just a preference, toward industrializing AI that can be considered transparent, trustworthy, and cost effective.

Research

  • Hu and E. P. Xing, Panoramic Learning with A Standardized Machine Learning Formalism,arXiv 2108.07783
  • A. Qiao, S. K. Choe, S. J. Subramanya, W. Neiswanger, Q. Ho, H. Zhang, G. R. Ganger, and E. P. Xing, Pollux: Co-adaptive Cluster Scheduling for Goodput-Optimized Deep Learning, The 15th USENIX Symposium on Operating Systems Design and Implementation, 2021 (OSDI ’21). (Recipient of the Jay Lepreau Best Paper Award)
  • X. Zheng, B. Aragam, P. Ravikumar, and E. P. Xing, DAGs with NO TEARS: Continuous Optimization for Structure Learning, Advances in Neural Information Processing Systems 32. (NeurIPS 18)
  • M. Al-Shedivat, J. Gillenwater, E. P. Xing, A. Rostamizadeh, Federated Learning via Posterior Averaging: A New Perspective and Practical Algorithms, Proceedings of 9th International Conference on Learning Representations, 2021. (ICLR ’21).
  • M. Al-Shedivat, A. Dubey, and E. P. Xing Contextual Explanation Networks, Journal of Machine Learning Research, 21 (194), 1-44, 2020.
  • K. Kandasamy, K. Vysyaraju, W. Neiswanger, B. Paria, C. Collins, J. Schneider, B. Poczos, and E. P. Xing Tuning Hyperparameters without Grad Students: Scalable and Robust Bayesian Optimisation with Dragonfly, Journal of Machine Learning Research, 21 (81), 1-27, 2020.
  • Xie, B. Huang, Z. Chen, Y. He, Z. Geng, K. Zhang, “Estimation of Linear Non-Gaussian Latent Hierarchical Structure,” International Conference on Machine Learning (ICML) 2022
  • Huang, C. Lu, L. Leqi, J. M. Hernandez-Lobato, C. Glymour, B. Schölkopf, K. Zhang, “Action-Sufficient State Representation Learning for Control with Structural Constraints,” International Conference on Machine Learning (ICML) 2022
  • Huang, F. Feng, C. Lu, S. Magliacane, K. Zhang, “AdaRL: What, Where, and How to Adapt in Transfer Reinforcement Learning,” International Conference on Learning Representations (ICLR) 2022(spotlight)
  • Yao, Y. Sun, A. Ho, C. Sun, K. Zhang, “Learning Temporally Latent Causal Processes from General Temporal Data,” International Conference on Learning Representations (ICLR) 2022
  • Adams, N. R. Hansen, K. Zhang, “Identification of Partially Observed Linear Causal Models: Graphical Conditions for the Non-Gaussian and Heterogeneous Cases,” Conference on Neural Information Processing Systems (NeurIPS) 2021
  • Xie, M. Gong, Y. Xu, and K. Zhang, “Unaligned Image-to-Image Translation by Learning to Reweight,” In Proceedings of International Conference on Computer Vision (ICCV) 2021
  • Xie, R. Cai, B. Huang, C. Glymour, Z. Hao, K. Zhang, “Generalized Independent Noise Condition for Estimating Linear Non-Gaussian Latent Variable Causal Graphs,” Conference on Neural Information Processing Systems (NeurIPS) 2020(spotlight).
  • B. Huang, K. Zhang, J. Zhang, J. Ramsey, R. Sanchez-Romero, C. Glymour, B. Schölkopf, “Causal Discovery from Heterogeneous/Nonstationary Data,” Journal of Machine Learning Research (JMLR), 2020

Projects

Understanding the developmental processes leading to the formation of tissues and organs from a fertilized egg, represents one of the most fundamental questions in developmental biology. Although the advances in microscopy and imaging systems have made imaging of mammalian from embryos possible, there is a major gap between the data collection and data analysis of the developing live cells. In this project, our objective at MBZUAI is to develop novel computer vision algorithms suitable for the automatic analysis of embryos.

EXPLORE

Congenital heart diseases (CHD) are among the most frequent birth defects contributing to around 1 million children a year globally. Ultrasound screening of the fetus is used to acquire different views of the heart that may help detect heart abnormalities. However, human expert detectability of heart defects is erroneous, subjective and time-consuming. Fetal heart is typically checked in the second trimester scan where most of fetal organs can be reviewed. In this work, we aim to develop state-of-the-art machine learning models to classify fetal heart views and check for fetal abnormalities. This shall have a significant effect on supporting clinicians to make more accurate and real-time diagnostic decisions.

EXPLORE

Sport knee injuries are the leading causes for most knee surgeries performed annually. Anterior cruciate ligament (ACL) tears and Meniscus tears are the most prevalent injuries to occur among people and athletes. The injuries are often detected using arthroscopy or knee magnetic resonance imaging (MRI). Arthroscopy is considered as an invasive method to analyze knee injuries; therefore, knee MRIs are more preferred for diagnosis.

EXPLORE

Heart disease is a major problem worldwide, encompassing many different kinds of disease. Coronary artery disease (CAD) in particular afflicts the blood vessels that supply the heart with blood. As part of the diagnostic process for CAD, ultrasound imaging can be used, which is non-invasive, inexpensive and quick. This project aims to automate the process of CAD diagnosis with ultrasound imaging to reduce the load on clinical experts, who are in short supply, and deal with the problem of human observer variability, leading to more reliable consistent diagnoses.

EXPLORE

According to WHO, cancer accounted for around 10 million deaths in 2020, or about one in six deaths. Many of the cancers can be cured with early diagnosis and effective treatment. In our project, we use self-supervised learning to extract features from unlabelled data to perform cancer type classification. We work on multi-omics data obtained from Next Generation Sequencing for cancer diagnosis. Our approach works well even with the limited amount of labelled data.

EXPLORE

Current digital EHR systems gather and organize information from thousands - or even millions - of individuals into curated databases. The medical information collected in these systems for each patient can overwhelm attending clinicians. We propose PICUT, a novel, efficient and explainable transformer-based framework to aggregate patients' EHR information and produce a broader and global medical language understanding of patients' history.

EXPLORE

The diagnosis of many heart-related problems can be done via cardiac function assessment. Expert physicians do perform cardiac function assessment on multiple cardiac cycles. However, such assessment is time-consuming and may be hindered by the variability and accuracy of measurements from cardiac imaging data. Furthermore, although cardiac ultrasound is widely available, inexpensive and safe compared to cardiac CT or MRI, it is operator dependent and hence image quality varies significantly between scans. Therefore, automatic machine learning solutions which rely on using big data to analyze echocardiographic scans to measure important cardiac functions might provide physicians with tools to support their daily clinical routines.

EXPLORE

The early prediction of Acute Kidney Injury (AKI) could be a considerable support for clinicians, since about 11% of deaths in hospitals could be prevented by promptly recognizing and treating patients at risk. To achieve this, we develop a deep learning NLP-based solution for early prediction of patients at risk of AKI using Electronic Health Records (EHR).

EXPLORE

Cancer is one of the leading causes of death worldwide, and head and neck (H&N) cancer is one of the most common types of cancer. Oncologically, the diagnosis of H and N cancer is performed using imaging modalities like computed tomography (CT) and positron emission tomography (PET). Clinicians spend hours, if not days, to manually delineate the tumor region. Deep learning (DL) can help automate this task, allowing faster, more consistent and equally accurate diagnosis and prognosis. In this work, we study different approaches of DL for the diagnosis of H and N cancer using multimodal data of CT and PET. Additionally, we perform prognosis using the imaging data and clinical records, achieving clinically reasonable results on both tasks.

EXPLORE

This project focuses on building a comprehensive image and video understanding model that can automatically answer challenging queries such as "what", "where", "how", and "how many related to various visual contents. Here the objective is to develop robust and efficient computer vision frameworks that can be utilized for real-world problems.

EXPLORE

Any deployed Machine Learning system trusted with patients must be robust to domain shifts over time. Additionally, a large variety of clinical applications today must operate efficiently on-device, under finite memory and resource constraints. We propose methods that achieve state-of-the-art in robustness to continuous domain shifts under resource-constrained settings.

EXPLORE

Knowledge transfer between tasks has greatly benefited the computer vision community over the years, by reducing the reliance on large annotated training data. This had a particular impact on the medical imaging domain where there is scarcity of data. Transfer learning in particular proved to be effective in this regard. The multi-attribute nature of medical imaging presents a potential direction where the transfer relationships between images with different attributes, different domain, modality, organ, pathology can be exploited for more robust and efficient transfers.

EXPLORE

Neonatal respiratory distress syndrome (NRDS) is a condition often seen in premature babies, where lungs are not fully developed. It is the most common respiratory disorder in premature newborns, and its prevalence is directly proportional to the premature birth rate. At present, physical examination, a blood test to measure blood oxygen saturation, and X-ray images are used for diagnosis. Early diagnosis of the condition is of high importance due to available management methods. Therefore, the development of methods to carry out NRDS diagnosis accurately and efficiently can significantly contribute to improving chances of treatment.

EXPLORE

The rapid spread of COVID-19 infections and the resulting strain on healthcare institutions worldwide made it clear that artificial intelligence (AI) assisted screening and diagnosis can alleviate some of this strain. Because of this, researchers around the world set out to develop deep learning models to assist with the screening and diagnosis of COVID-19 infections with the aim of supporting the medical community in curbing the spread of the virus and managing the treatment for infected cases.

EXPLORE

Fetal gestational age (GA) is a vital clinical information that is estimated during pregnancy in order to assess fetal growth. This is usually performed by measuring the crown-rump length (CRL) on an ultrasound image in the dating scan which is then correlated with fetal age and growth trajectory. Although clinical guidelines specify the criteria for the correct CRL view, sonographers may not regularly adhere to such rules. In this work, we propose a new deep learning-based solution that is able to verify the adherence of a CRL image to clinical guidelines in order to assess image quality and facilitate the accurate estimation of GA. This shall be evaluated on real-world scans and if successful, it may have a significant impact on how sonographers acquire dating scans.

EXPLORE

This project aims to develop computational models for spontaneous acquisition of infant-level perceptual understanding from realistic data in an unsupervised manner, i.e., without external guidance, similar to the development of such capabilities in early childhood. In particular, we are interested in building AI systems that can learn, with no external supervision, powerful and robust visual representations, which are useful for recognizing objects and understanding their interactions with the surroundings, including with other intelligent agents or objects.

EXPLORE

Despite the tremendous advancements achieved in the field of artificial intelligence (AI) in the past decade, there are tasks where AI systems lag behind their biological counterparts. Here, we propose to combine insights gained through the study of biological active perception and state-of-the-art AI, together with specialized biomimetic hardware, to bridge this gap. Specifically, we propose to develop an efficient video-based visual recognition system capable of continuous learning.

EXPLORE

Vacancies

CIAI at MBZUAI is looking for post-doctoral fellows in system ML, ML, causal representation learning, computational biology, computer vision, natural language processing, explainable AI, and other fields. Please send you CV to Guangyi.Chen@mbzuai.ac.ae if you are interested.

The center is always on the lookout for highly experienced senior systems and machine learning engineers and programmers. They must be passionate about building systems.

MBZUAI offers a highly competitive salary package and computing facilities, as well as freedom, collaboration, and opportunities to work with world-renowned faculty and students on a long-term basis.

Facilties

The CIAI has its own supercomputing cluster comprised of 64 GPU compute nodes (each with 4x A100 GPUs and Infiniband networking) and four high-capacity GPU nodes (each with 8x GPUs and Infiniband networking), for a grand total of 288 GPUs.

People

CIAI currently has more than 25 members including faculty, students, postdoctoral research fellows and research assistants.

Faculty

...

Eric Xing

President and University Professor

...

Kun Zhang

Acting Chair of Machine Learning, Professor of Machine Learning, and Director of Center for Integrative Artificial Intelligence (CIAI)

...

Qirong Ho

Assistant Professor of Machine Learning

...

Zhiqiang Shen

Assistant Professor in Machine Learning

Personal website

Ph.D. students

...

Ding Bai

Master’s students

...

Kirill Vishniakov

...

Munachiso Samuel Nwadike

Github profile

...

Zhenhao Chen

Github profile

...

Ashraf Haddad

...

Eman Hisham Zaki Al Suradi

...

Akbobek Abilkaiyrkyzy

...

Sarah Al Barri

...

Zhengqing Gao

Github profile

...

Shunxing Fan

...

Boyang Sun

...

Kevin Michael Toner

...

Juwayni Macadato Lucman

...

Omar Ali Ahmed Alsuwaidi

Postdoctoral research fellows

...

Guangyi Chen

Research assistants

...

Shentong Mo

Github profile

...

Hexu Zhao

...

Jiayou Zhang

...

Navish Kumar

Google Scholar

Visiting students

...

Zheng-Mao Zhu

...

Yuanyuan Jiang

...

Yu-Ren Liu

Personal website

Noticeboard

August 2, 2022

Click here to watch the latest CIAI colloquium held on July 14, 2022 titled “On the Utility of Gradient Compression in Distributed Training Systems” with Dr. Hongyi Wang from Carnegie Mellon University. Moderated by MBZUAI’s Qirong Ho.

September 29, 2022

In case you missed the CIAI colloquium on September 15, 2022 by Fed ML’s Dr. Chaoyang He, you can view it as part of MBZUAI’s AI Talks. Titled “FedML – Building Open and Collaborative Machine Learning Anywhere at Any Scale” and moderated by MBZUAI’s Qirong Ho.

Contact the CIAI