MBZUAI’s Center for Integrative Artificial Intelligence (CIAI) has been created to further develop an AI operating system (OS) based on the foundation of the CASL project.
The CIAI is led by MBZUAI President, Professor Eric Xing and center director Dr. Kun Zhang and has brought together a team of highly experienced senior systems and machine learning engineers, and researchers, who are passionate about building systems.
The aim of the center is to develop the next generation OS that can support easy composition, experimentation, and deployment of even the most advanced ML-pipelines such as building GPT3-like language models for new tasks, or full-stack AI systems for clinical management.
Objectives
An integrative AI system is not a monolithic blackbox, but a modular, standardizable, and certifiable assembly of building blocks at all levels: data, model, algorithm, computing, and infrastructure, etc. At CIAI, we seek to develop principled approaches including representations, optimization formalisms, intra- and inter-level mapping strategies, theoretical analysis, and production platforms for optimal and potentially autonomous creation and configuration of AI solutions at ALL
LEVELS – data harmonization, model composition, learning to learn, scalable computing, and infrastructure orchestration.
We believe machine learning at all levels is a necessity, not just a preference, toward industrializing AI that can be considered transparent, trustworthy, and cost effective.
Projects
Understanding the developmental processes leading to the formation of tissues and organs from a fertilized egg, represents one of the most fundamental questions in developmental biology. Although the advances in microscopy and imaging systems have made imaging of mammalian from embryos possible, there is a major gap between the data collection and data analysis of the developing live cells. In this project, our objective at MBZUAI is to develop novel computer vision algorithms suitable for the automatic analysis of embryos.
EXPLORECongenital heart diseases (CHD) are among the most frequent birth defects contributing to around 1 million children a year globally. Ultrasound screening of the fetus is used to acquire different views of the heart that may help detect heart abnormalities. However, human expert detectability of heart defects is erroneous, subjective and time-consuming. Fetal heart is typically checked in the second trimester scan where most of fetal organs can be reviewed. In this work, we aim to develop state-of-the-art machine learning models to classify fetal heart views and check for fetal abnormalities. This shall have a significant effect on supporting clinicians to make more accurate and real-time diagnostic decisions.
EXPLORESport knee injuries are the leading causes for most knee surgeries performed annually. Anterior cruciate ligament (ACL) tears and Meniscus tears are the most prevalent injuries to occur among people and athletes. The injuries are often detected using arthroscopy or knee magnetic resonance imaging (MRI). Arthroscopy is considered as an invasive method to analyze knee injuries; therefore, knee MRIs are more preferred for diagnosis.
EXPLOREHeart disease is a major problem worldwide, encompassing many different kinds of disease. Coronary artery disease (CAD) in particular afflicts the blood vessels that supply the heart with blood. As part of the diagnostic process for CAD, ultrasound imaging can be used, which is non-invasive, inexpensive and quick. This project aims to automate the process of CAD diagnosis with ultrasound imaging to reduce the load on clinical experts, who are in short supply, and deal with the problem of human observer variability, leading to more reliable consistent diagnoses.
EXPLOREAccording to WHO, cancer accounted for around 10 million deaths in 2020, or about one in six deaths. Many of the cancers can be cured with early diagnosis and effective treatment. In our project, we use self-supervised learning to extract features from unlabelled data to perform cancer type classification. We work on multi-omics data obtained from Next Generation Sequencing for cancer diagnosis. Our approach works well even with the limited amount of labelled data.
EXPLORECurrent digital EHR systems gather and organize information from thousands - or even millions - of individuals into curated databases. The medical information collected in these systems for each patient can overwhelm attending clinicians. We propose PICUT, a novel, efficient and explainable transformer-based framework to aggregate patients' EHR information and produce a broader and global medical language understanding of patients' history.
EXPLOREThe diagnosis of many heart-related problems can be done via cardiac function assessment. Expert physicians do perform cardiac function assessment on multiple cardiac cycles. However, such assessment is time-consuming and may be hindered by the variability and accuracy of measurements from cardiac imaging data. Furthermore, although cardiac ultrasound is widely available, inexpensive and safe compared to cardiac CT or MRI, it is operator dependent and hence image quality varies significantly between scans. Therefore, automatic machine learning solutions which rely on using big data to analyze echocardiographic scans to measure important cardiac functions might provide physicians with tools to support their daily clinical routines.
EXPLOREThe early prediction of Acute Kidney Injury (AKI) could be a considerable support for clinicians, since about 11% of deaths in hospitals could be prevented by promptly recognizing and treating patients at risk. To achieve this, we develop a deep learning NLP-based solution for early prediction of patients at risk of AKI using Electronic Health Records (EHR).
EXPLORECancer is one of the leading causes of death worldwide, and head and neck (H&N) cancer is one of the most common types of cancer. Oncologically, the diagnosis of H and N cancer is performed using imaging modalities like computed tomography (CT) and positron emission tomography (PET). Clinicians spend hours, if not days, to manually delineate the tumor region. Deep learning (DL) can help automate this task, allowing faster, more consistent and equally accurate diagnosis and prognosis. In this work, we study different approaches of DL for the diagnosis of H and N cancer using multimodal data of CT and PET. Additionally, we perform prognosis using the imaging data and clinical records, achieving clinically reasonable results on both tasks.
EXPLOREThis project focuses on building a comprehensive image and video understanding model that can automatically answer challenging queries such as "what", "where", "how", and "how many related to various visual contents. Here the objective is to develop robust and efficient computer vision frameworks that can be utilized for real-world problems.
EXPLOREAny deployed Machine Learning system trusted with patients must be robust to domain shifts over time. Additionally, a large variety of clinical applications today must operate efficiently on-device, under finite memory and resource constraints. We propose methods that achieve state-of-the-art in robustness to continuous domain shifts under resource-constrained settings.
EXPLOREKnowledge transfer between tasks has greatly benefited the computer vision community over the years, by reducing the reliance on large annotated training data. This had a particular impact on the medical imaging domain where there is scarcity of data. Transfer learning in particular proved to be effective in this regard. The multi-attribute nature of medical imaging presents a potential direction where the transfer relationships between images with different attributes, different domain, modality, organ, pathology can be exploited for more robust and efficient transfers.
EXPLORENeonatal respiratory distress syndrome (NRDS) is a condition often seen in premature babies, where lungs are not fully developed. It is the most common respiratory disorder in premature newborns, and its prevalence is directly proportional to the premature birth rate. At present, physical examination, a blood test to measure blood oxygen saturation, and X-ray images are used for diagnosis. Early diagnosis of the condition is of high importance due to available management methods. Therefore, the development of methods to carry out NRDS diagnosis accurately and efficiently can significantly contribute to improving chances of treatment.
EXPLOREThe rapid spread of COVID-19 infections and the resulting strain on healthcare institutions worldwide made it clear that artificial intelligence (AI) assisted screening and diagnosis can alleviate some of this strain. Because of this, researchers around the world set out to develop deep learning models to assist with the screening and diagnosis of COVID-19 infections with the aim of supporting the medical community in curbing the spread of the virus and managing the treatment for infected cases.
EXPLOREFetal gestational age (GA) is a vital clinical information that is estimated during pregnancy in order to assess fetal growth. This is usually performed by measuring the crown-rump length (CRL) on an ultrasound image in the dating scan which is then correlated with fetal age and growth trajectory. Although clinical guidelines specify the criteria for the correct CRL view, sonographers may not regularly adhere to such rules. In this work, we propose a new deep learning-based solution that is able to verify the adherence of a CRL image to clinical guidelines in order to assess image quality and facilitate the accurate estimation of GA. This shall be evaluated on real-world scans and if successful, it may have a significant impact on how sonographers acquire dating scans.
EXPLOREThis project aims to develop computational models for spontaneous acquisition of infant-level perceptual understanding from realistic data in an unsupervised manner, i.e., without external guidance, similar to the development of such capabilities in early childhood. In particular, we are interested in building AI systems that can learn, with no external supervision, powerful and robust visual representations, which are useful for recognizing objects and understanding their interactions with the surroundings, including with other intelligent agents or objects.
EXPLOREDespite the tremendous advancements achieved in the field of artificial intelligence (AI) in the past decade, there are tasks where AI systems lag behind their biological counterparts. Here, we propose to combine insights gained through the study of biological active perception and state-of-the-art AI, together with specialized biomimetic hardware, to bridge this gap. Specifically, we propose to develop an efficient video-based visual recognition system capable of continuous learning.
EXPLORECIAI at MBZUAI is looking for post-doctoral fellows in system ML, ML, causal representation learning, computational biology, computer vision, natural language processing, explainable AI, and other fields. Please send you CV to Guangyi.Chen@mbzuai.ac.ae if you are interested.
The center is always on the lookout for highly experienced senior systems and machine learning engineers and programmers. They must be passionate about building systems.
MBZUAI offers a highly competitive salary package and computing facilities, as well as freedom, collaboration, and opportunities to work with world-renowned faculty and students on a long-term basis.
The CIAI has its own supercomputing cluster comprised of 64 GPU compute nodes (each with 4x A100 GPUs and Infiniband networking) and four high-capacity GPU nodes (each with 8x GPUs and Infiniband networking), for a grand total of 288 GPUs.
CIAI currently has more than 25 members including faculty, students, postdoctoral research fellows and research assistants.
August 2, 2022
Click here to watch the latest CIAI colloquium held on July 14, 2022 titled “On the Utility of Gradient Compression in Distributed Training Systems” with Dr. Hongyi Wang from Carnegie Mellon University. Moderated by MBZUAI’s Qirong Ho.
September 29, 2022
In case you missed the CIAI colloquium on September 15, 2022 by Fed ML’s Dr. Chaoyang He, you can view it as part of MBZUAI’s AI Talks. Titled “FedML – Building Open and Collaborative Machine Learning Anywhere at Any Scale” and moderated by MBZUAI’s Qirong Ho.
Contact the CIAI