Apply Now More Information
img

Master of Science in

Natural Language Processing

Overview

NLP enables computers to communicate with people using everyday language. Large language models (LLMs), in particular, are key drivers of language-based interaction, potentially including extra data modalities such as structured data or images. Such systems also enable sophisticated tasks such as language translation, semantic understanding, text summarisation, and natural language dialogue. Applications of NLP include interactive speech-based applications, automated translators, digital personal assistants, and chatbot.

  • icon Full time Mode
  • icon 36 Credits
  • icon On Campus Location

As we look to the future, our department at MBZUAI will continue to be a driving force in shaping the next generation of NLP technologies. Whether it’s in the realms of large language models, dialog systems, chatbots, Arabic NLP, machine translation, speech recognition, multimodality (language +speech/images/video), or understanding the subtleties of language, our department will be at the forefront of innovation.

Preslav Nakov

Department Chair of Natural Language Processing, and Professor of Natural Language Processing

Read Bio

Meet the faculty

img

Timothy Baldwin

Provost and Professor of Natural Language Processing

BIO
img

Preslav Nakov

Department Chair of Natural Language Processing, and Professor of Natural Language Processing

BIO
img

Shady Shehata

Associate Professor of Practice

BIO
img

Hanan Aldarmaki

Assistant Professor of Natural Language Processing

BIO
img

Ted Briscoe

Deputy Department Chair of Natural Language Processing, and Professor of Natural Language Processing

BIO
img

Kentaro Inui

Professor of Natural Language Processing

BIO
img

Monojit Choudhury

Professor of Natural Language Processing

BIO
img

Alham Fikri Aji

Assistant Professor of Natural Language Processing

BIO
img

Bhiksha Raj

Visiting Professor of Natural Language Processing

BIO
img

Ekaterina Kochmar

Assistant Professor of Natural Language Processing

BIO
img

Iryna Gurevych

Adjunct Professor of Natural Language Processing

BIO
img

Muhammad Abdul-Mageed

Associate Professor of Natural Language Processing

BIO
img

Thamar Solorio

Senior Director, Graduate Student Affairs, and Professor of Natural Language Processing

BIO
img

Xiuying Chen

Assistant Professor of Natural Language Processing

BIO
img

Fajri Koto

Assistant Professor of Natural Language Processing

BIO
img

Veselin Stoyanov

Adjunct Professor of Natural Language Processing

BIO
img

Yova Kementchedjhieva

Assistant Professor of Natural Language Processing

BIO

Analyze and model textual and speech data with applications to real world scenarios.

Identify and explain the syntactic and semantic structures in speech and textual data (e.g., the predicate argument structure).

Implement cutting-edge NLP algorithms and benchmark the achieved results.

Formulate own research questions, analyze the existing body of knowledge, propose, and develop solutions to new problems.

Use and deploy NLP related programming tools for a variety of NLP problems.

Work independently as well as part of a team, in a collegial manner, on NLP related projects.

Effectively communicate the feasibility and sustainability of experimental results, innovations and research findings orally and in writing, and critique the existing body of work.

The minimum degree requirements for the Master of Science in Natural Language Processing is 36 credits, distributed as follows:

Number of Courses Credit Hours
Core 4 16
Electives 2 8
Internship At least one internship of up to six weeks duration must be satisfactorily completed as a graduation requirement 2
Introduction to Research Methods 1 2
Research Thesis 1 8

The Master of Science in Natural Language Processing is primarily a research-based degree. The purpose of coursework is to equip students with the right skill set, so they can successfully accomplish their research project (thesis). Students are required to take AI701, MTH701, NLP701, and NLP702 as mandatory courses. They can select two electives.

Course Title Credit Hours
AI701 Foundations of Artificial Intelligence

This course provides the students a comprehensive introduction to artificial intelligence. It builds upon fundamental concepts in machine learning. Students will learn about supervised and unsupervised learning, various learning algorithms, and the basics of the neural network, deep learning, and reinforcement learning.

4
MTH701 Mathematical Foundations for Artificial Intelligence

This course provides a comprehensive mathematical foundation for the field of artificial intelligence. It builds upon fundamental concepts in linear algebra, probability theory, statistics, and calculus. Students will learn how these mathematical concepts can be used to solve problems frequently encountered in AI applications.

4
NLP701 Natural Language Processing

This course provides a comprehensive introduction to natural language processing (NLP). It builds upon fundamental concepts in mathematics, specifically probability and statistics, linear algebra, and calculus, and assumes familiarity with programming.

4
NLP702 Advanced Natural Language Processing

This course provides a comprehensive introduction to natural language processing (NLP). It builds upon fundamental concepts in NLP and assumes familiarization with mathematical concepts and programming.

4

Students will select a minimum of two elective courses, with a total of 8 (or more) credit hours. Two must be selected from the list based on interest, proposed research thesis, and career aspirations, in consultation with their supervisory panel. The elective courses available for the Master of Science in Natural Language Processing are listed in the tables below:

Course Title Credit Hours
AI702 Deep Learning

This course provides a comprehensive overview of different concepts and methods related to deep learning. Students will first learn the foundations of deep learning, after which they will be introduced to a series of deep models: convolutional neural networks, autoencoders, recurrent neural network, and deep generative models. Students will work on case studies of deep learning in different fields such as computer vision, medical imaging, natural language processing, etc.

4
CB703 Introduction to Single Cell Biology and Bioinformatics

This course provides a broad overview of bioinformatics for single cell omics technologies, a new and fast-growing family of biological assays that enables measuring the molecular contents of individual cells with very high resolution and is key to advancing precision medicine. The course starts with an accessible introduction to basic molecular biology: the cell structure, the central dogma of molecular biology, the flow of biological information in the cell, the different types of molecules in the cell, and how we can measure them. This course then introduces students to the diverse landscape of biological data, including its types and characteristics and explores the foundational principles of single-cell omics bioinformatics, encompassing key methodologies, tools, and computational workflows, with an emphasis on the development of foundation models for single cell omics data.

4
CS721 Computer and Network Security

This course provides an overview of foundational principles and contemporary topics in information security. Students will examine system protection strategies, structural security frameworks, software resilience, and detection of security threats. The course integrates theoretical concepts with practical applications to enhance the understanding of securing complex information systems.

4
CV701 Human and Computer Vision

This course provides a comprehensive introduction to the basics of human visual system and color perception, image acquisition and processing, linear and nonlinear image filtering, image features description and extraction, classification, and segmentation strategies. Moreover, students will be introduced to quality assessment methodologies for computer vision and image processing algorithms.

4
CV702 Geometry for Computer Vision

The course provides a comprehensive introduction to the concepts, principles and methods of geometry-aware computer vision which helps in describing the shape and structure of the world. In particular, the objective of the course is to introduce the formal tools and techniques that are necessary for estimating depth, motion, disparity, volume, pose and shapes in 3D scenes.

4
CV703 Visual Object Recognition and Detection

This course provides a comprehensive overview of different concepts and methods related to visual object recognition and detection. In particular, the students will learn a large family of successful and recent state-of-the-art architectures of deep neural networks to solve the tasks of visual recognition, detection, and tracking.

4
CV707 Digital Twins

This course provides a comprehensive introduction to digital twins. Students will learn about digital twin technology, its common applications, and benefits, how to create a digital twin for predictive analytics using sensory data fusion, primary predictive modeling methods and how to implement and interacts with a digital twin using different platforms.

4
DS701 Data Mining

This course is an introductory course on data mining, which is the process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems.

4
DS702 Big Data Processing

This course is an introductory course on big data processing, which is the process of analyzing and utilizing big data. The course involves methods at the intersection of parallel computing, machine learning, statistics, database systems, etc.

4
DS703 Information Retrieval

This course is an introductory course on Information Retrieval (IR). The explosive growth of available digital information (e.g., Web pages, emails, news, Tweets, Wikipedia pages) demands intelligent information agents that can sift through all available information and find out the most valuable and relevant information. Web search engines, such as Google and Bing, are several examples of such tools. This course studies the basic principles and practical algorithms used for information retrieval and text mining. It will cover algorithms, design, and implementation of modern information retrieval systems. Topics include: retrieval system design and implementation, text analysis techniques, retrieval models (e.g., Boolean, vector space, probabilistic, and learning-based methods), search evaluation, retrieval feedback, search log mining, and applications in web information management.

4
HC701 Medical Imaging: Physics and Analysis

This course provides a graduate-level introduction to the principles and methods of medical imaging, with thorough grounding in the physics of the imaging problems. This course covers the fundamentals of X-ray, CT, MRI, ultrasound, and PET imaging. In addition, the course provides an overview of 3D geometry of medical images and a few classical problems in medical images analysis including classification, segmentation, registration, quantification, reconstruction and radiomics.

4
ML701 Machine Learning

This course provides a comprehensive introduction to machine learning. It builds upon fundamental concepts in mathematics, specifically probability and statistics, linear algebra, and calculus. Students will learn about supervised and unsupervised learning, various learning algorithms, and basics of learning theory, graphical models, and reinforcement learning.

4
ML702 Advanced Machine Learning

This course focuses on recent advances in machine learning and on developing skills for performing research to advance the state of the art in machine learning. Students will learn concepts in kernel methods, statistical complexity, statistical decision theory, computational complexity of learning algorithms, and reinforcement learning. This course builds upon concepts from Machine Learning (ML701) and assumes familiarity with fundamental concepts in machine learning, optimization, and statistics.

4
ML703 Probabilistic and Statistical Inference

Probabilistic and statistical inference is the process of drawing useful conclusions about data populations or scientific truths from uncertain and noisy data. This course will cover different modes of performing inference including statistical modelling, data-oriented strategies, and explicit use of design and randomization in analyses. Furthermore, it will provide an in-depth treatment of the broad theories (frequentists, Bayesian, likelihood) and numerous practical complexities (missing data, observed and unobserved confounding, biases) for performing inference. This course presents the fundamentals of statistical and probabilistic inference and shows how these fundamental concepts are applied in practice.

4
ML707 Smart City Services and Applications

This course provides a comprehensive introduction to using AI/ML in smart city services and applications. The course will start by reviewing basic concepts. Students will learn how to apply AI/ML to develop, design and improve smart city services. They will be able to demonstrate an understanding of the smart city concept, applications, requirements, and system design. They will develop capabilities of integrating emerging technologies in smart city components and be able to implement them. In addition, they will gain knowledge in applying security, data analytics, Internet of Things (IoT), communications and networking and work on case studies solutions for smart city infrastructures.

4
ML708 Trustworthy Artificial Intelligence

This course provides students with a comprehensive introduction to various trust-related issues in applications of artificial intelligence and machine learning. Students will learn about attacks against computer systems that use machine learning, as well as defense mechanisms to mitigate such attacks.

4
ML709 IoT of things, Services and Applications

This course provides a comprehensive introduction to using AI/ML in Internet of Things (IoT) smart systems, services and applications. The course will start by reviewing advanced concepts. Students will learn how to apply AI/ML to develop, design and improve IoT systems and services. They will be able to demonstrate an understanding of IoT concepts, applications, requirements and system design. They will develop capabilities of integrating emerging technologies in smart IoT components and be able to implement them. In addition, they will gain knowledge and skills in applying security, data analytics, AI models, communications and networking and work on case studies solutions for IoT infrastructures.

4
ML710 Parallel and Distributed Machine Learning Systems

As Machine Learning (ML) programs increase in data and parameter size, their growing computational and memory requirements demand parallel and distributed execution across multiple network-connected machines. In this course, students will learn the fundamental principles and representations for parallelizing ML programs and learning algorithms. Students will also learn how to design and evaluate (using standard metrics) and compare between complex parallel ML strategies composed out of basic parallel ML “aspects” and evaluate and compare between the architecture of different software systems that use such parallel ML strategies to execute ML programs. Students will also use standard metrics to explain how compilation and resource management affects the performance of parallel ML programs.

4
ML711 Intermediate Music AI

What is sound and music from a computer science perspective? How can we use AI and ML to better appreciate, perform, and compose music? When music meets computer science, could computers generate something truly creative by closing the loop of analysis and synthesis? Could computers interact with our humans in real time and offer us some new experience? Let’s explore the possibilities in this course.

It is a Music AI course, but most of the content is orthogonal to programming or traditional computer science. If you are a great computer science student or even a great programmer, you will be able to use your special skills in this class to your advantage. On the other hand, if you are a musician with intro-level programming skills, you can get by without writing a lot of difficult programs. Your musical knowledge and intuition will also be of great value. Students will learn the fundamentals of digital audio, basic sound synthesis algorithms, techniques for human-computer music interaction, and most importantly , machine learning algorithms for media generation. In a final project, students will demonstrate their mastery of tools and techniques through a publicly performed music composition.

4
MTH702 Optimization

This course provides a graduate-level introduction to the principles and methods of optimization, with a thorough grounding in the mathematical formulation of optimization problems. The course covers fundamentals of convex functions and sets, 1st order and 2nd order optimization methods, problems with equality and/or inequality constraints, and other advanced problems.

4
NLP703 Speech Processing

This course provides a comprehensive introduction to speech processing. It builds upon fundamental concepts in speech processing and assumes familiarization with mathematical and signal processing concepts.

4
ROB701 Introduction to Robotics

The course covers the mathematical foundation of robotic systems and introduces students to the fundamental concepts of ROS (Robot Operating System) as one of the most popular and reliable platforms to program modern robots. It also highlights techniques to formally model and study robot kinematics, dynamics, perception, motion control, navigation, and path planning. Students will also learn the interface of different types of sensors, read and analyze their data, and apply it in various robotic applications.

4

Master’s thesis research exposes students to an unsolved research problem, where they are required to propose new solutions and contribute towards the body of knowledge. Students pursue an independent research study, under the guidance of a supervisory panel, for a period of one year.

Course Title Credit Hours
NLP699 Natural Language Processing Master’s Research Thesis

Master’s thesis research exposes students to an unsolved research problem, where they are required to propose new solutions and contribute towards the body of knowledge. Students pursue an independent research study, under the guidance of a supervisory panel, for a period of one year. Master’s thesis research helps train graduates to pursue more advanced research in their Ph.D. degree. Further, it enables graduates to independently pursue an industrial project involving a research component.

8
RES799 Introduction to Research Methods

This course focuses on teaching students how to develop innovative research-based approaches that can be implemented in an organization. It covers various research designs and methods, including scientific methods, ethical issues in research, measurement, experimental research, survey research, qualitative research, and mixed methods research. Students will gain knowledge in selecting, evaluating, and collecting data to address specific research questions. Additionally, they will learn design thinking skills to connect their research-based topic to practicality. After completing the course, students will have the skills to develop a full research topic that can be innovative, entrepreneurial, and sustainable and can be applied in any organization related to the topic of research.

2

The MBZUAl internship with industry is intended to provide the student with hands-on experience, blending practical experiences with academic learning.

Course Title Credit Hours
INT799 M.Sc. Internship (up to six weeks)

M.Sc. Internship (up to six weeks)

2

MBZUAI accepts applicants from all nationalities who hold a completed Bachelor’s degree in a STEM field such as Computer Science, Electrical Engineering, Computer Engineering, Mathematics, Physics or other relevant Science or Engineering major from a university accredited or recognized by the UAE Ministry of Education (MoE) with a minimum CCGPA of 3.2 (on a 4.0 scale) or equivalent.


Applicants must provide their completed degree certificates and official transcripts when submitting their application. Senior-level students can apply initially with a copy of their official transcript and expected graduation letter and upon admission must submit the official completed degree certificate and transcript. A degree attestation from UAE MoE (for degrees from the UAE) or Certificate of Recognition from UAE MoE (for degrees acquired outside the UAE) should also be furnished within students’ first semester at MBZUAI.

 

All submitted documents must either be in English, originally, or include official English translations. Additionally, official academic documents should be stamped and signed by the university authorities.

Each applicant must show proof of English language ability by providing valid certificate copies of either of the following:

  • TOEFL iBT with a minimum total score of 90
  • IELTS Academic with a minimum overall score of 6.5
  • EmSAT English with a minimum score of 1550

TOEFL iBT and IELTS academic certificates are valid for two (2) years from the date of the exam while EmSAT results are valid for eighteen (18) months. Only standard versions (i.e. conducted at physical test centers) of the accepted English language proficiency exams will be considered.

Waiver requests from eligible applicants who are citizens (by passport or nationality) of UK, USA, Australia, and New Zealand who completed their studies from K-12 until bachelor’s degree and master’s degree (if applicable) from those same countries will be processed. They need to submit notarized copies of their documents during the application stage and attested documents upon admission. Waiver decisions will be given within seven (7) days after receiving all requirements.

Submission of GRE scores is optional for all applicants but will be considered a plus during the evaluation.

In a 500- to 1000-word essay, explain why you would like to pursue a graduate degree at MBZUAI and include the following information:

  • Motivation for applying to the university
  • Personal and academic background and how it makes you suitable for the program you are applying for
  • Experience in completing a diverse range of projects related to artificial intelligence
  • Stand-out achievements, e.g. awards, distinction, etc
  • Goals as a prospective student
  • Preferred career path and plans after graduation
  • Any other details that will support the application

Applicants will be required to nominate referees who can recommend their application. M.Sc. applicants should have a minimum of two (2) referees wherein one was a previous course instructor or faculty/research advisor and the other a current or previous work supervisor.

To avoid issues and delays in the provision of the recommendation, applicants have to inform their referees of their nomination beforehand and provide the latter’s accurate information in the online application portal. Automated notifications will be sent out to the referees upon application submission.

All applicants with complete files, including the required number of recommendations, will be invited to participate in an online screening exam to assess their knowledge and skills. Completion of the exam is not mandatory but highly encouraged as it would provide additional information to the evaluation committee. Waiving the exam is only recommended for those students who can provide strong evidence of their research capability, subject matter expertise, and technical skills.

Exam Topics

Math: Calculus, probability theory, linear algebra, trigonometry and optimization

Programming: Knowledge surrounding specific programming concepts and principles such as algorithms, data structures, logic, OOP, and recursion as well as language–specific knowledge of Python

Applicants are highly encouraged to complete the following online courses to further improve their qualifications :

 The exam instructions are available here

A select number of applicants may be invited to an interview with faculty as part of the screening process. The time and instructions for this will be communicated to applicants on timely bases.

Only one application per admission cycle must be submitted; multiple submissions are discouraged.

Application portal opens Regular deadline Decision notification date Late deadline
1st October 2024
(8:00 AM UAE time)
15th January 2025
(5:00 PM UAE time)
31st March 2025
(5:00 PM UAE time)
31st May 2025
(5:00 PM UAE time)
High-calibre applicants who apply by the ‘Regular Deadline’ and have complete applications (including the required recommendations) will be given full consideration. The online application portal will remain open until the ‘Late Deadline’. We do not guarantee that these late applications will be given full consideration.

Detailed information on the application process and scholarships is available here.

A typical study plan is as follows:


SEMESTER 1

AI701 Foundations of Artificial Intelligence
MTH701 Mathematical Foundations of Artificial Intelligence
NLP701 Natural Language Processing

SEMESTER 2

NLP702 Advanced Natural Language Processing
+ 2 electives from list

SUMMER

INT799 Internship (up to six weeks)

SEMESTER 3

NLP799 Master’s Research Thesis
RES799 Research Training

SEMESTER 4

NLP799 Master’s Research Thesis

Disclaimer: Subject to change.


Become a leader of the future

img

AI Innovation

More information
img

AI Start-ups

More information
img

The Node

More information

Register your interest for the M.sc in Natural Language Processing

We’ll keep you up to date with the latest news and when applications open.