Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) gifted Gokul Karthik Kumar the freedom to explore several areas of artificial intelligence (AI) during his two-year master’s journey. Kumar acknowledges the profound impact of this interdisciplinary approach to AI for discovering his true path and for earning him several accolades early in his career.
The MBZUAI Class of 2023 master’s graduate works at the intersection of his major in computer vision, and his passion for natural language processing (NLP). Kumar chose to seize the opportunity presented by the recently established graduate research institution in Abu Dhabi, rather than accepting an offer from the esteemed University of Waterloo in Canada and says he has never looked back.
“I have had the flexibility to explore different domains of AI research and choose the ones that I am most passionate about,” Kumar said. “While my major is in computer vision, my supervisor has supported me in pursuing projects in other domains like natural language processing and speech processing, which has been immensely fulfilling and helped me to identify the areas that I’m currently passionate about.”
Originally from Tamil Nadu, India, Kumar is quick thinking and does everything at pace. And with the speed in which the AI changes; Kumar can more than keep up. “I have learned from some of the most knowledgeable professors in artificial intelligence,” he admits. “I have also enhanced my research skills from problem identification to research proposal to presentation. My experience at MBZUAI has been transformative, preparing me well for my future career in AI research and development.”
Kumar, like his beloved IPL cricket team Chennai Super Kings, believes he has been on the winning team throughout his whole master’s journey. His favorite UAE memory is witnessing the Chennai Super Kings secure victory in IPL 2021 in Dubai at the beginning of his master’s journey. Adding to the excitement at the end of his journey, his team won again in 2023 just days before his commencement ceremony.
His next challenge is to help develop large language models (LLMs) for the country which supported him through his master’s journey – United Arab Emirates (UAE) – which has identified leveraging AI for good as the key priority. “I will be joining G42’s Inception Institute of Artificial Intelligence (IIAI) as an applied scientist, where my focus will be working collaboratively in a team to develop large language models tailored specifically for UAE-focused applications,” Kumar said.
While the release of public language models such as OpenAI’s ChatGPT and Google’s Bard have thrown AI into the public spotlight, Kumar and his colleagues around the world have been developing and training AI models to address diverse challenges. Generative AI is now mainstream as it can be utilized to create human-like content which is relatively accurate and fast to generate.
“When ChatGPT was released to the public, it brought about a complete transformation in the field,” Kumar said. “Previously, different organizations were engaged in developing numerous models to tackle various tasks. However, now a single interface can enhance the efficiency of numerous daily tasks. ChatGPT exhibits impressive performance in solving a wide range of problems yet concerns about security and privacy persist.”
“Confidential information can inadvertently be shared as input, and there is also the risk of generating misleading fake news online. Some of my colleagues are actively involved in the identification of such deceptive generated content, while others are dedicated to creating more sustainable and energy-efficient models. The impact it has had on the field of artificial intelligence is truly remarkable, and there is immense potential for further advancements. It is an exhilarating period, and I eagerly anticipate the opportunity to contribute to these developments within the industry.”
Kumar has an extensive background in machine learning across text, image, speech, and time-series, having worked with top technology organizations like Microsoft Research India, TCS Research, MBZUAI, and IIT Madras. Notably, he has won numerous hackathons, including the IEEE SLT 2022 international hackathon in Qatar, as well as eight national-level hackathons in the UAE and India, and a US patent.
He has co-authored articles that have been published in major conferences such as IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023), Association for Computational Linguistics (ACL 2022) Workshop, Empirical Methods in Natural Language Processing (EMNLP 2022) Workshop, and International Joint Conference on Neural Networks (IJCNN).
Following the commencement ceremony on June 4, he travelled to Greece to attend ICASSP 2023, where he presented his research paper titled “Towards Building Text-To-Speech Systems for the Next Billion Users.” Engaging in esteemed conferences throughout his master’s program has not only allowed Kumar to connect with fellow researchers but has also facilitated the expansion of his professional network.
“This project was initiated during my summer internship at Microsoft Research India, where I collaborated with my co-author, Praveen from IIT Madras,” Kumar explains. “Our work involved a systematic evaluation of design choices for text-to-speech systems, leading to the release of state-of-the-art models for 13 Indian languages. Most open-source text-to-speech is available in English but extending it to local languages could reach masses, especially people who don’t know how to read.”
Kumar’s thesis research explores efficient representation methods for multilingual and multimodal data. His work addresses crucial tasks such as question answering, hateful meme classification, text-to-speech, and text-image retrieval. In the current era of social media, where online bullying has become increasingly prevalent, Kumar’s research holds significant importance.
Hateful memes, which encompass hate speech targeted at individuals on social media, pose a concerning challenge. While various existing techniques exist for classifying such memes, Kumar has devised a straightforward approach that effectively combines image and textual features to predict the probability of hatefulness. This could empower social media platforms to make informed decisions about what should and should not be published.
He has also been involved in the joint development of the award-winning Autodub, a human-in-the-loop AI dubbing platform that aims to eliminate language barriers in educational video content to enhance remote, online learning to all corners of the globe. Autodub seamlessly integrates transcription, translation, voiceover, and background audio separation to create accurate translations and promote accessibility for all. Since many educational videos are primarily in English, this can create a hindrance for non-native English speakers. Autodub offers a viable solution to this challenge.
“What truly excites me about my future career is the opportunity to make a tangible impact. If I can develop something that enhances processes and, consequently, positively influences a significant number of individuals, it would be truly remarkable. Only a few fields or technologies have the power to create something that instantly captures widespread attention and sparks conversations across various communities.”
Kumar is the first to achieve a master’s degree in his family. He also holds a Bachelor of Technology in Information Technology from Anna University, India. He is one of 59 computer vision, machine learning, and natural language processing (NLP) students who graudated as part of Class of 2023. Fellow graduates have secured full-time positions both nationally and internationally, with esteemed organizations such as ADNOC, SnapChat (UK), Transco, Abu Dhabi Police, and the Abu Dhabi startup, FortyGuard. More than half of the graduates undertook voluntary internships during the previous summer, which provided them with invaluable industry experience.
The Arabic language is underrepresented in the digital world, making AI inaccessible for many of its 400.....
Martin Takáč and Zangir Iklassov's 'self-guided exploration' significantly improves LLM performance in solving combinatorial problems.
A team from MBZUAI is improving LLMs' performance across languages by helping them find the nuances of.....