Home / News / Multimodal pretraining for objectionable content detection in videos

Multimodal pretraining for objectionable content detection in videos

Wednesday, May 17, 2023

This talk will cover preliminary work in the space of multimodal representation learning for the task of detecting comic mischief. In this work, we investigate two multimodal pretraining mechanisms to improve prediction performance of the downstream task. Our evaluation results show that multimodal contrastive learning results in larger gains than the more common multimodal pretraining approach where the model trains to predict if the unimodal representations belong to the same instance or not. We also look at the value of two common multimodal corpora. If time allows, I’ll give a brief overview of other research projects going on in my group. Hopefully this talk will help identify and spark future research collaborations at MBZUAI.

Post Talk Link: Click Here

Passcode: !0h?bV+0

Speaker/s

Thamar Solorio is a Professor of Computer Science at the University of Houston (UH) and she is also a visiting scientist at Bloomberg LP. She holds graduate degrees in Computer Science from the Instituto Nacional de Astrofísica, Óptica y Electrónica, in Puebla, Mexico. Her research interests include information extraction from social media data, enabling technology for code-switched data, stylistic modeling of text, and more recently multimodal approaches for online content understanding. She is the director and founder of the RiTUAL Lab at UH. She is the recipient of an NSF CAREER award for her work on authorship attribution, and recipient of the 2014 Emerging Leader ABIE Award in Honor of Denice Denton. She is currently serving a second term as an elected board member of the North American Chapter of the Association of Computational Linguistics and was PC co-chair for NAACL 2019. She recently joined the team of Editors in Chief for the ACL Rolling Review (ARR) system. Her research is currently funded by the NSF and by ADOBE.

Monday, February 24, 2025

Multimodal pretraining for objectionable content detection in videos

Speaker/s

Related

Formal Methods for Modern Payment Protocols

Polygenic Score Modeling to Investigate Genotype-Phenotype Associations

Trustworthy Machine Learning: Transparency, Collaboration, and Evaluation

Multimodal pretraining for objectionable content detection in videos

Speaker/s

Related

Formal Methods for Modern Payment Protocols

Polygenic Score Modeling to Investigate Genotype-Phenotype Associations

Trustworthy Machine Learning: Transparency, Collaboration, and Evaluation

Subscribe to The Node