RBCDSAI Seminar - Balaji Vasan

Watch on YouTube


Topic: Generating Tailored Multimodal Fragments from Documents & Videos

Multimodal content is central to digital communications and has been shown to increase user engagement – making them indispensable in today's digital economy. Image-text combination is a common multimodal manifestation seen in several digital forums, e.g. banners, online ads, social posts and have been shown to be effective for both communication and cognition. The specific choice of a specific image-text combination is dictated by the information to be represented, the strength of the image and text modalities in representing the information, and the requirements of the underlying task. In this talk, I will walk through some of our recent works on automatic synthesis of such multimodal fragments to generate teasers to an article, to answer questions on a multimodal document, and effective navigation of long videos.


Balaji is a Principal Scientist at Adobe Research, India. His research interests span the areas of multimodal content generation and natural language generation towards automating various authoring workflows. With 10+ years of experience in industrial research, he has over 30 patents granted at USPTO and has authored over 50 papers at several top conferences. He completed his Ph.D. in Computer science at the University of Maryland in September 2011, M.S. in Electrical engineering from University of Maryland in 2008 and B.E. in Electrical engineering from Anna University (India) in 2006.