Augmenting Human Creativity with Machine Learning
DOI:
https://doi.org/10.1609/aaai.v40i47.41344Abstract
In this talk, I will survey my work in three main research directions: 1) generative models for music creation, 2) AI-assisted music creation tools, and 3) multimodal generative models for content creation. In particular, I will discuss our recent work on AI-assisted video editing that explores novel machine learning models that can cut, select, and rearrange a long video into a short video. In the first TeaserGen project, we proposed a narration-centered teaser generation system that can effectively compress >30-min documentaries into <3-min teasers leveraging pretrained LLMs and language-vision models. In the second REGen project, we proposed a retrieval-embedded generation framework that allows an LLM to quote multimodal resources while maintaining a coherent narrative. I will conclude by discussing our future work towards next-generation video editing interfaces using multimodal LLMs and retrieval embedded generation. I will also discuss our future work towards playful human-AI music co-creation systems where the user can control a music generation system through hand gestures and body movements.Downloads
Published
2026-03-14
How to Cite
Dong, H.-W. (2026). Augmenting Human Creativity with Machine Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 40(47), 39818–39819. https://doi.org/10.1609/aaai.v40i47.41344
Issue
Section
New Faculty Highlights