Emotion-Conditioned Music Generation with Transformer GANs

Project Members: Chloe Liu, Emma Wang, Dingkun Yang

Project Description

Generative models have been widely applied to the realm of natural language. While automatic generation of images or texts has been prevalent, music generation is a relatively niche area that we aim to delve deeper into via this project. As famed nineteenth-century poet Henry Wadsworth Longfellow wrote, “Music is the universal language of mankind.” In this project, we utilize generative models to perform automatic music generation and see if music generation can be as "realistic" or "human-like" as language models. The realm of Automatic Music Generation has made notable advancements, primarily due to the emergence of Deep Learning. Nevertheless, the majority of these achievements have stemmed from unconditional models, which do not possess the capability to engage with users, thereby preventing them from influencing the creative process in a meaningful and practical manner. Therefore, we aspire to generate music that conveys emotions, adding a purposeful dimension to the creative process. Presently, the majority of cutting-edge generative music models are based on the Transformer architecture. We intend to explore both the conventional Transformer and Generative Adversarial Network (GAN)-based models to evaluate their quality and performance in the context of conditional music generation.

Poster

poster