Technology Author:Yunfeng Zhang Apr 09, 2023 01:35 AM (GMT+8)

Currently, DeepMusic's(Chinese: 灵动音科技) music structure standard UMP has been applied in many scenarios by TME's National K Song and QQ Music.

music

DeepMusic, an artificial intelligence music service provider, announced the closing of a Series A+ financing round of nearly USD 10 million led by GGV Ji Yuan Capital and followed by Feng Yuan Capital. The financing amount will be used for the research and development of self-learning AIGC music engine and its application, and to accelerate domestic and international market expansion. Yawei Capital is the long-term exclusive financial advisor.

Founded in 2018, DeepMusic is the first domestic AI music service provider to build a music engine based on self-research AIGC capability, and is committed to transforming AI music technology into scenario-level applications and products for all kinds of music lovers.

"The emergence of AIGC has greatly changed the relationship between people and content." Liu, founder and CEO of DeepMusic, said in an interview that at a time when big models are rapidly dividing the minds of some content creators, the audio modality is not yet as widely used and scenario-based as natural language, images, etc. And while music is actually like other art forms such as painting and video, although those who have studied systematically are always a tiny minority of enthusiasts, each person actually has very individual preferences and understandings. With the increasingly vertical technology and capital trend of AIGC, a more inclusive way of music creation and the value of technical interaction brought by AI is coming.

Data shows that the current global music users have been as high as 1.63 billion, but music as one of the most important ways of self-expression under the popular perspective of social media, short video platforms, etc., the threshold of professionalism has not been brought down. Although in recent years, workstations such as the library tape have somewhat simplified the difficulty of entry into arranging music, the obstacles of music theory knowledge and digital playing ability have led to a long way to go before the "era of universal music creation" arrives.

Behind the industry opportunity, on the one hand, record labels and limited and expensive professional arrangement resources almost monopolize the traditional arrangement process, and on the other hand, the creative influence of individual users is growing by the day. According to Jitterbug data, 62 percent of videos with more than 10,000 views come from everyday creators with fewer than 10,000 followers.

"We want to make it possible for music lovers to express their musical talent without having to spend a lot of time learning music systematically, and for people without knowledge of music theory to express themselves." Liu says bluntly that the digital world currently offers very limited opportunities for non-professional music lovers to participate in creative work. The high threshold for music creation and the long time it takes to create music discourage ordinary people due to the lack of an excellent underlying infrastructure for music. In the same way, it is hard to imagine how many people who love singing were good at using professional recording software on computers to mix and shrink soundtracks before the emergence of cover song applications such as National Karaoke.

Compared to the modal forms represented by AI painting and ChatGPT, one of the difficulties in generating music with AI is the larger semantic gap, i.e. the correspondence between linguistic description and musical content. This is the most important task of a traditional music producer: after receiving the score, the producer must not only arrange the music, but also translate the composer's abstract stylistic description and emotional language into musical notation by communicating or coordinating with musicians.

In addition to the more downstream capabilities of natural language processing, structured data has historically been a more critical pain point for AI in the music industry.

"It's fair to say that music knowledge has never been retrievable by humans."

DeepMusic, which created the UMP music structure standard and automated annotation technology, has spent the past few years analyzing large numbers of audio files, piecing together what pitches, chords, passages, and other musical symbols are used in each bar, and converting music theory information from more than 20,000 songs into a database that can be used to train models, achieving recognition accuracy of more than 90 percent, which can meet the needs of most C users' scenarios.

When the technical environment of AIGC comes to enter the era of large models of natural language, DeepMusic's self-developed AIGC music engine "Mutrix" expands to a multimodal model based on compatible open-source language models, and finally achieves the control of natural language on music.

Currently, DeepMusic's music structure standard UMP has been applied in many scenarios by TME's National K Song and QQ Music. At the same time, after many iterations of UMP Board has completed the AI automatic annotation of 400,000 songs independently. In terms of output music styles, DeepMusic is also expanding its overseas content and accelerating the layout of overseas markets.