The result is The Aficionado - the first and only machine learning system that can accurately identify the musical style of live piano music.
Using deep learning technology cultivated in our Digital Greenhouse, The Aficionado quickly and accurately classifies solo piano music in real time.
The AI consistently and overwhelmingly outperforms classical hand-written software algorithms – both in terms of speed and accuracy – showing uncannily human-like judgement.
Unprecedented automation potential
Though its focus is music, The Aficionado illustrates the ability of machine learning to identify and recognise patterns in complex data.
From detection of faults in an industrial system to rapid assessment of patient health from sensor waveforms, the potential applications – and scope for commercial advantage – are endless.
As an internal experiment that grew organically out our Digital Greenhouse, our approach to The Aficionado was unlike any client project we have ever undertaken.
We picked a problem that was deemed almost impossible and continuously developed, tested, learned and re-developed our machine learning system throughout the project.
We didn’t want the system to simply learn the music – we wanted it to actually understand the music.
Creating the Digital Greenhouse
The Digital Greenhouse is both a physical facility and an experimental approach to developing and perfecting machine learning.
The ‘greenhouse’ analogy is an apt one, as it is here that our technologists breed and test new models of machine learning, using state-of-the-art computing resources to assess new algorithms against calibrated challenges. The more new strains we grow, the better we understand where the richest pickings are to be found.
Performing machine learning in real time is inherently complex. We started by tackling distinctly different genres of music, such as hip-hop, classical and rock. This gave us an early indication of the challenges we would face with solo piano music and helped us home in on the best deep learning techniques.
We quickly discovered that, in order to achieve real-time analysis, we could not rely on off-the-shelf toolboxes. Instead, we needed to develop a fully bespoke deep learning system.
We used our own in-house GPU-powered data farm – providing far superior speed compared with cloud services – together with a dedicated machine for processing the piano audio signal and carefully chosen deep learning architecture to deliver near-instant results.
Dealing with noisy data
Practical machine learning using real-world data is an entirely different proposition to using generic datasets.
We intentionally fed The Aficionado messy data – such as practice sessions featuring barking dogs, crying babies and other background noise, and synthesised data by subtlety changing the sound, tempo and rhythm of the playing – to increase the quantity and complexity of examples it learned from.
We then applied expert data science to teach The Aficionado to generalise, ensuring consistent genre outcomes from any pianist on any piano.
AI vs hand-coded software
An important part of The Aficionado project was to compare how the AI fared against conventional hand-coded software. Could the machine find a better way of telling one genre from another than classic code?
In practice, The Aficionado conclusively outperformed the software written by our engineers, proving more accurate and more decisive in classifying the genre of the music being played. To demonstrate this, we created a graphical interface showing the real-time output of the two approaches side-by-side.
Tales of the team
“We used a large GPU cluster to train and test around 400 different machine learning systems, varying key learning parameters for each.”
Dominic Kelly, lead AI engineer
“Preparing a dataset was a challenge. We built an online platform where colleagues could label and comment on recordings as they worked.”
Luke Smallwood, data wrangler
“For such a technical project, it’s interesting how often the team gathered around the piano. Laughter was common following earnest requests such as “please can you play something one third jazz and two thirds baroque!”
Monty Barlow, director of machine learning
“The Aficionado required finely-tuned AI development processes and cross-validation of candidate models, network architectures and algorithms.”
James Clemoes, AI engineer
“We taught The Aficionado to recognise a set of musical genres, but the same approach could be applied to almost any task that humans are innately good at.”
Lucy Wright, audio processing