By Dan Brotman, Editor-In-Chief
Special To FutureMusic
We’ve all know about the Graphical User Interface, more affectionately known as a GUI or Goo-ey, but have you heard of the Conversational User Interface? Maybe not by that name, but you’ve certainly seen it in action. Think Star Trek, “Computer, locate the nearest hospitable planet,” Tony Stark’s Jarvis, or the luscious Scarlett Johansson in Spike Jonze’s Her…and even to some extent Apple’s Siri, Google’s Now and Microsoft’s Cortana. While most of us certainly won’t be falling for Siri, the Conversational User Interface (CUI) arena is certainly heating up.
According to Tim Tuttle, founder of Expect Labs, voice-recognition and artificial intelligence has advanced and become more accurate by 30% in the last two years, which is more progress than the industry made in the prior ten years. Tuttle’s company has developed MindMeld, a cloud-based voice-recognition and artificial learning software, which is being used by over 1200 companies to fuel the voice-recognition on their computer programs and mobile devices. And he’s certainly not alone.
Nuance Communications, which you may know quite intimately if you drive a Ford and use the Sync system, is another leader in the field right now. Apple’s Siri relies on its technology. However, these systems are still pretty dumb. Bark a command like “turn on the radio” and Sync ablidges, ask “how long will it take to get home with all this traffic?” and it becomes hopelessly lost. In fact, using Siri, Sync or the majority of voice-recognition systems out there can leave you feeling like you’ve been left for dead in a telephone voice-menu system. “Representative. Representative! REPRESENTATIVE!!!”
Now one of the music recognition apps you use on your cell phone, SoundHound, could be leading the charge in intelligent voice-recognitions. CEO Keyvan Mohajer has been showboating the company’s new Hound technology around Silicon Valley as of late, and he may have a reason why. The technology promises to not only to deliver accurate answers to simple questions, but can then ramp up its intelligence to respond to ever more complex questions, using the information acquired previously in the conversation. Thus, not only could you ask Hound how long would it take to get home, but you could also ask it how long would it take to get home, if I avoided the expressway. A big leap forward, and this is just an elementary example of what Hound can process.
Now think about how wonderful this technology would be if it was implemented in your digital audio workstation (DAW). By reading FutureMusic and keeping abreast of the industry on a daily basis, you know how much hardware, software and mobile apps debut on a daily basis. It’s actually staggering. I thought it was mind-blowing five years ago, but now with crowdfunding, 3D printers and an even lower barrier to entry with apps, it’s like trying to drink from a firehose. What’s worse, is that even when you do finally get your arms around that complex DAW, an update comes out that changes the entire paradigm. The result? Frustration and countless hours figuring out how to make things work, rather than actually making music. But what if Apple decided to add a vastly more intelligent Siri to Logic Pro X?
“Siri what is the shortcut for swapping the left and right locators?”
“How come my Moog Taurus is not receiving MIDI data?”
“Siri sidechain my the highhat pattern on track six into Massive.”
Now your Moog Taurus may never be able to actually communicate back to Logic Pro, informing the DAW that it needs a firmware update, but the idea that you can discuss the problem with Logic and ask for some possible solutions would be a tremendous help. What about music intelligence? “Siri can you suggest some 1-4-5 chord progressions for the bass line on track ten?”
Or even the biggest time suck of them all…
“Siri can you audition all the patches listed as ‘metallic’ in Alchemy, Sculpture and Retro in a four bar loop on track twelve?”
The possibilities are staggering, and we’re just talking about your DAW. Look at other arenas, mixing, mastering, digital DJing, sound design and so on. Anything that allows me to make music more efficiency, instead of laboring through technical concerns, wading through patches, dealing with workflow modifications from an update or trying to recall complex engineering acrobatics, is a huge win.
Clearly, Apple is in a leading position to implement a CUI as part of their music and video product lines. However, with so many voice recognition / artificial intelligence companies in the market (with new startups mushrooming up every month), other companies such as Native Instruments, Ableton and Image-Line, to name just a few, could enter into this arena as well.