Posted
As I am venturing into current AI research from time to time, I came across this today and wanted to share it with you.
This is what AI can do today already. Is it just me, or does it induce an uncanny feeling in you too? I'd say it's only a matter of time until disinformation can no longer be separated from the truth (deep fakes). And that doesn't sound like a future I want to live in, does it to you?
Background
These neural networks are fed thousands of pictures and then discover (learn) automatically the features that distinguish them (a process called auto-encoding). In the video, a researcher uses sliders to gradually fade between the features that were just learned: age, smile, gender, ethnicity, etc.
Deep Fakes
Similar AI techniques are already used to make people say and do things they never did. These deep fakes are getting closer to reality by the day. The potential for political and criminal abuse is huge. Imagine a video showing you (yes: you) discussing a criminal plot with fellow accomplices and then some surveillence camera footage that shows you actually shooting someone. Good luck defening yourself at court.
And now imagine similarly faked stuff about politicians emerging shortly before a general election. Truth will be no more. Our daily lives will be one of tribes grouping around "leaders" they decided to trust, engaged in violent shouting matches, nobody able to be sure about the truth anymore. The end of civilization as we know it.
What About Music
Ok, enough of that. Now what about using this tech for music? A while ago I've mused about this already, but every now and then I get back to the topic and re-think it. The issue with music is that it's not continuous, but discrete serial information, more similar to language than pictures. There is no easy blending between features that would not corrupt ryhthmic, melodic and - most of all - semantic coherence.
In other words: You'll get something that surprisingly sounds like music, but isn't actually music anymore, because its inner logic is gone: The self-referentiality and structure, usually carefully assembled and arranged by a composer/songwriter, turns into a wishi-washi dream-like echo of fragments that aimlessly wander around.
This is because current AI blends stuff at the surface, without understanding the (temporal, structural, cultural) logic behind.
There's certainly a place for this kind of "special effect" in music, but if your goal is to write/compose original music that is recognizable and standing out, this technology leads into the exact opposite direction.
Thanks for reading. Going back to work now ;-)
Sat, 2020-05-02 - 20:40 Permalink
Hi Andre,
With regards to the negative side of AI for music, I believe having individuals knowledable and apparently ethical would help to maintain AI in the positive. That individual is you. What Synfire is a testament to what can be done when the best Human Intelligene guides AI.
Thank you!
Mon, 2020-05-04 - 09:53 Permalink
Oh thanks.
I didn't mean to say however that AI isn't useful for music. It absolutely is. It is merely that neural networks, as one particular example of many AI techniques, are not that useful for composing/generating music. They may well be useful for other tasks, e.g. classification, motif recognition and similar. That's why we are still running some research in that direction.
Wed, 2020-05-06 - 18:51 Permalink
Cognitone extended VisualWorks in-house to meet the demands of a desktop audio application. The product however includes large parts built in C++, C, Objective-C and other languages. There is no way to make a product of this complexity with only a single language, especially when AI and knowledge-based inferencing is involved. That would be extremely hard to do in C++.
Synfire 2.0 will even include a new programming language that was specifically designed by Cognitone to implement the new generative features.
Thu, 2020-05-07 - 12:15 Permalink
Didn't know KeyKit yet. There are similar platforms, like Symbolic Composer by Peter Stone, or more recently, Opus Modus, if you want to code your music like a LISP program. The very first (unpublished) incarnations of Synfire, named Leviathan back in 1992, also allowed for writing music using a simple Smalltalk syntax. However, I found writing music as code too abstract and inconvenient. It's so much more intuitive to toss around figures and other parameters and place them in containers.
The new generative language used in 2.0 however has nothing in common with the above. If anything, it is more akin to Prolog. And it builds on the powerfull data structures, knowledge base and transformation functions of Synfire.
With 2.0 there will be new ways to generate parameters out of the blue. I am convinced that manually building an arrangement from generated fragments is more productive and fun than writing an entire arrangement as code.
After all, music is a message from humans to humans.
Wed, 2020-06-03 - 18:44 Permalink
25 years in software development. I think we all know for the most part AI is a marketing buzzword (Rules based engine just doesn't sell anymore). That said there certainly are new paradigms in software systems to learn/extract rules based on signals/data being feed. Though I am not aware of any model that actually creates new rules that already are not implied in the system being observed.
As far as AI and music I think we are a long way off from feeding an accoustical signal to a software and it telling us what genre of music it is, and further telling us how popular the song would be to the masses. Or a software just outright developing a new genre of music that would be predictably novel and popular. Most of what I see that is called AI is just based on symbols we have created over the years for music sound and rules we have codified. If I tell my software "AI" music engine that I-II-#v is a "cadence" it won't tell me I am wrong, or that an Augmented chord played in brass typically causes "tension" (or that it even understands the feeling of tension). Or even more exotic, why the Gamelan scales make sense in their context but not necessarily in use with harmonic instruments.
Music is a pure human experience. Some people like certain styles and some people hate those. I like odd sounding music. Others are irritated by it. How do you model that?
I think as a tool set doing automagical software things can aid in the workflow of composition. But in the end it is still the ear of the human mind.
I have been on a journey myself in trying to develop a music prototype language and engine. That has a top down and bottom up approach that where they meet is the compositional layer. And also a deep low level that is the foundation (e.g. how the mind does fusion/fission, segregation etc in the basics of auditory scene and acoustical stream analysis. Also the the basics of "timbre" cognition,and metre and pulse/beat cognition).
Correct me if I am wrong but synfire (as an AI component) seems more focused on pitch based rules specifically in 12tet for harmonic spectral instruments. Out of curiosity is the any roadmap to exploring engine rules in regards to timbre (e.g. in Gamelan where timber is scale or even bell like sounds contextual fits) and temporal aspects (non western concepts of rhythm) of meter interpretation.
Wed, 2020-06-03 - 19:14 Permalink
To sum up. We can have software (AI) generate interesting music. But the same token will that AI appreciate its accomplishment? Will it WANT to actually "listen" to it? If software generates a song in the woods and no one is there to listen to it, is it actually music?
Maybe software can tell us someone in a picture is smiling, but can they tell us why? Does it know what it feels like to smile and the motivations behind it?
Ex Machina was a great movie. I can imagine her writing a love song for the guy and knowing it would get her further to her goal and knowing he would like it, but having no emotional experience herself for it. It is no wonder that we typically describe people with sociopathic/physcopatic extreme disorders as emotionless robots. I have always thought it would be an interesting expirement to do brain scans of physcopaths while they listen to music.
Wed, 2020-06-03 - 20:13 Permalink
Correct me if I'm wrong, but I thought Synfire can already do brain scans of psychopaths while they listen to music. It's not in the manual though, so you're just going to have to figure it out. Try pressing random keys, or get your cat to walk across the keyboard ;)
Wed, 2020-06-03 - 22:49 Permalink
I need to apologize for those previous posts. I left my laptop unattended and I walked in on a Monkey banging at the keyboard. You know what they say about given an infinate amount of monkeys with typewriters... Though I think my good fortune with him has ran out. As he has moved on to banging on my guitar but it sounds nothing like Django. I look forward to that new feline interface in Synfire 2.0
Mon, 2020-06-08 - 12:06 Permalink
As far as AI and music I think we are a long way off from feeding an accoustical signal to a software and it telling us what genre of music it is, and further telling us how popular the song would be to the masses
The former works fine with properly trained neural networks. Any attempt at the latter is futile. However, the labeling of musical genres in itself is a constantly disputed matter of subcultures, vanities and personal preferences. AI will never handle that to anyone's satisfaction.
Correct me if I am wrong but synfire (as an AI component) seems more focused on pitch based rules specifically in 12tet for harmonic spectral instruments.
That's correct. Some common denominator was needed to even begin with a formalization of the process and 12 TET is the broadest base available. Also, most if not all MIDI-based instruments are pretty much limited to that, except for vendor-specific implementations of micro-tunings ans other features that are hard to get under one hood.
The philosophy of Synfire is less about composing music along emotional parameters or by instructing it to achieve a desired effect with the listener (which is pretty much impossible, as elviscb3577 explained quite well).
Synfire is more about building music from arbitrary snippets and fragments as comfortably as possible. Its AI is mostly for making the snippets adapt automatically to a huge variety of contexts, allowing the user to revel in unlimited what-if scenarios. I consider that approach the most productive and unconstrained so far.
And of course, by comfortably playing around with so many things all the time, you quickly learn how to achieve a desired effect first hand.
Sun, 2020-07-05 - 16:02 Permalink
Synfire 2.0 will even include a new programming language that was specifically designed by Cognitone to implement the new generative features.
What development tools are you using for that? Are you using just a bunch of macros or something like https://www.jetbrains.com/mps/?
I'm assuming you are talking about writing version 2 in this language and not this language being a scripting language within Synfire.