The Future of Virtual Instruments?

Mikael · February 4, 2020, 1:40pm

By now you probably know that I am personally a huge fan of software instruments that are playable, meaning adding all articulation changes and expression controls by recording them live.

The downside with physical modeling has so far been that many people have not liked “the tone quality”. Some instruments have been modeled to almost perfection though, like: Piano (Pianotech 6), Bass (MODO Bass) and in cases drums have been passable for tone (imho).

Then there is the “hybrid” approach of for example Sample Modeling series, and Infinite Series. Meaning using samples as a foundation, but then do very clever scripting and acoustic calculations to create the final instrument.

Or do you believe developers will continue with the same old: record every note of every articulation separate, and then let the user switch articulations in programming by using key switches? I sincerely hope this trend will die soon, but I welcome your view even if it is completely opposed to mine!

So, what do you believe we will see in the coming years in the technology and expression of virtual instruments?

olofson · February 4, 2020, 4:06pm

Well, as you know, I’m not particularly impressed with the tone of most modeled instruments so far, and I believe the reason they’re not “there” yet, is that it’s essentially synth programming. Even with meticulous research, audio analysis, custom tools for deriving model parameters from recorded audio etc, it’s a massive project to even come up with an “acceptable” generic instrument, let alone one that sounds like a really good real one.

I mean, after hundreds of years, we only have a rough idea how a violin actually works, and building really great sounding ones - even with the help of computers and modern science, as some luthiers have started doing - is still more art than science. To “fake” our way around that, we’d at least have to record impulse responses of the top, back and other significant parts of good violins, and feed those into the models, to avoid trying to model things we don’t fully understand, such as the acoustic properties of a good piece of spruce.

Sample modelling partially avoids these problems by instead using the recorded tone, transients and the like from the real instrument, and only “modeling” the higher level behavior and playing techniques of the instrument.

Anyway, I’m not sure what the next step will be for the “mainstream,” but I’m leaning towards something along the lines of sample modeling. I suspect “true” modeling is just too much work to get right with the current tools and methods, and most of the work needs to be redone for every single instrument to be modeled. Meanwhile, traditional style “brute force” deep sampling just doesn’t scale to the levels of control and detail we want.

As for playability and control, most modeling approachs are automatically more “real time,” whereas traditional deep sampling inherently suffers from the need to select samples upfront, which is why they need to either depend on explicit “out-of-band” controls (kepswitches, CCs, …), or have to add latency to have enough time to reliably interpret input in clever ways.

On the downside, modeling demands more from the player and controllers (hello breath controllers, Touché, Seaboard, Osmose etc!), and I’m not sure composers, orchestrators etc in general are prepared, or even willing, to deal with that…

Are we going to see a market for “virtual studio musicians,” specializing in expressive playing of virtual instruments?

Or will there be “AI musicians” integrated into the instruments? If so, how will that be implemented, without reintroducing the dreaded latencies, keyswitches and whatnot that we were supposed to leave behind when moving beyond traditional sampling…?

Fredrik · February 4, 2020, 4:10pm

It take time for new ideas to get rooted. Schopenhauer said : All truth passes through three stages. First, it is ridiculed. Second, it is violently opposed. Third, it is accepted as being self-evident.
When scrolling the internet it seems like the majority thinks that samples still sounds best. Personally I’m not sure I know.
I like blind test so I avoid to much influence. This is how I usually work whether it’s a taste (I’ve made it with coffee, brandy, wine, plugins) or a sound:
My girlfriend puts up a selection to try that I’m not aloud to see.
I listen/taste.
I pick.

So far I’ve gone with samples.
The playability is another matter. I think modeling will be bigger in the future.
And inventions like Roli seaboard.

If dreaming about the future:
Maybe Elon Musks neuralink can make plugins connected to the brain so we can think tones and compositions.

Mikael · February 4, 2020, 4:20pm

Well, I would be able to cope with “samples” if I could still perform them with minimal key switching. An example of this is CSS short notes, which is controlled by MOD-wheel from spiccato to sforzando I believe?

I would also like to be able to play legatos and sustains and easily add accents (marcato, staccato, sforzato etc.) without having to program it in.

Am I in a minority in terms of composers, since I heavily and super passionately long for more expressive options without the need to program with key switches? Perhaps I am, but I do believe the market is still big enough to make room for such changes in expressive workflow.

Mikael · February 4, 2020, 4:37pm

Btw, I just want to add that even with my strong “pull” towards physically modeled and “hybrid alternatives”…I still agree that the tonal character is for the most part not there in any of the options. Well…I guess Pianotech 6 managed to make it, but since piano is essentially a percussion instrument (albeit tuned) it is much much easier than any instrument that can add variation to the dynamics and pitch over time…from attack to sustain…

olofson · February 4, 2020, 4:37pm

Well, I don’t mind keyswitches any more than the other options - or rather, I hate them all equally.

That is, in the context of the limited set of articulations we have in sample libraries, I prefer selecting them explicitly, and although I can see why one might prefer the modwheel for physical UI reasons, it doesn’t really make sense IMHO.

Same deal with the vibrato “controls” in the Spitfire libs: The are only two or three vibrato levels in most of them, with no morphing or anything in between, so the continuous controls are just misleading and annoying. The good thing is that it makes those controls more consistent across Spitfire libraries, but that becomes a bit of a moot point when a value around 64 is “medium” in a lib with three levels, but is just around the none/full switch point in a lib with only two levels.

That is, with modeling of some sort, where all controls are real time and continuous, it becomes a different situation entirely, as the CCs become an active part of playing the instrument, rather than pre-selecting from a limited set of canned articulations. Without that, I don’t really care how the selection is done, as long as it’s not a total PITA to handle in the DAW.

Mikael · February 4, 2020, 4:38pm

With all my passion for recording all the expression in the very “performance”, I probably should do what you do, and learn to play the violin!

Mikael · February 4, 2020, 4:41pm

I wonder if the newly announced MIDI 2.0 standard has thought about this, and added actual standards for articulations in the MIDI. So we could have a standard “articulation mapping track” defined by the MIDI 2.0 standard, and would not have to always search for those pesky key switches being different for all libraries.

Fredrik · February 4, 2020, 4:57pm

For solo violin tried this? https://youtu.be/oTxgHJMpKsw

olofson · February 4, 2020, 5:35pm

On that note, maybe MIDI violin (or cello) is the controller we need for all this? Electric instrument with per-string pickups, some accelerometers in the bow, and some DSP magic…

Wouldn’t be much easier to play than the normal ones, though - but I think that’s pretty much what all this boils down to: There’s always a compromise. “Easy” instruments are not very expressive, while expressive instruments are notoriously hard to learn. Even if a digital controller could offer a much better skill/expression ratio, one would still need to learn how to play it properly.

olofson · February 4, 2020, 5:39pm

Interesting! The sound is… about what I’ve come to expect from modeled strings at this point - but using the Touché for bowing is interesting. (I have one!)

Another idea I’ve had is to use a long ribbon controller for bowing. Preferably pressure sensitive as well. Maybe 2D, or a pedal or something, for sul pont/sul tasto control as well.

Mikael · February 5, 2020, 7:25am

Yes I know you are correct about the expressive capabiliites needing lots of training. It is perhaps therefore a safer bet to learn a violin than performing with a ROLI seaboard or whatever digital tool that is in the news for a few years until it fades. While the violin has been there for 100s of years and is not going anywhere soon.

Even so, as a composer, I am constantly searching for those holy grail compromises between artistic expression and final music outcome.

olofson · February 5, 2020, 11:36am

Yeah, violin and cello are much like the guitar and the piano that way; playable, great sounding, tried and tested in all genres, there’s lots of accumulated knowledge around their construction, maintenance, and playing, and there are non-acoustic variants when you want to reach beyond the traditional sound. So, pretty safe bets from a “usefulness” perspective.

Unfortunately, “MIDIfying” most instruments is either not (yet) an option, or basically turns them into different instruments. The reason for the slow development there is probably a combination of technical challenges, and a Catch-22 between limited interest and disappointing results, but I think it’s only a matter of time. It seems unlikely that anything (short of bypassing physical controllers altogether) will ever come close to the popularity and versatility of the keyboard, but I think truly viable alternatives will pop up at an accelerating rate as technology matures.

pj1240 · February 5, 2020, 5:39pm

I think we’ll see more creative ways of incorporating sampling and synthesis and AI come into modelling instruments. Even though the sound from most instruments is incredibly complex with layered harmonics and interactions our ears seem to be able to pick up when its not right or realistic.
Think MIT was doing some interesting work putting sensors in a violin bow and also flute so a more detailed performance could be recorded in an aim to do more realistic synthesis.

At the moment think we need major leap in development in sampling software/maybe wavetable synth hybrid.

Kontakt is good especially with scripting etc but you do have some latency with complex sample libraries. I can see more hard coded plugins being made which hit low level code rather than higher scripting processes to reduce latency and provide more complex processing.

I should imaging producing sample libraries for orchestral instruments is pretty tedious and in the grand scheme there’s not that many producers out there compared to say audio plugins

Mikael · February 5, 2020, 5:47pm

Interesting thoughts Phil. I also believe with MIDI 2.0 now out, and new expressive inputs (instruments) like ROLI Seaboard and Osmose, will give us composers and creative music artists so much more flexibility in performances!

florent83 · February 5, 2020, 7:59pm

hello
in my opinion, we still have both technology, the sampled instruments and the emulated
now software reach a high level of realism but for some instrument it stay hard to stay close to the real sound i think to the saxo and brass with some articulation
try to imagine a software that can emulate all available and playable articulation of a violin, a woodwind or a brass and i think in the classical first but if you add also the contemporary or extended play technic it will be very hard or extremely hard to reproduce or to emulate the instrument
take by exemple the violin:
the classical articulation are ( if i don’t forget one) : sustain, legato, staccato, spiccato, con sordino, pizzicato, ricochet, trill, tremolo, glissando, vibrato, sforzando, piano, pianissimo, fortissimo, fortississimo, mezzo piano, mezzo fortissimo, pizz bartock,col legno, col legno a batuto, harmonic, sul ponticello, mezzo sul ponticello, sul tasto, mezzo sul tasto, on tasto

try to imagine now the software that can emulate all this classical articulation for me it doesn’t exist that is why a lot of library use sample like spitfire audio, strezov sampling, vsl, EW, with sample you can ask to the player to play the desired articulation, but for a software you had to developp the right algorythm for each articulation

it is not for today but perhaps in many years yes we can have a great emulated instrument with all articulation

jlx_music · February 6, 2020, 1:05am

Hello guys,

sorry for being away for weeks. However, I am not super far away from you and see regularly what you guys are writing and doing

Let me put a couple of words together about this interesting topic!

First, I believe that we all can proudly say that percussion instruments already achieved a super high level and I think if they improve ever, the margin will be super little. Why? When we look at what Hollywood is doing, besides recording 10 drums at once for Man in Steel, almost every production uses samples. Writing for percussion in a DAW is super fast nowadays, and we almost don’t need to think a lot about the “realism” as long as we know how to write for percussion.
I think there are quite some pianos out there, which I really like as a keyboardist myself. In the early 2010er, I didn’t like what the market had to offer. A lot of troubles in the low and high-end areas – never sounded real. I played the concert Yamaha and Steinways and even Bosendorfer, and the samples just couldn’t give me this feeling. Today, I like more and more of the newest libraries.
The string sections for me as not a string player can get to a high standard, however, I can hear “false” programmings.
Brass is maybe the biggest issues in my opinion. If you heard a good brass section with your ears, I still believe that there is something that the samples and microphones can’t get to the point in terms of sound and overall feeling. But I need to check out what Junkies brass is doing first. He said like: it’s not “another” brass library, it’s something special.
Woods are maybe a little bit better standing today against brass, but not far away.

I would say that if I have the chance to record a real-life orchestra, I will do it without a doubt. First, I know that my music is played by real human beings, recorded by real human beings and all of that I know that they will make the music feel special. Even if I can program good mockups, it will never be as human as a real live performance. There are so many nuances, little mistakes, false note here and there (wrong pitching), timing issues, but all of that makes us feel real and emotional. I have many compositions myself I feel would be amazing to hear in a real live context. I know, when a performance is on point, I get goosebumps and I realized it takes a lot of emotional content to give me that. I mean, take any piece from Beethoven or Maler and hear the best orchestras play it. Even, if it’s not YOUR MUSIC it’s outrageous. You can’t program that with samples to get the same emotional result. And the second reason, which is important as well for me as I see it more and more going in the wrong direction due to “saving money agendas”: the musicians, who practiced their instruments for years and decades can still make a living doing that gift they have! There are so many talented people out there who would do anything to make a performance to stand out. But if budgets are doing down, no wonder, why the quality is going down. If there are fewer and fewer human beings involved, we lose more and more emotions throughout any project.

I am happy if the technology improves, but the “heart” and “mind” know that technology will never replace what we feel in that very moment

Have a great week guys!
Alexey

Keith · February 11, 2020, 7:52pm

I think vsts have matured quite a bit from their buzzy ,raspy days …Ive worked in ad conversion and sampling …(military applications)… fundamentally …we have 3 components in a signal all are dynamic in nature… Frequency, Phase and Amplitude …you can extend that to haronic content as well …as well as the intermpdulation or beat products of frequencies adding and subtracting …The problem with modelling is you cant model for “infinite parameters”…and dealing with multiple non-linear functions chained back to back … articulating small parameters like rosin on a violin bow as simple as it seems,involves very complex modelling…

ivyhome · October 10, 2021, 4:47pm

Maybe with 5G we’ll have some new possibilities, not only for memory size [definitely] but for other parameters from quantum computing, which I think will be a big break-through.

As for what needs improving, it seems to me that present modeling doesn’t take into account what happens in a room with large numbers of real instruments, like the strings, and each one sounds a little different, is played a little different, not in perfect time, etc. What I mean is that I don’t think a modeled sound is quite enough, and I’m not sure what is missing, but something happens with a real group of stringed instruments that we don’t yet understand - kind of like how birds that fly in formations influence each other - I don’t know how to say it. But you’ve probably noticed that recording 12 violin tracks doesn’t sound the same as 12 violins recorded together… not even close.

But I think due to A.I. the computers built to work on this problem will figure out things and apply them that engineers may not even know about, and won’t understand. That is, computers will take over the sample creation process in ways that no one can explain to us, and probably come up with something spectacular, not just realism, but some kind of HYPER-realism.

What I’ve noticed about people [not to beat a dead horse] is that they consistently underestimate the future, and in both ways: on the one hand, how much things stay the same, and on the other hand, how much outside of our wildest expectations it can be. Anyone who lived in the 60’s could not have dreamed of what computers have brought, especially the internet. It was literally outside anything we could have thought up. You could not even have described it to us, because we wouldn’t have any idea what you were talking about - “cyber space”?

So I think something similar will happen with quantum computers and 5G, because they will bump us to a level comparable to the difference between the 60’s and today, probably much further.

olofson · October 10, 2021, 5:15pm

As someone on the “inside” (former senior developer in network appliances), I don’t really understand the fascination with 5G. It’s just more bandwidth, and it makes it possible to “cut some corners” that will reduce latency in some applications (basically, you talk to a local server near your current location), but it does NOT circumvent limitations such as the speed of light. This guaranteed 1 ms nonsense that some are talking about is a misunderstanding of the specification; it’s simply not physically possible - without quantum mechanics, that is!

So, yeah; I’m not saying it will never be possible to run low latency audio across the planet, but it’s certainly not going to happen any time soon, and until then, we absolutely, positively, need local processing for anything real time interactive in the “music” time scale.

As for quantum computing, it’s not really applicable to “raw” calculations like signal processing, but it’s great for narrowing jobs down for traditional computers to finish up, and I’m sure it has potential in big data, AI and the like - and THAT is certainly an interesting field when it comes to analysis, (re)synthesis, sound design etc! When it comes to phenomena that aren’t completely understood, the closest we can get to simulating them is by creating approximate models that mimic the behavior of the “real thing”, and the AI field has some great models for turning raw data into models like that.

And speaking of not fully understood phenomena; I’m not sure the interaction between instruments in a section is entirely a mystery, but given that we just barely have the CPU power to handle a few fairly basic physical modeled instruments, we’re just not at the point of trying to properly simulate that yet.

I think it might be interesting to approximate it a bit, like giving modeled instruments an audio input that allows you to feed some sound from the other instruments into the “body” resonance of the model.

However, the most obvious issues I hear in most examples so far is “simply” poor room/spatial positioning simulation. You’ll just not get away with a bit of panning and EQ and a reverb send to make this work. Each instrument needs it’s own reverb response that accurately represents the exact position in the space - or at least, gives a believable impression of doing that.