Tracking Down Audio Artifacts…

So the past several days has me re-examining audio artifacts, after a customer stated that he thought the high-synth sound was off a bit. I tried to get a better understanding of what he meant, and I asked him for the track that he was listening to when those artifacts appeared.

After narrowing down on the time interval where he said the artifacts appeared, I zoomed in and listened carefully. In my mind, there are no guarantees that my Crescendo is performing properly, so I’m always interested to hear when someone says they hear something not quite right.

Well, with my own hearing being so bad, I thought maybe I did hear something, but just to be sure, I cut out the bass region and sent the track through a pitch shifter to lower the frequency range by 2 octaves – down to where I could hear them better. And lo and behold, what was there was a fluttery warbling pitch at a low volume level. Up and down in both amplitude and frequency. And so, how can that appear with “audio artifacts”?

Well, let’s back up a minute and look at the settings for Crescendo. There the most important parameter is the vTuning level. Where do we get the value it should have?

There is a correct answer, and a best available answer. The correct answer needs to have a carefully calibrated audio system so that Crescendo knows how loud something will appear when it sees some digital level at its input. Then we next need a carefully measured hearing threshold across the spectrum for our ears. That takes extreme care and many repeated trials to arrive at a decent answer. But since nobody ever has such a thing, we start by guesstimating from the listener’s audiology report.

But a word of caution… your audiologist is not a trained scientist, and makes a rather quick attempt to measure your hearing in just one sitting, in order to get you in and out of the office so the next patient can be examined. A careful study would require all day, perhaps several. That isn’t going to happen.

Secondly, those audiology measurements are only accurate to +/- 5 dB (!?? well, maybe on a good day). But often they are even worse. I once flunked a hearing test because I was alone in the sound booth with my tinnitus ringing off the hook, and not knowing what kind of a sound I should be looking for during the test. So I was pressing the button like crazy. The result was that I was deemed to have profound loss (> 90 dB elevation) above 1 kHz. In other words, the audiologist proclaimed me profoundly deaf, without hope. Well, we know that isn’t correct. So what value of vTuning should be derived from that sitting? Nada!

So the real answer to the original question – where does the vTuning number come from? – comes from individual listening sessions, hopefully with a well calibrated audio system, and the listener simply sweeps the vTuning control up and down until he likes what he hears. Sadly, without access to a lab like mine, and a trained observer like me, there is really little else that can be done. But that vTuning number is all important.

What happens when your vTuning is mis-adjusted is that you begin to hear subtle artifacts in the music at low sound levels. Crescendo attempts to pre-warp the sound spectrum so that it overcomes your hearing recruitment curves. But down near your threshold level the recruitment curve is nearly vertical, which means that any tiny variation in sound level around the threshold goes from being completely inaudible to becoming something quite a bit more than a mere threshold level sound. If your vTuning level is off, then we will be making incorrect adjustments to try to keep the sounds above your threshold by just the right amount. The sounds will end up exhibiting a “crunchy” sort of effect as they dither  in amplitude above and below your actual threshold level. Artifacts!

So that fluttery warbling high frequency pitch from the track probably did sound incorrect to the listener, and that is very likely because of (A) an incorrectly calibrated audio playback system, coupled with (B) a wrong guess at the vTuning level in his Crescendo system. When you hear things like I just described, you might want to go back and diddle the vTuning level while listening to a bothersome track, until the artifacts disappear. But be sure to calibrate your audio system so that Crescendo really knows how loud the sounds will be on playback.

When I did that exercise for myself on his troublesome track, I found that I probably had my vTuning set too high, at 60 dB. I found the artifacts disappearing as I dropped the level down to around 53 dB. And the result is that many other tracks sound better to me as well.

Here is a graph showing the effects of deliberate overcorrection, i.e., setting vTuning too high, in Crescendo. Ideally, we want Crescendo to pre-warp the sound so that, combined with the recruitment of our hearing (shown in red), the result will be that straight diagonal line (shown in orange) that lies right on top of the ideal reference line. That diagonal line shows that you perceive what was fed in. You hear it like it is.

But when Crescendo has its vTuning set too high, like my 60 dB when it should have been 53 dB, you end up with that green curve that lies everywhere above the ideal reference line. It shows that very faint sounds will sound much louder than they should, especially at the very faintest levels. In fact, you can easily end up boosting the noise floor to audible levels, which also sounds cruddy. Every recording has noise unless it was completely synthetically generated. And that noise, added to real signal, will cause the signal to flutter.

Now look how steep that red recruitment curve is, near the threshold level. If we are wrong about how much Crescendo should boost your signal, even by a tiny amount, you will go from hearing nothing when it is barely below your threshold, to something quite a bit louder than a real threshold level sound when it moves just slightly above your threshold.

The purpose of the Dual-Engine mode in Crescendo is to pre-boost the lowest level sounds into a range where the recruitment slope is less steep. That gives us some degree of forgiveness when our vTuning is slightly in error. Sounds won’t just toggle from inaudible to substantially audible anymore. They will fluctuate as they will, and we will hear a smoother fluctuation in those sounds down near the lowest levels in the recording.

Once you reach sound levels below 40 dBA SPL, you are hearing the room noise. An empty auditorium has an ambient noise level of around 30 dBA SPL. As I sit here in my lab, the sound level meter just now shows around 30 dBA SPL during a particularly quiet period. Normally, with the A/C running it is more like 45-55 dBA SPL. It is doubtful that you really want to bring up that room noise to prominent levels where it could compete with your music.

So that’s the story on getting your vTuning correct. Deliberate under-correction would show just the opposite effects, where nothing gets quite as loud as it needs to be. There is probably about a +/-3 dB range around the correct setting that is acceptable.

But… now we get into other complicating factors… I was just listening this morning to a track by Nacho Sotomayor where he had some percussion that seemed overly prominent to me throughout the entire piece. Surely the mastering engineer would have caught that, so it must be in my own ears.

I thought for a while… maybe I should revise the vTuning curve to level out above some high frequency, rather than continuing to grow as it now does. But dragging out the compiler and making code changes is a pretty serious endeavor. So before doing that, I tried a little experiment with an EQ on the input, to search for what frequency range was really causing all the trouble.

And to my total surprise, it was not at the highest frequencies where I originally thought I should change the code in the algorithm. The trouble frequencies were at the rather lower frequency range of 2 kHz !! Right down on top of my own band of hyper-recruitment. So glad I decided to try the pre-EQ before cutting into the code in the wrong place.

Remember that hyper-reruitment is the condition where sounds grow louder much more quickly with increasing loudness than they should. To the point where sounds can become uncomfortable at levels where everyone else is happy. If your audiology report shows a diminished maximum comfortable level, then it is a hint that you might have hyper-recruitment too. But sadly, there is no audiology test to be found at your local shop that can map out your hearing in such detail.

Happily, Crescendo can correct for hyper-recruitment. All it takes is a pre-EQ notching at those frequencies where your hyper-recruitment shows up. That is the correct way to handle this condition. But it takes trial and error to find that frequency band, and the proper depth of pre-EQ notching.

So I mused for a while, after fixing things for myself on Nacho’s track. I decided to build a swept pink-noise source that would repeatedly sweep across a range of frequencies so that I could listen and tweak the EQ until things sounded mostly correct to me. I made the bandwidth of the noise just slightly narrower than a typical critical band of hearing, which is around 1/4 octave in the 1-4 kHz range. That’s when I decided I should share all this information about artifacts with you readers.

When I listen to this swept noise, in frequency regions below and above my 2 kHz trouble zone, the sound is a smooth hiss. But as it crosses my 2 kHz range, through Crescendo without any notching pre-EQ, it begins to sound very harsh and rough. Crackly, even. So I adjusted the pre-EQ until the notching made the sound much less troublesome. Not quite fully smooth hissing, but much much better than without the pre-EQ notch filter.