Suppress annoying background music and noises on TV

The classic TV and streaming situation: You're watching an exciting film, it's getting closer to the finale. More and more action, music and sounds are added to the actors' dialogues and then it happens: The key scene is running, everything is now explained, the mysteries are solved, except for one and that is the most important thing: What did the actors just say? Because the background music is too loud right above the dialogue, you're basically sitting on the outside, you've been exposed to sound, but it's very inadequate, because you haven't heard anything other than the musical noise.

This is becoming increasingly annoying for TV and streaming viewers. They are simply exposed to this overlap of music and speech, and even turning the sound up with the remote control does not solve the problem. The dialogues are neither easier to understand, nor is it good for the neighborhood or sleeping children, because as soon as the next loud explosion comes that you didn't see coming, the speakers explode right away, so to speak, and you can't turn the volume down as quickly as you would have to in order not to disturb the peace.

But why does background music in films or documentaries often feel too loud and drown out the speech? And why are old films, for example from the 1960s, much easier to understand than today's ones with their modern technology?

The reason lies in the type of production and the subsequent distribution via different channels with different requirements, such as TV, streaming and media libraries via the Internet.

Looking back: In the past, for example in the 60s, film sound was mixed in mono. Mono means that there was only one soundtrack in which everything had to take place: speech, music and noise. The advantage of mono is that the volume gradations of such a sound mix on most speakers are very close to what the producer heard in his recording studio. So if the speech in this one channel is clearly mixed louder than music and noise, then that was also the case on most speakers in people's living rooms or radios. So everything was clearly organized due to the focus on just one channel. And the type of speakers used by listeners was also clearer, since there were televisions and radios, but no internet or streaming. In addition, televisions and radios were also technically more similar than the majority of devices are today.

A second reason for the easier-to-understand TV sound was the fact that sound production was very complex and expensive at the time, with the result that less music and noise was generally used than today. So the language was more often on its own, which of course maximized its intelligibility.

Today's sound productions, especially in the film sector, are designed for multi-channel use. Film sound can now be reproduced via up to 64 output channels, i.e. loudspeakers. The aim of these many channels is to enable the sound to be located as precisely as possible in the cinema. In the past, it only came from the front and in mono from the middle, but today it can be located more precisely anywhere in the room, both on the horizontal and vertical axis. Front, back, ear level or even above the head are the localization parameters that are used today.

In the cinema, this can be impressive if you sit on the "Golden Seat", usually in the middle of the cinema. All other seats are compromises, but they still allow enough spatial positioning to give you the feeling of being surrounded by sound.

However, today there is a huge imbalance between speech, music and sound. Speech still comes from the front and center with a few exceptions, for example when someone comes through a door from behind and says "hello" while he/she is not in the picture. Music and sound, on the other hand, can come from all directions, with music often placed equally at the front and back, while sounds move freely in the listening space depending on which picture objects they belong to.

This imbalance between the language, which is in one output channel, and the music and noise, which are shared by the remaining 63 channels, takes its toll in your own four walls. After all, who wants to live in a cinema with a correspondingly large sound system?

To keep things cozy, the speakers for TV and streaming in your living room or bedroom are much smaller and usually only come from the front, like with soundbars, for example. These also have far fewer output channels than the large cinema sound system. So how do the many channels from the cinema get through to the much smaller TV sound system at home?

This is where the so-called downmix comes into play. Many channels are reduced to fewer by adding them together in specific ratios. For example, 64 channels become two in stereo speaker systems or six in 5.1 systems. There are various quality classes of speaker systems, but even a more professional Dolby Atmos home cinema system with the 5.1.4 channel configuration, which already consists of 10 output channels, has up to 54 additional channels downmixed.

You can guess what happens in the downmix: lots of noise tracks meet lots of music tracks and both meet a speech track. Ratio of speech to rest: one to up to 63. That can work well if the speech is relatively alone with little background music and noise, but as more and more playback channels contain music and noise, things get tight for the speech and the result is that you can't hear anything anymore in terms of word intelligibility. This then takes its toll on all home sound systems in general, but of course especially on the smaller speakers for television, such as soundbars, which are the most common. It is of course worst on the TV speakers themselves, but it is well advised to equip yourself with a soundbar to improve the TV sound anyway.

The problem of incomprehensible dialogue is further exacerbated by the various ways in which the downmixed content is distributed, for example by streaming providers, which further limit the sound quality due to the high transmission costs, and also by the provision within media libraries, which are then often listened to using the smallest possible speakers - namely laptop or PC speakers. Here too, an external sound system is recommended so as not to degrade the sound even further.

The problem of incomprehensible dialogues that disappear behind too loud background music and noises is well known and actually all films should be re-produced specifically for listening in fewer channels, but this is not done for cost reasons. The good news is that there is finally a solution to the problem. It is called: HDSX TV Sound Optimizer.

The HDSX TV Sound Optimizer produces consistent volume and clear speech on all channels. It improves the listening experience for TV, streaming and gaming. The heart of the device, which is just the size of a palm, is a powerful DSP chip that contains the patented HDSX technology. Two algorithms play the main role here:

HDSX.volume intelligently adjusts the different volume levels within different film scenes in real time so that the sound remains exciting to hear, but the constant "too loud, too quiet" situations are completely eliminated. The remote control is no longer needed for this.
HDSX.speech lifts the dialogues out of the background music and at the same time frees them from any potentially overlapping noises, so that the intelligibility of words is drastically increased. Together with HDSX.volume, this is the ideal combination to finally hear clear speech again.

The optimizer is simply integrated into the signal path between a television and a TV-external sound system, such as a soundbar, hi-fi system or headphones, and ensures that the volume is even and speech is clear on all channels. There are also no longer any differences in volume between different sources, such as ARD, YouTube, Amazon Prime, Disney + and Apple TV. Everything is harmonized with each other.

The optimizer uses the uncompressed PCM 2.0 format as source material for the best sound quality and converts it into an immersive two-channel signal called HDSX.TV. This signal also controls downstream surround decoders so that a consistent volume and clear speech are achieved at all times. The miracle box works with all televisions and all TV boxes. The optimizer is available in HDMI ARC and TOSLINK connection variants, which is the most common digital sound connection using optical signal transmission. This means that an optimizer model fits every television and every speaker for television.

Suppress annoying background music and noises on TV

Reading next