Welcome to the fourth episode in our look at technology advancements in immersive live-sound systems. In last month’s issue (FRONT of HOUSE, Dec. 2019, page 36) we covered how turnkey immersive, LCR (Left, Center, Right) and stereo live-sound systems differ in cost, with samples of sound coverage and max SPLs. Certainly, audio imaging can be improved with a greater number of immersive arrays across a stage. However, we wanted to inquire whether half-sized line-arrays or point-source arrays can provide the same max SPL and distant coverage as large stereo line-arrays, and if surround loudspeakers are always required.
Based on last month’s discussions with leading immersive system vendors, there are several array and DSP options. But at this time, LAcoustics’ Soundvision software is the primary sound modeling program that can simulate the multi-array coverage of an immersive design. Fig. 1 shows L-Acoustics main arrays in use for an ALT-J performance at London’s Royal Albert Hall; such a design (with seven large line arrays) seems rather typical of a L-ISA Immersive system. The Oct. 2019 issue of FRONT of HOUSE (page 42) showed how EAW’s ADAPTive series could provide a system design solution with shorter line arrays, offering less interference issues with visual elements. In the Dec. 2019 issue, we also covered how immersive live sound is at least, if not more, as much about superior sound coverage, better source localization and time-alignment across a venue, as it is about surround sound. Lastly, that immersive sound is very flexible — not channel-based — providing better coverage and source localization, with less than 180° of immersion (across a proscenium/stage) or up to a full 360° of surround immersion (overhead surround loudspeakers are less common), allowing for stereo downmixing as well.
Further interactions with some additional immersive vendors and another great-sounding demo of a large-scale immersive system with EAW ADAPTive line arrays and Flux SPAT software (running on a Mac) have revealed that some immersive DSPs are dedicated hardware — much like a DriveRack DSP — but other immersive processors consist of an upgradable immersive (surround) sound application/software running on a high-power PC/workstation. But, be careful if you intend to use a generic PC, as some large (high channel-count) immersive live-sound systems require a very-high power PC/DAW, including specialized cards and an industrial OS, for smooth, crash-free running. Note that many of the currently available immersive programs are actually intended for the offline editing of live theater sound FX, film soundtracks (like Dolby Atmos), dance clubs and gaming VR applications, where processing latency (delay time) does not matter and fault tolerance is not as critical. It’s a nuisance when an audio editing workstation crashes in a studio, but the same situation creates a crisis during a live show or worship service. Sound reinforcement applications require a reliable digital sound system with a high-power DSP, dedicated network, redundant cables and power supplies, redundant immersive live-sound-specific software, and a user interface that’s suitable for a live sound performance.
Real-time talent tracking is another option, for automatically track sound sources onstage. However, like any emerging technology, designing a DIY immersive system, and programming immersive productions has a significant learning curve, so plan well in advance for any such projects. Fortunately, several makers of immersive systems, along with DIY components and software, are working to soon offer more optimized solutions for the live immersive markets.
Categories of Immersive Systems
In the evolution of immersive post-production and live sound systems, there are many different types and ways of sourcing an immersive system. We propose the following categories for various immersive products/methods:
- A-type: Analog I/O (in/out) specialized immersive DSP (digital signal processor with third-party components).
- B-type: Branded single-source solutions, including everything (except control) from the major manufacturer.
- C-type: Collaborative or strategic partnerships between key hardware and software manufacturers.
- D-type: DIY DAW (do it yourself digital audio workstation) in third-party immersive solutions.
- E-type: Electronic-Acoustic Systems are more for simulating classical concert halls.
- F-type: FX-surround systems that are played back in cinemas and live-theatre.
- I -type: In-Ear immersive DSP can be added to a live-sound system.
- M-type: Mixing consoles that can control third-party immersive DSP.
- N-type: Networked redundant switches for multi-media systems.
- P-type: Post-production software only for immersive playback.
- S-type: Small-format surround systems for themed playback.
- T-type: Talent tracking systems — optional accessories.
Firstly, A-type, Analog I/O type of specialized immersive DSPs are bundled — by a designer/integrator that is expert in immersive systems design and installation — with third-party components. Outboard’s TiMax DSP originated processing of dynamic delay-matrix localization and immersive audio with object-based spatial audio. It has yet to be announced if Dolby Labs will expand beyond Atmos for cinemas and clubs, with a live-sound control and DSP variant.
B-type (branded) immersive solutions were covered last month. These immersive live-sound systems are essentially single-source (turnkey) solutions, including everything from the immersive processor to the routers, amplifiers, and loudspeakers. They provide pre-tested and specialized components, software and tech-support, with the inherent cost premium and product choice limitations (much like how Apple tests Mac apps. for stability on its computers). Branded-turnkey systems are currently available from d&b audiotechnik and L-Acoustics, with Meyer Sound working to adapt its live-theater surround system for concert sound use.
The C-type are partnered immersive solutions from strategic partnerships — built between key hardware and software manufacturers — to collaborate on a packaged solution that the manufacturing partnership supports — typically through specially trained dealers. Such specialized immersive processor manufacturers include Astro Spatial and Flux SPAT apps; partnered with Alcons, Clair, EAW, Martin Audio and other amplifier and loudspeaker suppliers.
D-type are DIY immersive systems with a third-party or “roll your own” DAW (digital audio workstation), with possible savings, along with many component and software (plug-in) options, and all the associated risks mentioned above. DIY immersive systems can be built from a variety of hardware and software combinations. (For software sources, see P-type and the quote from Dr. Caulkins to follow.)
E-type: Electronic-Acoustic Systems simulate classical concert halls, for instance acoustic/hall simulation systems (with acoustic gain & reverb. extension) are available from: E-coustic Systems, Meyer Sound and SIAP.
F-type: FX-surround sound systems are for playback back in cinemas, clubs and live-theatre, such surround sound components, software and systems are available from Dolby, Envelop, Figure 53, Meyer Sound and Outboard.
I-type: In-Ear immersive DSP from KLANG Technologies can be added to a live-sound stage monitoring system.
M-type: Mixing consoles that control third-party immersive DSP are currently available from: Avid, DiGiCo and Lawo. This integration is currently offered with Avid’s S6L console via plug-ins or using generic OSC commands with the mixing console offering OSC control, such as the DiGiCo SD series integration with Klang. The power of OSC means that any third-party system with this capability can potentially be part of the immersive audio control.
N-type: Network switches (with redundancy) for live-sound and media systems are available from Luminex.
P-type: Post-production software programs for immersive playback include: Ambisonics, Dolby Atmos, Envelop, Flux SPAT, Ina GRM and many more. Avid Pro Tools is a leader in DAW hardware/software for recording/post applications. For more on software programs for immersive sound see the following quotes and sidebar articles.
S-type: Small-format immersive systems (post control/source) for presentation system or theme park playback are available from ImmerGo (compact speakers with amplifier and digital network input).
T-type: Talent tracking systems by BlackTrax, Stagetracker and Zactrack, automatically track onstage sources.
For an updated list of the many manufacturers (and their web links) of immersive systems and DIY components, along with more on immersive live-sound systems, visit www.
immersive-pa.com.
Last month we focused on single-source immersive sound systems design by d&b audiotechnik and L-Acoustics, so this time we spoke with a few of the leading DIY (third-party or à la carte) hardware, software and partnered immersive sound system venders about challenges in deploying immersive systems.
Outboard TiMax
In terms of the TiMax DIY approach to immersive live-audio, “TiMax originated dynamic delay-matrix localization and immersive audio with object-based spatial audio and mixing from the early 2000’s, and the evolved TiMax SoundHub platform integrates a total package of immersive spatial rendering and system management with integral show control and playback,” explains the company’s Dave Haydon. “It is inherently speaker-type-agnostic and also less subject to rigid system geometries and topologies than some algorithm-based devices and derivatives. TiMax is much more of an immersive audio Swiss Army knife, which as leading sound designers have observed and demanded, is often essential to creatively address the more challenging immersive demands across performance, presentation and experiential sectors.
“The TiMax HARDCore spatial rendering workflow is centered on the PanSpace tools,” Haydon continues, “which allow users to instantly auto-calculate 3D spatial mapping of an area from an imported drawing of the space with speakers and ImageDefinition objects dropped on it. Editable movement trajectories are then drawn on the PanSpace and managed in a linked TimeLine, and for live vocal localization and effects, source objects can be controlled via OSC by TiMax TrackerD4 Precision Stagetracking or BlackTrax. Unique to TiMax is the option to get involved under the hood to an infinitely variable degree with the individual level/delay parameters and adaptive coefficients, which render the Image Definition spatial objects. This creates a comfort zone for leading and experienced sound designers of the more challenging and complex immersive projects, for which closed ‘black box’ spatial devices can be less revealing and flexible about what’s going on and how to deal with it.” See Fig. 2.
Eastern Acoustic Works
“EAW believes there are strong leaders in immersive panning engines and software and we plan to support our third-party friends by improving and developing our ADAPTive line of speakers,” says EAW’s John Mills about the company’s DIY approach to immersive audio. “We believe our ADAPTive method of shaping our wave front to match the geometry of the room is a critical listeners dream. Using our Resolution software package, you draw the 3D planes of your room and it calculates and deploys a custom wave front that specifically matches the audience planes of your space. It matches not only frequency and amplitude but also phase and impulse response at every seat from front to back of the audience area. With mechanically splayed line arrays, it is well known that anything over about 3 to 4 degrees falls apart in phase and impulse. So, the most important seats in the front of the venue are often tonally a bit different than at FOH, where the engineer is making all the most important decisions of the mix.
“ADAPTive technology overcomes this and delivers an identical sonic experience in every seat in the house; the full-range systems create their coverage flexibility by choosing where, in 3D space, the acoustic sources will create summation,” Mills continues. “Resources are simply not allocated to areas where coverage is not desired. This approach can only be achieved by the tightly spaced, high-resolution array of devices in ADAPTive products. For instance, a single ANYA cabinet has 22 drivers, DSP and amp channels which are all manipulated independently. By focusing on summation, rather than destructive interference, the natural impulse purity inherent to the acoustic design is maintained. This ensures a level of sonic consistency inside and outside of coverage that is unequaled by line source technology.”
Flux Audio
Hugo Larin explained his company’s approach: “Spat Revolution is a software engine running on standard computer hardware [see sidebar for more details] dedicated to offline show content creation workflows and real-time live applications. FLUX:: has been a software development partner with French research institute IRCAM (www.ircam.fr), since 2008, and Spat Revolution is the result of decades of research and achievements,” he notes.
“Many of these technologies have been successfully deployed in live sound installations with products including Spat in MaxMSP, Panoramix, with the legacy unique Spat audio plug-in, and most recently with Spat Revolution. The FLUX:: and IRCAM cooperation offers a variety of spatial audio techniques to users and designers, sharing a vision of open development. Behind these various spatialization and audio panning techniques is the desire to offer creativity, flexibility and the ability to adapt to each application and creative challenge, whether sweet spot-centric, live performance, or and installation-based, and regardless of where the audience may be distributed.”
As an alternative, Larin adds, “For those not inclined to DYI system design, hardware can be specified (and provided) by integrators or via the FLUX:: Immersive Consulting Group, offering a range of services for system deployment and integration. Channel count, sample rate, audio distribution method and any required third-party control integration, are basic parameters when defining proper system design elements. To support live production deployments, technical services are available from FLUX:: Immersive Consulting Group, ranging from fully configured hardware/software system packages based on specific project requirements, to pre-design, guidelines, deployments, system tuning, commissioning and training.”
Astro Spatial Audio
Bjorn van Munster notes that Astro does not pursue a DIY approach, but rather a strategic collaboration business model. Fig. 3 shows how Astro Spatial Audio routes systems through their immersive processor (a.k.a., Rendering Engine). “Our aim is to reach as much audience as possible to allow them to have the full Astro Spatial Audio experience for live performances. In order to achieve this, we deviate from the traditional business model and work with Solution Partners and Manufacturing Partners. A Solution Partner is a partnership with a company in a defined territory. This company is trained by us, can deal with all brands of equipment and invest in a demo system [with our ASA engine]. Additionally, we have manufacturing partnerships. This is a strategic partnership with pro audio manufacturers who can sell our product to the market directly via their distribution network as a packaged solution, not OEM. For example: Martin Audio markets our collaboration under ‘Sound Adventures’ which is simply a packaged solution of Martin Audio speakers with an ASA engine(s); we are working like this also with brands like Adamson, Clair, Alcons, etc.,” van Munster explains.
“The manufacturing partners create a large footprint for us and we make sure that our partners are well trained and informed about our equipment. Additionally, they all benefit from each other’s experiences. So instead of DIY, I would rather say it is a very keen and strategic collaboration between leading manufacturers in the industry, whereby we enhance our joint solution, but individually stay focused on our core business.” With reference to array coverage modeling software (discussed last month): “We are able to export the results and then in ASA software, we can evaluate and qualify the results, by making histograms.” Fig. 4 shows a small immersive presentation system from Astro Spatial Audio and Martin Audio.
Arup
Lastly, we recently had an extensive conversation about DIY immersive pro-audio, with Terence Caulkins, PhD, an acoustics and AV researcher, interactive technologist, and all-around immersive sound designer extraordinaire, now associate consultant with NYC-based Arup. As he has been on the development team of several immersive pro-audio products/projects, and also produced immersive sound for several installations around the world over the past decade, I asked Caulkins to elaborate on the subject for our readers.
“DIY solutions for immersive audio design have existed for decades, though until now, they were mostly used by a niche group of researchers and sound artists around the world. With today’s large tech companies pushing VR and AR, there are many readily available DIY solutions for immersive audio today beyond the solutions already mentioned above. This includes (in no particular order): IRCAM Spat and Panoramix (basis for Spat Revolution), Matthias Kronlachner’s ambiX, Aalto SPARTA, WigWare, Blue Ripple Sound, Ambisonics Toolkit, Facebook 360 Spatial Workstation, IEM Graz plug-in suite and many more. Each tool is more or less easy to use and configure for non-expert users. Each rendering algorithm has its own specificity, sound quality and intended use. Up until recently, live immersive playback events have been a relatively niche occurrence — as such, most DIY software solutions have yet to be extensively road tested for large scale live sound applications.”
Caulkins notes that keeping focused on the basics is critical. “In a large venue / live sound context, your main priorities should be the quality of sound produced by the sound system (timbre and time alignment), SPL, evenness of coverage over the audience area, latency of the live sound, and stability of the system. If your immersive DIY system misses the mark on any of these points, you have a big problem. All the rules of typical system design apply, including loudspeaker selection, room acoustic design, and system calibration. As mentioned in a previous article [in this series], everything happening on stage should be audible everywhere in the room, which might require adding a mono mix-down bus to feed fills along the extreme sides of large, fan-shaped rooms,” he adds.
“From a stability standpoint, depending on one’s level of comfort with DIY tools, a safe route can be to combine traditional LR/LCR/Turnkey systems for the Live component with pre-recorded immersive cues that will move around and above the audience as another layer for the audio design. Immersive cues can be triggered by the performer via interactive OSC triggers or by the mix engineer. This was the approach I took for 3D sound design on the Hubble Cantata, a live virtual reality space opera that premiered at the BRIC Celebrate Brooklyn! festival in 2016. For this show, we transported 6,000 people into VR, supported by an immersive audio FX array surrounding the audience paired with a conventional line array setup above the stage for the live sound reinforcement (see Fig. 5). This type of approach follows the Theatrical Surround Effects Systems paradigm described in your previous article. If you are bravely inclined to go with a fully real-time DIY immersive audio workflow on PC, it is good practice to build in redundancy as mentioned earlier in this article, with a secondary identical machine running the show and receiving all the same audio inputs and control cues as your primary machine. Both machine outputs are connected to a mixer via MADI or Dante and the secondary machine can be brought in play in case of a system crash. This workflow has been used successfully by IRCAM engineers for complex large-scale multichannel productions over many years.”
Whether — or not — you are pre-rendering your immersive audio, the simplicity of the user interface is a key factor in the live sound world, says Caulkins. “There is a lot of room for improvement on this topic in the DIY world, where the user interfaces design of many systems is exceedingly complex, designed for PhD students and not touring audio engineers. The creation of simple, robust interfaces with easy to understand controls is a must, if DIY tools are to be adopted on big-venue production jobs, where engineers are working to satisfy the needs of big-name artists and executives, with little time for experimentation and/or improvisation.”
According to Caulkins, “another key issue lies in the transcoding of content from one immersive audio system to another. While some touring artists can afford to transport multiple proscenium arrays, the cost and time constraints associated with moving a fully immersive sound system — above, behind and all around the audience seems unlikely to be within reach for most productions, anytime soon. Having the possibility to tap into a host venue’s immersive audio system as an added layer is a desirable prospect for the future of touring shows. This may mean transcoding from one system architecture to another. This work is underway, under the umbrella of MPEG-H 3D audio, as mentioned in a previous article. If some of the turnkey and DIY solution developers currently out there can work together and adopt this MPEG-H 3D format, it will pave the way for the standardization of immersive audio in the live-sound industry.”
Despite any challenges, the benefits are great. “The cost and complexity of producing fully immersive audio shows for live sound is high, but immersive provides an incomparable level of clarity, transparency, spaciousness and envelopment, compared to conventional stereophonic deployments. As many have already said in this article series, it’s safe to say that audiences will increasingly be demanding this level of experience as more venues and tours become equipped with these technologies.”
Conclusions
There are several sources of immersive sound programs, but most of the currently available DIY immersive software programs are actually intended for the offline editing of immersive mixes and not yet proven for live sound. Fortunately, several makers of immersive systems are working to soon offer more optimized immersive mix software (with real-time editing) and intuitive control interfaces for the concert and church sound markets. On the hardware side, one option is to hire a sound designer and an integrator who are expert in immersive systems design and installation, or alternatively, work with a manufacturer (turnkey or a collaboration) and who can design, supply and support a complete immersive system. There are a wide range of immersive system components and system approaches. So, clearly, there are many different ways of creating an immersive production and sourcing an immersive system, so, as I said earlier, plan well in advance for immersive projects. See the sidebars for further details on Mac/PC hardware specifications for DIY real-time live immersive applications and immersive audio content creation with immersive mix software such as Spat Revolution. See you next month!
David K. Kennedy operates David Kennedy Associates, consulting on the design of architectural acoustics and live-sound systems, along with engineering and market research for speaker manufacturers. He has designed hundreds of auditorium sound systems for churches, schools, PACs and AV contractors. Visit him at immersive-pa.com.
___________
Software Spotlight: Creating Immersive Audio Content with Spat Revolution
By Hugo Larin
Spat Revolution from Flux:: Engineering can run on generic hardware, offering sound designers the opportunity to start an offline conception on a local computer without needing specific hardware. Spat Revolution seamlessly integrates with a variety of DAWs and playback systems, allowing for local inter-application audio transport and automation. Compatible third-party systems include Ableton Live, Nuendo, Pro Tools, Reaper, QLab and others.
There are three plug-ins (AU, AAX and VST) in the Spat Revolution production suite: Spat Send, Spat Return and Spat Room. Users can build automation network cues, write automation of an immersive mix to a timeline or to cues, preview the results (binaurally or on smaller scale speaker arrangement), and build a show that’s ready to move to dedicated live playback systems and Spat Revolution Live Computer engine, from creation to delivery. The creation phase can be managed on a single computer, and easily migrated to a network of show controllers, audio playback, mixing desks and immersive audio engines.
Spat Revolution addresses panning over multi-channel immersive audio systems without the requirement of a closed framework, adapting easily to speaker design arrangements for various productions with spatial audio techniques and panning methods to suit different applications (binaural, high-order Ambisonic HOA or 2D/3D channels based with WFS, VPAB, DBAP, KNN, SPCAP and more panning methods). Multiple virtual spaces in the software engine called “Rooms” let users deliver to multiple diffusion systems (virtual spaces using all or partial speaker arrangement setups) and offer extensive flexibility in creating custom 2D or 3D speaker arrangements. This supports unconventional stage setups where 5, 7 (or more) speaker hangs are spread across the stage with somewhat equal separation, or an multiple loudspeakers in arbitrary locations. These virtual spaces each include acoustic simulations (reverb engines) to generate early reflections localized with each source, along with a reverb tail end that is diffused at the outputs, thus creating a sense of depth and reality.
For multiple virtual rooms in Spat revolution, the creation phase (studio-style scenario) can deliver simultaneous output streams by recording multiple stream formats of the same immersive mix in the DAW. For example, a sound designer can create a binaural output preview on headphones while on a plane and a surround arrangement for studio work, while simultaneously creating content for multi-speaker arrangements of the actual show itself, which may have speakers by the dozen!
Spat Revolution also provides advanced virtual source parameters from the simple (basic radiation control of azimuth, effective 360-degree pan, and basic source distance) to very complex situations where multiple perceptual factors are being controlled. See Fig. 6.
To support live production, planned software options for 2020 include: show/config. Modes; snapshot system with interpolation; remote and server “renderer” mode — where computer(s) are dedicated to processing without GUI user interface and remote computer handle control; and a new Wave Field Synthesis spatialization option that can be applied to a Frontal system, such as 7 to 8 frontal arrays or more. For more info, visit: fluxaudio.com.
________________
PC Hardware Requirements for DIY Real-Time Live Immersive Applications
By Terence Caulkins, PhD
“In my experience, most DIY spatial audio experiences can run on any current spec Mac or PC running a multichannel audio interface. Audio producers/researchers that I have worked with tend to prefer Mac, possibly because key DIY immersive sound software packages were historically only available for Mac computers. It’s worth noting that it was possible to run very high channel count real time immersive audio experiences on PC and Mac laptops 15 years ago using e.g. the IRCAM Spat in Max MSP, with very low processing speeds compared to today’s machines.
Machine spec requirements for DIY will heavily depend on the complexity of your spatial audio algorithms and quantity of associated signal processing like delay and equalization. If you are loading up on real time FX and instruments in your DAW (reverbs, compressors, virtual synths) then you will quickly end up making use of all of the processing power available on a single machine. In this case it’s best to have a separate machine with the DAW and FX piping audio streams over e.g. Dante/MADI and control signals over OSC into your spatial audio rendering engine. “Turnkey” and “collaboration” solutions like d&b, L-acoustics, Astro Spatial Audio etc… all follow this paradigm.
Given the relatively small cost of computers compared to loudspeaker/amplifier/installation costs for immersive audio it’s a good idea to purchase the best available processor speeds to give yourself a lot of processing headroom, especially if you are working in live conditions where it becomes critical to minimize audio buffer sizes (to minimize system latency). The audio environments that I work in recommend at least 8GB of RAM, though my understanding is that more RAM is not necessarily helpful in most cases – this may vary depending on the software package.
It’s worth noting that Wave Field Synthesis (WFS) is a much more processor-intensive technique than e.g. Ambisonics or surround sound as it typically requires one real time convolution filter running for every loudspeaker output channel. As such, high channel count Wave Field Synthesis rendering may require multiple machines. As an example, EMPAC’s 558-channel WFS array running on IRCAM Spat in Max uses two Mac Pros, one for each half of the array.”