I really wish it was easier to do high-quality audio calls these days.
It's technically feasible, apps that can do this have existed for years[1,2,3], but they're either non-free or kludgy and unintuitive as hell.
At this point, It's definitely a UX problem, not a "we don't have the tech to do this" problem.
Analog phones in the 80s sounded better than almost anything a typical consumer is likely to interact with these days[4]. Now, it's all crappy 16kHZ Bluetooth headsets, bad noise / echo cancellation everywhere, and all that encoded with some low-quality opus.
nobody seems to care about this very much. We now have devices that can go up to a few hundred mb/s over WiFi, yet Bluetooth hasn't changed much since 20 years ago, and the audio quality is basically what it was back then.
It still positively mystifies me why the only actually lossless codec used for getting data to and from a headset / earpiece wirelessly is the extremely underadopted and proprietary aptX Lossless. Like I just cannot for the life of me understand why is it so difficult to push ~2.3 megabits/sec (48 KHz, 16-bit stereo listen + same but mono mic) wirelessly in the big 2025.
That's a lot of software setup for something that can easily be done in hardware. I've been over engineering things for years now.
For my mic I have also an sm57 feeding into a dbx286 hardware mic preamp and dynamics. Then feeding that into my audio interface for calls that also gives me a knob to mix in my mic into my headphone signal. This all gives me zero latency monitoring off all the gates and compression. Then for the output signal I have a seperate audio interface I use with all the web calling applications that sends the audio of all other people in the call to o another compressor that levels out their volumes. Then that is sent through a tc electronic multiband compressor unit to fix the really dull mics some people have. This is then mixed with the stereo output of all other applications on my audio interface again.
This way I have consistent audio no matter what OS I'm booted into at that moment.
I've tried several different mics but eventually settled on a wired headset and Revelator io44 audio interface. The latter one is a goofy brick but it has a TRRS audio jack and built-in DSP so I don't have to fiddle with loopbacks, DAWs and VSTs.
And if I'm not able to lug that brick I can just plug the headset directly into my laptop.
Every actor making codecs is trying to pull off an MP3, so that they extract rents from everyone else via licensing. They carpet bomb the field with patents to prevent free codecs from succeeding.
aptX is an example of an non-free codec made in this manner :)
Bluetooth has low bandwidth. Classic Bluetooth is 1 Mbps. Bluetooth LE can do 2 Mbps. LE Audio was introduced in Bluetooth 5.3 and starting to show up in headphones. I think LE Audio supports high quality bidirectional so that should solve the poor headset problem.
i'm also an audio nerd and although I do everything in the box when i'm recording, i agree completely that it's way easier to use outboard stuff for this case. i had an analog channel strip but decided to try one of the very inexpensive behringer uv1 strips with an integrated usb interface and it's been great, the gate and compressor work well, and i have a rolls audio parametric eq in the effects loop to high pass and de-essing.
since it's convenient to use the headphone out on the uv1 for the headset, i do use a limiter plugin in Rogue Amoeba Soundsource to compress the output from the conferencing software we use, it's nice being able to do that per-application since i listen to music through the headset a lot and don't want to have to take the limiter in and out.
analog headsets are so much less annoying and flexible, huge fan
Toward the beginning of the pandemic, a friend asked me how she could use an external vocal mic and a guitar with a pickup on Zoom calls. Sounds easy, right?
But to have the amount of control a musician really wants, it turned out to be a bit more involved. Plus, when working from home for a microphone company, it’s pretty common to use a decent mic in meetings.
This post explains the setup I’ve been using for my calls.
(Don’t let the speakers fool you. Use headphones or it’ll feed back when echo cancellation is turned off!)
Mix the mic or other inputs going into the USB interface in the DAW
Be able to hear/monitor the mix
Route the output of the DAW to a Zoom call
Be able to hear the far end of the call through the same headphones as monitoring the mix
Get Blackhole
The key ingredient here is BlackHole, a virtual audio driver that acts as a passthrough from each input to the corresponding output[1]. This actually needs two instances of BlackHole because Zoom can only send and receive from the first two channels of any audio interface. Fortunately, they offer direct downloads (email required) of each (and have nice instructions for building from source). I have one called BlackHole 16ch and one called BlackHole 2ch, which — surprise — have 16 channels and 2 channels, respectively.
The 16-channel BlackHole device will function as the Zoom speaker; the 2-channel BlackHole will be the Zoom “microphone”.
Set up an aggregate audio device
Reaper will handle all of the audio routing, but since it doesn’t support having different input and output devices, the first thing to do is create an aggregate device in Audio MIDI Setup. This allows the system to treat multiple devices as a single device with all of the channels from the individual devices. It doesn’t really matter what order you add them to the aggregate device, but it should include both BlackHole devices and the audio interface. I have the USB interface set to be the clock source, with the two BlackHole instances set for drift correction.
Route and mix in the DAW
Once I got the audio devices set up, I had to route everything in Reaper. The general approach is:
The master out is going to Zoom, and my ears for monitoring In other words, (almost) every track in the DAW is “normal” in the sense that what I hear is what the far end of the call hears
The output of Zoom is not going to the master out, so it doesn’t feed back
But before doing that, make sure Reaper is set to use the aggregate device in the device preferences.
For every input I want to mix, I created a track. Selecting the input for that track feels almost like just using the regular USB audio interface, but with a whole bunch of other channels thanks to being aggregated with BlackHole.
By default, Reaper sends each track to the master out, but in order to hear live input, you have to arm the track and turn on record monitoring.
A fun bonus of routing through a DAW is that you can use plugins! I use a simple NR plugin to deal with HVAC noise, and some compression.
The master out needs to go two places:
The USB interface so you can hear in your headphones
BlackHole, to get it into Zoom
So, from the master track’s routing window, add outputs to the USB interface and the two channels of the 2-channel BlackHole interface. The fader/mute button for the USB interface on the output routing of the master is how I adjust whether/how much of myself I want to monitor in my ears.
That’s it for everything I want to send to Zoom, but I still want to be able to hear the far end of the call. I could just tell Zoom to send out to the hardware interface, but I want it in the DAW, too. This is useful for recording a tape sync, and so you don’t have to mess up your monitoring volume to change the volume of the far end.
For that, I created a special track and set its input to the 16–channel BlackHole instance. When you set up Zoom to use a particular output device, it sends the audio to the first two channels, so I had to use channels 1 and 2. Here’s where the track becomes special: you have to make sure it doesn’t send to the master out (it’ll feed back if you do). Instead, send it directly to the USB interface’s out.
And that’s it for the DAW.
Set up Zoom
The basics of setting up Zoom are simple: it receives the master output of the DAW by setting its microphone to BlackHole 2ch, and setting the speaker to BlackHole 16ch sends the far end’s audio to the DAW on channels 1 and 2 of BlackHole 16ch. Since you can control the output level from the DAW, I maxed out Zoom’s output and input faders and turned off the automatic gain control.
That’s really all you need for the basics, but Zoom has a bunch of cool advanced audio settings. Under “Music and Professional Audio” you can tell Zoom to let you turn off all of its audio processing, sending “original sound”. This is great, because what’s the point of having a decent mic if Zoom is going to band-limit and compress it to death? You can also turn on stereo, but I only use that if I really need to, which is rare. (Keep in mind that in order to actually activate these settings, you have to press “Turn on original sound” in the upper left of a call.)
Bonus! Sharing system sound
Zoom can share system sound, but when using a setup like this, I don’t recommend it. Turning it on activates some sort of additional virtual audio device on the system, which can mess with things. Remember that Zoom can only send audio out on the first two channels of a device. Thankfully, the system isn’t so limited. To share system sound, I went back to Audio MIDI Setup and under Configure Speakers told it that for stereo out, BlackHole 16ch uses channels 3 and 4.
Now I can set my system output device to BlackHole 16ch, make a new track in the DAW, set its input to BlackHole 16ch, and system sound comes in there. So the far side of a Zoom call comes in on channels 1 and 2, and system sound on 3 and 4.
And that’s it. Happy calling!
I used BlackHole because it’s free and did what I needed. You can achieve the same thing with a nice UI using Loopback from the excellent Rogue Amoeba. ↩︎