Illustration by Justin Harrell. Watch him draw the Skype-O-Saurus in this time-lapse video.
The Drupalize.Me Podcast (formerly the Lullabot podcast) has been running for many years now. During this time, not much has changed as far as what makes the podcast itself. There is theme music, a host, guests, event updates, and now even sound effects. Even when it comes to how we record a podcast, not much is different in either the method or the technology. What can make or break a podcast, though, is the quality of the sound. I'm not talking about if the podcast is HD or anything, but what the overall quality of a person's voice is, the ability to reduce or eliminate background distractions, or even just being able to create a good mix of volumes. All these things are great to be able to have some control over and edit before putting the podcast out to the masses.
Drupalize.Me has always taken sound quality and the editing process into consideration when producing a podcast. Each host has a specialized mic for recording and we encourage our guests to do the same. We try to avoid recording via phones, ear buds, or even laptop microphones. The biggest hurdle when it comes to recording a podcast is the actual recording. You can get a great recording if the podcast is just you, but the second you add guests into the mix things get out of your control. A guest may work from home and have children or dogs. A guest may prefer to record form a nearby coffee shop to get out of the office. Background noise can be a major issue in the quality of a recording.
One of the biggest tools for recording a podcast over the years has been Skype. It was one of the first widely used applications for chatting with people across the globe; and it was free for the most part. A huge advantage Skype has over lots of other options is the quality of its audio, as strange as that may sound, Skype is one of the best. Of course there is the occasional dropped call or robot voice you must deal with, but those aren't issues limited to Skype either.
A Typical Recording Process
Whether you're using Skype or some other software, if you have remote guests, there has basically been two ways to record a podcast: record the Skype call and hope for the best, or have each person on the call record locally and distribute the files with a cloud file sharing service like Dropbox. At Drupalize.Me we use an application called Audio Hijack (recently Audio Hijack Pro) to record Skype calls for our podcast. One of the great things about this app is it will record the host on a separate channel from the guests. This is great especially if there is only one guest. We also have what we call a "backup recorder" which is another team member on the call also recording for just-in-case reasons.
With a host and a single guest recording, the audio file is split into 2 channels that allows the person editing (me) to isolate each person to a separate track and adjust volumes and effects independently. This is great if one person is talking and a dog barks in the background from the other person's mic. I can just edit that out. It is this reason some podcasters prefer the "everyone record" method. That way each person is an individual track to work with; that and the audio quality is better. The biggest issue with this for Drupalize.Me is putting the burden of recording of our podcasts on our guests. Our guests are typically spread around the world and use various operating systems. Asking them to record and expect that they know how or have the means to do so is just not realistic. So over the years we have done our best (and a pretty good job if I do say so myself) to work with what we have.
Improving the Multiple Guest Recording
It wasn't until recently when I was recording a podcast that this process really got to me. I knew there was another way but never had the time or resources to investigate. A podcast I have listened to over the years, TWiT, had invested lots of time and money into solving this. Not only for quality's sake, but because they went live at some point so transferring audio files was not even an option. I read an article once how they built what they called a Skypesaurus. They used multiple computers, screens, and a hardware mixer (not to mention a budget of $1500) to make this happen. In the past I researched a way to make this happen with just one machine, I found bits and pieces but was never able to make it happen. Just recently I decided to give it another go. I found an article which described what I was trying to do with an app called Soundflower and Abletonlive.
Soundflower is an app for the Mac that has been around for quite some time. It is open source and something I never really grasped until I used it. It basically turns your Mac into a virtual mixer. It allows you to send any device or app (that allows you to select its input/output audio sources) to any other. The reason I never took hold of Soundflower was because it basically comes preset with a 2 channel in/out and a 64 channel in/out. The odd part is the 64 channel is just labeled as that, it doesn't display 64 different channels. A channel to me would be either an input or output. The article above made it make sense and more possible by modifying Soundflowers' plist file and adding actual channels listing from a-i. I finally understood what I could use it for. The other piece of software used was Ableton Live, which is multi-track recording software with a hefty price tag. I realized I could do this with Apple's Logic Pro X for a fraction of the price.
The Skype-O-Saurus is Born
There was just one other thing I needed, a way to have multiple Skype conversations at the same time. I remembered in the past when I attempted this before that I came across a piece of software called "Skypelauncher". This allowed you to launch multiple instances of Skype. With that I created there other Skype accounts (podcastbot 1 thru 3). Then between Soundflower, Logic Pro X, and Skypelauncher I was able to record a podcast while maintaining each guest and host as a separate audio channel. I was even able to bring in a music and soundboard mix in just for kicks. Another huge advantage I have with this method is the ability to adjust each person's feed to each other. So if guest one says they can't hear guest two very well, I can adjust the audio for that person only.
Video: How I Built The Skype-O-Saurus
I could go into depth all the configurations I did to make this happen, but this being Drupalize.Me an online video training site I felt it would make more sense to show you in a video. Watch to see how I went about making this happen and how I configured each piece of software.