“We are all in the gutter, but some of us are looking at the stars”.
The great Irish playwright, poet and writer Oscar Wilde passed away 118 years ago today, so to mark the anniversary here’s a little 360 video called Wilde Flock.
The footage was captured earlier this year at the Oscar Wilde Memorial Sculpture , created by the sculptor Danny Osborne, and erected in Merrion Square, Dublin in 1997. Wilde was actually born nearby in No. 1 Merrion Square, which can be seen in the video on the opposite side of the road from the statue. During my time as a post-graduate student in Trinity College, I walked home by this statue every day, and it is probably one of my favourite monuments in the city. I always thought the pose of the statue captured Wilde’s wit and flamboyance very well, especially when combined with the typically witty quotations displayed on the two pillars at either side.
For the piece I wanted to use these quotations, but rather than a simple voice-over narration I decided to process the recordings into a flock of sound that flies towards and eventually surrounds the listener. This was created using granulation, a technique that has been used by many electroacoustic composers (Curtis Roads and Barry Truax for example) and something I have used extensively in my own music. Fundamentally, granulation involves the division of an audio file into many short segments or grains, which can then recombined in different ways. The technique can also be used to time-stretch or pitch-shift an audio file, and a similar technique (using the PaulStretch application) was used to create those recordings of pop songs slowed down a 1000 times you can find on YouTube (I’m not much of a fan of Justin Bieber, but when you slow it down by 800%,,,, that I like!).
Granulation is an extremely useful technique for spatial music composition, as the individual grains of sound can be collapsed into a single point, or alternatively spread out in space in different ways. One technique I’ve used in the past is to combine granulation with a flocking algorithm, which replicates the complex emergent behavior found in nature using a collection of individual agents following a simple set of rules, such as;
- steer away from nearby flockmates
- steer toward the average heading of the flock
- steer toward the center of the flock
It’s remarkable how effectively computer simulations of this algorithm resemble the complexity, and the beauty of nature, such as the murmuration of starlings shown in the clip below.
The original audio consisted of myself reading various quotations of Wilde, along with some bird sounds, and a synthesized combination of the two. For the spatialization in Sound Particles, the flock starts off in the distance and gradually approaches the listener before transitioning to a non-spatialized, unprocessed monophonic recording of one or two quotes. This was implemented using a new feature on YouTube, which now supports both a spatial First Order Ambisonics track, and also a separate, standard stereo track which is headlocked, meaning it is not spatialized and doesn’t respond to head rotations.
While I’ve alway been somewhat resistant to the use of headlocked audio in this medium, it is undoubtedly useful at times. In particular, the ability to create a mono audio track, which on headphones will be heard by the listener inside their head, can make for a nice contract with the externalized, spatial audio we typically use for 360 videos and VR.
I’ve had a few requests for the specific ffmpeg commands used to add the spatial audio and head-locked stereo audio to the video file for uploading to Youtube, so here it is.
I used Adobe Premiere to edit the video file, the four-channel Ambisonics audio, and the stereo, headlocked audio. Then I exported the video by itself, without audio, before then exporting the cut four-channel Ambisonics Audio by itself, and then finally the stereo, head-locked audio by itself (both as full resolution .wav audio).
I then rendered the two audio files together into a single six-channel audio file using Reaper. The channel order should be FOA-AmbiX in channels 1-4, and the stereo, headlocked audio in channels 5-6.
Then I used ffmpeg to convert my exported mp4 video file to a MOV so we can attach the full quality .wav audio files, as follows;
ffmpeg -i my360video-noaudio.mp4 -vcodec copy -f mov my360video-noaudio.mov
Then I attached the six-channel audio file to the .MOV video, as follows;
ffmpeg -i my360video-noaudio.mov -i my360audio-6ch.wav -channel_layout 6.0 -c:v copy -c:a copy my360video.mov
Alternatively (with thanks to Angelo Farina for pointing this out), you can implement both steps using a single command, as follows;
ffmpeg -i my360video-noaudio.mp4 -i my360audio-6ch.wav -channel_layout 6.0 -c:v copy -c:a copy my360video.mov
Finally, making sure you have the latest version of Google’s spatial metadata injector, select your video file, tick the boxes and click Inject Metadata. As you can see in the screenshot below, the latest version of the tool will recognise that your file contains both spatial audio and head-locked stereo in 6 channels.
Once that’s done, the tool will create a new version of the video file with the metadata added which you can then upload to YouTube. Remember to wait a little while for the spatial audio to be processed as this may take a number of hours to fully complete.