Understanding the ACX submission requirements for audio

If you’re new to the world of audiobook production, the audio submission requirements given by the Audiobook Creation Exchange (ACX) for such projects may seem confusing. Let’s demystify that, shall we?

ACX is a platform where book authors, publishers, and narrators come together to create audiobook projects. In the simplest terms, this is where you upload an audiobook to sell on Audible, Amazon, and iTunes. In order to successfully upload audio and get it approved for sale, the audio must meet certain requirements. ACX does this so that their audiobook library has a minimum and consistent level of quality for submissions. To put it bluntly, they don’t want to sell shitty sounding audiobooks.

So, how do we avoid recording and submitting audio that doesn’t make the cut? Simple! Follow these guidelines straight from ACX

Your submitted audiobook must:

  • be consistent in overall sound and formatting

  • include opening and closing credits

  • be comprised of all mono or all stereo files

  • include a retail audio sample that is between one and five minutes long

Each uploaded audio file must:

  • contain only one chapter/section per file, with the section header read aloud

  • have a running time no longer than 120 minutes

  • have room tone at the beginning and end and be free of extraneous sounds

  • measure between -23dB and -18dB RMS and have -3dB peak values and a maximum -60dB noise floor

  • be a 192kbps or higher MP3, Constant Bit Rate (CBR) at 44.1 kHz

You might look at this list and go “what the hell am I even reading?” and I wouldn’t fault you. For the newly minted voice actor, who might lack the requisite audio engineering experience, these requirements might seem difficult or impossible to adhere to in your mind. I’m here to tell you, though, that (once you understand the basics) it’s actually quite easy to meet these requirements. Furthermore, if your studio is properly configured, you’re likely already meeting these requirements to an extent.

Let’s break these items down one by one and go over what they mean in layman’s terms.

  1. Consistency in overall sound and formatting is an easy one. You want to have an audiobook that sounds the same throughout, from one chapter to the next. If it doesn’t, it will be jarring for the listener. So, how do we do this? Well, once you have your settings dialed in with your DAW, audio interface, and other gear… leave it that way until the project is finished. You should also take note of where your gain is set, the type of configuration you used, etc. so if you need to do some pickups later on, you can recall the sound you previously had.

  2. When is comes to opening and closing credits, consistency is your friend. I recommend having a template for this, wherein you just insert the name of the book, author’s name, and your name. Here’s an example:
    Opening Credits Script:
    ”Book Name”… Written by “Author Name”… Narrated by “Your Name”

    Closing Credits Script:
    The end. Thank you for listening to ”Book Name”… Written by “Author Name”… Narrated by “Your Name”

    This is how I personally handle opening and closing credits, and it works every time. Be sure that you are delivering these as separate files, as that’s how the rights holder will upload them into the ACX system. Oh, and if you’re narrating something that might be a bit too raunchy, political, or otherwise disagreeable to the masses, consider using a pseudonym instead of your real name. I do this sometimes, if I’m hired to narrate something that might not align with the values of my brand.

  3. Up next, we’re talking about All Mono VS All Stereo files. It’s required that every file be consistent throughout the audiobook in this regard, so pay attention. If you’re just recording your voice, then the choice is simple: mono is the way to go for a single sound source. There’s no stereo information if you only have one sound source, so you can choose mono and keep plugging away.

    With that in mind, if you’re working on an audiobook that requires music beds, sound effects, or multiple performances for various characters, stereo would almost certainly be the way to go. Remember this when working on your next audiobook project.

  4. The retail audio sample is the audio that a potential customer will hear on Audible when previewing the audiobook. ACX requires that this file be between one and five minutes long. It’s common to use a portion of the first chapter for this, but it could technically be any part of the book. This sample should start with narration, not opening credits or music. This means that, if you’re using the introduction or first chapter, take out the announcement at the beginning (i.e. the phrase “chapter one” at the start of the audio file.) The sample must also not include explicit material.

  5. As for file requirements, let’s start with chapters or sections. Each audio file must contain only one chapter or section. This is required because rights holders have to upload each chapter individually when they submit the audiobook for approval. Opening and closing credits will also be separate files. This helps listeners navigate the audiobook easily, as they will be able to skip around the audiobook by chapter to find their spot.

  6. Each file must have no more than 5 seconds of room tone at the beginning and end. Room tone, if you’re unfamiliar, is the sound of the recording environment when you’re not speaking. This space is required to ensure titles are successfully encoded in the many formats made available to customers. It also gives listeners an audio cue that they have reached the beginning or end of a section.

  7. Each file must contain the section header, if it’s in the book. For example “Chapter 1” at the start of Chapter 1’s audio file. This, once again, helps listeners navigate the audiobook more easily. If a section header is found to be missing during the ACX QA review, you will be contacted to make revisions which could delay the release of your title. Be consistent - a listener may think content is missing if most headers are read but some are not .

  8. Each file must measure between -23dB and -18dB RMS. This is where it can get tricky for novice narrators. RMS is a measure of loudness for audio files. RMS stands for Root Mean Square, and in very simple terms it measure the average loudness of the audio file over time. To get this value where it needs to be, you need to ensure that your audio signal is consistent. This helps prevent what we call “riding the volume knob” where listeners have to constantly adjust the volume of the audio in order to hear everything properly. The advice straight from ACX is as follows:

    ”Maintaining optimal RMS begins with controlling the signal level of your voice. We recommend tracking so that your RMS level is around -20dB RMS with peak levels around -7dB, but not exceeding -3dB. This level should provide a strong signal to noise ratio, while leaving headroom for gain increases as you fine-tune the recording in post-production. Giving an even performance and utilizing proper microphone technique will also factor in the consistency of your recording level, and this should be your focus at the recording stage.”

  9. Each file must have peak values no higher than -3dB. This is necessary because it helps avoid clipping and distortion in your audio. These things sound bad, so we want to avoid them when creating our audio. To further explain, the peak values are the loudest sounds in the audio file, where the volume peaks the highest. This goes back to maintaining a consistent volume and signal level; if you can do that, you can minimize the prevalence of large spikes in your audio when recording. Avoiding plosives, and backing off the mic when you need to project more for a role, can help as well. Finally, your effects chain comes into play here too. Sometimes we use Compressors and Limiters in the audio engineering world to even out dynamics, and limit peak values, in this manner.

  10. Each file must have a noise floor no higher than -60dB RMS. This really comes down to your recording environment. If you can hear the AC or furnace in your recording space, or perhaps computer fans, you’ve got a problem. We need to reduce (and eliminate where possible) the background noise to meet this requirement. You don’t necessarily need a vocal booth worth thousands of dollars to get this done, but they definitely make short work of it.

    If you’re a novice, try finding the quietest room in your house to record in, and make sure there is ample acoustic treatment on the walls. If you’re really stretching your budget thin and can’t afford the acoustic treatment, grab all the blankets and pillows you can find and make a fort. We’ve all done it, myself included. These days I use my Whisper Room, though. Never forget your roots!

  11. Each file must be 192kbps or higher 44.1kHz MP3, Constant Bit Rate (CBR). Here’s the good news: even the cheapest audio interfaces you can reasonably purchase these days can meet these requirements. You’ll likely already be recording at 44.1kHz, and exporting your files in mp3 format should be as trivial as checking off a few boxes in your DAW when it’s time to export. Just be sure to meet or exceed the quality requirements for bitrate. 192kbps or higher. Make sure its a constant bit rate as well, NOT a variable bit rate. The audio will get rejected if you neglect those details.

Conclusion

While recording an audiobook may seem like a difficult task, especially when faced with strict requirements for your audio, it’s actually quite simple once you get the hang of it. And hey, if you need some help dialing in your settings for recording, or would like me to lend an ear to listen to your audio and make suggestions, let me know. I’m available for consultations and coaching sessions over at www.votrainer.com. There, you’ll find lots of resources for the aspiring voice actor.

Trevor OHare

Trevor O’Hare is a professional american male voice talent, specializing in commercials, explainer video narrations, elearning, telephony, and more. Contact Trevor today to book him for your next project.

https://www.trevorohare.com
Previous
Previous

How to have a better phone voice, from a pro voice over artist

Next
Next

How professional voiceover content can elevate your brand.