Audio Terms: Every Word You Need to Know to Record Your Own Music

When you’re first starting to learn to record and mix your own music, one of the HARDEST parts is something that would seem the most simple…

“What in the world do all of these words mean?”

There’s a lot of vocabulary and jargon in audio. And because so much of it is based on acoustics and electrical engineering, it can get confusing FAST.

So I’ve made a collection of all the words that I struggled with when I was starting out.

This is meant to be a reference for you. Bookmark this guide – If you’re learning about something new and you come across a word you don’t understand, just come here and check it out.

I’ve also made a handy PDF version you can get for free. Grab that if you want access to this guide on-the-go.

Get it here:

 

How to Use This Guide

There’s two versions of this glossary. One organized by subject (which is the one you’re on) and one organized alphabetically (which you can find right here).

Use the one that would be most helpful for you.

Here’s a tip to quickly find the word you’re looking for: use your computer’s “find” function.

Basically, hit “cmd + F” if you’re on a Mac or “ctrl + F” if you’re on a PC. Then type in the word you’re looking for.

If you don’t see it, let us know in the comments. We’ll be updating our list over time with words our readers don’t understand.

Good luck!

 

Recording

Bi-Directional – A microphone that picks up sound from the front and back, but not the sides.

Bit depth – A measure of the accuracy of a program. The higher the bit depth, the more accurate the output. For instance, running a recording session at 24bits means the audio will be more accurate than if it was running at 16bits.

Buffer size – How much data a computer program can handle at a time. Lower buffer sizes have lower latency, but are more susceptible to crashing. Higher buffer sizes have greater latency, but are less susceptible to crashing. The rule of thumb is to set your buffer size as low as possible when recording and as high as possible when mixing. This setting can be found in your DAW’s preferences.

Cardioid (or Unidirectional) – A unidirectional microphone with a heart-shaped pickup pattern. In other words, a microphone that picks up sounds in front of it, but not behind it.

Channel – Similar to a bus, a pathway through an audio device. For example, sound mixers have multiple input channels and output channels.

Clipping (or Peaking) – Another word for distorting. “Clipping” is usually used when a channel on a DAW or mixing board has too much volume being sent into it. In general, you want to give a channel enough headroom so that clipping doesn’t occur.

Condenser mic – A microphone commonly found in studios with a large frequency range and high sensitivity. Known for being very accurate.

Doubling – Recording a part multiple times to get a “thicker” sound.

Dynamic mic – A microphone found in both studio and live settings with a limited frequency range and lower sensitivity. Known for handling more aggressive or loud instruments.

Headroom – The amount of volume a channel can take before distorting. The louder the sound, the less headroom it has. For example, if a sound is peaking at -5dB, it has 5dB’s of headroom. If it’s peaking at -1dB, it has 1 dB of headroom.

Latency – The amount of delay between between the input and the output of a signal. Latency usually refers to the delay that occurs when someone tries to record something when there are too many plugins on the session. The input (the instrument) is delayed so that the output (the recording) is several milliseconds behind, causing an frustrating delay in a performer’s headphones.

Layering – Recording several copies of one musical part to be performed on top of each other. A more extreme version of doubling.

Mono – The opposite of stereo. A sound that has one source, rather than two.

Omnidirectional (or Omni) – A microphone that picks up sound from all direction.

Phantom power – +48 Volts. The power necessary to get a condenser mic to work. Most audio interfaces have a button that sends this power to a microphone that needs it.

Plosives – Sounds made from the mouth that blow quick bursts of air. Common examples are words with p’s, b’s, t’s, k’s, and d’s.

Polarity – The “direction” of a waveform. When you “flip the polarity” of a waveform, it turns the waveform upside down. Basically, the peaks are where the troughs once were, and vice versa. Polarity buttons (sometimes called phase buttons) are common on audio interfaces to keep stereo inputs in phase with each other.

Pop filter – a foam rubber windscreen that is placed in front of a microphone to reduce the sound of plosives.

Preamp – an amplifier that boosts the incoming signal from a microphone. Without a preamp, the sound from a microphone would be too quiet to use.

Proximity effect – The closer you get to the microphone, the more low frequencies are recorded. This phenomenon is only present when using a condenser or ribbon mic.

Ribbon mic – A microphone commonly found in studios with a high frequency range and sensitivity. Known for “coloring” the sound it records.

Room tone – The tone of the reverb produced in a room. Also refers to how the room “colors” a sound.

Sensitivity – How much sound a microphone can pick up. When the level of the mic’s preamp is increased, the sensitivity is increased. The more sensitive the mic, the more detail and background noise will be recorded.

Sibilance – The sound of an “s” in a word. For example, sit, stay, masks, etc. Usually problematic in vocal recording, as microphones pick up the sibilance sounds more than the rest of the frequency spectrum. One of the most popular ways to fix sibilance is a de-esser.

Stereo – The opposite of mono. A sound that has two sources, rather than one. Creates the illusion of horizontal space in recordings.

Talkback – A mic in the control room of a studio that allows the engineer or producer to talk to the performers who are recording in the studio.

Tape – The medium that sound was recorded on before the transition to computers in the mid-80’s. While technically less accurate than digital recording, tape is sought after for the warmth and aggression it adds to the sounds recorded on it.

 

Mixing

Automation – “telling” a program to do a certain thing at a certain time in a song. For example, automating a track to pan from left to right over the course of 4 bars. All major DAW’s have the capability to automate different parameters and plugins.

Auxiliary track (or Aux track)– A track that has no audio on it, but has audio being sent to it for processing.

Buffer size – How much data a computer program can handle at a time. Lower buffer sizes have lower latency, but are more susceptible to crashing. Higher buffer sizes have greater latency, but are less susceptible to crashing. The rule of thumb is to set your buffer size as low as possible when recording and as high as possible when mixing. This setting can be found in your DAW’s preferences.

Bus (or Buss) – The pathway along which an electrical signal flows. For example, the output of a DAW is referred to as the mix bus or stereo bus. The term is also used to describe an aux track with several tracks of the same instrument flowing into it. For example, if I set the output of each of my drum tracks to a single bus, than the aux track with that bus as the input is referred to as the Drum Bus.

Channel – Similar to a bus, a pathway through an audio device. For example, sound mixers have multiple input channels and output channels.

Clipping (or Peaking) – Another word for distorting. “Clipping” is usually used when a channel on a DAW or mixing board has too much volume being sent into it. In general, you want to give a channel enough headroom so that clipping doesn’t occur.

Comping – Combining several different takes of an instrument into one. Basically, copying the best parts of each recording and pasting them onto a single track, so that the performance of that instrument is the best it can be.

Crossfade – A specific type of fade where one sound fades in as another sound fades out. These are used when editing audio so that the transition between the two audio clips is smooth, rather than jarring.

DitheringAdding white noise to a recording to reduce distortion when the recording is exported at a lower bit rate. Only used during the mastering process.

Dry sound – An unprocessed sound; a sound without an effect on it. The opposite of a wet sound.

Fade – The increase and decrease of volume at the beginning and end of a sound or a song.

Gain – This is a synonym for volume, though it’s often used as another word for distortion.

Gain Staging – This refers to 1) the process of making sure a recording is the same volume after a plugin as it was before, and 2) the process of making sure all of the recordings in a session are relatively the same volume.

Headroom – The amount of volume a channel can take before distorting. The louder the sound, the less headroom it has. For example, if a sound is peaking at -5dB, it has 5dB’s of headroom. If it’s peaking at -1dB, it has 1 dB of headroom.

Latency – The amount of delay between between the input and the output of a signal. Latency usually refers to the delay that occurs when someone tries to record something when there are too many plugins on the session. The input (the instrument) is delayed so that the output (the recording) is several milliseconds behind, causing an frustrating delay in a performer’s headphones.

Mix bus (or Submix, Stereo output, Mix output, or Master output) – The channel that all of the audio of a session flows to.

Mono – The opposite of stereo. A sound that has one source, rather than two.

Pan pot (or Pan knob) – A control that places a sound in the left speaker, the right speaker, or somewhere in between.

Plug-in – A piece of software used within a DAW that processes the sound of a recording.

Processors (or Sound processors) – any hardware or software that changes the pitch, speed, loudness, or tone of a sound.

Send – A routing function inside a DAW that allows you to send a copy of an audio file to an auxiliary track without affecting the sound of the original file.

Stereo – The opposite of mono. A sound that has two sources, rather than one. Creates the illusion of horizontal space in recordings.

Sweetening – The process of enhancing a sound of a recording.

Wet sound – A fully processed sound; a sound with only an effect on it. The opposite of a dry sound.

 

EQ

Bandwidth – The amount of space on the frequency spectrum that the sounds of an instrument are being produced at. For example, an average electric guitar has a bandwidth of 80Hz-5kHz, as the instrument cannot produce sounds above or below those frequencies.

Equalization (or EQ) – A sound processor that can boost or cut particular frequencies in a sound.

Filter – a feature of an EQ that cuts the sound of the low end or the high end of the frequency spectrum. These are known as a High Pass Filter (HPF) and a Low Pass Filter (LPF), respectively.

Hertz (or Hz) – The unit of measurement for frequencies. After 1,000Hz, the unit is measured in Kilohertz (or kHz).

Highs (or High end, Top end, Treble, or Air) – The section of the frequency spectrum above 8kHz.

Kilohertz (or kHz) – 1000x the unit of measurement for frequencies. 1 kHz = 1,000 Hz.

Lows (or Low end, Bass, or Sub-bass) – The section of the frequency spectrum between 60Hz-200Hz.

Mids – The section of the frequency spectrum between 600Hz-3kHz.

Q – the width of a band in an EQ.

Room resonances (or Standing waves) – Every room has frequencies that build up more than others. These frequencies can mask the pleasant elements of a sound. By finding these frequency build-ups and cutting them using an EQ, we can improve the sound of a recording.

Spectrum analyzer – a visual graph that shows what frequencies are being produced in real-time by a sound.

(See Acoustics for more terms on frequencies and sound)

 

Compression

Attack – This refers to 1) the very beginning of a sound, and 2) the amount of time it takes after a sound begins for a sound processor to begin working. Usually measured in milliseconds (ms).

Compression – Reducing a signal’s output volume in relation to its input volume to reduce its dynamic range. Basically, when a sound gets louder than a certain level, a compressor turns the sound down somewhat. This controls the dynamics of that sound to make it more consistent.

De-esser – A processor that turns down sibilance (the sound of an “s”) when it happens in a vocal track.

Knee – A control on a compressor that changes how variable the severity of compression is once the threshold has been passed. A “soft” knee makes the compression less obvious, whereas a “hard” knee makes the compressor more obvious.

Limiter – A compressor with a ratio of :1, otherwise known as a “brick wall.” This means that when a sound reaches the threshold of a limiter, it doesn’t get any louder – it stays the exact same volume. This is used to prevent a track from peaking while at the same time increasing its perceived loudness.

Makeup gain – A parameter that allows you to increase the output volume of a sound processor that made the input sound quieter. For example, a compressor makes sounds softer, so makeup gain is needed to keep the sound at the same volume that it previously was.

Noise gate – A sound processor that cuts off the volume of a sound once it passes below a certain volume threshold.

Ratio – A parameter of a compressor that determines how hard the compressor clamps down on the volume of the audio. If a ratio is set to 2:1, then for every 2dB’s of audio that goes above the threshold, 1dB comes out. If the ratio is set to 4:1, then for every 4dB’s of audio that goes above the threshold, 1dB comes out. And so on.

ReleaseHow long it takes a sound processor to cease processing the sound. Usually measured in milliseconds (ms). For example, if the release of a compressor is set to 100ms, then the compressor will stop processing the sound 100ms after it has been activated.

Sibilance – The sound of an “s” in a word. For example, sit, stay, masks, etc. Usually problematic in vocal recording, as microphones pick up the sibilance sounds more than the rest of the frequency spectrum. One of the most popular ways to fix sibilance is a de-esser.

Threshold – A parameter of a sound processor that tells the processor to not kick in until the volume of an incoming sound exceeds the set volume limit. For example, a compressor does not start to turn down audio until the instrument gets louder than the threshold set by the user.

Transient – The very beginning section of a sound. Also known as the sound’s attack. It’s the loudest and most percussive part of the sound.

 

 

Distortion

Clipping (or Peaking) – Another word for distorting. “Clipping” is usually used when a channel on a DAW or mixing board has too much volume being sent into it. In general, you want to give a channel enough headroom so that clipping doesn’t occur.

Distortion – The result of a sound source overloading an amplifier or sound processor. Basically, new frequencies are added where there were none before. This can be pleasing or very harsh. The nature of the distortion depends on the equipment that is being distorted.

Fuzz – A specific type of distortion that cuts the tops off of waveforms to produce a particular sound. Fuzz sounds exactly like its name – fuzzy. Mostly used with electric guitars.

Gain – This is a synonym for volume, though it’s often used as another word for distortion.

Overdrive (or Drive) – Usually refers to the distortion that occurs when an amplifier is overloaded. Commonly used to describe guitar amp distortion. Considered to be “creamier” than the harshness of digital distortion.

Saturation – Usually refers to the distortion that occurs when a piece of analog equipment is overloaded by a sound passing through it. Though overloading digital equipment tends to produce harsh sounds, saturation can make a sound “fat,” “round,” or “smooth.” Saturation is one of the most sought-after parts of analog equipment.

 

Reverb/Delay

AmbienceBackground noise added to a musical recording to give the impression that it was recorded live. Often done using short room reverbs.

Decay – How fast a sound fades from a certain loudness.

Delay (or echo) – A processor that creates copies of a sound source that repeat over and over, fading slowly. Commonly used with vocals and electric guitar.

Feedback – When a signal is sent through an amplifier and into a microphone, which picks up the sound and sends it back through the amplifier, and so on. The loop of sound creates high pitched whines. Also refers to the parameter on a delay that adds more repetitions of the sound.

Ping-Pong – A delay that alternates between the left and right speakers.

PredelayA short delay between a sound and when an effect begins. Usually measured in milliseconds (ms). For example, a 50ms reverb pre-delay means that there is 50ms between the actual sound and when the reverberated sound starts.

Reverb – The sound of a room after a sound has been produced inside it. If more reverb is desired, it can be added to a recording digitally via a reverb plugin.

Slapback – A quick delay (30-200ms) with little to no repetitions.

 

Modulation Effects

Chorus – a sound processor that makes a sound seem doubled by creating several delayed copies of the original sound and slightly varying the pitch of each copy. Used to “thicken” a sound.

Decay – How fast a sound fades from a certain loudness.

Flanger – Uses the same process as a chorus, but with dramatically short delays. Rather than “thickening” a sound, a flanger is usually less subtle. It’s been described as sounding “like an airplane flying right over your head.”

Phaser – A sound processor that removes certain random frequencies by creating a copy of the soundwave and moving it back and forth, causing a “phasing” sound.

Pitch shifter – A sound processor that changes the pitch of a sound.

Sustain – How long a sound can hold before it begins to fade.

Threshold – A parameter of a sound processor that tells the processor to not kick in until the volume of an incoming sound exceeds the set volume limit. For example, a compressor does not start to turn down audio until the instrument gets louder than the threshold set by the user.

Tremolo – A sound processor that either quickly turns the volume of a sound up and down, or quickly pans it left to right.

 

 

Acoustics

Acoustic treatment – Panels made of fiberglass (among other things) that are hung from walls in order to deaden room reflections and balance the frequency response of a room. Treatment is very important when recording or when mixing using speakers.

Attack – This refers to 1) the very beginning of a sound, and 2) the amount of time it takes after a sound begins for a sound processor to begin working. Usually measured in milliseconds (ms).

Bandwidth – The amount of space on the frequency spectrum that the sounds of an instrument are being produced at. For example, an average electric guitar has a bandwidth of 80Hz-5kHz, as the instrument cannot produce sounds above or below those frequencies.

Decay – How fast a sound fades from a certain loudness.

Decibel (or dB) – the main unit of volume measurement. A dB is relative, as there are several different “scales” of dB’s that are used in audio (dB-FS being the most common, along with dB-VU, dB-RMS, and dB-LUFS). Each dB scale has a certain function in audio.

Dynamics (or dynamic range) – The loud and soft points of a sound over time. The higher the range, the more difference there is between the loudest point and the softest point.

Feedback – When a signal is sent through an amplifier and into a microphone, which picks up the sound and sends it back through the amplifier, and so on. The loop of sound creates high pitched whines. Also refers to the parameter on a delay that adds more repetitions of the sound.

Flat – A word used to describe a piece of gear that has no coloration to the sound; what comes in is what comes out. Most digital gear has a flat response, whereas most analog gear does not.

Fundamental – When a sound is produced by an instrument, a series of harmonics are created that determine the tone of that sound. The lowest (and loudest) of those frequencies is the fundamental. It is the primary harmonic of that sound.

Hertz (or Hz) – The unit of measurement for frequencies. After 1,000Hz, the unit is measured in Kilohertz (or kHz).

Kilohertz (or kHz) – 1000x the unit of measurement for frequencies. 1 kHz = 1,000 Hz.

Masking – The phenomenon when one’s perception of one sound is affected by the presence of another sound. Basically, if two sounds are present in the same frequency range, then it will be harder to distinguish between the two. You want to avoid masking in order to get your instruments to sit well in the mix.

Overtone – When a sound is produced by an instrument, a series of harmonics are created that determine the tone of that sound. All of the harmonics that aren’t the lowest (known as the fundamental) are known as overtones.

Phase – The nature of the location of two similar waveforms in relation to each other. If two similar waveforms are “in-phase,” then the peaks and troughs of the waves are lined up with each other. If the waveforms are “out-of-phase,” then the peaks are in line with the troughs. This causes low and low-mid frequencies to get lost. Ultimately, out-of-phase waveforms sound bad.

Plosives – Sounds made from the mouth that blow quick bursts of air. Common examples are words with p’s, b’s, t’s, k’s, and d’s.

Polarity – The “direction” of a waveform. When you “flip the polarity” of a waveform, it turns the waveform upside down. Basically, the peaks are where the troughs once were, and vice versa. Polarity buttons (sometimes called phase buttons) are common on audio interfaces to keep stereo inputs in phase with each other.

Room resonances (or Standing waves) – Every room has frequencies that build up more than others. These frequencies can mask the pleasant elements of a sound. By finding these frequency build-ups and cutting them using an EQ, we can improve the sound of a recording.

Sine Wave – A perfect soundwave with no harmonics or overtones. These cannot be produced in nature, but are the basis for many synthesizers and effects.

Timbre – another word for tone.

Transient – The very beginning section of a sound. Also known as the sound’s attack. It’s the loudest and most percussive part of the sound.

Waveform – The shape of a sound wave.

Wavelength – How long a wave is. The shorter the wavelength, the faster the wave.

 

Gear

Amplifier (or Amp) – A device that increases the volume of a signal. For instance, a guitar amp increases the volume of the signal picked up from the electric guitar, which is very quiet on its own.

Analog – Technology that does not use digital components. Often used to describe audio technology from before the mid-80’s. Analog gear tends to be sought after for the way that it “colors” sound.

Bit depth – A measure of the accuracy of a program. The higher the bit depth, the more accurate the output. For instance, running a recording session at 24bits means the audio will be more accurate than if it was running at 16bits.

Console (or Mixing desk, Mixer, or Sound board) – A device for recording, mixing, or live sound purposes that amplifies, balances, processes, and combines sounds. Basically, it’s the studio version of a DAW.

DAW – Digital Audio Workstation. The software that you record, edit, mix, and master in. Popular versions are Pro Tools, Logic Pro, GarageBand, Ableton Live, etc.

Fader – The part of the channel that controls the volume. Faders are always a straight line, in contrast to knobs/pots, which are always circular.

Flat – A word used to describe a piece of gear that has no coloration to the sound; what comes in is what comes out. Most digital gear has a flat response, whereas most analog gear does not.

Makeup gain – A parameter that allows you to increase the output volume of a sound processor that made the input sound quieter. For example, a compressor makes sounds softer, so makeup gain is needed to keep the sound at the same volume that it previously was.

Meter – A piece of software or hardware that analyzes certain data and visually shows you the results. For example, anything that shows the volume of a sound is a volume meter.

MIDI – Generally refers to the notes and other data recorded when using software instruments. It’s also used to refer to the software instruments themselves. For example, a software piano is also known as a MIDI piano, and the notes it records in your DAW are known as MIDI notes.

Monitors – This refers to 1) speakers that are used for mixing, or 2) the screen of a computer.

Pad – Something that can quickly reduce the input volume of a piece of hardware. Commonly found on microphones and preamps.

Phantom power – +48 Volts. The power necessary to get a condenser mic to work. Most audio interfaces have a button that sends this power to a microphone that needs it.

Plug-in – A piece of software used within a DAW that processes the sound of a recording.

Processors (or Sound processors) – any hardware or software that changes the pitch, speed, loudness, or tone of a sound.

Quarter Inch – also known as a TRS cable, these are cables that are commonly used for instruments like guitars and basses. Thicker versions of this cable are used for speakers.

XLR – A cable with three prongs that is used by microphones.

 

Misc.

Audio Engineer (or Sound Engineer) – someone who records, edits, mixes, or masters audio. Usually works in a studio or live concert setting.

Bounce – Another word for export. If you are “bouncing a track,” that means you’re just exporting a session into a listenable format, like an mp3 or wav file.

BPM – Beats Per Minute. It’s the tempo of the song.

Listener fatigue – The natural degradation of the accuracy of the human ear over several hours of listening. The ear is like a muscle – when it is used a lot, it gets tired. When a mixer reaches the point of listener fatigue, he or she needs to rest their ears, or they will start to make poor mixing choices as their ears are no longer accurate.

Looping – Repeating a section of a song over and over again.

Mute – An action that stops the sound of a channel from playing.

Sample – This refers to 1) a short section of music taken from one recording and repurposed in another, or 2) the smallest unit of measurement in digital sound.

Sample rate (or resolution) – A setting that determines how accurately audio that is being recorded onto a DAW is encoded. The higher the sample rate, the higher the sound quality of the recorded audio. However, it also leads to larger file sizes for the audio files.

Signal flow – Where a signal travels from the input of a system to the output. For example, the average signal flow of a sound would be the microphone, then the audio interface, then the DAW, then the performer’s headphones.

Solo – An action that temporarily mutes all sounds other than the one currently selected. Only the soloed sound is heard.

 

Don’t forget to grab the free PDF version of this guide to use for your reference. It’s an incredibly powerful tool to have while you’re learning to make pro-level music!

Songwriter and producer. Writer at Musician on a Mission. I’m here to help people make music that lasts.

Leave A Comment

Your email address will not be published.