pydub Tutorial: Audio Manipulation in Python

In this comprehensive tutorial, we will explore the powerful pydub library, a Python package that simplifies the process of working with audio files. Whether you are a music enthusiast, a data scientist, or a developer looking to integrate audio processing into your applications, pydub has got you covered.

Getting Started with Pydub

Before we dive into the wonders of Pydub, let’s make sure you have it installed. Open your terminal or command prompt and type:

pip install pydub

Now that Pydub is installed, let’s import it into our Python script or Jupyter notebook:

from pydub import AudioSegment

Setup ffmpeg

If you’re working with audio files that use certain codecs (e.g. mp3), you might need to install additional dependencies, such as ffmpeg. Without this, there is a good chance you will not be able to even load any audio files, much less play them.

To install this dependency, visit the official website and download the bundle for your respective OS. Unzip the folder, and you will see a bin folder, which contains a bunch of EXE’s. Copy the path to this bin folder (path should include the bin folder).

You now have two options.

You can add this path to your environment variables.
You can use this path inside your code, when initializing the AudioSegment class (imported in the previous step).

pydub.AudioSegment.ffmpeg = "/absolute/path/to/ffmpeg/bin"

Loading and Playing Audio

The first step in audio manipulation is loading a sound file, and being able to play it. In order to load an audio file using pydub, we will use the AudioSegment class we imported in the previous step.

audio = AudioSegment.from_file("countdown.mp3", format="mp3")

Playing audio is a little bit trickier, as pydub is an audio manipulation library, not an audio player library. It can only natively play .wav files, which are compatible with native Python. We have two options,

Convert the audio into .wav format.
Download a library like Python simpleaudio, or pyAudio which are design specifically to play a wide range of audio files.

The second option is recommended due to several issues which commonly occur on windows systems with the first option. All you have to do is install either simpleaudio or pyAudio, and pydub will automatically use them internally to play the audio. No need to make any changes to your code.

pip install pyAudio
pip install simpleaudio

Once you have installed either one of the two libraries, run the following:

from pydub import AudioSegment
from pydub.playback import play

# Load an audio file
audio = AudioSegment.from_file("countdown.mp3", format="mp3")

# Play the audio
play(audio)

If you want an audio file of your own to practice on and follow along with this tutorial, here is a download link to the one that we are using:

Download “Sample Audio (countdown.mp3)” countdown.mp3 – Downloaded 210 times – 410.29 KB

Slicing, Concatenating, and Repeating

Pydub makes it easy to slice and concatenate audio segments using list indexing. It uses milliseconds as the unit of measurement, allowing you to be very specific about where you slice from.

# Slice the audio from 1 to 3 seconds
audio1 = audio[1000:3000]

# Slice the audio from 6 to 9 seconds
audio2 = audio[6000:9000]

# Concatenate two audio segments
combined = audio1 + audio2

Every operation you perform returns a new audio segment, which means that the “segments” you sliced are also AudioSegment objects. It is also worth nothing, that AudioSegment objects are immutable, meaning that applying an operation on them returns a new object, instead of modifying the existing object.

You can check the duration of a audio segment by accessing the duration_seconds attribute.

without_the_middle.duration_seconds

You can also repeat the audio clip by using the multiplication operator:

# repeat the clip twice
repeated_audio = audio * 2

Applying Effects

Add creative effects to your audio with Pydub’s built-in effects:

# Apply a fade-in over the first 5 seconds
faded_in = audio.fade_in(5000)

# Apply a fade-out over the last 5 seconds
faded_out = audio.fade_out(5000)

# Increase the volume by 10 dB
louder = audio + 10

# Lower the volume by 10 dB
louder = audio - 10

You can get the volume of an audio clip by using the dBFS property, which is the decibels relative to the maximum possible loudness.

loudness = audio.dBFS

You can also check for the volume at specific points:

loudness = audio[1000].dBFS

This will return -infinity if the volume is zero.

Changing Audio Parameters

You can adjust various parameters of an audio file, such as the sample rate, channels, and bit depth:

# Change the sample rate to 44100 Hz
audio = audio.set_frame_rate(44100)

# Set the number of channels to mono
audio = audio.set_channels(1)

# Set the bit depth to 16 bits
audio = audio.set_sample_width(2)

You can also get the current framerate of an audio clip using the following property:

audio.frame_rate

Filtering Audio Frequencies

Filtering allows you to manipulate the frequency content of your audio:

# Apply a low-pass filter with a cutoff frequency of 1000 Hz
low_pass = audio.low_pass_filter(1000)

# Apply a high-pass filter with a cutoff frequency of 1000 Hz
high_pass = audio.high_pass_filter(1000)

Exporting Audio

After making changes to your audio, you might want to save the result. Pydub makes exporting audio just as simple:

# Export the modified audio to wav format
audio.export("output.wav", format="wav")

# Export the modified audio to FLAC format
audio.export("output.flac", format="flac")

Overlaying Audio with Pydub

Pydub provides a powerful method called overlay() that allows you to combine two audio segments to play simultaneously. This method is available on any audio segment object, and takes as a parameter another audio segment object to be overlaid. If the audio segment object to be overlaid is longer than the audio segment, then it will be cut-off at the end.

from pydub import AudioSegment

# Load two audio files
audio1 = AudioSegment.from_file("audio1.wav")
audio2 = AudioSegment.from_file("audio2.wav")

# Overlay sounds to play simultaneously
combined_audio = audio1.overlay(audio2)

You can loop the overlaid sound to extend its duration. If audio2 is shorter than audio1, it will be repeated until audio1 ends:

# Repeat sound2 until sound1 ends
new_audio = audio1.overlay(audio2, loop=True)

Conclusion

Congratulations! You’ve now embarked on a journey through the exciting landscape of audio manipulation in Python with Pydub. From basic tasks like loading and playing audio to advanced manipulations such as applying effects and filtering, Pydub empowers you to explore the creative and technical aspects of audio processing.

This tutorial only scratches the surface of Pydub and its capabilities. As you continue your exploration, don’t forget to check out the official Pydub documentation for more in-depth information and examples. Now, armed with this knowledge, go forth and create amazing audio projects with Python and Pydub!

Share on Facebook