SoX – An Introduction

SoX – An Introduction

SoX, short for Sound eXchange, is a command line utility that can convert audio data into other formats. It can also apply various effects, play, record, mix, and filter the audio data. Refer to the development website, In a previous post I used it to create a spectrogram of an audio file using this command.

sox audio-in.wav -n spectrogram

Lets start with basic play operation. SoX can play audio data through the command line using the play command, here is an example

play audio-in.wav

It will display some of the audio file information as well as progress of the playing audio file. Here is an example of the output.


File Size: 22.6M Bit Rate: 706k
Encoding: Signed PCM
Channels: 1 @ 16-bit
Samplerate: 44100Hz
Replaygain: off
Duration: 00:05:00.15

In:50.0% 00:02:30.15 (00:02:30.00) Out:11.3M[ | ] Clip:0

This next command will record an audio clip of 5 minutes in duration.

rec audio-in.wav trim 0 5:00

This will also display some information with the progress of the recording.

Input File : 'default' (alsa)
Channels : 2
Sample Rate : 48000
Precision : 16-bit
Sample Encoding: 16-bit Signed Integer PCM

In:50.0% 00:02:30.15 (00:02:30.00) Out:11.3M[ | ] Clip:0

Wait a minute, there have been three different commands used. Don’t worry, they are all part of SoX. Lets create help files so we can view the command options outside of the CLI.

rec --help "/home/local/Desktop/ImageToSpectrograms/rec_help.txt"
play --help "/home/local/Desktop/ImageToSpectrograms/play_help.txt"
sox --help "/home/local/Desktop/ImageToSpectrograms/sox_help.txt"

Lets go back to our spectrogram function in SoX. Thanks to Kris Occhipinti’s channel,, you can view his demonstration on the topic. You can also find more on Kris’s blog, Here are some of the commands used.

sox audio-in.wav -n spectrogram -o colorscale-out.png
sox audio-in.wav -n spectrogram -m -o grayscale-out.png
sox audio-in.wav -n spectrogram -m -l -o negative-out.png

The command switches for the spectrogram filter are a buried in the help file. I simply put in a poorly formed command to get the help to display them by using this.

sox audio-in.wav -n spectrogram -o colorscale-out.png#

Along with the response that there was an error, I got these tips from the command.

-x num X-axis size in pixels; default derived or 800
-X num X-axis pixels/second; default derived or 100
-y num Y-axis size in pixels (per channel); slow if not 1 + 2^n
-Y num Y-height total (i.e. not per channel); default 550
-z num Z-axis range in dB; default 120
-Z num Z-axis maximum in dBFS; default 0
-q num Z-axis quantisation (0 - 249); default 249
-w name Window: Hann (default), Hamming, Bartlett, Rectangular, Kaiser
-W num Window adjust parameter (-10 - 10); applies only to Kaiser
-s Slack overlap of windows
-a Suppress axis lines
-r Raw spectrogram; no axes or legends
-l Light background
-m Monochrome
-h High colour
-p num Permute colours (1 - 6); default 1
-A Alternative, inferior, fixed colour-set (for compatibility only)
-t text Title text
-c text Comment text
-o text Output file name; default `spectrogram.png'
-d time Audio duration to fit to X-axis; e.g. 1:00, 48
-S time Start the spectrogram at the given time through the input

My last post mentioned that not many options were available to SoX when creating spectrograms, I stand corrected.

This post serves as an introduction to SoX. I don’t want to cover too many functions of the command here. If you can’t find the switches using the –help command, then form your command in a way that will nudge the command to give you them.

I hope you have an enjoyed this post and look forward to covering more on the topic. A big thank you to Kris Occhipinti, the demos and explanations are a great help. Please support Kris if you find his instruction useful.

Comments are closed.