{"id":3105,"date":"2017-09-18T00:00:19","date_gmt":"2017-09-18T07:00:19","guid":{"rendered":"http:\/\/192.168.3.4\/?p=3105"},"modified":"2025-09-12T04:59:06","modified_gmt":"2025-09-12T11:59:06","slug":"spectrograms-with-ffmpeg","status":"publish","type":"post","link":"https:\/\/www.cloudacm.com\/?p=3105","title":{"rendered":"Spectrograms with FFMpeg"},"content":{"rendered":"<p>In this post I&#8217;ll be covering how to create image files that represent the sound levels and frequencies of a media file.\u00a0 These images are known as spectrograms.\u00a0 They provide a way to visually locate moments in time.\u00a0 This can be useful for a number of reasons.<\/p>\n<p>FFMpeg has a feature that lets us create spectrograms with the showspectrumpic filter.\u00a0 Details about this filter can be found here, <a href=\"https:\/\/ffmpeg.org\/ffmpeg-filters.html#showspectrumpic\">https:\/\/ffmpeg.org\/ffmpeg-filters.html#showspectrumpic<\/a>.\u00a0 You can also get information about the filter by typing in this command.<\/p>\n<pre>ffmpeg -h filter=showspectrumpic<\/pre>\n<p>There are a few options for the filter I would like to point out.\u00a0 These are Size, Gain, Color, and Orientation.\u00a0 There are other options as well, but these are advanced concepts beyond the scope of this post, maybe in another post.<\/p>\n<p>The Size filter option lets you choose the dimensions of the spectrogram image.\u00a0 One thing to keep in mind is that spectrogram function creates a padded boarder around the actual spectrogram.\u00a0 It follows a consistant pattern of 116 pixels on each side and 64 pixels above and below the spectrogram.\u00a0 This will be worth noting when I demonstrate some advanced uses of this topic.<\/p>\n<p>Lets create a spectrogram with the default options.\u00a0 Use this command, replace the input and output file names to suit your needs.<\/p>\n<pre>ffmpeg -i audio-in.wav -lavfi showspectrumpic image-out.png<\/pre>\n<p>This should create an image file fairly quickly with the default dimensions of 4328 x 2176.\u00a0 The actual spcetrogram is 4096 x 2048 if we remove the scale padding that boarders the spectrogram.\u00a0 If you reference the filter help file, it states this, (default &#8220;4096&#215;2048&#8221;).\u00a0 For my purposes, I prefer to lower it down so I can read the scale information.\u00a0 I use this command with my options to get my intended results.<\/p>\n<pre>ffmpeg -i audio-in.wav -lavfi showspectrumpic=s=960x540 image-out.png<\/pre>\n<p><a href=\"https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_scale.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-3116 size-large\" src=\"https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_scale-1024x574.png\" alt=\"\" width=\"640\" height=\"359\" srcset=\"https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_scale-1024x574.png 1024w, https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_scale-300x168.png 300w, https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_scale-768x430.png 768w, https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_scale-482x270.png 482w, https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_scale.png 1192w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><\/a><\/p>\n<p>Now I can read the time and frequency scales easier.\u00a0 Change the scale around, try different options.<\/p>\n<p>Next I&#8217;m going to cover Orientation.\u00a0 The only time you would use this filter is when you want the frequency scale to run horizontal.\u00a0 It runs vertical by default.\u00a0 Use this command to change the orientation.<\/p>\n<pre>ffmpeg -i audio-in.wav -lavfi showspectrumpic=s=960x540:orientation=1 image-out.png<\/pre>\n<p><a href=\"https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_orient.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-3117 size-large\" src=\"https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_orient-1024x574.png\" alt=\"\" width=\"640\" height=\"359\" srcset=\"https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_orient-1024x574.png 1024w, https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_orient-300x168.png 300w, https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_orient-768x430.png 768w, https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_orient-482x270.png 482w, https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_orient.png 1192w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><\/a><\/p>\n<p>The start of the sound begins at the top of the image, while the bottom is the end of the file.\u00a0 You can also see a much higher range of 20Khz in the frequency scale when we change the orientation.\u00a0 It appears to top out at 12Khz on the default orientation, something to note.<\/p>\n<p>Now let&#8217;s change the color of our spectrogram that reflects sound levels at different frequencies.\u00a0 There are 9 possible options available, which I won&#8217;t demonstrate all here.\u00a0 So far we have used the default &#8220;intensity&#8221; color setting.\u00a0 Here is the command to change it to the &#8220;fiery&#8221; color setting.<\/p>\n<pre>ffmpeg -i audio-in.wav -lavfi showspectrumpic=s=960x540:color=6 image-out.png<\/pre>\n<p><a href=\"https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_color.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-3119 size-large\" src=\"https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_color-1024x574.png\" alt=\"\" width=\"640\" height=\"359\" srcset=\"https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_color-1024x574.png 1024w, https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_color-300x168.png 300w, https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_color-768x430.png 768w, https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_color-482x270.png 482w, https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_color.png 1192w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><\/a><\/p>\n<p>Some of the details become clearer when color scales are changed.\u00a0 Try different color scales to see how your sound information changes in the spectrogram.\u00a0 Some higher or lower frequencies become easier to see.<\/p>\n<p>The last filter option I would like to cover is gain.\u00a0 The gain scale has a default value of 1 and can range from 0 to 128 with floating point values.\u00a0 Setting a gain of less than 1 decreases the sound level, whereas setting it higher increases it.\u00a0 Here are two commands to set higher and lower gain levels.<\/p>\n<pre>ffmpeg -i audio-in.wav -lavfi showspectrumpic=s=960x540:gain=5 image-out.png<\/pre>\n<p><a href=\"https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_gain5.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-3121 size-large\" src=\"https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_gain5-1024x574.png\" alt=\"\" width=\"640\" height=\"359\" srcset=\"https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_gain5-1024x574.png 1024w, https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_gain5-300x168.png 300w, https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_gain5-768x430.png 768w, https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_gain5-482x270.png 482w, https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_gain5.png 1192w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><\/a><\/p>\n<pre>ffmpeg -i audio-in.wav -lavfi showspectrumpic=s=960x540:gain=.5 image-out.png<\/pre>\n<p><a href=\"https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_gain.5.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-3122 size-large\" src=\"https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_gain.5-1024x574.png\" alt=\"\" width=\"640\" height=\"359\" srcset=\"https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_gain.5-1024x574.png 1024w, https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_gain.5-300x168.png 300w, https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_gain.5-768x430.png 768w, https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_gain.5-482x270.png 482w, https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram_gain.5.png 1192w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><\/a><\/p>\n<p>This is just another method to draw attention to sound levels that would normally be hard to spot with the default gain values.<\/p>\n<p>There are other filter options, such as window function, which I won&#8217;t cover here.\u00a0 You can see that there are many possiblities available for rendering spectrographic images of sound media using FFMpeg.<\/p>\n<p>I would like to also demonstrate another tool called Sox that does a similar, but more limited rendering of spectrograms.\u00a0 Install Sox if you haven&#8217;t already with this command<\/p>\n<pre>sudo apt-get install sox<\/pre>\n<p>After the install, you can create a spectrogram using this command.<\/p>\n<pre>sox audio-in.wav -n spectrogram<\/pre>\n<p><a href=\"https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-3123 size-full\" src=\"https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram.png\" alt=\"\" width=\"944\" height=\"593\" srcset=\"https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram.png 944w, https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram-300x188.png 300w, https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram-768x482.png 768w, https:\/\/www.cloudacm.com\/wp-content\/uploads\/2017\/09\/spectrogram-430x270.png 430w\" sizes=\"auto, (max-width: 944px) 100vw, 944px\" \/><\/a><\/p>\n<p>This will create an image file in the working directory where you ran the command.\u00a0 There aren&#8217;t as many options available to Sox as there are with FFMpeg, but you get the idea on some of the basic uses.<\/p>\n<p>Spectrograms are interesting ways to represent sound data.\u00a0 There is much more that can be done with spectrograms than I&#8217;ve covered here.\u00a0 I hope you have found this introduction useful.\u00a0 In the coming weeks I&#8217;ll be covering more audio specific topics and we&#8217;ll return to spectrograms and there uses.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this post I&#8217;ll be covering how to create image files that represent the sound levels and frequencies of a media file.\u00a0 These images are known as spectrograms.\u00a0 They provide a way to visually locate moments in time.\u00a0 This can be useful for a number of reasons. FFMpeg has a feature that lets us create spectrograms with the showspectrumpic filter.\u00a0 Details about this filter can be found here, https:\/\/ffmpeg.org\/ffmpeg-filters.html#showspectrumpic.\u00a0 You can also get information about the filter by typing in&#8230;<\/p>\n<p class=\"read-more\"><a class=\"btn btn-default\" href=\"https:\/\/www.cloudacm.com\/?p=3105\"> Read More<span class=\"screen-reader-text\">  Read More<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[9,6,3],"tags":[],"class_list":["post-3105","post","type-post","status-publish","format-standard","hentry","category-computer-vision","category-raspberry-pi","category-rd"],"_links":{"self":[{"href":"https:\/\/www.cloudacm.com\/index.php?rest_route=\/wp\/v2\/posts\/3105","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.cloudacm.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.cloudacm.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.cloudacm.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.cloudacm.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=3105"}],"version-history":[{"count":14,"href":"https:\/\/www.cloudacm.com\/index.php?rest_route=\/wp\/v2\/posts\/3105\/revisions"}],"predecessor-version":[{"id":4993,"href":"https:\/\/www.cloudacm.com\/index.php?rest_route=\/wp\/v2\/posts\/3105\/revisions\/4993"}],"wp:attachment":[{"href":"https:\/\/www.cloudacm.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=3105"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.cloudacm.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=3105"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.cloudacm.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=3105"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}