npr.py

Description

npr.py is a Python script to download radio programs from NPR and other sources. If a particular NPR story interests you, this script can download it and convert it to an MP3 file, so you can listen on your MP3 player. It downloads programs from all "standard" NPR shows (shows that have pages on the main NPR site). This includes Morning Edition, All Things Considered, Fresh Air, Day to Day, etc. Shows that have their own sites (like Wait Wait...Don't Tell Me! or On the Media) aren't directly supported, but there are often other means to get these shows. Many of them offer MP3 downloads already.

In addition to NPR shows, it also downloads the Diane Rehm and Kojo Nnamdi shows from WAMU, To The Best of our Knowledge from Wisconsin Public Radio, and if all else fails, it handles generic rtsp:// links fairly well.

Requirements

You need these to run npr.py.

  • Python (I use version 2.4.2, probably works with other versions)
  • pyxml (if you use Gentoo, you need to emerge pyxml)

Options

These are only required in some situations, but are helpful to have.

  • pyid3lib is recommended for handling id3 tags (if you use Gentoo, you need to emerge pyid3lib). If you don't have it, it falls back to using lame's ID3 functionality.
  • mplayer is required for downloading non-NPR streams (I use 1.0pre8-3.4.4, it probably works with other versions). You will also need the win32 codecs (essential codecs)
  • lame is required for converting non-NPR streams into MP3s.
  • normalize is useful for normalizing the volume of non-NPR streams.

I have tested npr.py under Linux on an Intel machine, and under Windows with Cygwin's Python build.

Download

Download the script here.

What it does

For non-NPR shows, this script it runs mplayer to act as a RealPlayer streaming client, and simulates a user listening to a stream, saving the audio to a WAV file. The speed is limited by the server you connect to. Some servers stream their programs at listening speed, so a 50-minute show will always take 50 minutes to download. Other sites (WAMU) will stream as fast as you can process, so you can download a 50-minute show in under a minute.

After downloading a streamed show, it can be normalized, which makes the volume consistent. That way, you don't constantly need to adjust the volume on your MP3 player as you try to run. Finally, it's converted to an MP3 using lame.

For NPR shows, this script simply connects to NPR's site and downloads an MP3 of the show.

The program tries to be intelligent about the ID3 tags, too. It scrapes web pages and source files for data where it can, and tries to apply as much relevant ID3 data as possible. This is helpful if you want to do something else with the MP3 files, like make a podcast for your friends.

Basic usage

Usage is simple, just run with a single argument, for example:

npr.py http://www.npr.org/templates/story/story.php?storyId=5566231

This downloads the file, normalizes it, and converts it to MP3. Everything is done within the current directory, so make sure you have sufficient space.

Supported options

  • -n or --no-normalize skips running normalize on the output; not only does this eliminate the need for normalize, it allows you to use a FIFO instead of an intermediate .wav file (in other words, you don't need a lot of temporary space)
  • -f or --no-fifo prevents using a FIFO for intermediate output; a FIFO isn't used on systems where it isn't available (i.e. Windows)
  • -d or --no-delete prevents deleting the intermediate .wav file used during the download process
  • -v or --verbose shows detailed messages at every step

Show-specific notes

NPR

Here's how to download most NPR shows:

  1. Go to the show's daily summary page. Here's the link to "All Things Considered" for July 18, 2006: http://www.npr.org/templates/rundowns/rundown.php?prgId=2&prgDate=07-18-2006&view=storyview
  2. Find the story you're interested in. I'll use "Gonzales Testifies on U.S. Surveillance" as my example.
  3. Right-click the link and select Copy Link Address in your browser (or Copy Link Location, Copy Shortcut, or whatever your browser uses)
  4. In a terminal, type npr.py and paste the link as the argument, like this:
    npr.py http://www.npr.org/templates/story/story.php?storyId=5566231
  5. After a few minutes, you will have an MP3 file called 20060718_atc_16.mp3.

Diane Rehm / Kojo Nnamdi

Here's how to download one of these shows:

  1. Go to the show's daily summary page. Here's the link to "The Diane Rehm Show" for July 4, 2006: http://www.wamu.org/programs/dr/06/07/04.php
  2. Pick the segment you're interested in (the 10:00 or the 11:00 segment), and find the "Listen to this segment" caption for that segment (on the right side of the screen). I'll use the first segment for my example.
  3. Right-click the link and select Copy Link Address in your browser (or Copy Link Location, Copy Shortcut, or whatever your browser uses)
  4. In a terminal, type npr.py and paste the link as the argument, like this:
    npr.py http://www.wamu.org/audio/dr/06/07/r1060704-11395.ram
  5. After a few minutes, you will have an MP3 file called r1060704.mp3.

To The Best of our Knowledge

Maybe it's just me, but I find this website extremely confusing. It was redesigned recently and seems a little better, but it's still difficult to find a particular show.

If it's a recent show (the current week or last week), do this:

  1. Go to the show page; last week's show (as I write this) is http://www.wpr.org/book/lastweek.html
  2. Look for the blue "RealPlayer" icon next to the word "Listen!"
  3. Right-click the word "Listen!" and select Copy Link Address in your browser (or Copy Link Location, Copy Shortcut, or whatever your browser uses)
  4. In a terminal, type npr.py and paste the link as the argument, like this:
    npr.py http://broadcast.uwex.edu:8080/ramgen/wpr/bok/bok060723b.rm
  5. After a few minutes, you will have an MP3 file called 20060723_ttbook_b.mp3.

Sometimes the "Listen" link for older shows doesn't work; it takes you to a generic WPR archive page. If it's an older show, do this:

  1. Go to the Archives page:
    http://www.wpr.org/book/archives.html
  2. Select the year and month you're interested in; for this example, I'll use March 2006
  3. Find the show you're interested in; for this example I'll use Hour 2 of the 03/05/2006 show
  4. Sometimes the "Listen" link works; sometimes it doesn't. I generally click the title of the show to go to the show page. In this example, click "Failure".
  5. Look for the blue "RealPlayer" icon next to the word "Listen!"
  6. Right-click the word "Listen!" and select Copy Link Address in your browser (or Copy Link Location, Copy Shortcut, or whatever your browser uses)
  7. In a terminal, type npr.py and paste the link as the argument, like this:
    npr.py http://broadcast.uwex.edu:8080/ramgen/wpr/bok/bok060305b.rm
  8. After a few minutes, you will have an MP3 file called 20060305_ttbook_b.mp3.

Generic RTSP

I download "Wait Wait...Don't Tell Me!" once a week, and haven't gotten around to writing code specifically for it. This technique can be used for lots of other RTSP links, too.

  1. Go to the show's page, in my example: http://www.npr.org/programs/waitwait/
  2. Click the "Listen to this week's show" link
  3. You should be sent an .smil file (unless your browser handles this automatically)
  4. Find the audio link in the .smil file. It will look like this:
    <audio src="rtsp://real.npr.org:80/real.npr.na-central/waitwait/20060722_waitwait_full.rm?v1st=D62D6EC203B0507D&mt=7" />
  5. In a terminal, type npr.py and paste the link as the argument, like this:
    npr.py rtsp://real.npr.org:80/real.npr.na-central/waitwait/20060722_waitwait_full.rm
    (notice, the query string has been removed. Incidentally, the query parameters make the XML of the .smil file invalid.)
  6. After a few minutes, you will have an MP3 file called 20060722_waitwait_full.mp3

The MP3 file won't have any ID3 information filled out, but for grabbing the occasional link, this method should work well.

ID3

As much as possible, the script tries to put useful values into the ID3 tag. This allows the tag data to be reused by other programs, or put into a podcast.

For NPR stories, the description (taken from the story page) is put into the ID3 COMM (comment) field. Other fields (title, artist, author) are pulled from the .smil metadata; the year and date are taken from the date the story ran, and the wwwaudiofile field is set to the show's story page.

This data isn't available for all show types; it is either omitted or hardcoded for those shows.

Ongoing maintenance

These shows have been known to change their layout, and the way they link to streams. I listen to these shows fairly regularly and plan to keep the script up-to-date, but if you notice a problem with a download, please e-mail me at mpicker0@yahoo.com.

Other

Special thanks to Ed for providing the details on the new NPR download links. It's a lot faster and easier to get MP3 files directly!

Please support your public radio station! I do!