Sunday, June 29, 2008

Ripping and Tagging Radio Paradise


Kjersti was out of town this weekend, so I spent some time tinkering with my MythTV. We moved a few weeks ago, and I haven't had the time to figure out why it was apparently not able to pick up the new cable listings. I decided to just do a fresh install of the newest Mythbuntu, based on Ubuntu Hardy Heron. After the usual headaches I got it working.

I was listening to Radio Paradise -- my favorite internet radio station -- and was remembering iLog, this little program I wrote when I worked at Sizzling Platter back in Utah. Basically, I'd sit at my desk coding all day, usually listening to Radio Paradise, and I'd hear a song I liked. So I'd keep this little text file open all day and write down the artist and title so I could download purchase the song later. I did this enough that I thought it'd be cool if I had a hotkey for it so I wouldn't have to alt-tab away from Visual Studio. iLog was a .NET/C# program I wrote to sit in the background as a Windows Daemon. When I pressed Control-Shift-G, it would "grab" the song and record it. Then, at the end of the day, it would email me all the songs.

Anyway, I took this a step further this weekend. I went and used my $200 Apple Store gift card and got a LaCie 500GB external hard drive and hooked it up to my Myth box. There's enough space there for about a year's worth of straight 128 kbps Radio Paradise MP3, so purely as an exercise, I set up my Myth box to save Radio Paradise to LaCie for me.

Turns out, it's not too hard to rip an internet radio stream with StreamRipper, which has a nice Linux CLI client. It will automatically split up the files and put the artist and title in the file name. Basically, it buffers the current song to an "incomplete" folder and saves it out when its done. As far as I can tell, it doesn't write any ID3 tags for you. It looks like you can set up rules and such in its configuration files, but I didn't want to dive into all that. Instead, I wrote a Ruby script that will
  1. Parse out the artist and title from the file name and write it to the MP3 using the handy id3lib-ruby library.
  2. Scrape Amazon for the album name and cover art, and again add those tags to the file.
  3. Move it to LaCie (symlinked at /home/greg/music).
Here's the code (syntax highlighting done with Spotlight):
 1 #!/usr/local/bin/ruby
2
3 require 'rubygems'
4 require 'id3lib'
5 require 'open-uri'
6
7 rip_dir = '/home/greg/ripstream/Radio_Paradise'
8 save_dir = '/home/greg/music'
9 global_genre = 'Eclectic Rock'
10
11 Dir.new(rip_dir).each do |file_name|
12 if file_name =~ /\.mp3$/ then
13 puts
14 puts "Processing #{file_name}"
15
16 file_path = rip_dir + '/' + file_name
17 tag = ID3Lib::Tag.new(file_path)
18 parts = file_name.split(/ \- /)
19
20 tag.artist = parts[0]
21 tag.title = parts[1].gsub(/\.mp3$/, '')
22 tag.genre = global_genre
23 tag.update!
24
25 puts "Artist: #{tag.artist}"
26 puts "Title: #{tag.title}"
27 puts "Genre: #{tag.genre}"
28
29 search_term = "#{tag.artist} #{tag.title}".gsub(/[^\w ]+/, '').gsub(/[ ]+/, '+')
30 results_page_url = "http://www.amazon.com/s/?field-keywords=#{search_term}"
31 results_page = ''
32 open(results_page_url) { |s| results_page = s.read }
33 results_page_match_data = /<a href="(.+?)"><span class="srTitle">(.+?)<\/span><\/a>/.match(results_page) if results_page.length > 0
34 if results_page_match_data then
35 tag.album = results_page_match_data[2]
36 tag.update!
37
38 puts "Album: #{tag.album}"
39
40 product_page_url = results_page_match_data[1]
41 product_page = ''
42 open(product_page_url) { |s| product_page = s.read }
43
44 product_page_match_data = /registerImage\(\"original_image\",.\"(.+?)"/.match(product_page) if product_page.length > 0
45 if product_page_match_data then
46 image_url = product_page_match_data[1]
47 if /\.jpe?g$/.match(image_url) then
48 tag << {
49 :id => :APIC,
50 :mimetype => 'image/jpeg',
51 :picturetype => 3,
52 :data => open(image_url).read
53 }
54 tag.update!
55
56 puts "Image URL: #{image_url}"
57 end
58 end
59 end
60
61 FileUtils.mv(file_path, save_dir, {:verbose => true})
62 end
63 end


I run StreamRipper all the time in a screen session and crontab this script to tag and move files once a day. I'll build up a nice collection in no time. The only annoyance is that StreamRipper seems to overshoot the end of the file sometimes. So many of the files start a few seconds into the song and likewise others play a few seconds of the next song that was played. However, this isn't that big of an issue because StreamRipper at least tags a sequence number in the "track number" frame of the MP3, so if you play them in order, it just melds together (and you can also appreciate Bill Goldsmith's artistic mixing). Also, there's probably some options to tune the StreamRipper configs to fix this.

Now I just need to get me one of them fancy new iPods that shows the album art, color and all.

3 comments:

Mark said...

Dude, you are a genious. I didn't understand any of that, but I wish I could rip songs that easy. You have talent

Cynthia said...

Huh?

SamYam said...

That code looks like porn for robots.

Post a Comment