Saturday, December 3, 2011

My Radio Broadcast Podcast

While I have an enthusiasm for boppy, bubble-gum music, particularly as a backdrop to coding, I have also had a passion for opera in the past. I've even had questions used in the Opera Quiz on the Metropolitan Opera radio broadcasts.

Unfortunately, that passion is hard to fit into my life these days. Opera tickets are expensive unless you want to stand, and performances take a long time. This is the nature of opera.

This used to be easier. On Saturday mornings, I'd turn on the radio broadcast and listen to the opera for several hours as I went about my morning. Then I got into food. And farmers markets, my favorite of which are on Saturday mornings. That then became my normal food shopping day, and the radio broadcast went by the wayside. I'm usually coming home from shopping right as the opera ends.

Sure, you can listen to albums, but the Met broadcasts are fun because they provide context. The host describes the costumes, experts provide backstory, and they do the quiz.

When podcasts became popular, I realized that a podcast of the Met's Saturday matinees would be perfect. I sent letters asking the Met to do it. When they called and asked for money, I'd mention it. I'd even tell them that I would pay for such a podcast. Imagine: paying for Internet content!

They've never done it. And so I've fallen behind on opera, enjoying it as much as possible with a ticket or two a year to the San Francisco Opera.

Earlier this year, I was ranting about this yet again when I realized that I could probably craft my own podcast based on the radio broadcasts. What I wanted was the ability to call up a podcast on my iPod and see the latest opera broadcast, already synced. So began a day or two, off and on, of work on a podcast-creation script that would use radio stations as its source material.

Rube Goldberg would like this one.

First, I needed to figure out how to capture the music. A friend suggested FStream, Mac OS X software that had two valuable features: I could open a wide range of streaming URLs with it, and it was scriptable. I like scriptable apps. And these days, one can even use Ruby to do that scripting.

What I ultimately wanted was to not even think about this. That meant that my script would need to know a schedule. It reads in a config file with YAML entries that contain the name of the item, the start time, end time, and streaming URL. When the script runs, it parses the file and checks once a minute to see if it should be recording. If it should (and it isn't), it starts up FStream, points it to the appropriate URL, and tells it to start recording. When it reaches the end time, it tells FStream to stop recording.

Once the file is closed, the script uploads it to S3 and creates an XML file that points to all the appropriate links of the files. Voila: a podcast of streaming radio.

Though I did this originally with the Metropolitan Opera broadcasts in mind, it obviously works for any streaming radio. I've set up new entries for CapRadio's Friday Night at the Opera, and KDFC's San Francisco Opera broadcasts.

There are a couple of problems with this script. One is a bug that causes it to throw an exception when uploading the file. I'll fix that at some point. It just means I have to manually upload the files to S3. The other problem is logistical: Neither of our computers is on all the time. To add one more step to my baroque script, I set myself a reminder so that I know to set up my computer during the appropriate time period. I wonder if iCal is scriptable …

In addition to all the stuff this script is supposed to do, the script passes its config file through Ruby's ERB system. That means that I can actually set up my config file so that the start times are programatically driven (e.g., 9:00am on the coming Saturday).

I'd still like the Met to do a podcast of their own. I'd even still pay for it. But until they do, I not only have their broadcasts in a podcast, I have a wealth of others.

Here's the script, with various sensitive bits taken out. One thing I've found useful is to put my S3 connectivity information in a separate, included script so that I can distribute the main script and not accidentally include my S3 credentials. On the off chance you want to use this script, you'll need a file named aws_info.rb in the same directory as this script which defines three variables: S3_BUCKET_NAME, S3_ACCESS_KEY, and S3_SECRET_ACCESS_KEY.

require './awsinfo'
require 'rubygems'

require 'appscript'
include Appscript

require 'fileutils'
require 'time'
require 'yaml'
require 'aws/s3'
require 'erb'
require 'rss/1.0'
require 'rss/2.0'
require 'rss/maker'

class File
def name
pieces = self.path.split('/')
pieces[pieces.length - 1]
end
end

SLEEP_INTERVAL = 60
PODCAST_FILE = "podcast.xml"
PODCAST_URL = "https://s3.amazonaws.com/#{S3_BUCKET_NAME}"

$schedule_file = 'opera_schedule.yaml'
$is_recording = false
$current_schedule = nil

#constants are defined in awsinfo
def start_s3
AWS::S3::Base.establish_connection!(
:access_key_id => S3_ACCESS_KEY,
:secret_access_key => S3_SECRET_ACCESS_KEY
)
end

def stop_s3
AWS::S3::Base.disconnect!
end

def parse_time(time_string)
regex = /^(\d*?)-(\d*?)-(\d*?)\s*?(\d*?):(\d*)/
year = time_string[regex,1].to_i
month = time_string[regex,2].to_i
day = time_string[regex,3].to_i
hour = time_string[regex,4].to_i
minute = time_string[regex,5].to_i
Time.local(year,month,day,hour,minute)
end

def parse_schedule_file(filename = $schedule_file,time=Time.new)
schedule = File.open(filename,'r') {|file|YAML::load(ERB.new(file.read).result binding)}
end

#side effect: sets $current_schedule if appropriate
def should_be_recording(time=Time.new)
schedule = parse_schedule_file(filename=$schedule_file,time=time)
schedule_flag = false
schedule.each do |schedule_entry|
start_time = parse_time(schedule_entry['start_time'])
end_time = parse_time(schedule_entry['end_time'])
if (start_time..end_time) === time then
schedule_flag = true
$current_schedule = schedule_entry
break
end
end
schedule_flag
end

def next_scheduled_task(time=Time.new)
schedule = parse_schedule_file(filename=$schedule_file,time=time)
sorted_schedules = schedule.sort {|a,b| parse_time(a['start_time']) <=> parse_time(b['start_time'])}
return_schedule = nil

sorted_schedules.each do |entry|
if parse_time(entry['start_time']) > time then
return_schedule = entry
break
end
end
return_schedule
end

def add_file_to_podcast(file,schedule_info=$current_schedule)
xml_file = file_from_pieces(fstreams_dir,PODCAST_FILE)
rss = nil
if !File.exists?(xml_file) then
rss = RSS::Maker.make("2.0") do |maker|
maker.channel.title = "Derrick's Radio Podcast"
maker.channel.link = PODCAST_URL + PODCAST_FILE
maker.channel.description = "Radio programs captured by script"
maker.items.do_sort = true # sort items by date
end
File.open(xml_file,"w") {|file| file.write(rss)}
end

content = ""
File.open(xml_file,"r") do |existing_file|
content = existing_file.read
end
rss = RSS::Parser.parse(content,false)

item = item = RSS::Rss::Channel::Item.new
item.title = schedule_info['name']
item.date = File.mtime(file)
item.link = "#{PODCAST_URL}/#{File.new(file).name}"
item.pubDate = File.mtime(file)
item.enclosure = RSS::Rss::Channel::Item::Enclosure.new(item.link, File.size(file), 'audio/mpeg')
rss.items << item

File.open(xml_file,"w") {|file| file.write(rss)}
end

#todo: use fstream.recording flag
def is_recording
fstream = app('FStream')
puts fstream.status
$is_recording && fstream.status == 3
end

def fstreams_dir
fstream_path = './fstreams'
FileUtils.mkdir_p(fstream_path)
Dir.new(fstream_path)
end

def file_from_pieces(dir,file)
"#{dir.path}/#{file}"
end

def sync_dir
dir = fstreams_dir
start_s3
dir.entries.each do |filename|
next if filename =~ /^\..*/
file_path = file_from_pieces(dir,filename)

# if the file doesn't exist (or it's the podcast file), upload it
if !AWS::S3::S3Object.exists?(S3_BUCKET_NAME,filename) || filename == PODCAST_FILE then
AWS::S3::S3Object.store(filename,open(file_path),S3_BUCKET_NAME)
end
end
stop_s3
end

def s3_safe_name(english_name)
english_name.gsub(/\s/,'_').downcase
end

def start_recording
$is_recording = true
puts 'Starting to record'
fstream = app('FStream')
fstream.openStreamWithURL($current_schedule['from_url'])
fstream.startRecording
end

def stop_recording
puts 'Stopping recording'
fstream = app('FStream')
fstream.stopRecording
fstream.stopPlaying

# find the file most recently created, and rename it
dir = fstreams_dir
filepaths = []
dir.entries.each do |filename|
next if filename =~ /^\..*/
file paths << file_from_pieces(dir,filename)
end
filepaths.sort {|a,b| File.mtime(b) <=> File.mtime(a)}
filepath = file_from_pieces(dir,s3_safe_name($current_schedule['name'])+".mp3")

FileUtils.mv(filepaths[0],filepath)

# file can take a while to close
while true
begin
sleep(10)
File.mtime(filepath)
break
rescue exception
puts "Waiting for #{filepath} to close"
end
end

# update podcast file and S3
add_file_to_podcast(filepath)
sync_dir

$is_recording = false
$current_schedule = nil
end

def do_start_app
while(true)
if !is_recording && should_be_recording then
start_recording
elsif is_recording && !should_be_recording then
stop_recording
else
next_sched = next_scheduled_task
puts "#{Time.new} Next scheduled task is #{next_sched['name']} starting at #{next_sched['start_time']}" if next_sched
end
sleep(SLEEP_INTERVAL)
end
end

$schedule_file = ARGV[0] if ARGV.length >= 1
do_start_app if !$testing_mode





3 comments:

  1. That seems awfully complex...

    First - you don't have a server somewhere!? Sign up for Amazon's free usage tier: http://aws.amazon.com/free/ After a year it'll be like $10 a month at most.

    Or better yet, go get a cheap ($200) box from Fry's/Best Buy and throw Ubuntu on it. It'll take you all of 20 minutes to set up, then you can throw it in a corner and forget about it.

    Then after that, use the Streamripper command line tool (apt-get install streamripper) and point it at an MP3 stream. (If you absolutely insist on Mac, there's an OS X version as well. But you should really just set up an always-on Linux box somewhere as it's very useful.)

    I found this stream:

    http://915.kuscstream.org:8000/kuscaudio96.mp3

    Streamripper reads the ICEcast format, so it'll save the metadata as well. This is what it looks like:

    russell@shiny:~/Desktop$ streamripper http://915.kuscstream.org:8000/kuscaudio96.mp3
    Connecting...
    stream: KUSC-FM
    server name: Icecast 2.3.2
    declared bitrate: 96
    meta interval: 16000

    [skipping... ] Peter Tchaikovsky - Sleeping Beauty Selections [ 4.52M]
    [ripping... ] - Classical KUSC [ 1.48M]
    [ripping... ] Ralph Vaughan Williams - Symphony #7 "Sinfonia An [ 260kb]
    ^C
    shutting down
    bye..


    So that just saves out the MP3 files by name into a folder named by the stream. Pretty easy.

    Then there's a thing called 'cron'... It was created in 1979 I think, so maybe you haven't heard of it... ;-)

    -Russ

    ReplyDelete
  2. Well, hm, yes, that would be a better strategy :)

    I do have an EC2 server kicking around. I don't know why I didn't think of it. I'll check out streamripper and see if it'll work.

    The scheduling bit is a bit more complex than cron can easily do. At least without a lot of trial and error to make sure it's working properly. So I'll keep with my scheduling system for now.

    Good ideas! And easy enough to switch out the code of interest.

    ReplyDelete
  3. Through interaction with curriculum developers and teaching professionals at all levels we have prepared equipment packages and supporting materials to assist with the creation and delivery of educational content.

    How to Build Radio Station

    ReplyDelete