Apress - Smart Home Automation with Linux (2010)- P44 pps

CHAPTER 6 ■ DATA SOURCES 198 (search for stations using the form at www.fcc.gov/mb/audio/fmq.html) or Ofcom in the United Kingdom. In the case of the latter, I was granted permission to take its closed-format Excel spreadsheet of radio frequencies (downloadable from www.ofcom.org.uk/radio/ifi/rbl/engineering/tech_parameters/ TxParams.xls) and generate an open version (www.minervahome.net/pub/data/fmstations.xml) in RadioXML format. From here, you can use a simple XSLT sheet to extract a list of stations, which in turn can tune the radio and set the volume with a command like the following: fm 88.6 75% When this information is not available, you need to search the FM range—usually 87.5 6 to 108.0MHz—for usable stations. There is an automatic tool for this, fortunately, with an extra parameter indicating how strong the signal has to be for it to be considered “in tune”: fmscan -t 10 >fmstations I have used 10 percent here, because my area is particularly bad for radio reception, with most stations appearing around 12.5 percent. You redirect this into a file because the fmscan process is quite lengthy, and you might want to reformat the data later. You can list the various stations and frequencies with the following: cat fmstations | tr ^M \\n\\r | perl -lane 'print $_ if /\d\:\s\d/' or order them according to strength: cat fmstations | tr ^M \\n\\r | perl -lane 'print $_ if /\d\:\s\d/' | awk -F :  '{ printf( "%s %s \n", $2, $1) }'| sort -r | head In both cases, the ^M symbol is entered by pressing Ctrl+V followed by Ctrl+M. You will notice that some stations appear several times in the list, at 88.4 and 88.6, for example. Simply pick one that sounds the cleanest, or check with the station call sign. Having gotten the frequencies, you can begin the search for program guides online to seek out interesting shows. These must invariably be screen-scraped from a web page that’s found by searching for the station’s own site. A term such as the following: radio 88.6 MHz uk generally returns good results, provided you replace uk with your own country. You can find the main BBC stations, for example, at www.bbc.co.uk/programmes. There are also some prerecorded news reports available as MP3, which can be downloaded or played with standard Linux tools. Here’s an example: mplayer http://skyscape.sky.com/skynewsradio/RADIO/news.mp3 6 The Japanese band has a lower limit of 76MHz. CHAPTER 6 ■ DATA SOURCES 199 CD Data When playing a CD, there are often two pieces of information you’d like to keep: the track name and a scan of the cover art. The former is more readily available and incorporated into most ripping software, while the latter isn’t (although a lot of new media center–based software is including it). What happens to determine the track names is that the start position and length of each song on the CD is determined and used to compute a single “fingerprint” number by way of a hashing algorithm. Since every CD in production has a different number of songs and each song has a different length, this number should be unique. (In reality, it’s almost unique because some duplicates exist, but it’s close enough.) This number is then compared against a database of known albums 7 to retrieve the list of track names, which have been entered manually by human volunteers around the world. These track names and titles are then added to the ID tag of the MP3 or OGG file by the ripping software for later reference. If you are using the CD itself, as opposed to a ripped version, then this information has to be retrieved manually each time you want to know what’s playing. A part-time solution can be employed by using the cdcd package, which allows you to retrieve the number of the disc, the name, its tracks, and their durations. cdcd tracks The previous example will produce output that begins like this: Trying CDDB server http://www.freedb.org:80/cgi-bin/cddb.cgi Connection established. Retrieving information on 2f107813. CDDB query error: cannot parseAlbum name: Total tracks: 19 Disc length: 70:18 Track Length Title 1: > [ 3:52.70] 2: [ 3:48.53] 3: [ 3:02.07] 4: [ 4:09.60] 5: [ 3:55.00] Although this lets you see the current track (indicated by the >), it is no more useful than what’s provided by any other media player. However, if you’ve installed the abcde ripper, you will have also already (and automagically) installed the cddb-tool components, which will perform the CD hashing function and the database queries for you. Consequently, you can determine the disc ID, its name, and the names of each track with a small amount of script code: ID=`cd-discid /dev/dvd` TITLE=`cddb-tool query http://freedb.freedb.org/~cddb/cddb.cgi 6 $(app) $(host) $ID` 7 This was originally stored at CDDB but more recently at FreeDB. CHAPTER 6 ■ DATA SOURCES 200 The app and host parameters refer to the application name and the host name of the current machine. Although their contents are considered mandatory, they are not vital and are included only as a courtesy to the developers so they can track which applications are using the database. The magic number 6 refers to the protocol in use. From this string, you can extract the genre: GENRE=ècho $TITLE | cut -d ' ' -f 2` and the disc’s ID and name: DISC_ID=ècho $TITLE | cut -d ' ' -f 3` DISC_TITLE=ècho $TITLE | cut -d ' ' -f 4-` Using the disc ID and genre, you can determine a unique track listing (since the genre is used to distinguish between collisions in hash numbers) for the disc in question, which allows you to retrieve a parsable list of tracks with this: cddb-tool read http://freedb.freedb.org/~cddb/cddb.cgi 6 $(app) $(host)  $GENRE $DISC_ID The disc title, year, and true genre are also available from this output. 8 A more complex form of data to retrieve is that of the album’s cover art. This is something that rippers, especially text-based ones, don’t do and is something of a hit-and-miss affair in the open source world. This is, again, because of the lack of available data sources. Apple owns a music store, where the covers are used to sell the music and are downloaded with the purchase of the album. If you rip the music yourself, you have no such option. One graphical tool that can help here is albumart. You can download this package from www.unrealvoodoo.org/hiteck/projects/albumart and install it with the following: dpkg -i albumart_1.6.6-1_all.deb This uses the ID tags inside the MP3 file to perform a search on various web sites, such as Buy.com, Walmart.com, and Yahoo! The method is little more than screen scraping, but provided the files are reasonably well named, the results are good enough and include very few false positives. When it has a problem determining the correct image, however, it errs on the side of caution and assigns nothing, waiting for you to manually click Set as Cover, which can take some time to correct. Once it has grabbed the art files, it names them folder.jpg in the appropriate directory, where it is picked up and used by most operating systems and media players. As a bonus, however, because the album art package uses the ID tags from the file, not the CD fingerprint, it can be used to find images for music that you’ve already ripped. 8 There is one main unsolved problem with this approach. That is, if there are two discs with the same fingerprint or two database entries for the same disc, it is impossible to automatically pick the correct one. Consequently, a human needs to untangle the mess by selecting one of the options. CHAPTER 6 ■ DATA SOURCES 201 ■ Note Unlike track listings, the cover art is still copyrighted material, so no independent developer has attempted to streamline this process with their own database. Correctly finding album covers without any IDs or metadata can be incredibly hard work. There is a two-stage process available should this occur. The first part involves the determination of tags by looking at the audio properties of a song to determine the title and the artist. MusicBrainz is the major (free) contender in this field. Then, once you have an ID tag, you can retrieve the image as normal. These steps have been combined in software like Jaikoz, which also functions as a mass-metadata editing package that may be of use to those who have already ripped your music, without such data. News Any data that changes is new, and therefore news, making it an ideal candidate for real-time access. Making a personalized news channel is something most aggregators are doing through the use of RSS feeds and custom widgets. iGoogle (www.google.com/ig), for example, also includes integration with its Google Mail and Calendar services, making this a disturbingly useful home page when viewed as a home page, but its enclosed nature makes it difficult to utilize this as a data input for a home. Instead, I’ll cover methods to retrieve typical news items as individual data elements, which can be incorporated in a manner befitting ourselves. This splits into two types: push and pull. Reported Stories: Push The introduction of push-based media can be traced either to 24-hour rolling news (by Arthur W Arundel in 1961) or to RSS 9 feeds, depending on your circumstances. Both formats appear to push the information in real time, as soon as it’s received, to the viewer. In reality, both work by having the viewer continually pull data from the stream, silently ignoring anything that hasn’t changed. In the case of TV, each pull consists of a new image and occurs several times a second. RSS happens significantly less frequently but is the one of interest here. RSS is an XML-based file format for metadata. It describes a number of pieces of information that are updated frequently. This might include the reference to a blog post, the next train to leave platform 9¾ from King’s Cross, the current stories on a news web site, and so on. In each case, every change is recorded in the RSS file, along with the all-important time stamp, enabling RSS readers to determine any updates to the data mentioned within it. The software that generates these RSS feeds may also remove references to previous stories once they become irrelevant or too old. However, old is defined by the author. This de facto standard allows you to use common libraries to parse the RSS feeds and extract the information quite simply. One such library is the PHP-based MagpieRSS (http://magpierss. sourceforge.net), which also supports an alternative to RSS called Atom feeds and incorporates a data 9 RSS currently stands for Really Simple Syndication, but its long and interesting history means that it wasn’t always so simple. CHAPTER 6 ■ DATA SOURCES 202 cache. This second feature makes your code simpler since you can request all the data from the RSS feed, without a concern for the most recent, because the library has cached the older stories automatically. You utilize MagpieRSS in PHP by beginning with the usual code: require_once 'rss_fetch.inc'; Then you request a feed from a given URL: $rss = fetch_rss($url); Naturally, this URL must reference an RSS file (such as www.thebeercrate.com/rss_feed.xml) and not the page that it describes (which would be www.thebeercrate.com). It is usually indicated by an orange button with white radio waves or simply an icon stating “RSS-XML.” In all cases, the RSS file appears on the same page whose data you want to read. You can the process the stories with a simple loop such as the following: $maxItems = 10; $lastItem = count($rss->items); if ($lastItem > $maxItems) { $lastItem = $maxItems; } for($i=0;$i < $maxItems;++$i) { /* process items here */ } As new stories are added, they do so at the beginning of the file. Should you want to capture everything, it is consequently important to start at the end of the item list, since they will disappear sooner from the feed. As mentioned earlier, the RSS contains only metadata, usually the title, description, and link to the full data. You can retrieve these from each item through the data members: $rss->items[$i]['link']; $rss->items[$i]['title']; $rss->items[$i]['description']; They can then be used to build up the information in the manner you want. For example, to re- create the information on your own home page, you would write the following: $html .= "<a href=".$rss->items[$i]['link'].">".$rss->items[$i]['title']."</a>"; $html .= "<p>".$rss->items[$i]['description']."</p>"; Or you could use a speech synthesizer to read each title: system("say default "+$rss->items[$i]['description']); You can then use an Arduino that responds to sudden noises such as a clap or hand waving by a sensor (using a potential divider circuit from Chapter 2, with a microphone and LDR, respectively) to trigger the full story. You can also add further logic, so if the story’s title includes particular key words, such as NASA, you can send the information directly to your phone. . GENRE=ècho $TITLE | cut -d ' ' -f 2` and the disc’s ID and name: DISC_ID=ècho $TITLE | cut -d ' ' -f 3` DISC_TITLE=ècho $TITLE | cut -d ' ' -f 4-` Using the. example, to re- create the information on your own home page, you would write the following: $html .= "<a href=".$rss->items[$i]['link'].">".$rss->items[$i]['title']."</a>";. the album’s cover art. This is something that rippers, especially text-based ones, don’t do and is something of a hit-and-miss affair in the open source world. This is, again, because of the

Định dạng
Số trang	5
Dung lượng	226,84 KB