Sat, 03 Mar 2007

Amazon, Screen Scraping and More Yahoo Pipes

When playing with Yahoo Pipes the other day, I mentioned that it would be nice if it had a screen-scraping module to build feeds for pages that don't one. It turns out there are a handful of services that already do that. I used Feedity to create of feed of the Prudent Bear news since my script stopped working at some point.

This morning I decided to update my Amazon aStore since they now allow you to add more than just the nine items on the homepage. I wanted to add a bunch of stuff from my wishlist. So I decided to try to get a list of ASINs for the items in my wishlist using Yahoo Pipes. Amazon provides RSS feeds of wishlists, but they only contain the last 10 items, so I used Ponyfish to create a screen-scraped feed. (I triggered an application error trying to do so with Feedity.)

Then I used Yahoo Pipes rename module to change the title of each item to the link, and the regex module to change the title to ASIN by extracting it from the URL. This gives me a feed of ASIN's. Admittedly, this is quite a convoluted solution to the actual problem I was trying to solve; curl and sed would have worked fine in this instance. It gave me a chance to play around with Yahoo Pipes, though.

tech | Permanent Link

The state is that great fiction by which everyone tries to live at the expense of everyone else. - Frederic Bastiat