Monday, July 26, 2010

Game Data, AWS, and MySQL Challenges

Okay, here is my first report on the results of my hundred minute hack experiment.  The good news is the database has video game data from Amazon and I was able to stuff those results into an Excel file.  Getting data into a web page is next.  The pressure is on to find work in case I can't turn this idea profitable somehow.  There's good news even in that regard since I now have good samples I can use to market with to prospective employers.  Now for the details...

A third of the time spent on this project was getting that first information request to Amazon.  That involved grappling with encryption, url parameter manipulation, unicode issues, and other stuff that Amazon Web Services requires in order for you to get an information request through to them.  Eventually, I shifted strategies and turned to APIs for Amazon and MySQL.  After some gotchas with versions, platforms, dependencies, and so forth, the first request from Amazon to Python to MySQL made it through.

As data was concerned, a "cheap game" was defined as a physical PS3 game costing anywhere between a penny to 49.99.   Getting that data presented difficulties.  Just asking for game data causes Amazon to spit back a whole lot of Playstation console bundles, controllers, and so on along with game titles.

Building a Playstation hardware dataset helped even though it meant two more rounds of web service requests.  One round was needed for the hardware and accessories while a separate one was needed for game controllers.  A MySQL trigger cleared unwanted hardware from the game dataset.  The trigger approach made sense because it removed bad data from the game list once it was discovered rather than burden the system needlessly with subqueries and table joins at every single request.

So there you have it.  Several hundred web service requests done in 3 parts leading up to a collection of cheap games.  All told, that amounted to somewhere between 500 and 750 web service requests which translates to 5000 to 7500 records processed.  All this done to amass a collection of 341 Playstation 3 titles.

I have quite a bit to do on this project and I look forward to seeing where all this leads me.  I'll share as things progress.

No comments:

Post a Comment