When I explain mashups to others, I typically use the example of the web site Housingmaps.com, a mashup of Craigslist and Google Maps. Housingmaps.com is useful in ways that are quick and easy to understand, which invites repeated usage. It also requires no software beyond a modern web browser. Moreover, Housingmaps.com takes two already well-known web applications to create something new.
Figure 1-1 shows Housingmaps.com displaying a specific rental listing. Note the photos of the apartment and the links to Craigslist. All the data is drawn from Craigslist and then displayed in a Google map.
Housingmaps.com takes the list of houses, apartments, and rooms that are for sale or rent from Craigslist and displays them on a Google map. Note that it was invented by neither Google nor Craigslist but by an individual programmer, Paul Rademacher, who, at the time of its invention, was working for neither Google nor Craigslist but who was later hired by Google.
Craigslist provides links to Google Maps and Yahoo! Maps for any individual real estate listing, but it does not map the listings collectively. The single listing per map on the Craigslist interface makes it a challenge to mentally track the location of all the properties. Moreover, when looking for real estate, you often want to look at a narrowly defined neighborhood or find houses with good access to transit. With Craigslist, you have to click many links and manually piece together a lot of maps to focus your search geographically.
Housingmaps.com addresses these challenge by letting you see on a Google map all the Craigslist apartments or houses in a specific area, not just an individual item. At Housingmaps.com, geographical location becomes the primary lens for looking for real estate, with a map as the central element of the user interface.
The remixing occurs on the server side on a web site (Housingmaps.com) that is distinct from both the source web site (Craigslist) and the destination application (Google Maps). Data is drawn from the source and transformed into a Google map, which is embedded in web pages at Housingmaps.com.
This question really breaks down into two questions:
How does Housingmaps.com obtain the housing and rental data from Craigslist?
How does Housingmaps.com create a Google map of that data?
A desirable, and increasingly common, method for mashups to obtain data from a web site is through a web site’s publicly available application programming interface (API). An API is designed specifically to facilitate communication between programs, often including the exchange of data. (You will be introduced in detail to APIs in Chapters 6 and 7.)
At this time, Craigslist does not provide a public API but does provide RSS feeds. As I will discuss in Chapter 4, RSS feeds are used to syndicate, or transport, information from a web site to a program that consumes this information. The RSS feeds, however, do not provide enough detail to precisely position the listings on a map.
Consequently, Housingmaps.com screen-scrapes (or crawls) Craigslist; that is, Housingmaps.com retrieves and parses the HTML pages of Craigslist to obtain detailed information about each listing. The crawling is performed carefully so as to minimize the use of bandwidth. When you access Housingmaps.com, you are accessing not real-time data from Craigslist but rather the data that has been screen-scraped by Housingmaps.com.
Note | |
---|---|
Public APIs and RSS feeds are generally preferable to screen-scraping web sites. Screen-scraping, when poorly implemented, can overtax the data source. Always check that you are complying with the terms of service of the data source in how you use the data. |
To display the real estate information on a Google map, the current version of Housingmaps.com uses the Google Maps API,[9]
It’s interesting to go into a bit of history here to understand the emergence of the mashup phenomenon. When Housingmaps.com first showed up in April 2005, Rademacher was using Google Maps before it had any real API. He deciphered the original JavaScript of Google Maps and figured out how to incorporate Google Maps into Housingmaps.com. During the period between the release of Google Maps on February 8, 2005, and the publication of version 1 of the Google Maps API (on approximately June 29, 2005[10][11]
For this and other reasons we were thrilled to see “hackers” have a go at Google Maps almost immediately after we launched the site back in early February. Literally within days, their blogs described the inner workings of our maps more accurately than our own design documents did, and soon the most amazing “hacks” started to appear: Philip Lindsay’s Google Maps “stand-alone” mode, Paul Rademacher’s [Housingmaps.com], and Chris Smoak’s Busmonster, to mention a few.
Since the debut of Housingmaps.com, many other mashups—in fact, tens of thousands—have followed this pattern set of recasting data to make geographical location the organizing principle. These mashups cover an incredible range of topics and interests.[12]
Many other mashups involve extracting geocoded data (location information, often latitude and longitude) from one source to then place it on an online map (such as a Google map or Yahoo! map). I name two prominent examples here:
Adrian Holovaty’s Chicago crime map (http://chicagocrime.org
), which is a database of crimes reported in Chicago fronted by a Google Map interface
Weather Bonk, which is a mashup of weather data on a Google map (http://www.weatherbonk.com/weather/about.jsp
)
[11] Google Maps Hacks by Rich Gibson and Erle Schuyler (O’Reilly Media, 2006)
[12] See http://googlemapsmania.blogspot.com/
for many new mashups based on Google Maps that appear every day.