Using Techniques That Do Not Depend on APIs

Using Techniques That Do Not Depend on APIs
Prev	Chapter 12. Making Your Web Site Mashable	Next

Use a Consistent and Rich URL Language

Chapter 2 analyzed the URL language of Flickr and showed how its highly addressable, granular, transparent, and persistent URL language opens up a lot of opportunities to mash up content from Flickr merely by exploiting Flickr’s URL structures. The human-readable, transparent URLs of Flickr lets developers link deeply into the fabric of the web site, even in the absence of formal documentation. The fact that Flickr works hard to keep the URLs permanent allows mashup creators to depend on the URLs to keep working. Granular URLs give mashups very fine-grained access and control over resources at Flickr. You will learn in Chapter 14 how these same qualities make it possible to use a social bookmarking system such as del.icio.us to bookmark content from Flickr. Hence, developing your own web site with a rich URL language avails your content to similar mashup techniques.

Moreover, the discipline of creating a consistent and human-readable URL structure benefits you as a content producer. It forces you to abstract the interface of your application (for example, the URL structures) from your back-end implementation, thus making your web site more maintainable and flexible.

Use W3C Standards to Develop Your Web Site

The use of good standards helps bring clarity to your web design, especially standards that insist on separating concerns (such as content from design). For instance, disentangling formatting from the markup and sticking it into CSS has a side benefit for mashup folks of producing content that is clearly laid out. Even generating well-formed XHTML (instead of tag-soup HTML) would be a huge boon since it allows for more error-free scraping of data. All this makes things more parsable even in the absence of explicit XML feeds.

Pay Attention to Web Accessibility

An accessible site lets more people access your content. You might be required by law to make your web site accessible to people with disabilities (see http://section508.gov/). Even if you aren’t legally obliged to produce accessible content, adhering to modern web design such as producing valid (X)HTML naturally contributes to producing better accessibility. The end product of increased accessibility (for example, clean separation of content from style) is more mashable than nonaccessible sites.

Consider Allowing Users to Tag Your Content

Tagging provides a lightweight way for users to interact with and label and annotate content. As I demonstrated in Chapter 3, those tags can be the basis of simple mashups. There are some tricky issues to consider when you create a system for tagging—for example, how to incorporate multiple words and what to do about singular vs. plural tags. There is no universally accepted way to do this, so you need to weigh the possibilities (I covered some in Chapter 3). Having a strategy for multilingual tags is helpful (in other words, how to handle Unicode).

Consider also whether you have built enough structure to allow the hacking of tags. Could a user have jump-started geotagging as was done in Flickr with your site? Do you have something equivalent to machine tags?

Make Feeds Available

In Chapter 4, you learned about syndication feeds, their syntax, and how they can be used to represent your content in different formats to be exported to other applications. Feeds are becoming ubiquitous on the Web—they’re the closest thing to the lingua franca of data exchange. Users by and large are beginning to expect feeds to be available from web sites. Users like syndication; they spend more time away from your site than on yours. Feeds let people access data from your site in their preferred local context (such as a feed reader). Moreover, there is a whole ecosystem built around feeds. By producing feeds, your data becomes part of that ecosystem.

Creating feeds out of your web site should be very high on a priority list. In fact, depending on what systems you are using to publish, you might already be generating them (for example, weblogs or many content management systems). By virtue of pushing your photos to Flickr, YouTube, and many other social sharing systems, you have the option of autogenerating feeds.

Feeds sound intimidating, but don’t worry. You can start small and grow them. You might have a single feed for the most recent content. See how that works for you. Then you can consider generating feeds throughout your system. (Remember that Flickr has an extensive selection of feeds.)

If you need to programmatically generate feeds, they represent a good place to start in the business of generating XML. You might ask which feed type to generate. Ideally, you should generate many types like Flickr does, which takes little effort. That is possible if you have an abstract model of the data that you then format for different format types by writing a template for each format. If you don’t want to go through that effort, then Atom 1.0 is a good place to start. Atom 1.0 is now recognized by lots of feed aggregators. It’s also a good stepping-stone toward building an API. (You would have the Atom Publishing Protocol, covered in Chapter 7, and GData as good prior art to start.) Moreover, Atom feeds can flow into Yahoo! Pipes and the Google Mashup Editor (GME). RSS 2.0 wouldn’t be far behind in my priority list. Also, if you want to get a start on experimenting with RDF and the semantic Web, a good place to start is to produce RSS 1.0.

Let’s return briefly to the issue of the feed ecosystem. As you have seen, Yahoo! Pipes and the GME use feeds natively. The Flickr API puts out many formats (as you saw in Chapter 6) but not RSS 2.0 or Atom, although there are many Flickr feeds. You saw in Chapter 11 that even with the extensive number of Flickr feeds to access the Flickr API, I still had to convert Flickr XML to RSS 2.0, which I did with Yahoo! Pipes. That conversion made the data available to the GME.

As a final note, try using feed autodiscovery to enable easier access to feeds by users (which was discussed in Chapter 4).

Finally, be friendly to extensions to feeds. Remember that RSS 2.0, Atom 1.0, and RSS 1.0 are all extensible. Make use of this extensibility. If your system consumes feeds that have extensions, don’t strip them out.

Make It Easy to Post Your Content to Blogs and Other Web Sites

In Chapter 5, you learned about how blogs can be integrated with web sites such as Flickr. Flickr’s Blog button allows users to post a photo to a weblog. Moreover, the Flickr All Sizes button makes it easy for users to embed a photo into a blog or other web site by providing HTML fragments that they readily copy and paste elsewhere. In a similar fashion, YouTube provides HTML to embed a video, and Google provides HTML to embed its maps and calendars. You as a content producer can emulate the practice of making it easy to post your content to other sites while linking back to your own web site, where the content originates. In addition to facilitating the flow of content from your web site, you track comments originating from other web sites through a variety of linkback mechanisms. (See Chapter 5 for more information.)

Encourage the Sharing of Content with Explicit Licenses

Licensing digital content clears away important barriers to creating mashups with that content. In your web site, you should allow users to explicitly set the licensing of content and data to use, such as the Creative Commons licenses do, for instance. Set defaults that encourage sharing, but always give your users the choice to change those defaults. Build functionality to enable users to search and browse content according to a license.

As you learned in Chapter 2, Flickr is a good model here. Flickr has done a huge amount to promote open content specifically licensed through a Creative Commons license. That users can explicitly tie a Creative Commons license to a piece of content has been a tremendous enabler for remixing. If you don’t give a mechanism for your users to assert a certain license, there might be too much ambiguity around the reuse of content. Even if you don’t have granular control over the licensing of content on the site, it’s very helpful to have a global statement about intellectual property issues. That is, some content producers license an entire site in a certain way. For example, the Wikipedia is licensed under GFDL:

http://en.wikipedia.org/wiki/Wikipedia:Copyrights

Freebase is licensed under CC-By:

http://www.freebase.com/signin/licensing

In Chapter 2, we discussed the barriers to screen-scraping. If you don’t have an API but don’t mind your users accessing your data, consider creating some bot-friendly terms of service (ToS).

Develop Extensive Import and Export Options for User Content

The more ways you have to get data in and out of an application, the better. Ideally, you would support protocols and data formats that would help your users. As a bonus, let your users embed their data hosted on your site somewhere else on the Web (for example, through a JavaScript badge). Super-flexible badges can be used themselves to access data for mashups and can hint at the existence of a feature-rich API.

Study How Users Remix Your Content and Make It Easier to Do So

Be prepared to be surprised by how people might use and reuse your content. See how people are using your content, and make it easier to do so. The primary example I have in mind here is when people started to hack the Google Maps API. Google, instead of stopping those people, actually formalized the API.

At the least, if you don’t want to develop an API, when you see people use your web site in unusual ways, you should think about what’s really go on and whether to make it easier to carry out this reuse.