Using Greasemonkey to Access New York Times Permalinks

Greasemonkey is an add-­on for Firefox that allows you to change the behavior of web pages in your browser. That includes creating mashups. You already saw an example of a Greasemonkey script in Chapter 1—the Google Maps in Flickr Greasemonkey script. And here are some good references you can use to get the best out of Greasemonkey:

Links that show up on the New York Times online site typically expire after a week. That is, instead of going to the article, you are given an excerpt and a chance to purchase a copy of the article. However, in 2003, Dave Winer struck a deal with the New York Times to provide a mechanism to get weblog-­safe permalinks to articles.[127]New York Times link generator that compiles those permalinks and makes them available for lookup via a web form or a JavaScript bookmarklet.[128]New York Times article, and it will return to you a more permanent link.

Let’s look at an example. Consider the following URL:

http://www.nytimes.com/2007/04/04/education/04colleges.html

This corresponds to the following:

http://www.nytimes.com/2007/04/04/education/04colleges.html?ex=1333339200&en=3b7aac16a1ce4512&ei=5090&partner=rssuserland&emc=rss

You can see this for yourself by going to the link generator:

http://nytimes.blogspace.com/genlink?q=http://www.nytimes.com/2007/04/04/education/04colleges.html

When there is no permalink for an article, you will see a different type of output from the New York Times link generator. For example, consider the following:

http://www.nytimes.com/aponline/us/AP-Imus-Protests.html

This doesn’t have a permalink, as you can see from this:

http://nytimes.blogspace.com/genlink?q=http://www.nytimes.com/aponline/us/AP-Imus-Protests.html

Where’s a good place to stick a UI element for the permanent link on the New York Times page? There are lots of choices, but a good one is a toolbar with such elements as e-mail/print/single-page/save/share.

The basic logic of the Greasemonkey script we want to write consists of the following:

  1. If you are on a New York Times article, send the link to the New York Times link generator.

  2. If there is a permalink (which you will know is true if the href attribute of the first <a> tag starts with http: and not genlink), insert a new <li> element at the end of the <ul id="toolsList">.

Now let’s walk through the steps to get this functionality working in your own Firefox browser installation:

  1. Install the Greasemonkey extension if you don’t already have it installed.[129]

  2. Create a new script in Greasemonkey in one of two ways:

    1. a.You can go to http://examples.mashupguide.net/ch08/?newyorktimespermalinker.?user.js and click Install.

    2. b.Select Tools → Greasemonkey → New User Script, fill in Name/Namespace/?Description/?Includes, and then enter the following code. (Note the use of GM_xml httpRequest to find out what a more permanent link is. You will see in Chapter 10 the logic behind xmlhttpRequest.[130]

      // ==UserScript==
      // @name           New York Times Permlinker
      // @namespace      http://mashupguide.net
      // @description    Adds a link to a "permalink" or  "weblog-safe" URL 
      // for the NY Times article, if such a link exists
      // @include        http://*.nytimes.com/*
      // ==/UserScript==
      
      function rd(){
      
        // the following code is based on the bookmarklet written by Aaron Swartz
        // at http://nytimes.blogspace.com/genlink
        
        var x,t,i,j;
        // change %3A -> : and %2F -> '/'
        t=location.href.replace(/[%]3A/ig,':').replace(/[%]2f/ig,'/');
        
        // get last occurrence of "http://"
        i=t.lastIndexOf('http://');
        
        // lop off stuff after '&'
        if(i>0){
          t=t.substring(i);
          j=t.indexOf('&');   
          if(j>0)t=t.substring(0,j)
        }
      
        var url = 'http://nytimes.blogspace.com/genlink?q='+t;
        
        // send the NY Times link to the nytimes.blogspace.com service. 
        // If there is a permalink, then the href attribute of the first tag 
        //  will start with 'http:' and not 'genlink'.  
        // if there is a permalink, then insert a new li element at the end of the 
        // <ul id="toolsList">.
        
        GM_xmlhttpRequest({
        method:"GET",
        url:url,
        headers:{
          "User-Agent":"monkeyagent",
          "Accept":"text/html",
        },
        onload:function(details) {
          var s = details.responseText;
          var p = /a href="(.*)"/;
          var plink = s.match(p)[1];
          if ( plink.match(/^http:/) && 
               (tl = document.getElementById('toolsList')) )  { 
            plink = plink + "&pagewanted=all";
            plinkItem = document.createElement('li');
            plinkItem.innerHTML = '<a href="' + plink + '">PermaLink</a>';
            tl.appendChild(plinkItem);
          }     
        }
      });
        
      }
      
      rd();
                        
  3. What you will see now with this Greasemonkey script if you go to http://www.nytimes.com/2007/04/04/education/04colleges.html is the new entry Permalink underneath the Share button. The link may take a few seconds to appear while the permalink is retrieved.

Note that this Greasemonkey script is sensitive to changes in the way New York Times articles are laid out: older articles have a different page structure and therefore need some other logic to put the permalink in the right place.