Google Spreadsheets API

Let’s figure out how to use the Google Spreadsheets API, focusing specifically on PHP and Python wrapper libraries. You can also directly manipulate the feed protocol:

There are also a number of API kits available for Google Spreadsheets. I will first demonstrate how to use the Python API kit by creating a mashup involving the web services. I’ll then show you a simple example of using the Zend PHP API kit to read the spreadsheets generated in the mashup.

Python API Kit

Google provides a Python GData library and sample code to access the Google spreadsheet. You can either download specific releases (from or access the SVN repository:

            svn checkout

Note the dependencies on other libraries, especially ElementTree, which was not part of the standard Python libraries until version 2.5.[302]

I highly recommend reading the documentation on the Google site specific to the Python library:

Once you have the Python GData library installed, you can try some code samples, using the Python interpreter, to teach yourself how it works. First here are the obligatory imports:

import gdata.spreadsheet.service

Let’s then declare some convenience functions and variables:

            GoogleUser = "[your Google email address]"
            GooglePW = "[your password]"

Define the following convenience function:

            def GSheetService(user,pwd):
                gd_client = gdata.spreadsheet.service.SpreadsheetsService()
       = user
                gd_client.password = pwd
                gd_client.source = ''
                return gd_client

Instantiate a Google data client for your spreadsheet:

            gs = GSheetService(GoogleUser,GooglePW)
            sheets = gs.GetSpreadsheetsFeed()

To get a list of the spreadsheets, their titles, and their IDs, use this:

            print map(lambda e: e.title.text + " : "  +'/', 1)[1],sheets.entry)

This yields something like the following (which is based on my own spreadsheets):

            ['My Amazon WishList : o06341737111865728099.3585145106901556666', 'Udell Mini-
            Symposium May 1, 2007 : o06341737111865728099.1877210150658854761', 'weight.journal
            : o06341737111865728099.6289501454054682788', 'Plan : o10640374522570553588.

Note the key for the spreadsheet “My Amazon WishList.” The spreadsheet that the code I write here will be reading from and writing to is as follows:


You will need to create your own Google spreadsheets to work with since you won’t be able to write to mine. Note the ID of your spreadsheet, which you will use here.

In the browser, if I’m logged in as the owner of the spreadsheet, I can access this:{GSheetID}

For example:


Otherwise, I get a 404 error. Now I need to get the ID of the one worksheet in the “My Amazon Wishlist” spreadsheet. First use this:


For example, I use this:


to return the URL whose last segment is a worksheet ID—that is, a URL of the following form:


For example:


(in which case the worksheet ID is od6).

Now you can get the worksheet ID:

gs.GetWorksheetsFeed(key="[GSheetID]").entry[0].id.text.rsplit('/', 1)[1]

For example:

            id.text.rsplit('/', 1)[1]

There are two ways to get at the data in a worksheet—either in a ­list-­based way that gets you rows or in a ­cell-­based way that gets you a range of cells. I will show the ­list-­based approach here, which depends on the assumption that the first row is the header row.

For testing purposes, I created a spreadsheet with the header row and one line of data that I entered, as shown in Table 17-1.

Table 17.1. Table 17-1. The Sample Spreadsheet
Date Quantity ?ASIN DetailPageURL Title Author Added Price Desired
1590598385 Smart and Gets Things Joel Spolsky 6/5/2007 13.25 1? gp/product/1590598385/ Done: Joel Spolsky’s ? Concise Guide to ? Finding the Best ? Technical Talent ? (Hardcover)

The following returns a feed for the rows (there’s only one):

lfeed = gs.GetListFeed(key="[GSheetID]",wksht_id="[worksheetID]")

For example:

            lfeed =

You can see the content of the row with the following:


This results in the following:

            'ASIN: 1590598385, DetailPageURL:
            ie=UTF8&coliid=I1A0WT8LH796DN&colid=1U5EXVPVS3WP5, Author: Joel Spolsky, Date Added:
            6/5/2007, Price: 13.25, Quantity Desired: 1'

The following holds the data that has been mapped from ­namespace-­extended elements in the entry (see html#listFeedExample):


Let’s see this in action:

map(lambda e: (e[0],e[1].text), lfeed.entry[0].custom.items())

Running this returns the following:

            [('asin', '1590598385'), ('dateadded', '6/5/2007'), ('detailpageurl',
            ie=UTF8&coliid=I1A0WT8LH796DN&colid=1U5EXVPVS3WP5'), ('author', 'Joel
            Spolsky'), ('quantitydesired', '1'), ('price', '13.25'), ('title', "Smart and Gets
            Things Done: Joel Spolsky's Concise Guide to Finding the Best Technical Talent
            (Hardcover) ")]

Now let’s look at adding another row of data. Let’s see whether you can just duplicate the row by creating a dictionary of the first row and stick it into the second row:

            h = {}
            for (key,value) in lfeed.entry[0].custom.iteritems():
              h[key] = value.text

h now is as follows:

            {'asin': '1590598385', 'dateadded': '6/5/2007', 'detailpageurl':
            ie=UTF8&coliid=I1A0WT8LH796DN&colid=1U5EXVPVS3WP5', 'author': 'Joel
            Spolsky', 'quantitydesired': '1', 'price': '13.25', 'title': "Smart and Gets Things
            Done: Joel Spolsky's Concise Guide to Finding the Best Technical Talent (Hardcover)

To add the new row, use this:


For example:


To clear the second row you just added, you need to get an update lfeed that reflects the current state of the spreadsheet/worksheet and then issue a delete command:

            lfeed = gs.GetListFeed(key="[GSheetID]",

 The Google Spreadsheets API is under active development and is still in the process of maturation.

Mashup: Amazon Wishlist and Google Spreadsheets Mashup

To demonstrate how to use Google Spreadsheets for a simple mashup, I will show you how to write code that will transfer the contents of an wishlist to a Google Spreadsheets spreadsheet. Why do that? I use my wishlist to keep track of books and other stuff that I find interesting. If the wishlist belonged to someone else, I might want to download it into a spreadsheet to make it easier to generate a ­hard-­copy shopping list I could use.

Accessing the Wishlist Through the ECS Web Service

First, a word about how you can use to help you formulate the right ECS query to get the information you want to find. I figured out that I wanted to use the List Lookup query by using this:

Furthermore, I am using a ListType=WishList and the ListID=1U5EXVPVS3WP5. The URL for web interface to an wishlist is this:{ListID}/

Substituting your own AccessKeyId, you can get information about the list as a whole:


To get a page of the individual items, use the following URL:


Python Code to Mash Up and Google Spreadsheets

Now you can stitch all of this together with the following code, called (Remember to substitute your own parameters into this code.)

               an example to copy over a public Amazon wishlist to a Google Spreadsheet
               owned by the user based on code at
               GoogleUser = "[GoogleUSER]"
               GooglePW = "[GooglePASSWORD]"
               GSheet_KEY = "[GSheetID]"
               # GSheet_KEY = "o06341737111865728099.3585145106901556666"
               GWrkSh_ID = "[worksheetID]"
               #GWrkSh_ID = "od6"
               # AMAZON_LIST_ID = "1U5EXVPVS3WP5"
               AMAZON_ACCESS_KEY_ID = "[AMAZON_KEY]"
               from xml.dom import minidom
               import gdata.spreadsheet.service
               def getText(nodelist):
                   convenience function to return all the text in an array of nodes
                   rc = ""
                   for node in nodelist:
                       if node.nodeType == node.TEXT_NODE:
                           rc = rc +
                   return rc
               # a sample row for testing the insertion of a row into the spreadsheet
               GS_Example_Row = {'asin': '1590598385', 'dateadded': '6/5/2007', 'detailpageurl':
               ie=UTF8&coliid=I1A0WT8LH796DN&colid=1U5EXVPVS3WP5', 'author': 'Joel Spolsky',
               'quantitydesired': '1', 'price': '13.25', 'title': "Smart and Gets Things Done: Joel
               Spolsky's Concise Guide to Finding the Best Technical Talent (Hardcover) "}
               GS_HEADER = ['ASIN', 'DetailPageURL', 'Title', 'Author', 'Date Added', 'Price',
               'Quantity Desired']
               GS_KEYS = ['asin', 'detailpageurl', 'title', 'author', 'dateadded', 'price',
               class GSheetForAmazonList:
                   def __init__(self,user=GoogleUser,pwd=GooglePW):
                       gd_client = gdata.spreadsheet.service.SpreadsheetsService()
              = user
                       gd_client.password = pwd
                       gd_client.source = ''
                       self.gd_client = gd_client
                   def setKey(self,key):
                       self.key = key
                   def setWkshtId(self,wksht_id):
                       self.wksht_id = wksht_id
                   def listSpreadsheets(self):
                       return a list with information about the spreadsheets available to the user
                       sheets = self.gd_client.GetSpreadsheetsFeed()
                       return map(lambda e: (e.title.text ,'/', 1)[1]), 
                   def listWorkSheets(self):
                       wks = self.gd_client.GetWorksheetsFeed(key=self.key)
                       return map(lambda e: (e.title.text ,'/', 1)[1]),wks.entry)
                   def getRows(self):
                       return self.gd_client.GetListFeed(key=self.key,wksht_id=self.wksht_id).entry
                   def insertRow(self,row_data):
                       return self.gd_client.InsertRow(row_data,key=self.key,wksht_id=self.wksht_id)
                   def deleteRow(self,entry):
                       return self.gd_client.DeleteRow(entry)
                   def deleteAllRows(self):
                       entrylist = self.getRows()
                       i = 0
                       for entry in entrylist:
                           i += 1
                           print "deleted row ", i
               class amazonWishList:
               # we can use Python and WSDL
               # I've been wondering how to introspect using WSDL -- Mark Pilgrim has some answers:
               # well -- the introspection of the input parameters doesn't seem to yield the useful
               # stuff. I was hoping for more info
                   def __init__(self,listID=AMAZON_LIST_ID,amazonAccessKeyId=AMAZON_ACCESS_KEY_ID):
                       self.listID = listID
                       self.amazonAccessKeyId = amazonAccessKeyId
                   def getListInfo(self):
                       aws_url = "
               =%s" % (self.amazonAccessKeyId, self.listID)
                       import urllib
                       f = urllib.urlopen(aws_url)
                       dom = minidom.parse(f)
                       self.title = getText(dom.getElementsByTagName('ListName')[0].childNodes)
                       self.listLength = int(getText(dom.getElementsByTagName('TotalItems')[0].
                       self.TotalPages = int(getText(dom.getElementsByTagName('TotalPages')[0].
                       return(self.title, self.listLength, self.TotalPages)
                   def ListItems(self):
                       a generator for the items on the Amazon list
                       import itertools
                       for pageNum in xrange(1,self.TotalPages):
                           aws_url = "
               List&ListId=%s&ResponseGroup=ListItems,Medium&ProductPage=%s" % (
                           import urllib
                           f = urllib.urlopen(aws_url)
                           dom = minidom.parse(f)
                           items = dom.getElementsByTagName('ListItem')
                           for c in xrange(0,10):
                               yield items[c]
                   def parseListItem(self,item):
                       from string import join
                       from decimal import Decimal
                       itemDict = {}
                       itemDict['asin'] = getText(item.getElementsByTagName('ASIN')[0].childNodes)
                       itemDict['dateadded'] = getText(item.getElementsByTagName('DateAdded')[0].
                       itemDict['detailpageurl'] = getText(item.getElementsByTagName(
                       # join the text of all the author nodes, if they exist
                       authorNodes = item.getElementsByTagName('Author')
                       # blank not allowed
                       itemDict['author'] = join(map(lambda e: getText(e.childNodes), 
               authorNodes), ", ") or ' '
                       itemDict['quantitydesired'] = getText(item.getElementsByTagName(
                       titleNodes = item.getElementsByTagName('Title')
                       # blank title not allowed
                       itemDict['title'] = join(map(lambda e: getText(e.childNodes), 
               titleNodes), ", ") or ' '
                       # to fix -- not all things have a LowestNewPrice
                       itemDict['price'] = str(Decimal(getText(item.getElementsByTagName(
               'LowestNewPrice')[0].getElementsByTagName('Amount')[0].childNodes))/100) or ' '
                       return itemDict
               def main():
                   gs = GSheetForAmazonList(user=GoogleUser,pwd=GooglePW)
                   aWishList = amazonWishList(listID=AMAZON_LIST_ID,amazonAccessKeyId=
                   items = aWishList.ListItems()
                   print "deleting all rows..."
                   for item in items:
                           h = aWishList.parseListItem(item)
                           print h['asin']
                       except Exception, e:
                           print "Error %s parsing %s" % (e, item.toprettyxml("  "))
                       except Exception, e:
                           print "Error %s inserting %s" % (e, h['asin'])
               if __name__ == '__main__':

Here are some things to note about this code:

  • The GSheetForAmazonList class provides convenience methods for the Google GData library.

  • The error handling is essential since not all wishlist items necessarily have all the pieces of data requested. It’s important for the code to keep moving even if data is missing.

  • At least in the Python GData interface to Google Spreadsheets, you can’t insert blank cells.

  • amazonWishList.ListItems is a Python generator, which creates an iterator to parcel out the Amazon items one at a time. See pycon.html?t=gr,lnxw16=PyIntro for a tutorial on Python generators.

  • You can speed up the operation of this code through batch operations (http://code.?google.?com/?apis/gdata/batch.html), which are currently supported in the GData interface and in the Java API kit (but not Python).

Zend PHP API Kit for Google Spreadsheets

In this section, I’ll show you how to use the PHP API kit for Google Spreadsheets to read the contents of the Google Spreadsheets that we’ll generate in the previous mashup. You can download the Zend Framework from here:

and read about how to use the Zend Framework to access Google Spreadsheet here:

and here:

The following code first lists your spreadsheets and then the rows and named columns of the spreadsheet containing items from your wishlist:

            # user and password for google spreadsheet
            $user = "[GoogleUSER]";
            $pass = "[GooglePASSWORD]";
            # set parameters for your version of "My Amazon WishList" Google Spreadsheet
            $GSheetID = "[GSheetID]";
            #$GSheetID = "o06341737111865728099.3585145106901556666";
            # list entries from a spreadsheet
            $service = Zend_Gdata_Spreadsheets::AUTH_SERVICE_NAME;
            $client = Zend_Gdata_ClientLogin::getHttpClient($user, $pass, $service);
            $spreadsheetService = new Zend_Gdata_Spreadsheets($client);
            #  the following printFeed shows how to parse various types of feeds
            #  coming from Google Spreadsheets API
            #  function is extracted from
            function printFeed($feed)
              $i = 0;
              foreach($feed->entries as $entry) {
                if ($entry instanceof Zend_Gdata_Spreadsheets_CellEntry) {
                  print $entry->title->text .' '. $entry->content->text . "\n";
                } else if ($entry instanceof Zend_Gdata_Spreadsheets_ListEntry) {
                  print $i .' '. $entry->title->text .' '. $entry->content->text . "\n";
                } else {
                  print $i .' '. $entry->title->text . "\n";
            # figuring out how to print rows
            function printWorksheetFeed($feed)
              $i = 2;  # the first row of content is row 2
              foreach($feed->entries as $row) {
                print "Row " . $i .' '. $row->title->text . "\t";
                $rowData = $row->getCustom();
                foreach($rowData as $customEntry) {
                  print $customEntry->getColumnName() . " = " . $customEntry->getText(). "\t";
                print "\n";
            # first print a list of your Google Spreadsheets
            $feed = $spreadsheetService->getSpreadsheetFeed();
            # Print the content of a specific Spreadsheet/Worksheet
            # set a query to return a worksheet and print the contents of the worksheet
            $query = new Zend_Gdata_Spreadsheets_ListQuery();
            $listFeed = $spreadsheetService->getListFeed($query);

A Final Variation: Amazon Wishlist to Microsoft Excel via COM

With code to access the wishlist in hand, you can use COM programming to generate an Excel spreadsheet with the same information. To learn more about the details about how to do so, consult Chapter 12 of Python Programming on Win32.[303]

            from amazonListToGSheet import GS_HEADER, amazonWishList, AMAZON_LIST_ID, 
            from win32com.client import Dispatch
            # fire up the Excel application
            xlApp = Dispatch("Excel.Application")
            xlApp.Visible = 1
            # write the headers
            col = 1
            def insertRow(sheet,row,data,keys):
                col = 1
                for k in keys:
                    sheet.Cells(row,col).Value = data[k]
                    col += 1
            for h in GS_HEADER:
                xlApp.ActiveSheet.Cells(1,col).Value = h
                col +=1
            # now loop through the amazon wishlist
            aWishList =
            items = aWishList.ListItems()
            row = 2
            for item in items:
                    p = aWishList.parseListItem(item)
                    print p['asin']
                except Exception, e:
                    print "Error %s parsing %s" % (e, item.toprettyxml("  "))
                    row += 1
                except Exception, e:
                    print "Error %s inserting %s" % (e, p['asin'])