SOAP is a complicated topic of which I readily admit to having only a limited understanding. SOAP and the layers of technologies built on top of SOAP—WSDL, UDDI, and the various WS-* specifications (http://en.wikipedia.org/wiki/WS-%2A
)—are clearly getting lots of attention, especially in enterprise computing, which deals with needs addressed by this technology stack. I cover SOAP and WSDL (and leave out the other specifications) in this book because some of the APIs you may want to use in creating mashups are expressed in terms of SOAP and WSDL. My goal is to provide practical guidance as to how to consume such services, primarily from the perspective of a PHP and Python programmer.
As with XML-RPC, SOAP and WSDL are supposed to make your life as a programmer easier by abstracting away the underlying HTTP and XML exchanges so that web services look a lot like making a local procedure call. I’ll start with simple examples, using tools that make using SOAP and WSDL pretty easy to use, in order to highlight the benefits of SOAP and WSDL, and then I’ll move to more complicated examples that show some of the challenges. Specifically, I’ll show you first how to use a relatively straightforward SOAP service (geocoder.us), proceeding to a more complicated service (Amazon.com’s ECS AWS), and then discussing what turns out to be unexpectedly complicated (the Flickr SOAP interface).
As you learned in Chapter 6, the process of using the Flickr REST interface generally involves the following steps:
Finding the right Flickr method to use
Figuring out what parameters to pass in and how to package up the values
Parsing the XML payload
Although these steps are not conceptually difficult, they do tend to require a fair amount of manual inspection of the Flickr documentation by any developer working directly with the Flickr API. A Flickr API kit in the language of your choice might make it easier because it makes Flickr look like an object in that language. Accordingly, you might then be able to use the facilities of the language itself to tell you what Flickr methods are available and what parameters they take and be able to get access to the results without having to directly parse XML yourself.
You might be happy as a user of the third-party kit, but the author of any third-party kit for Flickr must still deal with the original problem of manually translating the logic and semantics of the Flickr documentation and API into code to abstract it away for the user of the API kit. It’s a potentially tedious and error-prone process. In Chapter 6, I showed you how you could use the flickr.reflection
methods to automatically list the available API methods and their parameters. Assuming that Flickr keeps the information coming out of those methods up-to-date, there is plenty of potential to exploit with the reflection methods.
However, flickr.reflection.getMethodInfo
does not currently give us information about the formal data typing of the parameters or the XML payload. For instance, http://www.flickr.com/
services/api/flickr.photos.search.html
tells us the following about the per_page
argument: “Number of photos to return per page. If this argument is omitted, it defaults to 100. The maximum allowed value is 500.” Although this information enables a human interpreter to properly formulate the per_page
argument, it would be difficult to write a program that takes advantage of this fact about per_page
. In fact, it would be useful even if flickr.reflections.getMethodInfo
could tell us that the argument is an integer without letting us know about its range.
That’s where Web Services Definition Language (WSDL) comes in as a potential solution, along with its typical companion, SOAP. There are currently two noteworthy versions of WSDL. Although WSDL 2.0 (documented at http://www.w3.org/TR/2007/REC-wsdl20-20070626/
) is a W3C recommendation, it seems to me that WSDL 1.1, which never became a de jure standard, will remain the dominant version of WSDL for some time (both in WSDL documents you come across and the tools with which you will have easy access). WSDL 1.1 is documented at http://?w
ww.w3.org/TR/wsdl
.
A WSDL document specifies the methods (or in WSDL-speak operations) that are available to you, their associated messages, and how they turned in concrete calls you can make, typically through SOAP. (There is support in WSDL 2.0 for invoking calls using HTTP without using SOAP.) Let me first show you concretely how to use WSDL, and I’ll then discuss some details of its structure that you might want to know even if you choose never to look in depth at how it works.
Consider the geocoder.us service (http://geocoder.us/
) that offers both free noncommercial and for-pay commercial geocoding for U.S. addresses. You can turn to the API documentation (http://geocoder.us/help/
) to learn how to use its free REST-RDF, XML-RPC, and SOAP interface. There are three methods supported by geocoder.us:
geocode
: Takes a U.S. address or intersection and returns a list of results
geocode_address
: Works just like geocode
except that it accepts only an address
geocode_intersection
: Works just like geocode
except that it accepts only an intersection
Let’s first use the interface that is most familiar to you, which is its REST-RDF interface, and consider the geocode
method specifically. To find the latitude and longitude of an address, you make an HTTP GET
request of the following form:
http://geocoder.us/service/rest/geocode?address={address}
For example, applying the method to the address of Apress:
http://geocoder.us/service/rest/geocode?address=2855+Telegraph+Ave%2C+Berkeley%2C+CA
gets you this:
<?xml version="1.0"?> <rdf:RDF xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" > <geo:Point rdf:nodeID="aid78384162"> <dc:description>2855 Telegraph Ave, Berkeley CA 94705</dc:description> <geo:long>-122.260070</geo:long> <geo:lat>37.858276</geo:lat> </geo:Point> </rdf:RDF>
Now let’s make the same call using the SOAP interface. Instead of making the SOAP call directly to the geocode
method, let’s use the WSDL document for the service:
http://geocoder.us/dist/eg/clients/GeoCoderPHP.wsdl
![]() | Note |
---|---|
Because the first WSDL document ( |
I will use the WSDL document in a variety of ways to teach you the ideal usage pattern for WSDL, which involves the following steps:
A SOAP/WSDL tool/library takes a given WSDL document and makes transparent the operations that are available to you.
For a given operation, the SOAP/WSDL tool makes it easy for you to understand the possible input parameters and formulate the appropriate request message.
The SOAP/WSDL tool then returns the response to you in some easy-to-parse format and handles any faults that come up in the course of the operation.
My favorite way of testing a WSDL file and issuing SOAP calls is to use a visual IDE such as oXygen (http://www.oxygenxml.com/
). Among the plethora of XML-related technologies supported by oXygen is the WSDL SOAP Analyser. I describe how you can use it to invoke the geocoder.us geocode operation to illustrate a core workflow.
![]() | Note |
---|---|
oXygen is a commercial product. You can evaluate it for 30 days free of charge. XML Spy ( |
When you start the WSDL SOAP Analyser, you are prompted for the URL of a WSDL file. You enter the URL for the geocoder.us WSDL (listed earlier), and oXygen reads the WSDL file and displays a panel with four subpanels. (Figure 7-1 shows the setup of this panel.) The first subpanel contains three drop-down menus for three types of entities defined in the WSDL file:
Services
Ports
Operations
The geocoder.us WSDL file follows a pattern typical for many WSDL files: it has one service (GeoCode_Service
) tied to one port (GeoCode_Port
), which is tied, through a specific binding, to one or more operations. It’s this list of operations that is the heart of the matter if you want to use any of the SOAP services. The panel shows three operations (geocode
, geocode_address
, and geocode_intersection
) corresponding to the three methods available from geocoder.us.
The values shown in the three other subpanels depend on the operation you select. The four subpanels list the parameters described in Table 7-1.
Panel | Parameter | Explanation |
WSDL Services Drop-down menu of services (for example, GeoCode_Service) | ||
Ports Drop-down menu of ports (for example, GeoCode_Port) | ||
Operations Drop-down menu of operations (for example, geocode) | ||
Actions URL For example, http://rpc.geocoder.us/service/soap/ | ||
SOAP action For example, http://rpc.geocoder.us/Geo/Coder/US#geocode | ||
Request The body of the request (you fill in the parameters) | ||
Response The body of the response (this is the result of the operation) |
As someone interested in just using the geocode operation (rather understanding the underlying mechanics), you would jump immediately to the sample request that oXygen generates:
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"> <SOAP-ENV:Header/> <SOAP-ENV:Body> <oxy:geocode xmlns:oxy="http://rpc.geocoder.us/Geo/Coder/US/" ?SOAP-?ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"> <location>STRING</location> </oxy:geocode> </SOAP-ENV:Body> </SOAP-ENV:Envelope>
To look up the address of Apress, you would replace this:
<location>STRING</location>
with the following:
<location>2855 Telegraph Ave, Berkeley CA 94705</location>
and hit the Send button on the Request subpanel to get the following to show up in the Response subpanel:
<?xml version="1.0" encoding="UTF-8"?> <SOAP-ENV:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/" xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" ?SOAP-?ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"> <SOAP-ENV:Body> <namesp6:geocodeResponse xmlns:namesp6="http://rpc.geocoder.us/Geo/Coder/US/"> <geo:s-gensym23 xsi:type="SOAP-ENC:Array" xmlns:geo="http://rpc.geocoder.us/Geo/Coder/US/" ?SOAP-?ENC:arrayType="geo:GeocoderAddressResult[1]"> <item xsi:type="geo:GeocoderAddressResult"> <number xsi:type="xsd:int">2855</number> <lat xsi:type="xsd:float">37.858276</lat> <street xsi:type="xsd:string">Telegraph</street> <state xsi:type="xsd:string">CA</state> <zip xsi:type="xsd:int">94705</zip> <city xsi:type="xsd:string">Berkeley</city> <suffix xsi:type="xsd:string"/> <long xsi:type="xsd:float">-122.260070</long> <type xsi:type="xsd:string">Ave</type> <prefix xsi:type="xsd:string"/> </item> </geo:s-gensym23> </namesp6:geocodeResponse> </SOAP-ENV:Body> </SOAP-ENV:Envelope>
There you have it. Let’s review what oXygen and a WSDL document could accomplish for you:
You can get a list of operations available for the services and ports defined in the WSDL (not atypically one service and port combination).
You are given a template for the body of the request with an indication of the data type of what you need to fill in.
oXygen packages up the request, issues the HTTP request, handles the response, and presents you with the results.
To confirm that you understand the nuances of the geocode
SOAP call, you can rewrite the SOAP request as a curl
invocation—once you notice the role played by the two parameters that oXygen does pick up from the WSDL document:
The SOAP action of http://rpc.geocoder.us/Geo/Coder/US#geocode
. In SOAP 1.1, the version of SOAP used for geocoder.us, the SOAP action is transmitted as a SOAPAction
HTTP request header.
The URL (or location) to target the SOAP call: http://rpc.geocoder.us/service/soap/
.
You can now replicate this call with curl
:
curl -v -X POST -H "SOAPAction: http://rpc.geocoder.us/Geo/Coder/US#geocode" --data-binary "<SOAP-ENV:Envelope xmlns:SOAP-ENV='http://schemas.xmlsoap.org/soap/ envelope/'><SOAP-ENV:Header/><SOAP-ENV:Body><oxy:geocode xmlns:oxy= 'http://rpc.geocoder.us/Geo/Coder/US/' ?SOAP-?ENV:encodingStyle='http://schemas. xmlsoap.org/soap/encoding/'><location>2855 Telegraph Ave, Berkeley, CA</location> </oxy:geocode></SOAP-ENV:Body></SOAP-ENV:Envelope>" http://rpc.geocoder.us/service/soap/
Note that you need to know the SOAPaction
header and URL of the SOAP call only if you are trying to understand all the details of the HTTP request and response. oXygen was just being helpful in pointing out those parameters. They, however, were not needed to fill out an address or interpret the latitude or longitude contained in the response.
![]() | Note |
---|---|
If you’re wondering why I’m not using Flickr for my concrete example, Flickr does not offer a WSDL document even though it does present a SOAP interface. I’ll return to discussing Flickr in the later section called “The Flickr API via SOAP.” |
Even without access to oXygen or the Eclipse Web Services Explorer, you can use Tomi Vanek’s WSDL XSLT-based viewer (http://tomi.vanek.sk/index.php?page=wsdl-viewer
) to make sense of a WSDL document. For example, take a look at the results for the geocoder.us WSDL document:
http://www.w3.org/2000/06/webdata/xslt?xslfile=http://tomi.vanek.sk/xml/wsdl-viewer.xsl&xmlfile=http://geocoder.us/dist/eg/clients/GeoCoderPHP.wsdl&transform=Submit
Let’s take a look how to use the geocoder.us WSDL using the SOAPpy
library in Python.
![]() | Note |
---|---|
You can download |
The following piece of Python code shows the process of creating a WSDL proxy, asking for the methods (or operations) that are defined in the WSDL document, and then calling the geocode
method and parsing the results:
from SOAPpy import WSDL wsdl_url = r'http://geocoder.us/dist/eg/clients/GeoCoderPHP.wsdl' server = WSDL.Proxy(wsdl_url) # let's see what operations are supported server.show_methods() # geocode the Apress address address = "2855 Telegraph Ave, Berkeley, CA" result = server.geocode(location=address) print "latitude and longitude: %s, %s" % (result[0]['lat'], result[0]['long'])
This produces the following output (edited for clarity):
Method Name: geocode_intersection In #0: intersection ((u'http://www.w3.org/2001/XMLSchema', u'string')) Out #0: results ((u'http://rpc.geocoder.us/Geo/Coder/US/', u'ArrayOfGeocoderIntersectionResult')) Method Name: geocode_address In #0: address ((u'http://www.w3.org/2001/XMLSchema', u'string')) Out #0: results ((u'http://rpc.geocoder.us/Geo/Coder/US/', u'ArrayOfGeocoderAddressResult')) Method Name: geocode In #0: location ((u'http://www.w3.org/2001/XMLSchema', u'string')) Out #0: results ((u'http://rpc.geocoder.us/Geo/Coder/US/', u'ArrayOfGeocoderResult')) latitude and longitude: 37.858276, -122.26007
Notice the reference to XML schema types in describing the location
parameter for geocode. The type definitions come, as one expects, from the WSDL document.
The concision of this code shows WSDL and SOAP in good light.
Let’s do the straight-ahead PHP PEAR::SOAP
invocation of geocode.us
. You’ll the same pattern of loading the WSDL document using a SOAP/WSDL library, packaging up a named parameter (location
) in the request, and then parsing the results.
<?php # example using PEAR::SOAP + Geocoder SOAP search require 'SOAP/Client.php'; # let's look up Apress $address = '2855 Telegraph Avenue, Berkeley, CA 94705'; // your Google search terms $wsdl_url = "http://geocoder.us/dist/eg/clients/GeoCoderPHP.wsdl"; # true to indicate that it is a WSDL url. $soap = new SOAP_Client($wsdl_url,true); $params = array( 'location'=>$address ); $results = $soap->call('geocode', $params); # include some fault handling code if(PEAR::isError($results)) { $fault = $results->getFault(); print "Error number " . $fault->faultcode . " occurred\n"; print " " . $fault->faultstring . "\n"; } else { print "The latitude and longitude for address is: {$results[0]->lat}, {$results[0]->long}"; } ?>
![]() | Note |
---|---|
I have not been able to figure out how to use |
Now that you have studied the geocoder.us
service, which has three SOAP methods, each with a single input parameter, let’s turn to a more complicated example, the Amazon E-Commerce Service (ECS):
http://www.amazon.com/E-Commerce-Service-AWS-home-page/b?ie=UTF8&node=12738641
See the “Setting Up an Amazon ECS Account” sidebar to learn about how to set up an Amazon ECS account.
Although I focus here on the SOAP interface, ECS also has a REST interface. The WSDL for AWS-ECS is found at
http://webservices.amazon.com/AWSECommerceService/AWSECommerceService.wsdl?
Using one of the SOAP/WSDL toolkits I presented in the previous section (for example, oXygen, the Eclipse Web Services Explorer, or Vanek’s WSDL viewer), you can easily determine the 20 operations that are currently defined by the WSDL document. Here I show you how to use the ItemSearch
operation.
If you use oXygen to formulate a template for a SOAP request, you’ll get the following:
<?xml version="1.0" encoding="UTF-8"?> <SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"> <SOAP-ENV:Header/> <SOAP-ENV:Body> <ItemSearch xmlns="http://webservices.amazon.com/AWSECommerceService/2007-07-16"> <AWSAccessKeyId>STRING</AWSAccessKeyId> [5 tags] <Shared> [40 tags] </Shared> <Request> [40 tags] </Request> </ItemSearch> </SOAP-ENV:Body> </SOAP-ENV:Envelope>
Let’s say you wanted to look for books with the keyword flower
. To create the proper request, you’ll need to figure out which of the many tags you must keep and how to fill out the values that you need to fill out. Through reading the documentation for ItemSearch
(http://?
docs.amazonwebservices.com/AWSECommerceService/2007-07-16/DG/ItemSearch.html
) and trial and error, you can boil down the request template to the following:
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"> <SOAP-ENV:Header/> <SOAP-ENV:Body> <ItemSearch xmlns="http://webservices.amazon.com/AWSECommerceService/2007-07-16"> <AWSAccessKeyId>STRING</AWSAccessKeyId> <Request> <Keywords>STRING</Keywords> <SearchIndex>STRING</SearchIndex> </Request> </ItemSearch> </SOAP-ENV:Body> </SOAP-ENV:Envelope>
You can pull together a full request by filling out your Amazon key and entering flower
and Books
for the <Keywords>
and <SearchIndex>
into a curl
invocation:
curl -H "SOAPAction: http://soap.amazon.com" -d "<?xml version='1.0' encoding='UTF-8'?><SOAP-ENV:Envelope xmlns:SOAP-ENV='http://schemas.xmlsoap.org/ soap/envelope/'><SOAP-ENV:Header/><SOAP-ENV:Body><ItemSearch xmlns='http://webservices.amazon.com/AWSECommerceService/2007-07-16'> <AWSAccessKeyId>[AMAZON-KEY]</AWSAccessKeyId><Request><Keywords>flower</Keywords> <SearchIndex>Books</SearchIndex></Request></ItemSearch></SOAP-ENV:Body> </SOAP-ENV:Envelope>" http://soap.amazon.com/onca/soap?Service=AWSECommerceService
to which you get something like this:
<?xml version="1.0" encoding="UTF-8"?> <SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <SOAP-ENV:Body> <ItemSearchResponse xmlns="http://webservices.amazon.com/AWSECommerceService/2007-07-16"> <OperationRequest> [....] </OperationRequest> <Items> <Request> <IsValid>True</IsValid> <ItemSearchRequest> <Keywords>flower</Keywords> <SearchIndex>Books</SearchIndex> </ItemSearchRequest> </Request> <TotalResults>34489</TotalResults> <TotalPages>3449</TotalPages> <Item> <ASIN>0812968069</ASIN> <DetailPageURL> http://www.amazon.com/gp/redirect.html%3FASIN=0812968069%26Â tag=ws%26lcode=sp1%26cID=2025%26ccmID=165953%26location=/o/ASIN/0812968069%253FÂ SubscriptionId=0Z8Z8FYGP01Q00KF5802</DetailPageURL> <ItemAttributes> <Author>Lisa See</Author> <Manufacturer>Random House Trade Paperbacks</Manufacturer> <ProductGroup>Book</ProductGroup> <Title>Snow Flower and the Secret Fan: A Novel</Title> </ItemAttributes> </Item> [...] </Items> </ItemSearchResponse> </SOAP-ENV:Body> </SOAP-ENV:Envelope>
Notice what makes this example more complicated than geocoder.us
:
There are many more operations.
There are many more parameters, and it’s not obvious what is mandatory without reading the documentation and experimenting.
The XML in the request and response involve complex types. Notice that <Keywords>
and <SearchIndex>
are wrapped within <Request>
. This representation means you have to understand how to get your favorite SOAP library to package up the request and handle the response.
Using the Python SOAPpy
library, you perform the same SOAP call with the following:
# amazon search using WSDL KEY = "[AMAZON-KEY]" from SOAPpy import WSDL class amazon_ecs(object): def __init__(self, key): AMAZON_WSDL = "http://webservices.amazon.com/AWSECommerceService/AWSECommerceService.wsdl?" self.key = key self.server = WSDL.Proxy(AMAZON_WSDL) def ItemSearch(self,Keywords,SearchIndex): return self.server.ItemSearch(AWSAccessKeyId=self.key,Request=Â {'Keywords':Keywords,'SearchIndex':SearchIndex}) if __name__ == "__main__": aws = amazon_ecs(KEY) results= aws.ItemSearch('flower','Books') print results.Items.TotalPages, results.Items.TotalResults for item in results.Items.Item: print item.ASIN, item.DetailPageURL, item.ItemAttributes.Author
Notice in particular how to represent the nested parameters in this:
self.server.ItemSearch(AWSAccessKeyId=self.key,Request={'Keywords':Keywords,'SearchIndex':SearchIndex})
Also notice how to read off the nested elements in the XML response:
print results.Items.TotalPages, results.Items.TotalResults for item in results.Items.Item: print item.ASIN, item.DetailPageURL, item.ItemAttributes.Author
When you look at this Python code and my description of how to use oXygen to interface with Amazon ECS via WSDL and SOAP, you might think to yourself that doing so doesn’t look that hard. The combination of WSDL and SOAP does indeed bring some undeniable conveniences such as the automated discovery of what methods are available to you as a programmer. However, my experience of SOAP and WSDL is that they are still a long way from plug-and-go technology—at least in the world of scripting languages such as PHP and Python. It took me a great amount of trial and error, reverse engineering, reading source code, and hunting around to even get to the point of distilling for you the various examples of how to use SOAP and WSDL you see here. I would have wanted to reduce using SOAP and WSDL to full-proof recipes that hid from you what was happening underneath.
For instance—returning to the example—I was not able to able to craft a satisfactory working example of using PEAR::SOAP
to call ItemSearch
. Some of the issues I struggled with included how to pass in parameters with complex types to a SOAP call, how to parse the results, and how to debug the entire process. I’d be willing to bet that there is in fact a way to make this call work with PEAR::SOAP
or in some other PHP toolkit. However, if I had wanted to call this SOAP service only for a mashup, I would likely have given up even earlier on figuring out how to make it work.
![]() | Note |
---|---|
It might be true that if you use Java or .NET, programming environments for which
there is deep support for SOAP and WSDL, you might have an easier time using this
technology. Don’t let me discourage you from trying those tools. I hope to find out
for myself whether libraries such as Axis from the Apache Project ( |
The Flickr SOAP request and response formats are documented here:
http://www.flickr.com/services/api/request.soap.html http://www.flickr.com/services/api/response.soap.html
The first thing to notice about the Flickr SOAP interface is that Flickr provides no WSDL document to tell us how to use it. Hence, if you want to use Flickr SOAP, you need to figure out how call it directly yourself. But why bother? Flickr has a wonderfully supported REST interface that you already know how to use. If you go down the road of using the SOAP interface, you’ll have to deal with many challenges, some of which I have already discussed.