My Books‎ > ‎Software Engineering‎ > ‎

RESTful Webservices - Representational State Transfer


Preface

It may seem strange to claim that the Web’s potential for distributed programming has been overlooked. After all, this book competes for shelf space with any number of other books about web services. The problem is, most of today’s “web services” have nothing to do with the Web. In opposition to the Web’s simplicity, they espouse a heavyweight architecture for distributed object access, similar to COM or CORBA. Today’s “web service” architectures reinvent or ignore every feature that makes the Web successful.

 

It’s time to put the “web” back into “web services.”

Our goal throughout is to show the power (and, where appropriate, the limitations) of the basic web technologies:

 

Technology

Description

HTTP

Application Protocol

URI

Naming Standard

XML

Markup language

 

Our topic is the set of principles underlying the Web: Representational State Transfer, or REST. For the first time, we set down best practices for “RESTful” web services. We cut through the confusion and guesswork, replacing folklore and implicit knowledge with concrete advice.

 

We introduce the Resource-Oriented Architecture (ROA), a commonsense set of rules for designing RESTful web services. We also show you the view from the client side: how you can write programs to consume RESTful services.

 

Our examples include realworld RESTful services like Amazon’s Simple Storage Service (S3), the various incarnations of the Atom Publishing Protocol, and Google Maps. We also take popular services that fall short of RESTfulness, like the del.icio.us social bookmarking API, and

rehabilitate them.

 


The Story of the REST

 

REST is simple, but it’s well defined and not an excuse for implementing web services as half-assed web sites because “they’re the same.” Unfortunately, until now the main REST reference was chapter five of Roy Fielding’s 2000 Ph.D. dissertation, which is a good read for a Ph.D. dissertation, but leaves most of the real-world questions unanswered.

 

That’s because it presents REST not as an architecture but as a way of judging architectures. The term “RESTful” is like the term “object-oriented.” A language, a  framework, or an application may be designed in an object-oriented way, but that doesn’t make its architecture the object-oriented architecture.

 

We wrote the Resource Oriented Architecture ROA to bring the best practices of web service design out of the realm of folklore. What we’ve written is a suggested baseline. If you’ve tried to figure out REST in the past, we hope our architecture gives you confidence that what you’re doing is “really” REST. We also hope the ROA will help the community as a whole make faster progress in coming up with and codifying best practices. We want to make it easy for programmers to create distributed web applications that are elegant, that do the job they’re designed for, and that participate in the Web instead of merely living on top of it.

 

We’ve positioned the ROA as a simple alternative to the RPC-style architecture used by today’s SOAP+WSDL services. The RPC architecture exposes internal algorithms through a complex programming-language-like interface that’s different for every service. The ROA exposes internal data through a simple document-processing interface that’s always the same. In Chapter 10, we compare the two architectures and show how to argue for the ROA.

 



Reuniting the Webs

 

Our ultimate goal in this book is to reunite the programmable web with the human web. We envision a single interconnected network: a World Wide Web that runs on one set of servers, uses one set of protocols, and obeys one set of design principles. A network that you can use whether you’re serving data to human beings or computer programs.

 

Chapter

Summary

1 - The Programmable Web and Its Inhabitants

    In this chapter we introduce web services in general: programs that go over the Web and ask a foreign server to provide data or run an algorithm. We demonstrate the three common web service architectures:

  1. RESTful
  2. RPC-style
  3. REST-RPC hybrid.
  4.  

    It shows sample HTTP requests and responses for each architecture, along with typical client code.

2 - Writing Web Service Clients

In this chapter we show you how to write clients for existing web services, using an HTTP library and an XML parser. We introduce a popular REST-RPC service (the web service for the social bookmarking site del.icio.us) and demonstrate clients written in Ruby, Python, Java, C#, and PHP. We also give technology recommendations for several other languages, without actually showing code. JavaScript and Ajax are covered separately in Chapter 11.

Chapter 3, What Makes RESTful Services Different?

We take the lessons of Chapter 2 and apply them to a purely RESTful service: Amazon’s Simple Storage Service (S3). While building an S3 client we illustrate some important principles of REST: resources, representations, and the uniform interface.

Chapter 4, The Resource-Oriented Architecture

A formal introduction to REST, not in its abstract form but in the context of a specific architecture for web services. Our architecture is based on four important REST concepts:

  • Resources
  • their names
  • their representations
  • the links between them

Its services should be judged by four RESTful properties:

  • Addressability
  • Statelessness
  • Connectedness
  • the uniform interface

Chapter 5, Designing Read-Only Resource-Oriented Services

We present a procedure for turning an idea or a set of requirements into a set of RESTful resources. These resources are read-only: clients can get data from your service but they can’t send any data of their own. We illustrate the procedure by designing a web service for serving navigable maps, inspired by the Google Maps web application.

Chapter 6, Designing Read/Write Resource-Oriented Services

We extend the procedure from the previous chapter so that clients can create, modify, and delete resources. We demonstrate by adding two new kinds of resource to the map service: user accounts and user-defined places.

Chapter 7, A Service Implementation

We remodel an RPC-style service (the del.icio.us REST-RPC hybrid we wrote clients for back in Chapter 2) as a purely RESTful service. Then we implement that service as a Ruby on Rails application. Fun for the whole family!

Chapter 8, REST and ROA Best Practices

In this chapter we collect our earlier suggestions for service design into one place, and add new suggestions. We show how standard features of HTTP can help you with common problems and optimizations. We also give resource-oriented designs for tough features like transactions, which you may have thought were impossible to do in RESTful web services.

Chapter 9, The Building Blocks of Services

Here we describe extra technologies that work on top of REST’s big three of HTTP, URI, and XML. Some of these technologies are file formats for conveying state, like XHTML and its microformats. Some are hypermedia formats for showing clients the levers of state, like WADL. Some are sets of rules for building RESTful web services, like the Atom Publishing Protocol.

Chapter 10, The Resource-Oriented Architecture Versus Big Web Services

We compare our architecture, and REST in general, to another leading brand. We think that RESTful web services are simpler, more scalable, easier to use, better attuned to the philosophy of the Web, and better able to handle a wide variety of clients than are services based on SOAP, WSDL, and the WS-* stack.

Chapter 11, Ajax Applications as REST Clients

Here we explain the Ajax architecture for web applications in terms of web services: an Ajax application is just a web service client that runs inside your web browser.  That makes this chapter an extension of Chapter 2. We show how to write clients for RESTful web services using XMLHttpRequest and the standard JavaScript library.

Chapter 12, Frameworks for RESTful Services

In the final chapter we cover three popular frameworks that make it easy to implement RESTful web services: Ruby on Rails, Restlet (for Java), and Django (for Python).

Appendix A, Some Resources for REST and Some RESTful Resources

The first part lists interesting standards, tutorials, and communities related to RESTful web services. The second part lists some existing, public RESTful web services that you can use and learn from.

Appendix B, The HTTP Response Code

Top 42 Describes every standard  HTTP response code (plus one extension), and explains when you’d use each one in a RESTful web service.

Appendix C, The HTTP Header Top Infinity

Does the same thing for HTTP headers. It covers every standard HTTP header, and a few extension headers that are useful for web services.

 

 

 

 



Chapter 1 - The Programmable Web and Its Inhabitants

 

The common term for the address of something on the Web is “URL.” I say “URI” throughout this book because that’s what the HTTP standard says. Every URI on the Web is also a URL, so you can substitute “URL” wherever I say “URI” with no loss of meaning.

 

Kinds of Things on the Programmable Web

 

The programmable web is based on HTTP and XML.  Some parts of it serve HTML, JavaScript Object Notation (JSON), plain text, or binary documents, but most parts use XML. And it’s all based on HTTP: if you don’t use HTTP, you’re not on the web.*

 

Beyond that small island of agreement there is little but controversy. The terminology isn’t set, and different people use common terms (like “REST,” the topic of this book) in ways that combine into a vague and confusing mess. What’s missing is a coherent way of classifying the programmable web. With that in place, the meanings of individual terms will become clear.

 

There are two analogous ways of classifying the services that inhabit the programmable web.

 

Web Classification

Examples

Technologies

URIs, SOAP, XML-RPC, and so on

Underlying architectures and design philosophies

 

 

When it comes to classifying the programmable web, most of today’s terminology sorts services by their superficial appearances: the technologies they use. I’m going to present a taxonomy based on architecture, which shows how technology choices follow from underlying design principles. I’m exposing divisions I’ll come back to throughout the book, but my main purpose is to zoom in on the parts

of the programmable web that can reasonably be associated with the term “REST.”

 

HTTP: Documents in Envelopes

 

To classify the programmable web, I’d like to start off with an overview of HTTP, the protocol that all web services have in common.

 

HTTP is a document-based protocol, in which the client puts a document in an envelope and sends it to the server. The server returns the favor by putting a response document in an envelope and sending it to the client. HTTP has strict standards for what the envelopes should look like, but it doesn’t much care what goes inside.

 

Method Information & Scoping Information

 

HTTP is the one thing all components of the programmable web have in common. Now I’ll show you how web services distinguish themselves from each other. There are two big questions that today’s web services answer differently. If you know how a web

service answers these questions, you’ll have a good idea of how well it works with the Web.

 

Question

Description

Alternatives

How the client can convey its intentions to the server?

How does the server know a certain request is a request to retrieve some data, instead of a request to delete that same data or to overwrite it with different data?

Why should the server do this instead of doing that?

I call the information about what to do with the data the method information. One way to convey method information in a web service is to put it in the HTTP method. Since this is how RESTful web services do it, I’ll have a lot more to say about this later. For now, note that the five most common HTTP methods are GET, HEAD, PUT, DELETE, and POST. This is enough to distinguish between “retrieve some data” (GET), “delete that same data” (DELETE), and “overwrite it with different data” (PUT).

  1. Keeping the method information in the path:
  2. the web service for Flickr, Yahoo!’s online photo-sharing application. 

    http://www.flickr.com/services/rest?

    method=flickr.photos.search&api_key=xxx&tag=penguins. How does the server know

    what the client is trying to do? Well, the method name is pretty clearly

    flickr.photos.search.  the Flickr API supports many methods, not just “get”-type methods such as flickr.photos.search and flickr.people.findByEmail, but also methods like flickr.photos.addTags, flickr.photos.comments.deleteComment, and so on. All of them are invoked with an

    HTTP GET request, regardless of whether or not they “get” any data. Flickr is sticking the method information in the method query variable, and expecting the client to ignore what the HTTP method says.

  3. By contrast, a typical SOAP service keeps its method information in the entity-body and in a HTTP header.

How the client tells the server which part of the data set to operate on?

Given that the server understands that the client wants to (say) delete some data, how can it know which data the client wants to delete?

Why should the server operate on this data instead of that data?

 

I call this information the scoping information. One obvious place to put it is in the URI path. That’s what most web sites do. Think once again about a search engine URI like http://www.google.com/search?q=REST. There, the method information is “GET,” and the scoping information is “/search?q=REST.” The client is trying to GET a list of search results about REST, as opposed to trying to GET something else: say, a list of search results about jellyfish (the scoping information for that would be “/search?q=jellyfish”), or the Google home page (that would be “/”).

  1. Many web services put scoping information in the path. Flickr’s is one: most of the query variables in a Flickr API URI are scoping information. tags=penguin scopes the flickr.photos.search method so it only searches for photos tagged with “penguin.” In a service where the method information defines a method in the programming language sense, the scoping information can be seen as a set of arguments to that method. You could reasonably expect to see flickr.photos.search(tags=penguin) as a line of code in some programming language.
  2. The alternative is to put the scoping information into the entity-body. A typical SOAP web service does it this way. Example 1-10 contains a q tag whose contents are the string “REST.” That’s the scoping information, nestled conveniently inside the doGoogleSearch tag that provides the method information.

 

Note:  While I was writing this book, Google announced that it was deprecating its SOAP search service in favor of a RESTful, resource-oriented service (which, unfortunately, is encumbered by legal restrictions on use in a way the SOAP service isn’t). I haven’t changed the example because Google’s SOAP service still makes the best example I know of, and because I don’t expect you to actually run this program. I just want you to look at the code, and the SOAP and WSDL documents the code relies on.

 

The Competing Architectures

 

I can group web services by their answers to the questions. In my studies I’ve identified three common web service architectures: RESTful resource-oriented, RPC-style, and REST-RPC hybrid.

 

Architecture

Description

RESTful, Resource-Oriented Architectures

In RESTful architectures, the method information goes into the HTTP method. In Resource-Oriented Architectures, the scoping information goes into the URI. The combination is powerful.

 

Given the first line of an HTTP request to a resource-oriented RESTful web service (“GET /reports/open-bugs HTTP/1.1”), you should understand basically what the client wants to do. The rest of the request is just details; indeed, you can make many requests using only one line of HTTP.

 

If the HTTP method doesn’t match the method information, the service isn’t RESTful. If the scoping information isn’t in the URI, the service isn’t resource-oriented. These aren’t the only requirements, but they’re good rules of thumb.

 

A few well-known examples of RESTful, resource-oriented web services include:

• Services that expose the Atom Publishing Protocol (http://www.ietf.org/html.charters/atompub-charter.html) and its variants such as GData (http://code.google.com/apis/gdata/)

• Amazon’s Simple Storage Service (S3) (http://aws.amazon.com/s3)

• Most of Yahoo!’s web services (http://developer.yahoo.com/)

• Most other read-only web services that don’t use SOAP

• Static web sites

• Many web applications, especially read-only ones like search engines

 

RPC-Style Architectures

An RPC-style web service accepts an envelope full of data from its client, and sends a similar envelope back. The method and the scoping information are kept inside the envelope, or on stickers applied to the envelope. What kind of envelope is not important to my classification, but HTTP is a popular envelope format, since any web service worthy of the name must use HTTP anyway. SOAP is another popular envelope format (transmitting a SOAP document over HTTP puts the SOAP envelope inside an HTTP envelope). Every RPC-style service defines a brand new vocabulary. Computer programs work this way as well: every time you write a program, you define functions with different names. By contrast, all RESTful web services share a standard vocabulary of HTTP methods. Every object in a RESTful service responds to the same basic interface.

The XML-RPC protocol for web services is the most obvious example of the RPC architecture. XML-RPC is mostly a legacy protocol these days, but I’m going to start off with it because it’s relatively simple and easy to explain. Example 1-11 shows a Ruby client for an XML-RPC service that lets you look up anything with a Universal Product Code.

 

The XML document changes depending on which method you’re calling, but the HTTP envelope is always the same. No matter what you do with the UPC database service, the URI is always http://www.upcdatabase.com/rpc and the HTTP method is always POST. Simply put, an XML-RPC service ignores most features of HTTP. It exposes only one URI (the “endpoint”), and supports only one method on that URI (POST). Where a RESTful service would expose different URIs for different values of the scoping

information, an RPC-style service typically exposes a URI for each “document processor”: something that can open the envelopes and transform them into software commands.

 

A few well-known examples of RPC-style web services:

• All services that use XML-RPC

• Just about every SOAP service (see the “Technologies on the Programmable Web” section later in this chapter for a defense of this controversial statement)

• A few web applications (generally poorly designed ones)

 

REST-RPC Hybrid Architectures

This is a term I made up for describing web services that fit somewhere in between the RESTful web services and the purely RPC-style services. These services are often created by programmers who know a lot about real-world web applications, but not much about the theory of REST.
 

Take another look at this URI used by the Flickr web service: http://www.flickr.com/services/rest?api_key=xxx&method=flickr.photos.search&tags=penguin. Despite the “rest” in the URI, this was clearly designed as an RPC-style service, one that uses HTTP as its envelope format. It’s got the scoping information (“photos tagged ‘penguin’”) in the URI, just like a RESTful resource-oriented service. But the method information (“search for photos”) also goes in the URI. In a RESTful service, the method information would go into the HTTP method (GET), and whatever was leftover would become scoping information. As it is, this service is simply using HTTP as an envelope format, sticking the method and scoping information wherever it pleases. This is an RPC-style service. Case closed.

 

This optical illusion happens when an RPC-style service uses plain old HTTP as its envelope format, and when both the method and the scoping information happen to live into the URI portion of the HTTP request. If the HTTP method is GET, and the point of the web service request is to “get” information, it’s hard to tell whether the method information is in the HTTP method or in the URI. Look at the HTTP requests that go across the wire and you see the requests you’d see for a RESTful web service. They may contain elements like “method=flickr.photos.search” but that could be interpreted as scoping information, the way “photos/” and “search/” are scoping information. These RPC-style services have elements of RESTful web services, more or less by accident. They’re only using HTTP as a convenient envelope format, but they’re using it in a way that overlaps with what a RESTful service might do.

 

 

A few well-known examples of REST-RPC hybrid services include:

• The del.icio.us API

• The “REST” Flickr web API

• Many other allegedly RESTful web services

• Most web applications

 

 

 

Technologies of the Programmable Web

 

Technology

Description

HTTP

All web services use HTTP, but they use it in different ways.

 

Architecture

HTTP Strategy

RESTful web services

puts the method information in the HTTP method and the scoping information in the URI

RPC-style web services

tend to ignore the HTTP method, looking for method and scoping information in the URI, HTTP headers, or entity-body

RPC-style web services

use HTTP as an envelope containing a document, and others only use it as an unlabelled envelope containing another envelope.

URI

Again, all web services use URIs, but in different ways. What I’m about to say is a generalization, but a fairly accurate one.

 

Architecture

URI strategy

Restful resource-oriented service

One URI for every piece of data (resource) the client might want to operate on.

 

Note: if a given website controls 1 million resources, being whatever they might be, there will be 1 million different URIs.

REST-RPC hybrid

One URI for every operation the client might perform: one URI to fetch a piece of data, a different URI to delete that same data

 

Note:  in case there are 1 million resources, there will be 4 million different URIs

RPC-style service

Exposes one URI  for every processes capable of handling Remote Procedure Calls (RPC). There’s usually

only one such URI: the service “endpoint.

Note: the ID  and method are both encapsulated in the envelop which means that it is totally opaque.

XML-RPC

A few, mostly legacy, web services use XML-RPC on top of HTTP. XML-RPC is a data structure format for representing function calls and their return values. As the name implies, it’s explicitly designed to use an RPC style.

SOAP

Lots of web services use SOAP on top of HTTP. SOAP is an envelope format, like HTTP, but it’s an XML-based envelope format.

 

Now I’m going to say something controversial. To a first approximation, every current web service that uses SOAP also has an RPC architecture. This is controversial because many SOAP programmers think the RPC architecture is déclassé and prefer to call their

services “message-oriented” or “document-oriented” services. Well, all web services are message-oriented, because HTTP itself is message-oriented.

 

An HTTP request is just a message: an envelope with a document inside. The question is what that document says. SOAP-based services ask the client to stick a second envelope (a SOAP document) inside the HTTP envelope. Again, the real question is what it says inside the envelope. A SOAP envelope can contain any XML data, just as an HTTP envelope can contain any data in its entity-body. But in every existing SOAP service, the SOAP envelope contains a description of an RPC call in a format similar to that of XML-RPC.

 

There are various ways of shuffling this RPC description around and giving it different labels—“document/literal” or “wrapped/literal”—but any way you slice it, you have a service with a large vocabulary of method information, a service that looks for scoping information inside the document rather than on the envelope. These are defining features of the RPC architecture.

 

I emphasize that this is not a fact about SOAP, just a fact about how it’s currently used. SOAP, like HTTP, is just a way of putting data in an envelope. Right now, though, the only data that ever gets put in that envelope is XML-RPC-esque data about how to call a remote function, or what’s the return value from such a function. I argue this point in more detail in Chapter 10.

WS-*

These standards define special XML “stickers” for the SOAP envelope. The stickers are analagous to HTTP headers.

WSDL

The Web Service Description Language (WSDL) is an XML vocabulary used to describe SOAP-based web services. A client can load a WSDL file and know exactly which RPCstyle methods it can call, what arguments those methods expect, and which data types they return. Nearly every SOAP service in existence exposes a WSDL file, and most SOAP services would be very difficult to use without their WSDL files serving as guides. As I discuss in Chapter 10, WSDL bears more responsiblity than any other technology for maintaining SOAP’s association with the RPC style.

WADL

The Web Application Description Language (WADL) is an XML vocabulary used to describe RESTful web services. As with WSDL, a generic client can load a WADL file and be immediately equipped to access the full functionality of the corresponding web service. I discuss WADL in Chapter 9. Since RESTful services have simpler interfaces, WADL is not nearly as necessary to these services as WSDL is to RPC-style SOAP services. This is a good thing, since as of the time of writing there are few real web services providing official WADL files. Yahoo!’s web search service is one that does.

 


Chapter 2 - Writing Web Service Clients

 

Web Services Are Web Sites

 

In this chapter I show how to write clients for RESTful and hybrid architecture services, in a variety of programming languages.

 

There is no magic dust that makes an HTTP request a web service request. You can make requests to a RESTful or hybrid web service using nothing but your programming language’s HTTP client library. You can process the results with a standard XML parser. Every web service request involves the same three steps:

  1. Come up with the data that will go into the HTTP request: the HTTP method, the URI, any HTTP headers, and (for requests using the PUT or POST method) any document that needs to go in the request’s entity-body.
  2. Format the data as an HTTP request, and send it to the appropriate HTTP server.
  3. Parse the response data—the response code, any headers, and any entity-body—into the data structures the rest of your program needs.

 

Wrappers, WADL, and ActiveResource

 

We need a language with a vocabulary that can describe the variety of RESTful and hybrid services.

 

A document written in this language could script a generic web service client, making it act like a custom-written wrapper. The SOAP RPC community has united around WSDL as its service description language. The REST community has yet to unite around a description language, so in this book I do my bit to promote WADL as a resource-oriented alternative to WSDL. I think it’s the simplest and most elegant solution that solves the whole problem. I show a simple WADL client in this chapter and it is covered in detail in the “WADL” section.  There’s also a generic client called ActiveResource, still in development. ActiveResource makes it easy to write clients for many kinds of web services written with the Ruby on Rails framework. I cover ActiveResource at the end of Chapter 3.

 

del.icio.us: The Sample Application

 

The del.icio.us web service gives you programmatic access to your bookmarks. You can write programs that bookmark URIs, convert your browser bookmarks to del.icio.us bookmarks, or fetch the URIs you’ve bookmarked in the past. The best way to visualize the del.icio.us web service is to use the human-oriented web site for a while. There’s no fundamental difference between the del.icio.us web site and the del.icio.us web service, but there are variations.

 

Item

Web site

Web service

Root

http://del.icio.us/

https://api.del.icio.us/v1/

Client Communication

HTTP

HTTPS

URI Structure

https://del.icio.us/{your-username}

https://api.del.icio.us/v1/posts/recent

Returned information type

HTML

XML

Authentication

Not required

Necessary for every request

Social Features

Yes

No

 

The web service is a stripped-down web site that uses HTTPS and serves funny-looking documents. (You can flip this around and look at the web site as a more functional web service, though the del.icio.us administrators discourage this viewpoint.) This is a theme I’m coming back to again and again: web services should work under same rules as web sites.

 

The del.icio.us web service does not have a very RESTful design. The programmers have laid out the service URIs in a way that suggests an RPC-style rather than a resource-oriented design. All requests to the del.icio.us web service use the HTTP GET method: the real method information goes into the URI and might conflict with “GET”.

 

Making the Request: HTTP Libraries

 

Every modern programming language has one or more libraries for making HTTP requests. Not all of these libraries are equally useful, though. To build a fully general webservice client you need an HTTP library with these features:

 

Feature  (*Mandatory)

Description

*HTTPS and SSL certificate validation

 It must support HTTPS and SSL certificate validation. Web services, like web sites, use HTTPS to secure communication with their clients. Many web services (del.icio.us is one example) won’t accept plain HTTP requests at all. A library’s HTTPS support often depends on the presence of an external SSL library written in C.

*Support for basic HTTP methods

It must support at least the five main HTTP methods: GET, HEAD, POST, PUT, and DELETE. Some libraries support only GET and POST. Others are designed for simplicity and support only GET.

 

You can get pretty far with a client that only supports GET and POST: HTML forms support only those two methods, so the entire human web is open to you.

 

You can even do all right with just GET, because many web services (among them del.icio.us and Flickr) use GET even where they shouldn’t. But if you’re choosing a library for all your web service clients, or writing a general client like a WADL client, you need a library that supports all five methods. Additional methods like OPTIONS and TRACE, and WebDAV extensions like MOVE, are a bonus.

*Request Editing

It must allow the programmer to customize the data sent as the entity-body of a PUT or POST request.

*Header Editing

It must allow the programmer to customize a request’s HTTP headers.

*Response Editing

It must give the programmer access to the response code and headers of an HTTP response; not just access to the entity-body.

*HTTP Proxy Support

It must be able to communicate through an HTTP proxy. The average programmer may not think about this, but many HTTP clients in corporate environments can only work through a proxy. Intermediaries like HTTP proxies are also a standard part of the REST meta-architecture, though not one I’ll be covering in much detail.

Data Compression

An HTTP library should automatically request data in compressed form to save bandwidth, and transparently decompress the data it receives. The HTTP request header here is Accept-Encoding, and the response header is Encoding. I discuss these in more detail in Chapter 8.

Caching

It should automatically cache the responses to your requests. The second time you request a URI, it should return an item from the cache if the object on the server hasn’t changed. The HTTP headers here are ETag and If-Modified-Since for the request, and Etag and Last-Modified for the response. These, too, I discuss in Chapter 8.

Support for basic HTTP authentication

It should transparently support the most common forms of HTTP authentication: Basic, Digest, and WSSE. It’s useful to support custom, company-specific authentication methods such as Amazon’s, or to have plug-ins that support them. 

The request header is Authorization and the response header (the one that demands authentication) is WWW-Authenticate. I cover the standard HTTP authentication methods, plus WSSE, in Chapter 8. I cover Amazon’s custom authentication method in Chapter 3.

HTTP Redirects

It should be able to transparently follow HTTP redirects, while avoiding infinite redirects and redirect loops. This should be an optional convenience for the user, rather than something that happens on every single redirect. A web service may reasonably send a status code of 303 (“See Other”) without implying that the client should go fetch that other URI right now!

Cookie Support

It should be able to parse and create HTTP cookie strings, rather than forcing the programmer to manually set the Cookie header. This is not very important for RESTful services, which shun cookies, but it’s very important if you want to use the human web.

 

 

Python: httplib2

 

The Python standard library comes with two HTTP clients: urllib2, which has a filelike interface like Ruby’s open-uri; and httplib, which works more like Ruby’s Net::HTTP. Both offer transparent support for HTTPS, assuming your copy of Python was compiled with SSL support. There’s also an excellent third-party library, Joe Gregorio’s httplib2 (http://bitworking.org/projects/httplib2/), which is the one I recommend in general. httplib2 is an excellent piece of software, supporting nearly every feature on my wish list—most notably, transparent caching.

 

 

urllib2

httplib

httplib2

HTTPS

Yes (assuming Python was compiled with SSL support)

Y

Y

HTTP verbs

GET POST

ALL

ALL

Custom data

Y

Y

Y

Custom headers

Y

Y

Y

Proxies

Y

N

N

Compression

N

N

Y

Caching

N

N

Y

Auth methods

Basic, Digest

None

Basic, Digest, WSSE, Google

Cookies

Yes (Use urllib2.build_opener(HTTPCookieProcessor))

N

N

Redirects

Y

N

Y

 

Processing the Response: XML Parsers

 

There are three kinds of XML parsers. It’s not just that some XML parsers have features that others lack, or that one interface is more natural than another. There are two basic XML parsing strategies: the document-based strategy of DOM and other treestyle parsers, and the event-based strategy of SAX and “pull” parsers. You can get a tree-style or a SAX parser for any programming language, and a pull parser for almost any language.

 

Parser

Description

DOM "Tree-style"

The document-based, tree-style strategy is the simplest of the three models. A tree-style parser models an XML document as a nested data structure. Once you’ve got this data structure, you can search and process it with XPath queries, CSS selectors, or custom navigation functions: whatever your parser supports. A DOM parser is a tree-style parser that implements a specific interface defined by the W3C.

 

The tree-style strategy is easy to use, and it’s the one I use the most. With a tree-style parser, the document is just an object like the other objects in your program. The big shortcoming is that you have to deal with the document as a whole. You can’t start

working on the document until you’ve processed the whole thing into a tree, and you can’t avoid loading the whole document into memory. For documents that are simple but very large, this is inefficient. It would be a lot better to handle tags as they’re parsed.

SAX "pull-parser"

Instead of a data structure, a SAX-style or pull parser turns a document into a stream of events. Starting and closing tags, XML comments, and entity declarations are all events.

 

A pull parser is useful when you need to handle almost every event. A pull parser lets you handle one event at a time, “pulling” the next one from the stream as needed. You can take action in response to individual events as they come in, or build up a data

structure for later use—presumably a smaller data structure than the one a tree-style parser would build. You can stop parsing the document at any time and come back to it later by pulling the next event from the stream.

 

A SAX parser is more complex, but useful when you only care about a few of the many events that will be streaming in. You drive a SAX parser by registering callback methods with it. Once you’re done defining callbacks, you set the parser loose on a document.

The parser turns the document into a series of events, and processes every event in the document without stopping. When an event comes along that matches one of your callbacks, the parser triggers that callback, and your custom code runs. Once the callback completes, the SAX parser goes back to processing events without stopping.

 

Python: ElementTree

 

The world is full of XML parsers for Python. There are seven different XML interfaces in the Python 2.5 standard library alone. For full details, see the Python library reference (http://docs.python.org/lib/markup.html).

 

Python Parser Type

Description

Tree-Style

For tree-style parsing, the best library is ElementTree (http://effbot.org/zone/elementindex.htm). It’s fast, it has a sensible interface, and as of Python 2.5 you don’t have to install anything because it’s in the standard library. On the downside, its support for XPath is limited to simple expressions—of course, nothing else in the standard library supports XPath at all. If you need full XPath support, try 4Suite (http://4suite.org/).

 

Beautiful Soup (http://www.crummy.com/software/BeautifulSoup/) is a slower tree-style parser that is very forgiving of invalid XML, and offers a programmatic interface to a document. It also handles most character set conversions automatically, letting you work with Unicode data.

SAX

For SAX-style parsing, the best choice is the xml.sax module in the standard library.

The PyXML (http://pyxml.sourceforge.net/) suite includes a pull parser

 

JSON Parsers: Handling Serialized Data

 

Most web services return XML documents, but a growing number return simple data structures (numbers, arrays, hashes, and so on), serialized as JSON-formatted strings. JSON is usually produced by services that expect to be consumed by the client half of an Ajax application. JSON is a simple and language-independent way of formatting programming language data structures (numbers, arrays, hashes, and so on) as strings.

 

JSON

XML

[3, "three"]

<value>

  <array>

    <data>

      <value><i4>3</i4></value>

      <value><string>three</string></value>

    </data>

  </array>

</value>

 

 

Chapter 3 - What Makes RESTful Services Different?

 

 

 

 

Chapter 4 - The Resource Oriented Architecture?

 

REST is not an architecture: it’s a set of design criteria. You can say that one architecture meets those criteria better than another, but there is no one “REST architecture.”

 

The traditional definition of REST leaves a lot of open space, which practitioners have seeded with folklore. I deliberately go further than Roy Fielding in his dissertation, or the W3C in their standards: I want to clear some of that open space so that the folklore has room to grow into a well-defined set of best practices. Even if REST were an architecture, it wouldn’t be fair to call my architecture by the same name. I’d be tying my empirical observations and suggestions to the more general thoughts of those who built the Web.

 

What’s a Resource?

 

A resource is anything that’s important enough to be referenced as a thing in itself. If your users might “want to create a hypertext link to it, make or refute assertions about it, retrieve or cache a representation of it, include all or part of it by reference into another representation, annotate it, or perform other operations on it”, then you should make it a resource.†

 

Usually, a resource is something that can be stored on a computer and represented as a stream of bits: a document, a row in a database, or the result of running an algorithm. A resource may be a physical object like an apple, or an abstract concept like courage, but (as we’ll see later) the representations of such resources are bound to be disappointing.

 

URIs

 

What makes a resource a resource? It has to have at least one URI. The URI is the name and address of a resource. If a piece of information doesn’t have a URI, it’s not a resource and it’s not really on the Web, except as a bit of data describing some other resource.

 

URIs Should Be Descriptive

 

Here’s the first point where the ROA builds upon the sparse recommendations of the REST thesis and the W3C recommendations. I propose that a resource and its URI ought to have an intuitive correspondence. Here are some good URIs for the resources I listed above:

http://www.example.com/software/releases/1.0.3.tar.gz

http://www.example.com/software/releases/latest.tar.gz

http://www.example.com/weblog/2006/10/24/0

http://www.example.com/map/roads/USA/AR/Little_Rock

http://www.example.com/wiki/Jellyfish

 

URIs should have a structure. They should vary in predictable ways: you should not go to /search/Jellyfish for jellyfish and /i-want-to-know-about/Mice for mice. If a client knows the structure of the service’s URIs, it can create its own entry points into the service. This makes it easy for clients to use your service in ways you didn’t think of.

 

 

The Relationship Between URIs and Resources

 

Let’s consider some edge cases. Can two resources be the same? Can two URIs designate the same resource? Can a single URI designate two resources?

 

By definition, no two resources can be the same. If they were the same, you’d only have one resource. However, at some moment in time two different resources may point to the same data. If the current software release is 1.0.3, then http://www.example.com/ software/releases/1.0.3.tar.gz and http://www.example.com/software/releases/ latest.tar.gz will refer to the same file for a while. But the ideas behind those two URIs are different: one of them always points to a particular version, and the other points to whatever version is newest at the time the client accesses it. That’s two concepts and two resources. You wouldn’t link to latest when reporting a bug in version 1.0.3.

 

A resource may have one URI or many. The sales numbers available at http://www.example.com/sales/2004/Q4 might also be available at http://www.example.com/ sales/Q42004. If a resource has multiple URIs, it’s easier for clients to refer to the resource. The downside is that each additional URI dilutes the value of all the others. Some clients use one URI, some use another, and there’s no automatic way to verify

that all the URIs refer to the same resource.

 

What to get around multiple URIS for the same resource? One way to get around this is to expose multiple URIs for the same resource, but have one of them be the “canonical” URI for that resource. When a client requests the canonical URI, the server sends the appropriate data along with response code of 200 (“OK”). When a client requests one of the other URIs, the server sends a response code 303 (“See Also”) along with the canonical URI. The client can’t see whether two URIs point to the same resource, but it can make two HEAD requests and see if one URI redirects to the other or if they both redirect to a third URI.

 

Another way is to serve all the URIs as though they were the same, but give the “canonical” URI in the Content-Location response header whenever someone requests a non-canonical URI.

 

Addressability

 

Now that I’ve introduced resources and their URIs, I can go in depth into two of the features of the ROA: addressability and statelessness.

 

An application is addressable if it exposes the interesting aspects of its data set as resources. Since resources are exposed through URIs, an addressable application exposes a URI for every piece of information it might conceivably serve. This is usually an infinite number of URIs.

 

If HTTP wasn’t addressable, you’d have to download the whole page and send the HTML file as an attachment.

 

To save bandwidth, you can set up an HTTP proxy cache on your local network. The first time someone requests http://www.google.com/search?q=jellyfish, the cache will save a local copy of the document. The next time someone hits that URI, the cache might serve the saved copy instead of downloading it again. These things are possible only if every page has a unique identifying string: an address.

 

The filesystem on your home computer is another addressable system. Command-line applications can take a path to a file and do strange things to it. The cells in a spreadsheet are also addressable; you can plug the name of a cell into a formula, and the formula will use whatever value is currently in that cell. URIs are the file paths and cell addresses of the Web.

 

Addressability is one of the best things about web applications. It makes it easy for clients to use web sites in ways the original designers never imagined. Following this one rule gives you and your users many of the benefits of REST. This is why REST-RPC services are so common: they combine addressability with the procedure-call programming model. It’s why I gave resources top billing in the name of the Resource-Oriented Architecture: because resources are the kind of thing that’s addressable.

 

This seems natural, the way the Web should work. Unfortunately, many web applications don’t work this way. This is especially true of Ajax applications. As I show in Chapter 11, most Ajax applications are just clients for RESTful or hybrid web services. But when you use these clients as though they are web sites, you notice that they don’t feel like web sites.

 

No need to pick on the little guys; let’s continue our tour of the Google properties by considering the Gmail online email service. From the end-user perspective, there is only one Gmail URI: https://mail.google.com/. Whatever you do, whatever pieces of information you retrieve from or upload to Gmail, you’ll never see a different URI. The resource “email messages about jellyfish” isn’t addressable, the way Google’s “web pages about jellyfish” is.‡ Yet behind the scenes, as I show in Chapter 11, is a web site that is addressable. The list of email messages about jellyfish does have a URI: it’s https://mail.google.com/mail/?q=jellyfish&search=query&view=tl. The problem is, you’re not the consumer of that web site. The web site is really a web service, and the real consumer is a JavaScript program running inside your web browser. The Gmail web service is addressable, but the Gmail web application that uses it is not.

 

Statelessness

 

Statelessness means that every HTTP request happens in complete isolation. When the client makes an HTTP request, it includes all information neccessary for the server to fulfill that request. The server never relies on information from previous requests. If that information was important, the client would have sent it again in this request.

 

More practically, consider statelessness in terms of addressability. Addressability says that every interesting piece of information the server can provide should be exposed as a resource, and given its own URI. Statelessness says that the possible states of the server are also resources, and should be given their own URIs. The client should not have to coax the server into a certain state to make it receptive to a certain request.

 

Statelessness also brings new features. It’s easier to distribute a stateless application across load-balanced servers. Since no two requests depend on each other, they can be handled by two different servers that never coordinate with each other. Scaling up is as simple as plugging more servers into the load balancer. A stateless application is also easy to cache: a piece of software can decide whether or not to cache the result of an HTTP request just by looking at that one request. There’s no nagging uncertainty that state from a previous request might affect the cacheability of this one.

 

To make your service addressabile you have to put in some work, dissect your application’s data into sets of resources. HTTP is an intrinsically stateless protocol, so when you write web services, you get statelessness by default. You have to do something to break it.

 

The most common way to break statelessness is to use your framework’s version of HTTP sessions. There’s nothing unRESTful about stateful URIs: that’s how the server communicates possible next states to the client. However, there is something unRESTful about cookies, as I discuss in “The Trouble with Cookies.” To use a web browser analogy, cookies break a web service client’s back button.

 

But those URIs need to contain the state, not just provide a key to state stored on the server. start=10 means something on its own, and PHPSESSID=27314962133 doesn’t. RESTfulness requires that the state stay on the client side, and be transmitted to the server for every request that needs it. The server can nudge the client toward new states, by sending stateful links for the client to follow, but it can’t keep any state of its own.

 

Application State Versus Resource State

 

Statelessness implies there’s only one kind of state and that the server should go without it. Actually, there are two kinds of state.

 

Type of State

Description

Example

Application State

ought to live on the client

When you use a search engine, your current query and your current page are bits of client state. 

You might be on page 3 of the search results for “jellyfish,” and I might be on page 1 of the search results for “mice.”  Our respective clients store different bits of application state.

Resource State

ought to live on the server

 

 

A web service only needs to care about your application state when you’re actually making a request. The rest of the time, it doesn’t even know you exist.  This means that whenever a client makes a request, it must include all the application states the server will need to process it. The server might send back a page with links, telling the client about other requests it might want to make in the future, but then it can forget all about the client until the next request. That’s what I mean when I say a web service should be “stateless.” The client should be in charge of managing its own path through the application.

 

Representations

 

A resource is a source of representations, and a representation is just some data about the current state of a resource. Most resources are themselves items of data (like a list of bugs), so an obvious representation of a resource is the data itself. The server might present a list of open bugs as an XML document, a web page, or as comma-separated text. The sales numbers for the last quarter of 2004 might be represented numerically or as a graphical chart. Lots of news sites make their articles available in an ad-laden format, and in a stripped-down “printer-friendly” format. These are all different representations of the same resources.

 

Deciding Between Representations

 

If a server provides multiple representations of a resource, how does it figure out which one the client is asking for? For instance, a press release might be put out in both English and Spanish. Which one does a given client want?

 

RESTfull Representation Technique

Description

Distinct URI
 

(Recommended)

to give a distinct URI to each representation of a resource. http://www.example.com/releases/104.en could designate the English representation of the press release, and http://www.example.com/releases/104.es could designate the Spanish representation.

Content Negotiation

In this scenario the only exposed URI is the Platonic form URI, http://www.example.com/releases/104. When a client makes a request for that URI, it provides special HTTP request headers that signal what kind of representations the client is willing to accept.
 

Your web browser has a setting for language preferences: which languages you’d prefer to get web pages in. The browser submits this information with every HTTP request, in the Accept-Language header. The server usually ignores this information because most web pages are available in only one language. But it fits with what we’re trying to do here: expose different-language representations of the same resource. When a client requests http://www.example.com/releases/104, the server can decide whether to serve the English or the Spanish representation based on the client’s Accept-Language header

 

It’s RESTful to keep this information in the HTTP headers, and it’s RESTful to put it in the URI. I recommend keeping as much of this information as possible in the URI, and as little as possible in request metadata. I think URIs are more useful than metadata. URIs get passed around from person to person, and from program to program. The request metadata almost always gets lost in transition.

 

Links and Connectedness

 

If you’ve read about REST before, you might have encountered an axiom from the Fielding dissertation: “Hypermedia as the engine of application state.” This is what that axiom means: the current state of an HTTP “session” is not stored on the server as a resource state, but tracked by the client as an application state, and created by the path the client takes through the Web. The server guides the client’s path by serving “hypermedia”: links and forms inside hypertext representations.

 

The server sends the client guidelines about which states are near the current one. The “next” link on http://www.google.com/search?q=jellyfish is a lever of state: it shows you how to get from the current state to a related one. This is very powerful. A document that contains a URI points to another possible state of the application: “page two,” or “related to this URI,” or “a cached version of this URI.” Or it may be pointing to a possible state of a totally different application.

 

I’m calling the quality of having links connectedness. A web service is connected to the extent that you can put the service in different states just by following links and filling out forms. I’m calling this “connectedness” because “hypermedia as the engine of application state” makes the concept sound more difficult than it is. All I’m saying is that resources should link to each other in their representations.

 

In the following table, all three services expose the same functionality, but their usability increases toward the right.

Service Type

Connectedness Description

 

Service A is a typical RPC-style service, exposing everything through a single URI. It’s neither addressable nor connected.

 

Service B is addressable but not connected: there are no indications of the relationships

between resources. This might be a REST-RPC hybrid service, or a RESTful service like Amazon S3.

 

Service C is addressable and well-connected: resources are linked to each other in ways that (presumably) make sense. This could be a fully RESTful service.

 

 

The Uniform Interface

 

All across the Web, there are only a few basic things you can do to a resource. HTTP provides four basic methods for the four most common operations.

 

Operation

Description

Notes

GET

Retrieve a representation of a resource

 

PUT

  1. Create a new resource to a new URI
  2. Modify an existing resource to an existing URI

 

POST

Create a new resource to an existing URI

    This is the most misunderstood of HTTP methods. This method essentially has two purposes:

  1. one that fits in with the constraints of REST
  2. One that goes outside REST and introduces an element of the RPC style
  3.  

    Accordingly to RFC 2616, POST is designed to allow a uniform method to cover the following functions:

    • Annotation of existing resources;

    • Posting a message to a bulletin board, newsgroup, mailing list, or similar group of articles;

    • Providing a block of data, such as the result of submitting a form, to a data-handling process;

    • Extending a database through an append operation.

     

    Creating subordinate resources - In a RESTful design, POST is commonly used to create subordinate resources: resources that exist in relation to some other “parent” resource.  A web-enabled database may expose a table as a resource, and the individual database rows as its subordinate resources. To create a database record, you POST to the parent: the database table.

     

    Difference between PUT and POST - the client uses PUT when it’s in charge of deciding which URI the new resource should have. The client uses POST when the server is in charge of deciding which URI the new resource should have.

     

    Factory Resource - The POST method is a way of creating a new resource without the client having to know its exact URI. In most cases the client only needs to know the URI of a “parent” or “factory” resource. The server takes the representation from the entity-body and use it to create a new resource “underneath” the “parent” resource (the meaning of “underneath”

    depends on context).  The response to this sort of POST request usually has an HTTP status code of 201 (“Created”). Its Location header contains the URI of the newly created subordinate

    resource. Now that the resource actually exists and the client knows its URI, future requests can use the PUT method to modify that resource, GET to fetch a representation of it, and DELETE to delete it.

     

    Appending to the resource state - The POST method works here, just as it would if each log entry was exposed as a separate resource. The semantics of POST are the same in both cases: the client adds subordinate information to an existing resource. The only difference is that in the case

    of the weblog and weblog entries, the subordinate information showed up as a new resource. Here, the subordinate information shows up as new data in the parent’s representation.

     

    Overloaded POST: The not-so-uniform interface - when providing a block of data, such as the result of submitting a form, to a data-handling process. Using POST this way turns a resource into a tiny message processor that acts like an XML-RPC server. The resource accepts POST requests, examines the request, and decides to do... something. Then it decides to serve to the client... some data. I call this use of POST overloaded POST, by analogy to operator overloading in a programming language.  When the method information isn’t found in the HTTP method, the interface stops being uniform.  The real method information might be anything. As a REST partisan I don’t like this very much, but occasionally it’s unavoidable. By Chapter 9 you’ll have

    seen how just about any scenario you can think of can be exposed through HTTP’s uniform interface, but sometimes the RPC style is the easiest way to express complex operations that span multiple resources. Overloaded POST should not be used to cover up poor resource design. Remember, a resource can be anything. It’s usually possible to shuffle your resource design so that the uniform interface applies, rather than introduce the RPC style into your service.

     

DELETE

 

 

HEAD

Retrieve meta-data only

An S3 client uses HEAD to fetch metadata about a resource without downloading the possibly enormous entity-body. That’s what HEAD is for. A client can use HEAD to check whether a resource exists, or find out other information about the resource, without fetching its entire representation. HEAD gives you exactly what a GET request would give you, but without the entity-body.

OPTIONS

Check which HTTP methods a particular resource supports

The OPTIONS method lets the client discover what it’s allowed to do to a resource. The response to an OPTIONS request contains the HTTP Allow header, which lays out the subset of the uniform interface this resource supports. Here’s a sample Allow header:

 

Allow: GET, HEAD

 

That particular header means the client can expect the server to act reasonably to a GET or HEAD request for this resource, but that the resource doesn’t support any other HTTP methods. Effectively, this resource is read-only.

 

NOTE: In theory, the server can send additional information in response to an OPTIONS request, and the client can send OPTIONS requests that ask very specific questions about the server’s capabilities. Very nice, except there are no accepted standards for what a client might ask in an OPTIONS request. Apart from the Allow header there are no accepted standards for what a server might send in response. Most web servers and frameworks feature very poor support for OPTIONS. So far, OPTIONS is a promising idea that nobody uses.

 

 

Safety and Idempotence

 

Safety

A GET or HEAD request is a request to read some data, not a request to change any server state. The client can make a GET or HEAD request 10 times and it’s the same as making it once, or never making it at all. When you GET http://www.google.com/search?q=jellyfish, you aren’t changing anything about the directory of jellyfish resources.

 

You’re just retrieving a representation of it. A client should be able to send a GET or HEAD request to an unknown URI and feel safe that nothing disastrous will happen. This is not to say that GET and HEAD requests can’t have side effects. Some resources are hit counters that increment every time a client GETs them. Most web servers log every incoming request to a log file. These are side effects: the server state, and even the resource state, is changing in response to a GET request. But the client didn’t ask for the side effects, and it’s not responsible for them.

 

IMPORTANT: A client should never make a GET or HEAD request just for the side effects, and the side effects should never be so big that the client might wish it hadn’t made the request.

Idempotence

An operation on a resource is idempotent if making one request is the same as  making a series of identical requests. The second and subsequent requests leave the resource state in exactly the same state as the first request did. PUT and DELETE operations are idempotent. If you DELETE a resource, it’s gone. If you DELETE it again, it’s still gone. If you create a new resource with PUT, and then resend the PUT request, the resource is still there and it has the same properties you gave it when you created it. If you use PUT to change the state of a resource, you can resend the PUT request and the resource state won’t change again.

The practical upshot of this is that you shouldn’t allow your clients to PUT representations that change a resource’s state in relative terms. If a resource keeps a numeric value as part of its resource state, a client might use PUT to set that value to 4, or 0, or −50, but not to increment that value by 1. If the initial value is 0, sending two PUT requests that say “set the value to 4” leaves the value at 4. If the initial value is 0, sending two PUT requests that say “increment the value by 1” leaves the value not at 1, but at 2. That’s not idempotent.

 

Why safety and idempotence matter

 

Safety and idempotence let a client make reliable HTTP requests over an unreliable network.

 

If you make a GET request and never get a response, just make another one. It’s safe: even if your earlier request went through, it didn’t have any real effect on the server. If you make a PUT request and never get a response, just make another one.

 

POST is neither safe nor idempotent. Making two identical POST requests to a “factory” resource will probably result in two subordinate resources containing the same information. With overloaded POST, all bets are off.

 

!!!!!The most common misuse of the uniform interface is to expose unsafe operations through GET. The del.icio.us and Flickr APIs both do this. When you GET https://api.del.icio.us/posts/delete, you’re not fetching a representation: you’re modifying the del.icio.us data set.

 

Why is this bad?

 

Well, here’s a story. In 2005 Google released a client-side caching tool called Web Accelerator. It runs in conjunction with your web browser and “pre-fetches” the pages linked to from whatever page you’re viewing. If you happen to click one of those links, the page on the other side will load faster, because your computer has already fetched it.

 

Web Accelerator was a disaster. Not because of any problem in the software itself, but because the Web is full of applications that misuse GET. Web Accelerator assumed that GET operations were safe, that clients could make them ahead of time just in case a human being wanted to see the corresponding representations. But when it made those GET requests to real URIs, it changed the data sets. People lost data. There’s plenty of blame to go around: programmers shouldn’t expose unsafe actions through GET, and Google shouldn’t have released a real-world tool that didn’t work  with the real-world web.

 

The current version of Web Accelerator ignores all URIs that contain query variables. This solves part of the problem, but it also prevents many

resources that are safe to use through GET (such as Google web searches) from being pre-fetched.

 

Multiply the examples if you like. Many web services and web applications use URIs as input, and the first thing they do is send a GET request to fetch a representation of a resource. These services don’t mean to trigger catastrophic side effects, but it’s not up to them. It’s up to the service to handle a GET request in a way that complies with the HTTP standard.

 

Why the Uniform Interface Matters

 

The important thing about REST is not that you use the specific uniform interface that HTTP defines. REST specifies a uniform interface, but it doesn’t say which uniform interface. GET, PUT, and the rest are not a perfect interface for all time. What’s important is the uniformity: that every service use HTTP’s interface the same way.

 

All this book is about, is about a common ontology to the web.  It makes sense what the author is saying.  If all the resources over the web, are accessible through a REST interface, then, and only then, it will be possible to bring the machine forward.

 

Without the uniform interface, you’d have to learn how each service expected to receive and send information. The rules might even be different for different resources within a single service.  Without the uniform interface, you get a multiplicity of methods taking the place of GET: doSearch and getPage and nextPrime. Every service speaks a different language. This is also the reason I don’t like overloaded POST very much: it turns the simple Esperanto of the uniform interface into a Babel of one-off sublanguages.

 

Some applications extend HTTP’s uniform interface. The most obvious case is WebDAV, which adds eight new HTTP methods including MOVE, COPY, and SEARCH. Using these methods in a web service would not violate any precept of REST, because REST doesn’t say what the uniform interface should look like. Using them would violate my Resource-Oriented Architecture (I’ve explicitly tied the ROA to the standard HTTP methods), but your service could still be resource-oriented in a general sense. There are web services like Subversion that use the WebDAV methods, so your service wouldn’t be all alone. But it would be part of a much smaller web. This is why making up your own HTTP methods is a very, very bad idea: your custom vocabulary puts you in a community of one. You might as well be using XML-RPC.

 

That’s It!

 

That’s the Resource-Oriented Architecture.

 

Concepts

Properties

Resources

Addressability

Resource Names and Locations (URIs)

Statelessness

Their Representations

Connectedness

The Links between them

Uniform interface

 

 

Chapter 5 - Designing Read-Only Resource- Oriented Services

 

 

Resource Design

 

Design Pattern

Design Approach

OO

Break a system down into its moving parts: its nouns. An object is something. Each noun (“Reader,” “Column,” “Story,” “Comment”) gets its own class, and behavior for interacting with the other nouns.

RPC

Break the system into its motions: its verbs. A procedure does something (“Subscribe to,” “Read,” “Comment on”).

RO

A resource is something, so I take an object-oriented approach to designing resources. In fact, the resource-oriented design strategy could be called “extreme object-oriented.”
 

A class in a programming language can expose any number of methods and give them any names, but an HTTP resource exposes a uniform interface of at most six HTTP methods. These methods allow only the most basic operations: create (PUT or POST), modify (PUT), read (GET), and delete (DELETE). If necessary, you can extend this interface by overloading POST, turning a resource into a small RPC-style message processor, but you shouldn’t need to do that very often.


The uniform interface means that a resource-oriented design must treat as objects what an object-oriented design might consider verbs. In the ROA, a Reader can’t subscribe to a regularly appearing Column, because “subscribe to” is not part of the uniform

interface. There must be a third object, Subscription, representing that relationship between a Reader and a Column. This relationship object is subject to the uniform interface: it can be created, fetched (perhaps as a syndication feed), and deleted. “Subscription” might not show up as a first-class object in an object-oriented analysis, but it probably would appear as a table in an underlying database model. In a resource oriented analysis, all object manipulation happens through resources that respect the uniform interface. Whenever I’m tempted to add a new method to one of my resource “classes,” I’ll resolve the problem by defining a new kind of resource.

 

Turning Requirements Into Read-Only Resources

 

I’ve come up with a procedure to follow once you have an idea of what you want your program to do (This procedure has a lot in common with Joe Gregorio’s “How to create a REST Protocol” - http://www.xml.com/pub/a/2004/12/01/restful-web.html ).  It produces a set of resources that respond to a read-only subset of HTTP’s uniform interface: GET and possibly HEAD. Once you get to the end of this procedure, you should be ready to implement your resources in whatever language and framework you like.

 

Step

Name

Description

1

Figure out the data set

A web service starts with a data set, or at least an idea for one.  An example service can find a place on a planet, given its name, type, or description. It can show the place on any appropriate maps, and it can find places nearby. Given a street address, the example service can locate the corresponding point on the planet Earth, and show it on a road map. Given the name of a country, it can locate the corresponding place on the planet (as a representative point), and show it on a political map.

 

General Lessons

This is a standard first step in any analysis. Sometimes you get to choose your data set, and sometimes you’re trying to expose data you’ve already got. You may come back to this step as you see how best to expose your data set as resources. I went through the design process two or three times before I figured out that points on a planet needed to be considered distinct from points on any particular map. Even now, the data set is chaotic, just a bundle of ideas. I’ll give it shape when I divide it into resources.

 

2

Split the data set into resources

    This step determines how to expose the data as HTTP resources. Remember that a resource is anything interesting enough to be the target of a hypertext link.

     

    Is a process also a resource?  How can a process be represented in a RESTfull fashion.

     

    There are three types of resources.

     

    Resource Type

    Description

    Predefined one-off resources for especially important aspects of the application.

    This includes top-level directories of other available resources. Most services expose few or no one-off resources.

     

    Example: A web site’s homepage. It’s a one-of-a-kind resource, at a well-known URI, which acts as a portal to other resources. The root URI of Amazon’s S3 service (https://s3.amazonaws.com/) serves a list of your S3 buckets. There’s only one resource of this type on S3. You can GET this resource, but you can’t DELETE it, and you can’t modify it directly: it’s modified only by operating on its buckets. It’s a predefined resource that acts as a directory of child resources (the buckets).

    A resource for every object exposed through the service.

    One service may expose many kinds of objects, each with its own resource set. Most services expose a large or infinite number of these resources.


    Example: Every S3 bucket you create is exposed as a resource. You can create up to 100 buckets, and they can have just about any names you want (it’s just that

    your names can’t conflict with anyone else’s). You can GET and DELETE these resources, but once you’ve created them you can’t modify them directly: they’re

    modified only by operating on the objects they contain.

    Every S3 object you create is also exposed as a resource. A bucket has room for any number of objects. You can GET, PUT, and DELETE these resources as you see fit.

    Resources representing the results of algorithms applied to the data set.

    This includes collection resources, which are usually the results of queries. Most services either expose an infinite number of algorithmic resources, or they don’t expose any.

     

    Example: A search engine exposes an infinite number of algorithmic resources. There’s one for every search request you might possibly make. The Google search engine exposes one resource at http://google.com/search?q=jellyfish (that’d be “a directory of resources about jellyfish”) and another at http://google.com/search?q=chocolate (“a directory of resources about chocolate”). Neither of these resources

    were explicitly defined ahead of time: Google translates any URI of the form http://google.com/search?q={query} into an algorithmic resource “a directory of resources about {query}.”

     

    Let’s apply these categories to my fantasy map service. I need one special resource that lists the planets, just as S3 has a top-level resource that lists the buckets. It’s reasonable to link to “the list of planets.” Every planet is a resource: it’s reasonable to link to “Venus.” Every map of a planet is also a resource: it’s reasonable to link to “the radar map of Venus.” The list of planets is a resource of the first type, since there’s only one of them. The planets and maps are also one-off resources: my service will serve a small number of maps for a small number of planets.

    Here are some of the resources so far:

    • The list of planets

    • Mars

    • Earth

    • The satellite map of Mars

    • The radar map of Venus

    • The topographic map of Earth

    • The political map of Earth

     

    But I can’t just serve entire maps and let our clients figure out the rest. Then I’d just be running a hosting service for huge static map files: a RESTful service to be sure, but not a very interesting one. I must also serve parts of maps, oriented on specific points and places.

     

    Every point on a planet is potentially interesting, and so should be a resource. A point might represent a house, a mountain, or the current location of a ship. These are resources of the second type, because there are an infinite number of points on any planet. For every point on a planet there’s a corresponding point on one or more maps. This

    is why I limited myself to addressable maps. When the map can be addressed by latitude and longitude, it’s easy to turn a point on the planet into a point on a map.

     

    Here are some more of the resources so far:

    • 24.9195N 17.821E on Earth

    • 24.9195N 17.821E on the political map of Earth

    • 24.9195N 17.821E on Mars

    • 44N 0W on the geologic map of Earth

     

    I’ll also serve places: points on a planet identified by name rather than by coordinates. My fantasy database contains a large but finite number of places. Each place has a type, a latitude and longitude, and each might also have additional associated data. For instance, an area of high pollution should “know” what pollutant is there and what the

    concentration is. As with points identified by latitude and longitude, the client should be able to move from a place on the planet to the corresponding point on any map.

     

     

    General Lessons

     

    A RESTful web service exposes both its data and its algorithms through resources.

    There’s usually a hierarchy that starts out small and branches out into infinitely many leaf nodes. The list of planets contains the planets, which contain points and places, which contain maps. The S3 bucket list contains the individual buckets, which contain the objects. It takes a while to get the hang of exposing an algorithm as a set of resources. Instead of thinking in terms of actions (“do a search for places on the map”), you need to think in terms of the results of that action (“the list of places on the map matching a search criteria”). You may find yourself coming back to this step if you find that your design doesn’t fit HTTP’s uniform interface.

For each resource

 

 

3

Name the resources with URIs

    In a resource-oriented service the URI contains all the scoping information.

     

    In the example of a map, the URIs need to answer questions like: “Why should the server operate on this map instead of that map?” and “Why should the server operate on this place instead of that place?”

     

    In the example being considered, the web service will be rooted at http://maps.example.com/. For brevity’s sake I sometimes use relative URIs in this chapter and the next; understand that they’re relative to http://maps.example.com/. If I say /Earth/political, what I mean is http://maps.example.com/Earth/political.

     

    In the case of the hsn - the root can be www.211now.com and the entities will be www.211now.com/beneficiaries and www.211now.com/donors

     

    There are three basic rules for URI design, born of collective experience.

     

    Description

    Example

    Use path variables to encode hierarchy:

     /parent/child

     

    Let’s make URIs for the second class of resource: planets and places on planets. There’s one piece of scoping information here: what planet are we looking at? (Earth? Venus? Ganymede?) This scoping information fits naturally into a hierarchy: the list of planets is at the top, and underneath it is every particular planet. Here are the URIs to some of my planets. I show hierarchy by using the slash character to separate pieces of scoping information.

     

    http://maps.example.com/Venus

    http://maps.example.com/Earth

    http://maps.example.com/Mars

     

    To identify geographical places by name I’ll just extend the hierarchy to the right. You’ll

    know you’ve got a good URI design when it’s easy to extend hierarchies by tacking on

    additional path variables. Here are some URIs to various places on planets:

     

    http://maps.example.com/Venus

    http://maps.example.com/Venus/Cleopatra

    http://maps.example.com/Earth/France/Paris

     

    Put punctuation characters in path variables to avoid implying hierarchy where

    none exists:

     /parent/child1;child2

      The next resources I need to name are geographic points on the globe, represented by latitude and longitude. Latitude and longitude are tied together, so a hierarchy isn’t appropriate. A URI like /Earth/24.9195/17.821 doesn’t make sense. The slash makes it look like longitude is a subordinate concept to latitude,  the way /Earth/Chicago signals that Chicago is part of Earth.

      Instead of using the slash to put two pieces of scoping information into a hierarchy, I recommend combining them on the same level of a hierarchy with a punctuation character:

      usually the semicolon or the comma.

       

      http://maps.example.com/Earth/24.9195,17.821

      http://maps.example.com/Venus/3,-80

       

      Recommendation:  use commas when the order of the scoping information is important, and semicolons when the order doesn’t matter. In this case the order matters: if you switch latitude and longitude, you get a different point on the planet. So I used commas to separate the two numbers. It doesn’t hurt that people already use commas in written language to separate latitude and longitude: URIs should use our existing conventions

      when possible.

       

      Note: URIs can become very long, especially when there’s no limit to how deep you can nest the path variables. My web service might let clients name a place using a lot of explicit scoping information: /Earth/North%20America/USA/California/Northern%20California/San%20Francisco%20Bay%20Area/Sebastopol/...

      The HTTP standard doesn’t impose any restrictions on URI length, but real web servers and clients do. For instance, Microsoft Internet Explorer can’t handle URIs longer than 2,083 characters, and Apache won’t respond to requests for URIs longer than 8 KBs. If soe of your resources are only addressable given a great deal of scoping information, you may have to accept some of it in HTTP headers, or use overloaded POST and put scoping information in the entity-body.

       

    Use query variables to imply inputs into an algorithm, for example:

     /search?q=jellyfish&start=20

     

      Algorithmic Resource? Use Query Variables


      Path variables look like you’re traversing a hierarchy, and query variables look like you’re passing arguments into an

      algorithm. “Search” sounds like an algorithm. For example, http://www.google.com/ directory/jellyfish" might work better than /search/jellyfish.

       

      This perception of query variables is reinforced whenever we use the Web. When you fill out an HTML form in a web browser, the data you input is turned into query variables.

      There’s no way to type “jellyfish” into a form and then be sent to http://www.google.com/search/jellyfish. The destination of an HTML form is hard-coded to http://www.google.com/search/, and when you fill out that form you end up at http://

      www.google.com/search?q=jellyfish. Your browser knows how to tack query variables

      onto a base URI. It doesn’t know how to substitute variables into a generic URI like

      http://www.google.com/search/{q}.

       

     

4

Expose a subset of the uniform interface

    Encode Hierarchy into Path Variables

     

     

    http://maps.example.com/Earth/Little%20Rock,AR

    http://maps.example.com/Earth/USA/Mount%20Rushmore

    http://maps.example.com/Earth/1005%20Gravenstein%20Highway%20North,%20Sebastopol,%20CA%2095472

     

    Sending a GET to one of these URIs invokes a remote operation that takes a variable number of arguments, and can locate a place on a planet to any desired degree of precision. But the URIs themselves look like normal web site URIs you can bookmark, cache, put on billboards, and pass to other services as input—because that’s what they are. Path variables are the best way to organize scoping information that can be arranged hierarchically.

     

    Note: The same structure you see in a filesystem, or on a static web site, can correspond to an arbitrarily long list of path variables.

     

    I’ve managed to avoid query variables so far: every planet, every point on a planet, and every corresponding map is addressable without them. I don’t really like the way query variables look in a URI, and including them in a URI is a good way to make sure that URI gets ignored by tools like proxies, caches, and web crawlers.

     

    The only query variable I’ll add is show, which lets the client specify in natural language what feature(s) they’re searching for. The server will parse the client’s values for show and figure out what places should be in the list of search results. In “Split the Data Set into Resources” earlier in this chapter, I gave a whole lot of sample search resources: “places on Earth called Springfield,” and so on. Here’s how a client might use show to construct URIs for some of those resources.

     

    http://maps.example.com/Earth?show=Springfield

    http://maps.example.com/Mars?show=craters+bigger+than+1km

    http://maps.example.com/Earth/Indonesia?show=oil+tankers&show=container+ships

    http://maps.example.com/Earth/USA/Mount%20Rushmore?show=diners

    http://maps.example.com/Earth/24.9195;17.821?show=arsenic

     

    Note that all of these URIs are searching the planet, not any particular map.

     

    URI Recap

     

    After all, this is the first place where my fantasy resources come into contact with the real world of HTTP. Even so, my service only supports three basic kinds of URI. To recap, here they are:

    • The list of planets: /.

    • A planet or a place on a planet: /{planet}/[{scoping-information}/][{place-name}]:The value of the optional variable {scoping-information} will be a hierarchy of place names like /USA/New%20England/Maine/ or it will be a latitude/longitude pair. The value of the optional variable {name} will be the name of the place. This type of URI can have values for show tacked onto its query string, to search for places near the given place.

    • A map of a planet, or a point on a map: /{map-type}{scale}/{planet}/[{scoping-information}].The value of the optional variable {scoping-information} will always be a latitude/longitude pair. The value of the optional variable {scale} will be a dot and a number.

5

Design the representation(s) accepted from the client

Decide what data to send when a client requests a resource, and what data format to use. Here, I have a specific service in mind, and I need to decide on a format (or a set of formats) that can meet the goals of any RESTful representation: to convey the current state of the resource, and to link to possible new application and resource states.

 

The Representation Talks About the State of the Resource

 

The main purpose of any representation is to convey the state of the resource. Remember that “resource state” is just any information about the underlying resource. In this case, the state is going to answer questions like: what does this part of the world look like, graphically? Where exactly is that meteor crater, in latitude and longitude? Where are the nearby restaurants and what are their names? Where are the container ships right now? Representations of different resources will represent different items of state.

 

The Representation Links to Other States

 

The other job of the representation is to provide levers of state. A resource’s representation

ought to link to nearby resources (whatever “nearby” means in context): possible new application states. The goal here is connectedness: the ability to get from one resource to another by following links.  This is how web sites work. You don’t surf the Web by typing in URIs one after the other. You might type in one URI to get to a site’s home page, but then you surf by following links and filling out forms. One web page (a “state” of the web site) contains links to other, related web pages (nearby “states”).

 

Representing the List of Planets

 

Representation

Example

Notes

Plain text

http://maps.example.com/Earth Earth

http://maps.example.com/Venus Venus

Plain text is not an hypermedia format

JSON

[{url="http://maps.example.com/Earth, description="Earth"},

{url="http://maps.example.com/Venus, description="Venus"},

...]

JSON is not generally considered a “hypermedia” format.

Custom XML format with or without schema

<?xml version="1.0" standalone='yes'?>

<planets>

  <planet href="http://maps.example.com/Earth" name="Earth" />

<planet href="http://maps.example.com/Venus" name="Venus" />

...

</maps>

The basic problems have already been solved, and most of the time you can reuse an existing XML

vocabulary. As it happens, there’s already an XML vocabulary for communicating lists of links called Atom.

Atom 0.3

 

Atom will work to represent the list of planets, but

it’s not a very good fit. Atom is designed for lists of published texts, and most of its elements don’t make sense in this context—what does it mean to know the “author” of a planet, or the date it was last modified?

XHTML

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"

"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">

  <head>

    <title>Planet List</title>

  </head>

  <body>

    <ul class="planets">

      <li><a href="/Earth">Earth</a></li>

      <li><a href="/Venus">Venus</a></li>

    </ul>

  </body>

</html>

 

Fortunately, there’s another good XML language for displaying lists of links: XHTML.

 

6

Design the representation(s) served to the client

    What about the maps themselves? What do I serve if someone asks for a satellite map of the Moon? The obvious thing to send is an image, either in a traditional graphics format like PNG or as a SVG scalar graphic. Except for the largest-scale maps, these images will be huge. Is this OK? It depends on the audience for my web service.

     

    If I’m serving clients with ultra-high bandwidth who expect to process huge chunks of map data, then huge files are exactly what they want. But it’s more likely my clients will be like the users of existing map applications like Google and Yahoo! Maps: clients who want smaller-sized maps for human browsing.

     

    A representation conveys the state of its resource, but it doesn’t have to convey the entire state of the resource. It just has to convey some state. The representation of “Earth” isn’t the actual planet Earth, and the representation of “a road map of the Earth” can reference just a simple image tile. But this representation does more than that: the XHTML file links this arbitrarily chosen point on the map to other nearby points on the part of the map directly to the north of this tile, directly to the east, and so on. The client can follow these links to other resources and piece together a larger picture. The map is made of a bunch of connected resources, and you can get as many graphical tiles as you need by following the links. So in a sense, this representation does convey all the state there is about the road map of Earth. You can get as much of that state as you want by following its links to other resources.

     

     

     

7

Integrate this resource into existing resources, using hypermedia links and forms

    I want to make it possible for a client to get from /Earth/USA/Mount%20Rushmore to /Earth/USA/Mount%20Rushmore?show=diners. But it does no good to link to “diners” specifically: that’s just one of infinitely many things a client might search for. I can’t put infinitely many links in the representation of /Earth/USA/Mount%20Rushmore just in case someone decides to search for pet stores or meteor craters near Mount Rushmore. HTML solves this problem with forms. By sending an appropriate form in a representation, I can tell the client how to plug variables into a query string. The form represents infinitely many URIs, all of which follow a certain pattern. I’m going to extend my

    representations of places by including this HTML form

     

     

8

Consider the typical course of events: what’s supposed to happen?

    Where are we? I’m almost done with this design. I know:

  1. what data I’m serving
  2. I know which HTTP requests clients will send when asking for the data
  3. I know how the data will be represented as I serve it.
  4.  

    I still have to consider the HTTP response itself. I know what my possible representations will look like, and that’s what’s going in the entity-body, but I haven’t yet considered the possible response codes or the HTTP headers I’ll send. I also need to think about possible error conditions: cases where my response signals an error instead of delivering a representation.

     

    Whenever a server serves a representation, it should include a time value for the Last-Modified HTTP header - This is the last time the data underlying the representation was changed. For “a road map of the United States,” the Last-Modified is likely to be the time the map imagery was first imported into the service. For “restaurants in New York City,” the Last-Modified may only be a few days old: whenever a restaurant was last added to the database of places. For “container ships near San Francisco,” the value of Last-Modified may be only a few minutes prior.

     

    If the underlying data changed between the two requests, the server sends a response code of 200 (“OK”) and provides the new representation in the entity-body. That’s the same thing that happens during a normal HTTP request. But if the underlying data has not changed, the server sends a response code of 304 (“Not Modified”), and omits any entity-body. Then the client knows it’s okay to reuse its cached representation: the underlying data hasn’t changed since the first request.

9

Consider error conditions: what might go wrong?

I also need to plan for requests I can’t fulfill. When I hit an error condition I’ll send a response code in the 3xx, 4xx, or 5xx range, and I may provide supplementary data in HTTP headers. If they provide an entity-body, it’ll be a document describing an error condition, not a representation of the requested resource (which, after all, couldn’t be served).

 

Conclusion

 

I’ve now got a design for a map web service that’s simple enough for a client to use without a lot of up-front investment, and useful enough to be the driver for any number of useful programs. It’s so closely attuned to the philosophy of the Web that you can use it with a web browser. It’s RESTful and resource-oriented. It’s addressable, stateless, and well connected.

 

It’s also read-only. It assumes that my clients have nothing to offer but their insatiable appetites for my data. Lots of existing web services work this way, but read-only web services are only half the story. In the next chapter I’ll show you how clients can use HTTP’s uniform interface to create new resources of their own.

 

6 - Designing Read/Write Resource-Oriented Services

 

User Accounts as Resources

 

If I’m going to let anyone with a web service client annotate our worlds, I need some way of distinguishing his custom places from the standard places in my database. I’ll also need a way to distinguish one user’s places from everyone else’s places. Basically, I need user accounts.

 

Why Should User Accounts Be Resources?

 

Reason

Description

#1

If I tried to cover the typical method of getting a user account, I’d end up skimming the details as not relevant to web services. Treating user accounts as read/write resources means I can demonstrate the new resource-oriented design procedure on a data structure you’re probably familiar with.

#2

I want to show that new possibilities open up when you treat everyday data structures as resources, subject to the uniform interface. Consider an Internet-connected GPS device that ties into my map service. Every hour or so, it annotates Earth (as exposed through my web service) with its current position, creating a record of where the GPS device is over time.  There will be thousands of these devices, and each one should only be able to see its own annotations. The person in charge of programming the device should not be limited to creating a single user account for personal use. Nor should everyone who buys the device have to go to my web site and fill out a form before they can use the device they bought.   Since user accounts are resources, every one of these devices can have its own account on my web service (possibly with a username based on the serial number), and these accounts can be created automatically. They might be created in batches as the devices are manufactured, or each one may create an account for itself when its owner first turns it on.  The end users may never know that they’re using a web service, and they’ll never have to sign up for a key. The device programmer does need to know how our web service works, and needs to write software that can create user accounts. If user accounts are resources, it’s obvious how the device programmer can do this. HTTP’s uniform interface gives most of the answers ahead of time.  The end users may never know that they’re using a web service, and they’ll never have

to sign up for a key. The device programmer does need to know how our web service works, and needs to write software that can create user accounts. If user accounts are resources, it’s obvious how the device programmer can do this. HTTP’s uniform interface gives most of the answers ahead of time.

 

Authentication, Authorization, Privacy, and Trust

 

Activity

Description

Authentication

is the problem of tying a request to a user. If you want to name a new place on Mars, I need some way of knowing that the new place should be associated with your user account instead of someone else’s.

Authorization

is the problem of determining which requests to let through for a given user. There are some HTTP requests I’d accept from user A but reject from user B: requests like “DELETE user A” or “GET all of user A’s private places.” In my service, if you authenticate as user A, you’re allowed to manipulate user A’s account, but not anyone else’s.

Trust

 

 

Basics:  When a web service client makes an HTTP request, it may include some credentials in the HTTP header Authorization. The service examines the credentials, and decides whether they correctly identify the client as a particular user (authentication), and whether that user is actually allowed to do what the client is trying to do (authorization). If both conditions are met, the server carries out the request. If the credentials are missing, invalid, or not good enough to provide authorization, then the server sends a response code of 401 (“Unauthorized”). It sets the WWWAuthenticate response header with instructions about how to send correct credentials in the future.

 

Authentication

Description

HTTP Basic

Using HTTPS instead of HTTP prevents other computers from eavesdropping on the conversation between client and server. This is especially important when using HTTP Basic authentication, since that authentication mechanism involves the client sending its credentials in plain text.

HTTP Digest

 

WSSE

 

Other

Amazon S3 implements authentication with a sophisticated request signing mechanism

 

The difference between the human web versus the machine web in terms of security

 

Human Web

Machine Web

Whenever you log in to a web site, you’re trusting your web browser to send your username and password to that web site, and nowhere else. Why do you trust your browser with that information? How do you know your browser doesn’t have a secret backdoor that broadcasts everything you type to some seedy IRC channel? There are several possible answers. You might be using an open source browser like Firefox, which has good source control and a lot of people looking at the source code. You might say there’s safety in numbers: that millions of people use your brand of browser and there haven’t been any problems traceable to the browser itself. You might monitor your network traffic to make sure your browser is only sending the data you tell it to send. But most people just take it on faith that their web browser is trustworthy.

I send you a cool new web service client for managing your del.icio.us bookmarks. Do you trust that client with your del.icio.us username and password? Do you trust it as much as you trust your web browser with the same information? Hopefully not! No web service client is as popular as a web browser, and no web service client has as many eyes on the source code. On the human web, we usually ignore the problem by taking a leap of faith and trusting our web browsers. On the programmable web the problem is more obvious. We don’t necessarily trust our own clients with our authentication credentials.

 

This is a problem - There’s nothing in the HTTP standard to deal with this problem, because it’s a problem between the end user and the client: HTTP lives between the client and the server. Solving this problem requires forgoing all the standard ways of sending authentication information: Basic, Digest, and WSSE don’t work because they require the client to know the credentials. (You can solve it with Digest or WSSE by having a tiny, trusted account manager send encrypted authentication strings to the actual, untrusted client. I don’t know of any web service clients that use this architecture.)

 

Note:  Google, Amazon, eBay, and Flickr have come up with ways for a client to make web service requests without knowing the actual authentication

credentials. You saw a hint of this in Chapter 3: I showed how to sign an Amazon S3 request and give a special URI to someone else, which they could use without knowing your password.

 

Turning Requirements into Read/Write Resources  (User/Password)

 

Step

Name

Description

1

Figure Out the data set

Most sites with user accounts try to associate personal information with your account, like your name or email address. I don’t care about any of that. In my map service, there are only two pieces of information associated with a user account:

• The name of the account

• A password used to access the account

Each user account also has some subordinate resources (custom places on planets) associated with it, but I’ll figure that part out later. All I need for now is a way of identifying specific user accounts (a username), and a way for a client to present credentials that tie them to a certain user account (a password). Since I don’t track any personal information, there’s no reason apart from tradition to even call this a “user account.” I could call it a “password-protected set of annotations.” But I’ll stick to the traditional terminology. This makes it easier to visualize the service, and easier for you to come up with your own enhancements to the user account system.

2

Split the Data Set into Resources

Here the data set is fairly constrained: “user accounts.” I’ll expose each user account as a resource. In terms of the Chapter 5 terminology, these new resources are resources of the second type. They’re the portals through which my service exposes its underlying user objects. Another site might also expose the list of user accounts itself as a one-off resource, or expose algorithmic resources that let a client search the list of users. I won’t bother.

 

Name the Resources with URIs

I’ll expose a user account with a URI of the following form: https://maps.example.com/user/{user-name}.

 

Expose a Subset of the Uniform Interface

By definition, read-only resources are the ones that expose no more than the HTTP methods GET, HEAD, and OPTIONS. Now that I’m designing resources that can be created and modified at runtime, I also have PUT, POST, and DELETE to consider.

NOTE:  If you find yourself wishing there were more HTTP methods, the first thing to do is go back to step two, and try to split up your data set so you have more kinds of resources. Only if this fails should you consider introducing an element of the RPC style by making a particular resource support overloaded POST.

Example:  if you have resources for “readers,” and resources for “published columns,” and you start thinking “it sure would be nice if there was a SUBSCRIBE method in HTTP,” the best thing to do is to create a new kind of resource: the “subscription.”

To decide which bits of the uniform interface to expose by asking questions about intended usage:
 

Question

Answer

Implication

Will clients be creating new resources of this type?

Of course they will. There’s no other way for users to get on the system.

 

When the client creates a new resource of this type, who’s in charge of determining the new resource’s URI?  Is it the client or the server?

The client is in charge, since the URI is made up entirely of constant strings (https://maps.example.com/user/) and variables under the client’s control ({user-name}).

a client will send a PUT request to the account’s URI

Will clients be modifying resources of this type?

Yes. It’s questionable whether or not a user should be allowed to change his username (I’m not going to allow it, for simplicity’s sake), but a user should always be allowed to change his password.

 

Will clients be deleting resources of this type?

Yes. You can delete an account when you’re done with it.

 

Will clients be fetching representations of resources of this type?

Right now there’s not much information associated with a user account: only the username, which is part of the URI, and the password, which I won’t be giving out.

I’m going to say yes, which means I will be exposing GET and HEAD on user account resources. If nothing else, clients will want to see whether or not their

desired username already exists. And once I allow users to define custom places, clients will want to look at the public places defined by specific users.

 

 

 

Design the Representation(s) Accepted from the Client

    My data set comes with no built-in user accounts: every one is created by some client. The obvious next step in this design is to specify how the client is supposed to create a user account.

     

    Example - AWS S3:

  1. Creating a bucket - A client creates an S3 bucket by sending an empty PUT request to the URI of the bucket.
  2. Create an object inside a bucket - To create an S3 object inside a bucket takes a little more work. An S3 object has two bits of state: name and value. The name goes into the URI, the destination of the PUT request. But the value needs to go into the entity-body of the PUT request. S3 will accept any data at all in this entity-body, because the whole point is that the value of an S3 object can be anything, but there needs to be something there: you can’t have an empty object.
  3.  

    What format makes it easiest for the client to convey a password to the server?

    When the state is complex, it’s helpful for the server to accept the same representation format it sends. The client can request a representation with GET, modify the representation, and then PUT it back, committing its changes to the underlying resource state. As we’ll see in Chapter 9, the Atom Publishing Protocol uses this technique effectively.

    And, of course, S3 serves the representation of an object byte for byte the way it was when the client first PUT it into the system. S3 doesn’t even pretend to know anything about the meaning of the representations it serves. Here, I’ve only got one item of state (the password), and it’s not one that the server will ever send to the client. Now’s a good time to introduce a representation format for simple cases like these.

     

    Form-encoding

     

    This representation doesn’t have an official name beyond its media type (application/x-www-form-urlencoded), but you’ve probably seen it before. It’s sometimes called “CGI escaping.” When you submit an HTML form in your web browser, this is the format the browser uses to marshal the form data into something that can go into an HTTP request.

    Example (simple HTML form):

    <form action="http://www.example.com/color-pairs" method="POST">

    <input name="color1" type="text"/>

    <input name="color2" type="text"/>

    </form>

     

    Assumption:  If the user enters the values “blue” and “green” in the text fields color1 and color2 fields, a form-encoded representation of that data would be color1=blue&color2=green.

     

     

    Action

    HTTP

    Post

    request to http://www.example.com/color-pairs, and sends color1=blue&color2=green in the entity-body: that’s a representation.

    Get

    user submitted the form the browser would make a GET request to http://www.example.com/color-pairs?color1=blue&color2=green. That’s got the same data in the same format, but there the data is scoping information that identifies a resource, not a representation.

     

    When an object’s state can be represented as key-value pairs, form-encoding is the simplest representation format. Almost every programming language has built-in facilities for doing form-encoding and -unencoding: they’re usually located in the language’s HTTP or CGI library.

     

    Example 6-2. Hypothetical map client to create a user account

    require 'rubygems'

    require 'rest-open-uri'

    require 'cgi'

    require 'uri'

    def make_user(username, password)

    open("https://maps.example.com/user/#{URI.escape(username)}",

    :data => CGI::escape("password=#{password}"), :method => :put)

    end

     

     



 

Design the Representation(s) to Be Served to the Client

(couldn't yet understand)

 

Link This Resource to Existing Resources

(couldn't yet understand)

 

What’s Supposed to Happen?

    Let’s consider what might happen when a client sends a PUT request to /user/leonardr. As is usual with HTTP, the server reads this request, takes some action behind the scenes, and serves a response. I need to decide which numeric response code the response will have, and what HTTP headers and/or entity-body will be provided. I also need to decide how the request will affect resource state: that is, what real-world effects it will have.

     

    What happens

    Return Code

    No user “leonardr”

    The service creates one with the specified password. The response code is 201 (“Created”), and the Location header contains the URI of the newly created user.

    Account already exists

    the resource state is modified to bring it in line with the client’s proposed new representation. That is, the account’s password is modified. In this case the response code may be 200 (“OK”), and the response entity-body may contain a representation of the user account. Or, since the password change never affects the representation, the response code may be 205 (“Reset Content”) and the response entity-body may be omitted altogether.

     

    IMPORTANT NOTE: PUT requests are the only complicated ones, because they’re the only ones that include a representation. GET and DELETE requests work exactly according to the uniform interface. A successful GET request has a response code of 200 (“OK”) and a representation in the entity-body. A successful DELETE request also has a response code of 200 (“OK”). The server can send an entity-body in response to a successful DELETE, but it would probably contain just a status message: there’s no longer a resource to send a representation of.

 

What Might Go Wrong?

A request that creates, modifies, or deletes a resource has more failure conditions than one that just retrieves a representation. Here are a few of the error conditions for this new resource.

 

Error

Description

415

The client’s representation might be unintelligible to the server. My server expects a representation in form-encoded format; the client might send an XML document instead. The status code in this case is 415 (“Unsupported Media Type”).

400

the client might not have provided a representation at all. Or it might

have provided a form-encoded representation that’s ill-formed or full of nonsense data. The status code in this case is 400 (“Bad Request”).

400

 

409

Maybe the representation makes sense but it tells the server to put the resource into an inconsistent or impossible state. Perhaps the representation is “password=”, and I don’t allow accounts with empty passwords. The exact status code depends on the error; in the case of the empty password it would probably be 400 (“Bad Request”). In another situation it might be 409 (“Conflict”).

401

Maybe the client sends the wrong credentials, sends authorization credentials for a totally different user account, or doesn’t send the Authorization header at all. A client can only modify or delete a user if it provides that user’s credentials. The response code in this case is 401 (“Unauthorized”), and I’ll set the WWW-Authenticate header with

instructions to the client, giving a clue about how to format the Authorization header according to the rules of HTTP Basic authentication.

 

 

 

Note:  the author moves onto making the read/write regarding user custom places.

 

CHAPTER 7 - A Service Implementation

 

Chapter Goal

Description

how to make a RESTful, resource-oriented service out of an existing RPC-style service

 

sort of tradeoffs you might need to make to get a design that works within your chosen framework

 

the complete code to a nontrivial web service, without boring you with page after page of implementation details

 

 

Analysis of Del.icio.us hybrid Restful-XML_RPC

 

Entity

Analysis

User Account

Unlike an S3 bucket, or a user account on my map service, a del.icio.us user account is not just a named list of subordinate resources. It’s got state of its own. A del.icio.us account has a username and password, but it’s supposed to correspond to a particular person, and it also tracks that person’s full name and email address. A user account also has a list of subordinate resources: the user’s bookmarks. All this state can be fetched and manipulated through HTTP.

Bookmark

    A bookmark belongs to a user and has six pieces of state:

  1. a URI
  2. a short description
  3. long description
  4. a timestamp
  5. a collection of tags
  6. a flag that says whether or not it’s public
  7.  

    Note: the previous chapter’s “custom place” resource has a similar flag. The client is in charge of specifying all of this information for each bookmark, though the URI and the short description are the only required pieces of state.

 

Very important note:   The URIs in users’ bookmarks are the most interesting part of the data set. When you put a bunch of peoples’ bookmarks together, you find that the URIs have emergent properties. On del.icio.us these properties include:
newness,
a measure of how recently someone bookmarked a particular URI;
“popularity,”
a measure of how many people have bookmarked that URI;
and the “tag cloud,” a generated vocabulary for the URI, based on which tags people tend to use to describe the URI
The del.icio.us web site also exposes a recommendation engine that relates URIs to each other, using a secret algorithm.

 

Resource Design

 

Important Note: In Chapters 5 and 6 I had a lot of leeway in turning my imaginary data set into resources. The idea for my map service came from the Google Maps application with its image tiles, but I took it off in another direction. I added user accounts, custom places, and other features not found in any existing map service. This chapter works differently. I’m focusing on translating the ideas of del.icio.us into the Resource-Oriented Architecture.

 

The del.icio.us web service is a REST-RPC hybrid service, described in English prose at http://del.icio.us/help/api/. The web service itself is rooted at https://api.del.icio.us/v1/.

 

The service exposes three RPC-style APIs, rooted at the relative URIs posts/,tags/, and bundles/. Beneath these URIs the web service exposes a total of twelve RPC functions that can be invoked through HTTP GET. I need to define RESTful resources that can expose at least the functionality of these three APIs:

 

API

Description

RPC

Description

Post/

which lets the user fetch and manage her bookmark posts to del.icio.us

posts/get

Search your posts by tag or date, or search for a specific bookmarked URI.

 

 

posts/recent

Fetch the n most recent posts by the authenticated user. The client may apply a tag filter: “fetch the n most recent posts that the authenticated user tagged with tag t”

 

 

posts/dates

Fetch the number of posts by the authenticated user for each day: perhaps five posts on the 12th, two on the 15th, and so on. The client may apply a tag filter here, too.

 

 

posts/all

Fetch all posts for the authenticated user, ever. The client may apply a tag filter.

 

 

posts/update

Check when the authenticated user last posted a bookmark. Clients are supposed to check this before deciding to call the expensive posts/all.

 

 

posts/add

Create a bookmark for a URI. The client must specify a short description. It may choose to specify a long description, a set of tags, and a timestamp. A bookmark may be public or private (the default is public). A client may not bookmark the same URI more than once: calling posts/add again overwrites the old post with new information.

 

 

posts/delete

Deletes a user’s post for a particular URI.

tags/

API, which lets the authenticated user manage her tags separately

from the bookmarks that use the tags

tags/get

Fetch a list of tags used by the authenticated user.

 

 

tags/rename:

Rename one of the authenticated user’s tags. All posts tagged with the old name will now be tagged with the new name instead.

Bundles/

lets the authenticated user group similar tags together

tags/bundles/all

Fetch the user’s bundles. The resulting document lists the bundles, and each bundle lists the tags it contains.

 

 

tags/bundles/set:

Group several tags together into a (possibly new) bundle.

 

 

tags/bundles/delete

Delete a bundle.

 

 

Now that I know what the service has to do, arranging the features into resources is like working a logic puzzle. I want to expose as few kinds of resources as possible. But one kind of resource can only convey one concept, so sometimes I need to split a single feature across two kinds of resource. On the other hand, sometimes I can combine multiple RPC functions into one kind of resource, a resource that responds to several methods of HTTP’s uniform interface.

 

REST in Rails

 

I’m not designing these resources in a vacuum: I’m going to implement them in a Rails application. It’s worth taking a brief look at how RESTful applications work in Rails. Unlike some other frameworks, Rails doesn’t let you define your resources directly. Instead, it divides up an application’s functionality into controllers: it’s the controllers that expose the resources.

 

Important Note: In Rails, list and an item in the list, show up all the time. Every database table is a list that contains items. Anything that can be represented as an RSS or Atom feed is a list that contains items. Rails defines a RESTful architecture that makes a simplifying assumption: every resource you expose can be made to fit one of these two patterns. This makes things easy most of the time, but the cost is aggravation when you try to use Rails controllers to expose resources that don’t fit this simple model.

 

The User Controller

 

Now I’m going to go back to that big list of RPC functions I found in the del.icio.us API and web site, and try to tease some Rails controllers out of it. One obvious controller is one that exposes information about user accounts. In Rails, this would be a class called UsersController. As soon as I say that, a lot of decisions are made for me. Rails sets up a path of least resistance that looks like this:

 

The user controller exposes a one-off “user list” resource, at the URI /users. It also exposes a resource for every user on the system, at a URI that incorporates the user’s database ID: /users/52 and the like. These resources expose some subset of HTTP’s uniform interface. Which subset? Rails defines this with a programming-language interface in the superclass of all controller classes: ActionController::Base.

 

Operation

HTTP action

Rails method

List the users

GET /users

UsersController#index

Create a user

POST /users

UsersController#create

View a user

GET /users/52

UsersController#show

Modify a user

PUT /users/52

UsersController#update

Delete a user

DELETE /users/52

UsersController#destroy

 

 Important node about Rails:  the URIs like /users/52 look ugly. They certainly don’t look like http://del.icio.us/leonardr, the URI to my corresponding page on del.icio.us. This URI format is the Rails default because every object in a Rails application’s database can be uniquely identified by its table (“users”) and its ID (“52”). This URI might go away (if user 52 DELETEs her account), but it will never change, because database unique IDs

don’t change.

 

The Bookmarks Controller

 

Each user account has a number of subordinate resources associated with it: the user’s bookmarks. I’m going to expose these resources through a second controller class, rooted beneath the “user account” resource.

 

The base URI of this controller will be /users/{username}/bookmarks. Like the users controller, the bookmarks controller exposes two types of resource: a one-off resource for the list of a user’s bookmarks, and one resource for each individual bookmark.

 

Rails wants to expose an individual bookmark under the URI /users/{username}/bookmarks/{database-id}. I don’t like this any more than I like /users/{database-id}. I’d like the URI to a bookmark to have some visible relationship to the URI that got bookmarked.  My original plan was to incorporate the target URI in the URI to the bookmark. That way if I bookmarked http://www.oreilly.com/, the bookmark resource would be available at /v1/users/leonardr/bookmarks/http://www.oreilly.com/. Lots of services work this way, including the W3C’s HTML validator (http://validator.w3.org/). Looking at one of these URIs you can easily tell who bookmarked what. Rails didn’t like this URI format, though, and after trying some hacks I decided to get back on Rails’s path of least resistance. Instead of embedding external URIs in my resource URIs, I’m going to

put the URI through a one-way hash function and embed the hashed string instead.

 

The User Tags Controller

 

Bookmarks aren’t the only type of resource that conceptually fits “beneath” a user account. There’s also the user’s tag vocabulary. I’m not talking about tags in general here: I’m asking questions about which tags a particular user likes to use. These questions are handled by the user tags controller.

This controller is rooted at /users/{username}/tags. That’s the “user tag list” resource.

 

It’s an algorithmic resource, generated from the tags a user uses to talk about her bookmarks. This resource corresponds roughly to the del.icio.us tags/get function. It’s a read-only resource: a user can’t modify her vocabulary directly, only by changing the way she uses tags in bookmarks.

 

The resources at /users/{username}/tags/{tag} talk about the user’s use of a specific tag. My representation will show which bookmarks a user has filed under a particular tag. This class of resource corresponds to the /{username}/{tag} “function” from the web site. It also incorporates some stuff of the del.icio.us API functions posts/get, posts/recent, and posts/all.

 

The “tag” resources are also algorithmic, but they’re not strictly read-only. A user can’t delete a tag except by removing it from all of her bookmarks, but I do want to let users rename tags. (Tag deletion is a plausible feature, but I’m not implementing it because, again, del.icio.us doesn’t have it.) So each user-tag resource will expose PUT for clients who want to rename that tag.

 

Instead of PUT, I could have used overloaded POST to define a one-off “rename” method like the del.icio.us API’s tag/rename. I didn’t, because that’s RPC-style thinking. The PUT method suffices to convey any state change, whether it’s a rename or something else. There’s a subtle difference between renaming the tag and changing its state so the name is different, but it’s the difference between an RPC-style interface and a uniform, RESTful one. It’s less work to program a computer to understand a generic “change the state” than to program it to understand “rename a tag.”

 

The Calendar Controller

 

A user’s posting history—her calendar— is handled by one more controller that lives “underneath” a user account resource. The posting history is another algorithmically generated, read-only resource: you can’t change your posting history except by posting. The controller’s root URI is /users/{username}/calendar, and it corresponds to the del.icio.us API’s posts/dates function.

 

I’ll also expose a variety of subresources, one for each tag in a user’s vocabulary. These resources will give a user’s posting history when only one tag is considered. These resources correspond to the del.icio.us API’s posts/dates function with a tag filter applied.Both kinds of resource, posting history and filtered posting history, will expose only GET.

 

The URI Controller

 

I mentioned earlier that URIs in a social bookmarking system have emergent properties. The URI controller gives access to some of those properties. It’s rooted at /uris/, and it exposes URIs as resources independent from the users who bookmark them. I’m not exposing this controller’s root URI as a resource, though I could. The logical thing to put there would be a huge list of all URIs known to the application. But again, the site I’m taking for my model doesn’t have any feature like that. Instead, I’m exposing a series of resources at /uris/{URI-MD5}: one resource for each URI known to the application.

 

The URI format is the same as /users/{username}/bookmarks/{URI-MD5} in the user bookmark controller: calculate the MD5 hash of the target URI and stick it onto the end of the controller’s base URI.

 

These resources expose the application’s knowledge about a specific URI, such as which users have bookmarked it. This corresponds to the /url/{URI-MD5} “function” on the del.icio.us web site.

 

user accounts

 

Operation

RestFull

Del.icio.us

Create a user account

POST /users

POST /register (via web site)

View a user account

GET /users/{username}

GET /users/{username} (via web site)

Modify a user account

PUT /users/{username}

 Various, via web site

Delete a user account

DELETE /users/{username}

 POST /settings/{username}/profile/delete (via web site)

 

Bookmark Management

 

Operation

RestFull

Del.icio.us

Post a bookmark

 POST /users/{username}/bookmarks

GET /posts/add

Fetch a bookmark

GET /users/{username}/bookmarks/{URI-MD5}

GET /posts/get

Modify a bookmark

PUT /users/{username}/bookmarks/{URI-MD5}

GET /posts/add

Delete a bookmark

DELETE /users/{username}/bookmarks/{URI-MD5}

GET /posts/delete

See when the user last posted a bookmark

Use conditional HTTP GET

GET /posts/update

Fetch a user’s posting History

GET /users/{username}/calendar

GET /posts/dates (your history only)

Fetch a user’s posting history,filtered by tag

GET /users/{username}/calendar/{tag}

 GET /posts/dates with query string (your history only)

 

Finding bookmarks

 

Operation

RestFull

Del.icio.us

Fetch a user’s recent bookmarks

GET /users/{username}/bookmarks

 with query string GET /posts/recent (your bookmarks only)

Fetch all of a user’s bookmarks

GET /posts/{username}/bookmarks

GET /posts/all (your bookmarks only)

Search a user’s bookmarks by date

GET /posts/{username}/bookmarks with query string

GET /posts/get with query string (your bookmarks only)

Fetch a user’s bookmarks tagged with a certain tag

GET /posts/{username}/bookmarks/{tag}

GET /posts/get with query string (your bookmarks

only)

 

social features

 

Operation

RestFull

Del.icio.us

See recently posted bookmarks

GET /recent

GET /recent (via web site)

See recently posted bookmarks for a certain tag

GET /recent/{tag}

 GET /tag/{tag} (via web site)

See which users have bookmarked a certain URI

GET /uris/{URI-MD5}

GET /url/{URI-MD5} (via web site)

 

tags and tag bundles

 

Operation

RestFull

Del.icio.us

Fetch a user’s tag vocabulary

GET /users/{username}/tags

GET /tags/get (your tags only)

Rename a tag

PUT /users/{username}/tags/{tag}

GET /tags/rename

Fetch the list of a user’s tag bundles

GET /users/{username}/bundles

GET /tags/bundles/all (your bundles only)

Group tags into a bundle

POST /users/{username}/bundles

GET /tags/bundles/set

Fetch a bundle

GET /users/{username}/bundles/{bundle}

N/A

Modify a bundle

PUT /users/{username}/bundles/{bundle}

GET /tags/bundles/set

Delete a bundle

DELETE /users/{username}/bundles/{bundle}

GET /tags/bundles/delete

 

I think you’ll agree that the RESTful service is more self-consistent, even accounting for the fact that some of the del.icio.us features come from the web service and some from the web site. The tables above are the best for a straight-up comparison. There you can distinctly see the main advantage of my RESTful service: its use of the HTTP method to remove the operation name from the URI.

 

This lets the URI identify an object in the object-oriented sense. By varying the HTTP method you can perform different operations on the object. Instead of having to understand some number of arbitrarily named functions, you can understand a single class (in the object-oriented sense) whose

instances expose a standardized interface.

 

My service also lifts various restrictions found in the del.icio.us web service. Most notably, you can see other peoples’ public bookmarks. Now, sometimes restrictions are the accidental consequences of bad design, but sometimes they exist for a reason. If I were deploying this service commercially it might turn out that I want to add those limits back in. I might not want user A to have unlimited access to user B’s bookmark

list. I don’t have to change my design to add these limits. I just have to change the authorization component of my service. I make it so that authenticating as userA doesn’t authorize you to fetch userB’s public bookmarks, any more than it authorizes you to delete userB’s account. Or if bandwidth is the problem, I might limit how often any user can perform certain operations. I haven’t changed my resources at all: I’ve just

added additional rules about when operations on those resources will succeed.

 

 

Design the Representation(s) Accepted from the Client

 

When a client wants to modify a user account or post a bookmark, how should it convey the resource state to the server? Rails transparently supports two incoming representation formats: form-encoded key-value pairs and the ActiveRecord XML serialization format.

 

Conveying Resource State to Server Method

Description

form-encoded key-value pairs

it’s everywhere in web applications. It’s the q=jellyfish and color1=blue&color2=green you see in query strings on the human web. When a client makes a request that includes the query string color1=blue&color2=green, Rails gives the controller a hash that looks like this: {"color1" => "blue", "color2" => "green"}

ActiveRecord XML serialization format

ActiveRecord is Rails’s object-relational library. It gives a native Ruby interface to the tables and rows in a relational database. In a Rails application, most exposed resources correspond to these ActiveRecord tables and rows. That’s the case for my service: all my users and bookmarks are database rows managed through ActiveRecord.

 

Design the Representation(s) Served to the Client

 

Rails makes it easy to serve any number of representation formats, but the simplest to use is the XML representation you get when you call to_xml on an ActiveRecord object.

 

This is a very convenient format to serve from Rails, but it’s got a big problem: it’s not a hypermedia format. A client that gets the user representation in Example 7-4 knows enough to reconstruct the underlying row in the users table (minus the password). But that document says nothing about the relationship between that resource and other resources: the user’s bookmarks, tag vocabulary, or calendar. It doesn’t connect the “user” resource to any other resources. A service that serves only ActiveRecord XML documents isn’t well-connected.

 

Connect Resources to Each Other

 

There are many, many relationships between my resources. Think about the relationship between a user and her bookmarks, between a bookmark and the tags it was posted under, or between a URI and the users who’ve bookmarked it. But a to_xml representation of a resource never links to the URI of another resource, so I can’t show those relationships in my representations. On the other hand, an Atom feed can contain links, and can capture relationships between resources.

 

What is Suppose to Happen?

 

Rails exposes every database-backed application using only two resource patterns: lists (the database tables) and list items (the rows in a table). All list resources work pretty much the same way, as do all list item resources. Every “creation” operation follows the same rules and has similar failure conditions, whether the database row being created is a user, a bookmark, or something else. I can consider these rules as a sort of generic control flow, a set of general guidelines for implementing the HTTP interface for list and list item resources. I’ll start defining that control flow here, and pick it up

again in Chapter 9.

 

Action

Return Code

resource is created

201 (“Created”)  + Location header should point the way to the resource’s location

resource is modified

200 (“OK”). If the resource state changes in a way that changes the URI to the resource (for instance, a user account  is renamed), the response code is 301 (“Moved Permanently”) and the Location header should provide the new URI.

object is deleted

response code should be 200 (“OK”)

 

 

 

What Might Go Wrong?

 

Action

Return Code

unauthorized access

I can use the 401 response code (“Unauthorized”) any time the client tries to do something (edit a user’s account, rename a tag

for a user) without providing the proper Authorization header.

create a user account that already exists

From the point of view of the service, this looks like an attempt to modify the existing account without providing

any authorization. The response code of 401 (“Unauthorized”) is appropriate, but it might be confusing to the client.

rename a user account to a name that already exists

409 response code is appropriate here as well

Incorrect range values

400 (“Bad Request”)

create or modify a resource, but doesn’t provide a valid representation

400 (“Bad Request”)

retrieve information about a nonexistent user

404 (“Not Found”)

client tries to get information about a URI that no one has bookmarked

404 (“Not Found”)

 

 

 

Code

 

Controller Code

 

The code that converts incoming HTTP requests into specific actions on the database. I’m going to define a base class called ApplicationController, which contains common code, including almost all of the tricky code. Then I’ll define the six controller classes.

 

Note: What RAILS doesn’t do:

  1. There’s one feature I want for my service that isn’t built into Rails or plugins, and there’s another that goes against Rails’s path of least resistance. I’m going to be implementing these features myself. These two items account for much of the tricky code in the service.Wherever possible, a web service should send the response headers Last-Modified and ETag along with a representation. If the client makes future requests for the same resource, it can make its requests conditional on the representation having changed since the last GET. This can save time and bandwidth; see “Conditional GET” in Chapter 8 for more on this topic. There are third-party Rails controllers that let the programmer provide values for Last- Modified and ETag. Core Rails doesn’t do this, and I don’t want to bring in the additional complexity of a third-party controller. I implement a fairly reusable solution for Last- Modified in Example 7-9.
  2. param[:id] for things that aren’t IDs - Rails assumes that resources map to ActiveRecord objects. Specifically, it assumes that the URI to a “list item” resource identifies a row in a database table by ID. For instance, it assumes the client will request the URI /v1/users/4 instead of the more readable URI /v1/users/leonardr.

 

The Lesser Controllers

 

Every other controller in my application is read-only. This means it implements at most index and show. Hopefully by now you get the idea behind the controllers and their action methods, so I’ll cover the rest of the controllers briefly.

 

Model Code

 

I’ve also got three “model” classes, corresponding to my three main database tables: User, Bookmark, and Tag. The Tag class is defined entirely

through the acts_as_taggable Rails plugin, so I’ve only got to define User and Bookmark.

 

The model classes define validation rules for the database fields. If a client sends bad data (such as trying to create a user without specifying a name), the appropriate validation rule is triggered and the controller method sends the client a response code of 400 (“Bad Request”). The same model classes could be used in a conventional web application, or a GUI application. The validation errors would be displayed differently, but the same rules would always apply. The model classes also define a few methods which work against the database. These methods are used by the controllers.

 

Model

Description

User

This is the simpler of the two models (see Example 7-24). It has some validation rules, a one-to-many relationship with Bookmark objects, and a few methods (called by the controllers) for validating passwords.

Bookmark

This is a more complicated model (see Example 7-25). First, let’s define the relationships between Bookmark and the other model classes, along with some validation rules and a rule for generating the MD5 hash of a URI. We have to keep this information because the MD5 calculation only works in one direction. If a client requests /v1/uris/ 55020a5384313579a5f11e75c1818b89, we can’t reverse the MD5 calculation. We need to be able to look up a URI by its MD5 hash.

 

 

 

What Does the Client Need to Know?

 

The only existing general-purpose web service client is the web browser, and I haven’t provided any HTML forms for creating users or posting bookmarks. Even if I did, that would only take care of situations where the client is under the direct control of a human being.

 

 

Description

Natural-Language Service Description

publish an English description of the service’s layout. If someone wants to use my service they can study my description and write custom HTTP client code.

Description Through Standardization

If all services exposed the same representation formats, and mapped URIs to resources in the same way... well, we can’t get rid of client programming altogether, but clients could work on a higher level than HTTP

Note: Conventions are powerful tools: in fact, they’re the same tools that REST uses. Every RESTful resource-oriented web service uses URIs to designate resources, and expresses operations in terms of HTTP’s uniform interface. The idea here is to apply higher-level conventions than REST’s, so that the client programmer doesn’t have to write as much code.

Note:  Rails architecture is as an example. Rails is good at gently imposing its design preferences on the programmer. The result is that most RESTful Rails services do the same kind of thing in the same way. At bottom, the job of almost every Rails service is

to send and accept representations of ActiveRecord objects. These services all map URIs to Rails controllers, Rails controllers to resources, resources to ActiveRecord objects, and ActiveRecord objects to rows in the database. The representation formats are also standardized: either as XML documents or form-encoded key-value pairs. They’re not the best representation formats, because it’s difficult to make connected services out of them, but they’re OK.


 

 

 

CHAPTER 8 - REST and ROA Best Practices

 

Resource Oriented Basis

 

 

 

Resource

Every interesting thing your application manages should be exposed as a resource. A resource can be anything a client might want to link to: a work of art, a piece of information, a physical object, a concept, or a grouping of references to other resources.

A URI is the name of a resource. Every resource must have at least one name. A resource should have as few names as possible, and every name should be meaningful.

Client

The client cannot access resources directly. A web service serves representations of a resource: documents in specific data formats that contain information about the resource. The difference between a resource and its representation is somewhat academic for static web sites, where the resources are just files on disk that are sent verbatim to clients. The distinction takes on greater importance when the resource is a row in a database, a physical object, an abstract concept, or a real-world event in progress.

 

 

 

State and Statelessness

 

There are two types of state in a RESTful service.

 

State Definition

Description

Stored

Resource State

Information about resources

Resource state stays on the server and is only sent to the client in the form of representations.

Application State

information about the path the client has taken through the application

Application state stays on the client until it can be used to create, modify, or delete a resource. Then it’s sent to the server as part of a POST, PUT, or DELETE request, and becomes resource state.

 

Important note:  A RESTful service is “stateless” if the server never stores any application state. In a stateless application, the server considers each client request in isolation and in terms of the current resource state. If the client wants any application state to be taken into consideration, the client must submit it as part of the request. This includes things like authentication credentials, which are submitted with every request. The client manipulates resource state by sending a representation as part of a PUT or POST request. (DELETE requests work the same way, but there’s no representation.)The server manipulates client state by sending representations in response to the client’s GET requests. This is where the name “Representational State Transfer” comes from.

 

 

Connectedness

 

The server can guide the client from one application state to another by sending links and forms in its representations. I call this connectedness because the links and forms connect the resources to each other. The Fielding thesis calls this “hypermedia as the engine of application state.” In a well-connected service, the client can make a path through the application by following links and filling out forms. In a service that’s not connected, the client must use predefined rules to construct every URI it wants to visit. Right now the human web is very well-connected, because most pages on a web site can be reached by following links from the main page. Right now the programmable web is not very well-connected. The server can also guide the client from one resource state to another by sending forms in its representations. Forms guide the client through the process of modifying resource state with a PUT or POST request, by giving hints about what representations are acceptable.  Links and forms reveal the levers of state: requests the client might make in the future to change application or resource state. Of course, the levers of state can be exposed only when the representation format supports links or forms. A hypermedia format like XHTML is good for this; so is an XML format that can have XHTML or WADL embedded in it.

 

The Uniform Interface

 

All interaction between clients and resources is mediated through a few basic HTTP methods. Any resource will expose some or all of these methods, and a method does the same thing on every resource that supports it.

 

Request

Description

Additional Notes

GET

A GET request is a request for information about a resource. The information is delivered as a set of headers and a representation. The client never sends a representation along with a GET request.

 

HEAD

A HEAD request is the same as a GET request, except that only the headers are sent in response. The representation is omitted.

 

PUT

A PUT request is an assertion about the state of a resource. The client usually sends a representation along with a PUT request, and the server tries to create or change the resource so that its state matches what the representation says. A PUT request with no representation is just an assertion that a resource should exist at a certain URI.

 

DELETE

A DELETE request is an assertion that a resource should no longer exist. The client never sends a representation along with a DELETE request.

 

POST

A POST request is an attempt to create a new resource from an existing one. The existing resource may be the parent of the new one in a data-structure sense, the way the root of a tree is the parent of all its leaf nodes. Or the existing resource may be a special “factory” resource whose only purpose is to generate other resources. The representation sent along with a POST request describes the initial state of the new resource. As with PUT, a POST request doesn’t need to include a representation at all. A POST request may also be used to append to the state of an existing resource, without creating a whole new resource.

 

OPTIONS

An OPTIONS request is an attempt to discover the levers of state: to find out which subset of the uniform interface a resource supports. It’s rarely used. Today’s services specify the levers of state up front, either in human-readable documentation or in hypermedia documents like XHTML and WADL files.

 

 

Very important:

 

Issue

Workaround

wanting to add another method or additional features to HTTP

  1. you can overload POST (see “Overloading POST)
  2. you probably need to add another kind of resource

wanting to add transactional support to HTTP

You should probably expose transactions as resources that can be created, updated, and

deleted. See “Resource Design” later in this chapter for more on this technique

 

 

 

Safety and Idempotence

 

Safety - applies to any method that doesn't make any changes to server state.

 

Important note:  The server might decide on its own to change state (maybe by logging the request or incrementing a hit counter), but it should

not hold the client responsible for those changes. Making any requests should have the same effect as no request at all.

 

Idempotence - Making any number of method requests to a certain URI should have the same practical effect as making one request.

 

Request

Safety

Idempotent

Get

X

X

Head

X

X

Put

 

X

Post

 

X

Delete

 

X

 

New Resources: Post versus Put

 

You can expose the creation of new resources through PUT, POST, or both.

 

PUT

a client can only use PUT to create resources when it can calculate the final URI of the new resource. In Amazon’s S3 service, the URI path to a bucket is /{bucket-name}. Since the client chooses the bucket name, a client can create a bucket by constructing the corresponding URI and sending a PUT request to it

POST

A client will use POST when it cannot calculate the URI for the resource being created.in a typical Rails web service looks like /{database-table-name}/{database-ID}. The name of the database table is known in advance, but the ID of the new resource won’t be known until the corresponding record is saved to the database. To create a resource, the client must POST to a “factory” resource, located at /{database-table-name}. The server chooses a URI for the new resource.

 

Overloading POST

 

POST can also use it to turn a resource into a tiny RPC-style message processor. A resource that receives an overloaded POST request can scan the incoming representation for additional method information, and carry out any task whatsoever. This gives the resource a wider vocabulary than one that supports only the uniform interface.  This is how most web applications work. XML-RPC and SOAP/WSDL web services also run over overloaded POST. This ruins the uniform interface.

 

If you’re tempted to expose complex objects or processes through overloaded POST, try giving the objects or processes their own URIs, and exposing them as resources. I show several examples of this in “Resource Design” later in this chapter.

 

There are two noncontroversial uses for overloaded POST.

  • The first is to simulate HTTP’s uniform interface for clients like web browsers that don’t support PUT or DELETE.
  • The second is to work around limits on the maximum length of a URI. The HTTP standard specifies no limit on how long a URI can get, but many clients and servers impose their own limits: Apache won’t respond to requests for URIs longer than 8 KB. If a client can’t make a GET request to http://www.example.com/numbers/1111111 because of URI length restrictions (imagine a million more ones there if you like), it can make a POST request to http://www.example.com/numbers?_method=GET and put “1111111” in the entity-body.

 

A rule of thumb: if you’re using overloaded POST, and you never expose GET and POST on the same URI, you’re probably not exposing resources at all. You’ve probably got an RPC-style service.

 

Why does it matter?

 

Rule

Why does it matter?

Addressibility

Addressability means that every interesting aspect of your service is immediately accessible from outside. Every interesting aspect of your service has a URI: a unique identifier in a format that’s familiar to every computer-literate person. This identifier can be bookmarked, passed around between applications, and used as a stand-in for the actual resource. Addressability makes it possible for others to make mashups of your service: to use it in ways you never imagined.

Statelessness

Statelessness is the simplifying assumption to beat all simplifying assumptions. Each of a client’s requests contains all application states necessary to understand that request. None of this information is kept on the server, and none of it is implied by previous requests. Every request is handled in isolation and evaluated against the current resource state.

Uniform

    The power of the uniform interface is not in the specific methods exposed. The human web has a different uniform interface—it uses GET for safe operations, and POST for everything else—and it does just fine. The power is the uniformity: everyone uses the same methods for everything. If you deviate from the ROA’s uniform interface (say, by adopting the human web’s uniform interface, or WebDAV’s uniform interface), you switch communities: you gain compatibility with certain web services at the expense of others.

Connectedness

 

 

 

Chapter 9 - The Building Blocks of Services

 

 

 

Representation Formats

 

The goal is to help you pick a format that says something about the semantics of your data, so you don’t find yourself devising yet another one-off XML vocabulary that no one else will use.

 

Format

Description

 

XHTML

Media type: application/xhtml+xml

 

My number-one representation recommendation is the format I’ve been using in my own services throughout this book, and the one you’re probably most familiar with. HTML drives the human web, and XHTML can drive the programmable web. The XHTML standard (http://www.w3.org/TR/xhtml1/) relies on the HTML standard to do most of the heavy lifting (http://www.w3.org/TR/html401/). XHTML is HTML under a few restrictions that make every XHTML document also valid XML. If you know HTML, you know most of what there is to know about XHTML, but there are some syntactic differences, like how to present self-closing tags.

 

XHTML with Microformats

    Media type: application/xhtml+xml

     

  1. XFN - XHTML Friends Network.
  2. XMDP -  XHTML Meta Data Profiles. A way of describing your custom values for XHTML attributes, using the XHTML tags for definition lists: DL, DD, and DT. This is a kind of meta-microformat: a microformat like rel-tag could itself be described with an XMDP document.
  3. XOXO -  Stands (sort of) for Extensible Open XHTML Outlines. Uses XHTML’s list tags to represent outlines. There’s nothing in XOXO that’s not already in the XHTML standard, but declaring a document (or a list in a document) to be XOXO signals that a list is an outline, not just a random list.
  4. hResume - A way of representing resumés.
  5. hReview -  A way of representing reviews, such as product reviews or restaurant reviews.
  6. xFolk - A way of representing bookmarks. This would make an excellent representation format for the social bookmarking application in Chapter 7. I chose to use Atom instead because it was less code to show you.
  7.  

     

 

Atom

Media type: application/atom+xml
 

Atom is an XML vocabulary for describing lists of timestamped entries. The entries can be anything, but they usually contain pieces of human-authored text like you’d see on a weblog or a news site. Why should you use an Atom list instead of a regular XHTML list? Because Atom provides special tags for conveying the semantics of publishing:

  • Authors
  • Contributors
  • Languages
  • copyright information
  • Titles
  • Categories
  • and so on.

(Of course, as I mentioned earlier, there’s a microformat called hAtom that brings all of these semantics into XHTML.)

 

Atom is a useful XML vocabulary because so many web services are, in the broad sense, ways of publishing information. What’s more, there are a lot of web service clients that understand the semantics of Atom documents. If your web service is addressable and your resources expose Atom representations, you’ve immediately got a huge audience. Atom lists are called feeds, and the items in the lists are called entries.



 

If your application almost fits in with the Atom schema, but needs an extra tag or two, there’s no problem. You can embed XML tags from other namespaces in an Atom feed.

You can even define a custom namespace and embed its tags in your Atom feeds. This is the Atom equivalent of XHTML microformats: your Atom feeds can use conventions

not defined in Atom, without becoming invalid. Clients that don’t understand your tag will see a normal Atom feed with some extra mysterious data in it.




 

 

<?xml version="1.0" encoding="utf-8"?>

<feed xmlns="http://www.w3.org/2005/Atom">

<title>RESTful News</title>

<link rel="alternate" href="http://example.com/RestfulNews" />

<updated>2007-04-14T20:00:39Z</updated>

<author><name>Leonard Richardson</name></author>

<contributor><name>Sam Ruby</name></contributor>

<id>urn:1c6627a0-8e3f-0129-b1a6-003065546f18</id>

<entry>

<title>New Resource Will Respond to PUT, City Says</title>

<link rel="edit" href="http://example.com/RestfulNews/104" />

<id>urn:239b2f40-8e3f-0129-b1a6-003065546f18</id>

<updated>2007-04-14T20:00:39Z</updated>

<summary>

After long negotiations, city officials say the new resource

being built in the town square will respond to PUT. Earlier

criticism of the proposal focused on the city's plan to modify

the resource through overloaded POST.

</summary>

<category scheme="http://www.example.com/categories/RestfulNews"

term="local" label="Local news" />

</entry>

</feed>

OpenSearch

 

 

OpenSearch (http://www.opensearch.org/) is one XML vocabulary that’s commonly embedded in Atom documents. It’s designed for representing lists of search results.

The idea is that a service returns the results of a query as an Atom feed, with the individual

results represented as Atom entries. But some aspects of a list of search results can’t be represented in a stock Atom feed: the total number of results, for instance. So OpenSearch defines three new elements, in the opensearch namespace:*

  • totalResults - The total number of results that matched the query.
  • itemsPerPage - How many items are returned in a single “page” of search results.
  • Startindex  - If all the search results are numbered from zero to totalResults, then the first result in this feed document is entry number startindex. When combined with itemsPerPage you can use this to figure out what “page” of results you’re on.

 

SVG

Media type: image/svg+xml

 

Scalable Vector Graphics is an XML vocabulary that makes it

possible for programs to understand and manipulate graphics. It describes graphics in terms of primitives like shapes, text, colors, and effects.

 

Form-Encoded Key-Value Pairs

Media type: application/x-www-form-urlencoded

 

This format is mainly used in representations the client sends to the server. A filled-out HTML form is represented in this format by default, and it’s an easy format for an Ajax application to construct. But a service can also use this format in the representations it sends. If you’re thinking of serving commaseparated values or RFC 822-style key-value pairs, try form-encoded values instead.

Form-encoding takes care of the tricky cases, and your clients are more likely to have a library that can decode the document.

 

 

JSON

Media type: application/json

 

JavaScript Object Notation is a serialization format for general data structures. It’s much more lightweight and readable than an equivalent XML document, so I recommend it for most cases when you’re transporting a serialized data structure rather than

a hypermedia document.

 

{"a":["b","c"], "1":[2,3]}

RDF and RDFa

 

<span about="isbn:9780596529260" property="dc:title">

RESTful Web Services

</span>

Framework-Specific Serialization Formats

Media type: application/xml

 

XML vocabularies used by frameworks like Ruby’s

ActiveRecord and Python’s Django to serialize database objects as XML. I gave an example back in Example 7-4. It’s a simple data structure: a hash or a list of hashes.

These representation formats are very convenient if you happen to be writing a service that gives you access to one. In Rails, you can just call to_xml on an ActiveRecord object or a list of such objects. The Rails serialization format is also useful if you’re not using Rails, but you want your service to be usable by ActiveResource clients. Otherwise, I don’t really recommend these formats, unless you’re just trying to get something up

and running quickly (as I am in Chapters 7 and 12). The major downside of these formats is that they look like documents, but they’re really just serialized data structures.

They never contain hypermedia links or forms.

 

 

 

 

 

 

 

Encoding Issues

 

Every text file you’ve ever created has some character encoding, even though you probably never made a decision about which encoding to use (it’s usually a system property). In the United States the encoding is usually UTF-8, US-ASCII, or Windows-1252. In western Europe it might also be ISO 8859-1. The default for HTML on the web is ISO 8859-1, which is almost but not quite the same as Windows-1252. Japanese documents are commonly encoded with EUC-JP, Shift_JIS, or UTF-8.

 

 

XML and HTTP: Battle of the encodings

 

An XML document can and should define a character encoding in its first line, so that

the client will know how to interpret the document. An HTTP response can and should

specify a value for the Content-Type response header, so that the client knows it’s being

given an XML document and not some other kind. But the Content-type can also specify

a document character encoding with “charset,” and this encoding might conflict with

what it actually says in the document.

Content-Type: application/xml; charset="ebcdic-fr-297+euro"

<?xml version="1.0" encoding="UTF-8"?>

Who wins? Surprisingly, HTTP’s character encoding takes precedence over the encoding

in the document itself.† If the document says “UTF-8” and Content-Type says

“ebcdic-fr-297+euro,” then extended French EBCDIC it is. Almost no one expects this

kind of surprise, and most programmers write code first and check the RFCs later. The

result is that the character encoding, as specified in Content-Type, tends to be unreliable.

Some servers claim everything they serve is UTF-8, even though the actual documents

say otherwise.

 

The character encoding of a JSON document

 

RFC 4627 states that a JSON file must contain Unicode characters, encoded in one of the UTF-* encodings. Practically, this means either UTF-8, or UTF-16 with a byte-order mark. Plain US-ASCII will also work, since ASCII text happens to be valid UTF-8. Given this restriction, a client can determine the character encoding of a JSON document by looking at the first four bytes (the details are in RFC 4627), and there’s no need to specify an explicit encoding. You should follow this convention whenever you serve plain text, not just JSON.

 

Prepackaged Control Flows

 

Though resources can be anything at all, they usually fall into a few broad categories: database tables and their rows, publications and the articles they publish, and so on. When you know what sort of resource a service  exposes, you can often anticipate the possible responses to an HTTP request without knowing too much about the resource. In one sense the standard HTTP response codes (see Appendix B) are just a suggested control flow: a set of instructions about what to do when you get certain kinds of requests. But that’s pretty vague advice, and we can do better. Here I present several prepackaged control flows: patterns that bring together advice about resource design, representation formats, and response codes to help you design real-world services.

 

General Rules

 

HTTP Response

Description

401

If the client tries to do something without providing the correct authorization, send a

response code of 401 (“Unauthorized”) along with instructions for correctly formatting

the Authorization header.

404

If the client tries to access a URI that doesn’t correspond to any existing resource, send

a response code of 404 (“Not Found”). The only possible exception is when the client

is trying to PUT a new resource to that URI.

405

If the client tries to use a part of the uniform interface that a resource doesn’t support,

send a response code of 405 (“Method Not Allowed”). This is the proper response

when the client tries to DELETE a read-only resource.

 

 

Method

Description

GET

If the resource can be identified, send a representation along with a response code of

200 (“OK”). Be sure to support conditional GET!

PUT

If the resource already exists, parse the representation and turn it into a series of changes

to the state of this resource. If the changes would leave the resource in an incomplete

or inconsistent state, send a response code of 400 (“Bad Request”).

If the changes would cause the resource state to conflict with some other resource, send

a response code of 409 (“Conflict”). My social bookmarking service sends a response

code of 409 if you try to change your username to a name that’s already taken.

If there are no problems with the proposed changes, apply them to the existing resource.

If the changes in resource state mean that the resource is now available at a different

URI, send a response code of 301 (“Moved Permanently”) and include the new URI in

the Location header. Otherwise, send a response code of 200 (“OK”). Requests to the

old URI should now result in a response code of 310 (“Moved Permanently”), 404

(“Not Found”), or 410 (“Gone”).

There are two ways to handle a PUT request to a URI that doesn’t correspond to any

resource. You can return a status code of 404 (“Not Found”), or you can create a

resource at that URI. If you want to create a new resource, parse the representation and

use it to form the initial resource state. Send a response code of 201 (“Created”). If

there’s not enough information to create a new resource, send a response code of 400

(“Bad Request”).

POST for creating a new resource

Parse the representation, pick an appropriate URI, and create a new resource there.

Send a response code of 201 (“Created”) and include the URI of the new resource in

the Location header. If there’s not enough information provided to create the resource,

send a response code of 400 (“Bad Request”). If the provided resource state would

conflict with some existing resource, send a response code of 409 (“Conflict”), and

include a Location header that points to the problematic resource.

POST for appending to a resource

 

DELETE

Send a response code of 200 (“OK”).

 

The Atom Publishing Protocol

 

The Atom Publishing Protocol (APP) takes HTTP’s uniform interface and puts a higher-level uniform interface on top of it. Many kinds of applications can conform to the APP, and a generic APP client should be able to access all of them. Specific applications can extend the APP by exposing additional resources, or making the APP resources expose more of HTTP’s uniform interface, but they should all support the minimal features mentioned in the APP standard.

 

The ultimate end of the APP is to serve Atom documents to the end user. Of course, the Atom documents are just the representations of underlying resources. The APP defines what those resources are. It defines two resources that correspond to Atom documents, and two that help the client find and modify APP resources.

 

Resource

Description

example

Collections

An APP collection is a resource whose representation is an Atom feed. The document in Example 9-2 has everything it takes to be a representation of an Atom collection. There’s no neccessary difference between an Atom feed you subscribe to in your feed reader, and an Atom feed that you manipulate with an APP client. A collection is just a list or grouping of pieces of data: what the APP calls members. The APP is heavily oriented toward manipulating “collection” type resources. The APP defines a collection’s response to GET and POST requests.

 

 

GET

returns a representation: the Atom feed.

POST

adds a new member to the collection, which (usually) shows up as a new entry in the feed.

 

 Maybe you can also DELETE a collection, or modify its settings with a PUT request. The APP doesn’t cover that part: it’s up to your application.

 

 

Members

An APP collection is a collection of members. A member corresponds roughly to an entry in an Atom feed: a weblog entry, a news article, or a bookmark. But a member can also be a picture, song, movie, or Word document: a binary format that can’t be represented in XML as part of an Atom document.

A client creates a member inside a collection by POSTing a representation of the member to the collection URI. This pattern should be familiar to you by now: the member is created as a subordinate resource of the collection. The server assigns the new member a URI. The response to the POST request has a response code of 201 (“Created”), and a Location header that lets the client know where to find the new resource.

<?xml version="1.0" encoding="utf-8"?>

<entry>

<title>New Resource Will Respond to PUT, City Says</title>

<summary>

After long negotiations, city officials say the new resource

being built in the town square will respond to PUT. Earlier

criticism of the proposal focused on the city's plan to modify the

resource through overloaded POST.

</summary>

<category scheme="http://www.example.com/categories/RestfulNews"

term="local" label="Local news" />

</entry>

Service document

This vaguely-named type of resource is just a grouping of collections. A typical move is to serve a single service document, listing all of your collections, as your service’s “home page.” A service document is an XML document written using a particular vocabulary, and its media type is application/atomserv+xml (see Example 9-6).

<?xml version="1.0" encoding='utf-8'?>

<service xmlns="http://purl.org/atom/app#"

xmlns:atom="http://www.w3.org/2005/Atom">

<workspace>

<atom:title>Weblogs</atom:title>

<collection href="http://www.example.com/RestfulNews">

<atom:title>RESTful News</atom:title>

<categories href="http://www.example.com/categories/RestfulNews" />

</collection>

</workspace>

<workspace>

<atom:title>Photo galleries</atom:title>

<collection

href="http://www.example.com/samruby/photos" >

<atom:title>Sam's photos</atom:title>

<accept>image/*</accept>

<categories href="http://www.example.com/categories/samruby-photo" />

</collection>

<collection

href="http://www.example.com/leonardr/photos" >

<atom:title>Leonard's photos</atom:title>

<accept>image/*</accept>

<categories href="http://www.example.com/categories/leonardr-photo" />

</collection>

</workspace>

</service>

Category documents

 

<?xml version="1.0" ?>

<app:categories

xmlns:app="http://purl.org/atom/app#"

xmlns="http://www.w3.org/2005/Atom"

scheme="http://www.example.com/categories/RestfulNews"

fixed="no">

<category term="local" label="Local news"/>

278 | Chapter 9: The Building Blocks of Services

<category term="international" label="International news"/>

<category term="lighterside" label="The lighter side of REST"/>

</app:categories>

 

 

GET

POST

PUT

DELETE

Service document

 Return a representation (XML)

Undefined

Undefined

Undefined

Category document

Return a representation

(XML)

Undefined

Undefined

Undefined

Collection

Return a representation

(Atom feed)

Create a new member

Undefined

Undefined

Member

Return the representation

identified by this

URI. (This is usually an

Atom entry document,

but it might be a binary

file.)

Undefined

Update the representation

identified by this

URI

Delete the member

 

 

GData

 

I said earlier that the Atom Publishing Protocol defines only a few resources and only a few operations on those resources. It leaves a lot of space open for extension. One extension is Google’s GData (http://code.google.com/apis/gdata), which adds a new kind of resource and some extras like an authorization mechanism.

 

Querying collections

 

POST Once Exactly

 

A POST request can do anything at all, and sending a POST request twice will probably have a different effect from sending it once. Of course, if a service

committed to accepting only POST requests whose actions were safe or idempotent, it would be easy to make reliable HTTP requests to that service.

 

POST Once Exactly (POE) is a way of making HTTP POST idempotent, like PUT and DELETE. If a resource supports Post Once Exactly, then it will only respond successfully to POST once over its entire lifetime. All subsequent POST requests will give a response code of 405 (“Method Not Allowed”). A POE resource is a one-off resource exposed for the purpose of handling a single POST request.

 

Example:

 

How would we change the weblog this design so that no resource responds to POST more than once?

 

Clearly the weblog can’t expose POST anymore, or there could only ever be one weblog entry.

 

How does POE does it?

 

Step

HTTP Header

The client sends a GET or HEAD request to the “weblog” resource, and the response includes the special POE header

HEAD /weblogs/myweblog HTTP/1.1

Host: www.example.com

POE: 1

The response contains the URI to a POE resource that hasn’t yet been POSTed to. This URI is nothing more than a unique ID for a future POST request. It probably doesn’t even exist on the server. Remember that GET is a safe operation, so the original GET request couldn’t have changed any server state.

200 OK

POE-Links: /weblogs/myweblog/entry-factory-104a4ed

At this point the client can POST a representation of its new weblog entry to /weblogs/myweblog/entry-factory-104a4ed.

 

After the POST goes through, that URI will start responding to POST with a response code of 405 (“Operation Not Supported”).

 

If the client isn’t sure whether or not the POST request went Prepackaged Control Flows | 283 through, it can safely resend. There’s no possiblity that the second POST will create a second weblog entry. POST has been rendered idempotent.

 

 

POE-Links are custom HTTP headers defined by the POE draft.

 

POE Header

Description

POE

POE just tells the server that the client is expecting a link to a POE resource.

POE-Links:

gives one or more links to POE resources.

 

IMPORTANT:  An alternative to making POST idempotent is to get rid of POST altogether. Remember, POST is only necessary when the client doesn’t know which URI it should PUT to. POE works by generating a unique ID for each of the client’s POST operations. If you allow clients to generate their own unique IDs, they can use PUT instead. You can get the benefits of POE without exposing POST at all. You just need to make sure that two clients will never generate the same ID.

 

Hypermedia Technologies

 

“hypermedia format,” I mean a format with some kind of structured support for links and forms.

 

Form Type

Description

Example

Application

  1. They show the client how to manipulate application state.
  2. An application form is a way of handling resources whose names follow a pattern: it basically acts as a link with more than one destination.
  3. The application form lets one resource link to an infinite number of others, without requiring an infinitely large representation.

A search engine doesn’t link to every search you might possibly make: it gives you a form with a space for you to type in your search query. When you submit the form, your browser constructs a URI from what you typed into the form (say, http://www.google.com/search?q=jellyfish), and makes a GET request to that URI.

Resource

  1. They show the client how to format a representation that modifies the state of a resource

 

 

Links and application forms implement what I call connectedness, and what the Fielding thesis calls “hypermedia as the engine of application state.”

 

The client is in charge of the application state, but the server can send links and forms that suggest possible next states.

 

By contrast, a resource form is a guide to changing the resource state, which is ultimately kept on the server.

 

IMPORTANT AUTHOR'S NOTE:  As of the time of writing, XHTML 4 is the only hypermedia technology in active use. But this is a time of rapid change,

thanks in part to growing awareness of RESTful web services. XHTML 5 is certain to be widely used once it’s finally released. My guess is that URI Templates will also catch on, whether or not they’re incorporated into XHTML 5. WADL may catch on, or it may be supplanted by a combination of XHTML 5 and microformats.

 

 

URI Templates

 

URI Templates (currently an Internet Draft (http://www.ietf.org/internet-drafts/draftgregorio-uritemplate-00.txt) are a technology that makes simple resource forms look like links.

 

 

XHTML 4

 

HTML is the most successful hypermedia format of all time, but its success on the human web has typecast it as sloppy, and sent practitioners running for the more

structured XML.

 

Links

A link tag shows up in the document’s head, and connects the document to some resource.

An a tag shows up in the document’s body. It can contain text and other tags, and it links its contents (not the document as

a whole) to another resource

 

Forms

A form tag has a method attribute, which names the HTTP method the client should use when submitting the form.

 

 

Application form

 

<form method="GET" action="http://search.example.com/search">

<input name="query" type="text" />

<input type="submit" />

</form>

Resource form

A resource form in HTML 4 identifies one particular resource, and it specifies an action of POST.

<form method="POST" action="http://files.example.com/dir/subdir/"

enctype="multipart/form-data">

<input type="text" name="description" />

<input type="file" name="newfile" />

</form>

 

Shortcomings of XHTML 4

 

HTML 4’s hypermedia features are obviously good enough to give us the human web we enjoy today, but they’re not good enough for web services.

 

Shortcoming

Description

Application forms are limited in the URIs they can express.

There’s no way to use a form to tell a client to send a DELETE request, or to show a client what the

representation of a PUT request should look like.

Resource forms in HTML 4 are limited to using HTTP POST.

The human web, which runs on HTML forms, has a different uniform interface from web services as a whole. It uses GET or safe operations, and overloaded POST for everything else. If you want to get HTTP’s uniform interface with HTML 4 forms, you’ll need to simulate PUT and DELETE with overloaded POST

There’s no way to use an HTML form to describe the HTTP headers a client should

send along with its request.

 

You can’t use an HTML form to specify a representation more complicated than

a set of key-value pairs.

The HTML standard defines two content types for form representations: application/x-wwwform-urlencoded, which is for key-value pairs (I covered it in “Form-encoding” in Chapter 6); and multipart/form-data, which is for a combination of key-value pairs and uploaded files.

you can’t define a repeating field in an HTML form

 

 

XHTML 5

 

HTML 5 forms support all four basic methods of HTTP’s uniform interface: GET, POST, PUT, and DELETE.

 

HTML 5 defines two new ways of serializing key-value pairs into representations:

  • as plain text
  • using a newly defined XML vocabulary. The content type for the latter is application/x-www-form+xml

 

This is not as big an advance as you might think. Form entities like input are still ways of getting data in the form of key-value pairs. These new serialization formats are just new ways of representing those key-value pairs. There’s still no way to show the client how to format a more complicated representation, unless the client can figure out the format from just the content type.

 

WADL

 

You can provide a WADL file that describes every resource exposed by your service. This corresponds roughly to a WSDL file in a SOAP/WSDL service, and to the “site map” pages you see on the human web. Alternatively, you can embed a snippet of WADL in an XML representation of a particular resource, the way you might embed an HTML form in an HTML representation. The WADL snippet tells you how to manipulate the state of the resource

 

 

WADL does better than HTML 5 as a hypermedia format.

  • It supports URI Templates and every HTTP method there is
  • A WADL file can also tell the client to populate certain HTTP headers when it makes a request.
  • WADL can describe representation formats that aren’t just key-value pairs.
  • You can specify the format of an XML representation by pointing to a schema definition.

 

 

 

CHAPTER 10 - The Resource-Oriented Architecture Versus Big Web Services

 

 

Big Web Services don’t get the benefits of resource-oriented web services. They’re not addressable, cacheable, or well connected, and they don’t respect any uniform interface. (Many of them are stateless, though.)  They’re opaque, and understanding one doesn’t help you understand the next one. In practice, they also tend to have interoperability problems when serving a variety of clients.

 

What Problems Are Big Web Services Trying to Solve?

 

The resource-oriented approach I advocate in this book is Turing-complete. It can model any application, even a complex one like a travel broker. If I implemented this travel broker as a set of resource-oriented services, I’d expose resources like “a fiveminute hold on seat 24C. This would work, but there’s probably little value in that kind of resource. I don’t pretend to know what emergent properties might show up in a resource-oriented system like this, but it’s NOT LIKELY that someone would want to bookmark that resource’s URI and pass it around.

 

The main problem that Big Web Services are trying to solve: the design of process-oriented, brokered distributed services. For whatever reason, this kind of application tends to be more prevalent in businesses and government applications, and less prevalent in technical and academic

areas.

 

SOAP

 

SOAP is the foundation on which the plethora of WS-* specifications is built. Despite the hype and antihype it’s been subjected to, there’s amazingly little to this specification. You can take any XML document (so long as it doesn’t have a DOCTYPE or processing instructions), wrap it in two little XML elements, and you have a valid SOAP document. For best results, though, the document’s root element should be in a namespace. Here’s an XML document:

 

<hello-world xmns="http://example.com"/>

 

Here’s the same document, wrapped in a SOAP envelope:

 

<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">

<soap:Body>

<hello-world xmns="http://example.com"/>

</soap:Body>

</soap:Envelope>

 

version of a SOAP document you might submit to Google’s web search service.

 

This document describes a Call to the Remote Procedure gs:doGoogleSearch. All of the query parameters are neatly tucked into named elements. This example is fully functional, though if you POST it to Google you’ll get back a fault document saying that the key is not valid.

<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">

<soap:Body>

<gs:doGoogleSearch xmlns:gs="urn:GoogleSearch">

<key>00000000000000000000000000000000</key>

<q>REST book</q>

<start>0</start>

<maxResults>10</maxResults>

<filter>true</filter>

<restrict/>

<safeSearch>false</safeSearch>

<lr/>

<ie>latin1</ie>

<oe>latin1</oe>

</gs:doGoogleSearch>

</soap:Body>

</soap:Envelope>

 

Given this fairly simple protocol, what’s the basis for the hype and controversy? SOAP is mainly infamous for the technologies built on top of it. It does have one alleged benefit of its own: transport independence. The headers are inside the message, which means they’re independent of the protocol used to transport the message. You don’t have to send a SOAP envelope inside an HTTP envelope. You can send it over email, instant messaging, raw TCP, or any other protocol.

 

The Resource-Oriented Alternative

 

SOAP is almost always sent over HTTP, but SOAP toolkits make little use of HTTP status codes, and tend to coerce all operations into POST methods.

 

Action

Description

Split your service into resources

The single most important change you can make is to split your service into resources:

Try to make objects of different types respond to method calls with the same name.

Implement these standard features of HTTP, make your representations cacheable, and you make your application more scalable. That has direct and tangible economic benefits.

 

WSDL

 

The weblogs.com interface exposes a single RPC-style function called ping.

 

The function

takes two arguments, both strings, and returns a pingResult structure.

 

This custom

structure contains two elements: flerror, a Boolean, and message, a string. Strings and

Booleans are standard primitive data types, but to use a pingResult I need to define it as an XML Schema  omplexType.

 

MESSAGE

 

I’ll move on to defining the messages that can

be sent between client and server. There are two messages here: the ping request and

the ping response.
 

PORT

 

A port is simply a collection of operations. NOTE: a programming

language would refer to a PORT as a library, a module, or a class.

 

BINDING

 

Binding the ping portType to a SOAP/HTTP implementation

 

SERVICE

 

The final piece to the puzzle is to define a service, which connects

a portType with a binding and (since this is SOAP over HTTP) with an endpoint URI

 

<!--WSDL-->

 

<?xml version="1.0" encoding="utf-8"?>

 

<definitions

xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/" xmlns:s="http://www.w3.org/2001/XMLSchema"

xmlns:tns="uri:weblogscom"

targetNamespace="uri:weblogscom"

xmlns="http://schemas.xmlsoap.org/wsdl/">

 

<types>

<s:schema targetNamespace="uri:weblogscom">

<s:complexType name="pingResult">

<s:sequence>

<s:element minOccurs="1" maxOccurs="1" name="flerror" type="s:boolean"/>

<s:element minOccurs="1" maxOccurs="1" name="message" type="s:string" />

</s:sequence>

</s:complexType>

</s:schema>

</types>

 

<!--Defining the MESSAGES that can be passed-->

 

<message name="pingRequest">

<part name="weblogname" type="s:string"/>

<part name="weblogurl" type="s:string"/>

</message>

 

<message name="pingResponse">

<part name="result" type="tns:pingResult"/>

</message>

 

<!--Defining the PORTS and operations of the webservice-->

 

<portType name="pingPort">

<operation name="ping">

<input message="tns:pingRequest"/>

<output message="tns:pingResponse"/>

</operation>

</portType>

 

<!--Defining the BINDINGS for the PORT-->

 

 

<binding name="pingSoap" type="tns:pingPort">

<soap:binding style="rpc" transport="http://schemas.xmlsoap.org/soap/http"/>

<operation name="ping">

<soap:operation soapAction="/weblogUpdates" style="rpc"/>

<input>

<soap:body use="encoded" namespace="uri:weblogscom" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/>

</input>

<output>

<soap:body use="encoded" namespace="uri:weblogscom" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/>

</output>

</operation>

</binding>

 

<!--Defining the SERVICE-->

 

<service name="weblogscom">

<document> For a complete description of this service, go to the following URL: http://www.soapware.org/weblogsCom

</document>

<port name="pingPort" binding="tns:pingSoap">

<soap:address location="http://rpc.weblogs.com:80/"/>

</port>

</service>

</definitions>

 

That’s a lot of work for a single operation that accepts two string parameters and returns a Boolean and a string. I had to do all this work because WSDL makes no simplifying assumptions. For this simple case, creating the WSDL by hand is possible (I just did it) but difficult. That’s why most WSDL is generated by automated tools. For simple services you can start from a generated WSDL file and tweak it slightly, but beyond that you’re at the mercy of your tools.

 

The tools then become the real story. It abstracts away the service, the binding, the portType, the messages, the schema, and even the network itself. If you are coding in a statically typed language, like C# or Java, you can have all this WSDL generated for you at the push of a button. Generally all you have to do is select which methods in which classes you want exposed as a web service. Almost all WSDL today is generated by tools and can only be understood by tools. After some setup, the client’s tools can call your methods through a web service and it looks like they’re calling native-language methods.

 

What’s not to like? How is this different from a compiler, which turns high-level concepts into machine code? What ought to concern you is that you’re moving further and further away from the Web.

 

Generated SOAP/WSDL interfaces also tend to be brittle. Different Big Web Services stacks interpret the standards differently, generate slightly different WSDL files, and can’t understand each other’s messages. The result is that clients are tightly bound to servers that use the same stack. Web services ought to be loosely coupled and resilient: they’re being exposed across a network to clients who might be using a totally different set of software tools. The web has already proven its ability to meet this goal.

 

The Resource-Oriented Alternative

 

WSDL serves two main purposes in real web services:

  1. It describes which interface (which RPC-style functions) the service exposes.
  2. It describes the representation formats: the schemas for the XML documents the service accepts and sends.

 

In resource-oriented services, these functions are often unnecessary or can be handled with much simpler standards.

 

WSDL

WADL

Encourages service designers to group many custom operations into a single “endpoint” that doesn’t respond to any uniform interface. Since all this functionality is accessible through overloaded POST on a single endpoint URI, the resulting service isn’t addressable.

is an alternative service description language that’s more in line with the Web. Rather than describing RPC-style function calls, it describes resources that respond to HTTP’s uniform interface.

Has no provisions for defining hypertext links, beyond the anyURI data type built into XML Schema. SOAP services aren’t well connected. How could they be, when an entire service is hidden behind a single address?

Solves this problem, describing how one resource links to another.

 

If all you’re doing is serializing a data structure for transport across the wire (as happens in the weblogs.com ping service), consider JSON as your representation format. You can represent fairly complex data structures in JSON without defining a schema; you don’t even need to use XML.

 

UDDI

 

A full description of UDDI is way beyond the scope of this book. Think of it as a yellow pages for WSDL, a way for clients to look up a service that fits their needs. UDDI is even more complex than WSDL. The UDDI specification defines a four-tier hierarchical XML schema that provides metadata about web service descriptions. The data structure types you’ll find in a UDDI registry are a businessEntity, a businessService, a binding-Template, and a tModel.

 

The Resource-Oriented Alternative

 

The closest RESTful equivalents to UDDI are the search engines, like Google, Yahoo!, and MSN. These help (human) clients find the resources they’re looking for. They take advantage of the uniform interface and common data formats promoted by REST. Even this isn’t perfect: spammers try to game the search engines, and sometimes they succeed. But think of the value of search engines and you’ll see the promise of UDDI, even if its complexity turns you off.

 

As RESTful web services grow in popularity and become better-connected (both internally and to the Web at large), something like today’s search engines may fulfill the promise of the public UDDI registry. Instead of searching for services that expose certain APIs, we’ll search for resources that accept or serve representations with certain semantics. Again, this is speculation. Right now, the public directories of web services (I list a few in Appendix A) are oriented toward use by human beings.

 

Security

 

But Big Web Services security involves more than the WS-Security standard. Two examples:

  • Signatures can enable nonrepudiation. It’s possible to prove the originator of a given message was long after it sent, and that the message was not modified after it was received. These concepts are important in contracts and checks.
  • Federation enables a third party to broker trust of identities. This would allow a travel broker to verify that a given person works for one of the travel broker’s customers: this might affect billing and discounts.

 

Suffice it to say that security concepts are much better specified and deployed in SOAP-based protocols than in native HTTP protocols. That doesn’t mean that this gap can’t be closed, that SOAP stickers can’t be ported to HTTP stickers, or that one-off solutions are possible without SOAP. Right now, though, SOAP has many security-related stickers that HTTP doesn’t have, and these stickers are useful when implementing applications like the travel broker.

 

The Resource-Oriented Alternative

 

When all is said and done, your best protection may be the fact that resource-oriented architectures promote simplicity and uniformity. When you’re trying to build a secure application, neither complexity nor a large number of interfaces turn out to be advantages.

 

Reliable Messaging

 

The WS-ReliableMessaging standard tries to provide assurances to an application that a sequence of messages will be delivered AtMostOnce, AtLeastOnce, ExactlyOnce, or InOrder. It defines some new headers (that is, stickers on the envelope) that track sequence identifiers and message numbers, and some retry logic.

 

The Resource-Oriented Alternative

 

Again, these are areas where the specification and implementation for SOAP-based protocols are further advanced than those for native HTTP.

 

If a GET, HEAD, PUT, or DELETE operation doesn’t go through, or you don’t know whether or not it went through, the appropriate course of action is to just retry the request.

 

The only nonidempotent method is POST, the one that SOAP uses. SOAP solves the reliable delivery problem from scratch, by defining extra stickers. In a RESTful application, if you want reliable messaging for all operations, I recommend implementing POST Once Exactly (covered back in Chapter 9) or getting rid of POST altogether.

 

Transactions

 

Big Service Transactions

Description

WS-AtomicTransaction

The WS-AtomicTransaction standard specifies a common algorithm called a two-phase commit. In general, this is only wise between parties that trust one another, but it’s the easiest to implement, it falls within the scope ofexisting products, and therefore it’s the one that is most  widely deployed.

WS-BusinessActivity

WS-BusinessActivity, and it more closely follows how businesses actually work. If you deposit a check from a foreign bank, your bank may put a hold on it and seek confirmation from the foreign bank. If it hears about a problem before the hold expires, it rolls back the transaction. Otherwise, it accepts the check. If it happens to hear about a problem after it’s committed the transaction, it creates a compensating transaction to undo the deposit. The focus is on undoing mistakes in an auditable way, not just preventing them from happening.

 

The Resource-Oriented Alternative

 

It’s usually not necessary at all. In Chapter 8 I implemented a transaction system by exposing the transactions as resources

 

To go back to the example of the check from a foreign bank: your bank might create a “job” resource on the foreign bank’s web service, asking if the check is valid. After a week with no updates to that resource, your bank might provisionally accept the check. If two days later the foreign bank updates the “job” resource saying that the check is bad, your bank can create a compensating transaction, possibly triggering an overdraft and other alerts. You probably won’t need to create a complex scenario like this, but you can see how patterns I’ve already demonstrated can be used to implement these new ideas.

 

BPEL, ESB, and SOA

 

 

 

 

 

CHAPTER 11 -  Ajax Applications as REST Clients

 

 

 

From AJAX to Ajax

 

Every introduction to Ajax will tell you that it used to be AJAX, an acronym for Asynchronous JavaScript And XML. The acronym has been decommissioned and now Ajax is just a word. It’s worth spending a little time exploring why this happened. Programmers didn’t suddenly lose interest in acronyms. AJAX had to be abandoned because what it says isn’t necessarily true. Ajax is an architectural style that doesn’t need to involve JavaScript or XML.

 

JavaScript

The JavaScript in AJAX actually means whatever browser-side language is making the HTTP requests. This is usually JavaScript, but it can be any language the browser knows how to interpret. Other possibilities are ActionScript (running within a Flash application), Java (running within an applet), and browser-specific languages like Internet Explorer’s VBScript.

XML

XML actually means whatever representation format the web service is sending. This can be any format, so long as the browser side can understand it. Again, this is usually XML, because it’s easy for browsers to parse, and because web services tend to serve XML representations. But JSON is also very common, and it can be also be HTML, plain text, or image files: anything the browser can handle or the browser-side script can parse.

 

The Ajax Architecture

 

 

  1. A user, controlling a browser, makes a request for the main URI of an application.
  2. The server serves a web page that contains an embedded script.
  3. The browser renders the web page and either runs the script, or waits for the user to trigger one of the script’s actions with a keyboard or mouse operation.
  4. The script makes an asynchronous HTTP request to some URI on the server. The user can do other things while the request is being made, and is probably not even aware that the request is happening.
  5. The script parses the HTTP response and uses the data to modify the user’s view. This might mean using DOM methods to change the tag structure of the original HTML page. It might mean modifying what’s displayed inside a Flash application or Java applet.
  6. From the user’s point of view, it looks like the GUI just modified itself.

 

 

A standard web application has the same GUI elements but a simpler event loop. Every click or form submission causes a refresh of the entire view. The browser gets a new HTML page and constructs a whole new set of GUI elements. In an Ajax application, the GUI can change a little bit at a time. This saves bandwidth and reduces the psychological effects on the end user. The application appears to change incrementally instead of in sudden jerks.

 

The downside is that every application state has the same URI: the first one the end user visited. Addressability and statelessness are destroyed. The underlying web service may be addressable and stateless, but the end user can no longer bookmark a particular state, and the browser’s “Back” button stops working the way it should. The application is no longer on the Web, any more than a SOAP+WSDL web service that only exposes a single URI is on the Web. I discuss what to do about this next.

 

The Advantages of Ajax

 

This client also never explicitly parses the XML response from the del.icio.us web service. A web browser has an XML parser built in, and XMLHttpRequest automatically parses into a DOM object any XML document that comes in on a web service response. You access the DOM object through the XMLHttpRequest.responseXML member. The DOM standard for web browsers defines the API for this object: you can iterate over its children, search it with methods like getElementsByTagName, or hit it with Xpath expressions.

 

The Disadvantages of Ajax

 

So why don’t we see these problems all the time in Ajax applications? Because right

now, most Ajax applications are served from the same domain names as the web services

they access. This is the fundamental difference between JavaScript web service

clients and clients written in other languages: the client and the server are usually written

by the same people and served from the same domain.

 

 

 

 

REST Goes Better

 

 

 

Making the Request

 

To build an HTTP request you need to create an XMLHttpRequest object. This seemingly

simple task is actually one of the major points of difference between the web browsers.

This simple constructor works in Mozilla-family browsers like Firefox:

 

request = new XMLHttpRequest();

 

Handling the Response

 

Eventually the request will complete and the browser will call your handler function for the last time. At this point your XMLHttpRequest instance gains some new and interesting abilities:

• The status property contains the numeric status code for the request.

• The responseXML property contains a preparsed DOM object representing the response

document—assuming it was served as XML and the browser can parse it.

HTML, even XHTML, will not be parsed into responseXML, unless the document

was served as an XML media type like application/xml or application/xhtml

+xml.

• The responseText property contains the response document as a raw string—useful

when it’s JSON or some other non-XML format.

• Passing the name of an HTTP header into the getResponseHeader method looks up

the value of that header.

 

JSON

 

 

 

 

 

Don’t Bogart the Benefits of REST

 

The underlying cause is the same thing that gives Ajax applications their polished look. Ajax applications disconnect the end user from the HTTP request-response cycle. When you visit the URI of an Ajax application, you leave the Web. From that point on you’re using a GUI application that makes HTTP requests for you, behind the scenes, and folds the data back into the GUI. The GUI application just happens to be running in the same piece of software you use to browse the Web. But even an Ajax application can give its users the benefits of REST, by incorporating them into the user interface.

 

But Google Maps also uses Ajax to maintain a “permalink” for whatever point on the

globe you’re currently at. This URI is kept not in your browser’s address bar but in an

a tag in the HTML document (see Figure 11-1). It represents all the information Google

Maps needs to identify a section of the globe: latitude, longitude, and map scale. It’s a

new entry point into the Ajax application. This link is the Google Maps equivalent of

your browser’s address bar.

 

 

Addressability, destroyed by Ajax but added back by good application design, has allowed communities like Google Sightseeing (http://googlesightseeing.com/) to grow up around the Google Maps application. Your Ajax applications can give statelessness back by reproducing the functionality of the browser’s back and forward buttons. You don’t have to reproduce the browser’s behavior slavishly. The point is to let the end user move back and forth in his application state, instead of having to start from the beginning of a complex operation if he makes a mistake or gets lost.

 

Cross-Browser Issues and Ajax Libraries

 

 

 

 

Prototype

 

Prototype (http://prototype.conio.net/) introduces three classes for making HTTP requests:

• Ajax.Request: a wrapper around XMLHttpRequest that takes care of cross-browser issues and can call different JavaScript functions on the request’s success or failure. The actual XMLHttpRequest object is available as the transport member of the Request object, so responseXML will be through request.transport.responseXML.

• Ajax.Updater: a subclass of Request that makes an HTTP request and inserts the response document into a specified element of the DOM.

• Ajax.PeriodicalUpdater, which makes the same HTTP request at intervals, refreshing a DOM element each time.

 

Dojo

 

 

 

Subverting the Browser Security Model

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

CHAPTER 12 -  Frameworks for RESTful Services

 

Djangoby Jacob Kaplan-Moss

 

You can apply the generic ROA design procedure to turn a dataset into a set of RESTful resources and implement those resources directly in Django.

I’ll show you how I implemented a social bookmarking service in Django, along the lines of the Rails implementation in Chapter 7. Since this book isn’t intended to be a Django tutorial, I’m leaving out most of the intermediary steps of Django development so I can focus on the parts that specifically apply to RESTful web services.

 

Define Resources and Give Them URIs

 

Django makes you design your URIs from scratch. Django’s philosophy is that the URI is an important part of a web application’s user interface, and should not be automatically generated. This fits in with the ROA philosophy, since a resource’s only interface elements are its URI and the uniform interface of HTTP.

 

 

Implement Resources as Django Views

 

 

 

The bookmark list view

 

Django can also serialize database rows into a JSON data structure or an ActiveRecordlike XML representation: switching to the XML representation would be as easy as changing the serializer type in the third line of the view, and the mimetype in the last line. Django’s default JSON output is relatively straightforward. Example 12-13 shows what it does to a row from my bookmarks table.

 

 

 

APPENDIX A  -Some Resources for REST and Some RESTful Resources

 

 

Standards and Guides

 

HTTP and URI

• The HTTP standard (RFC 2616) (http://www.w3.org/Protocols/rfc2616/

rfc2616.html).

• The URI standard (RFC 3986) (http://www.ietf.org/rfc/rfc3986.txt).

• The WEBDAV standard (RFC 2518) (http://www.webdav.org/specs/rfc2518.html),

if you’re interested in extensions to HTTP’s uniform interface.

• The Architecture of the World Wide Web introduces concepts like resources, representations,

and the idea of naming resources with URIs (http://www.w3.org/

2001/tag/webarch/).

• Universal Resource Identifiers—Axioms of Web Architecture (http://www.w3.org/

DesignIssues/Axioms).

RESTful Architectures

  1. The Fielding dissertation: Architectural Styles and the Design of Network-Based
  2. Software Architectures (http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm).
  3. The very active rest-discuss mailing list (http://tech.groups.yahoo.com/group/restdiscuss/).
  4. The RESTwiki (http://rest.blueoxen.net/).
  5. Joe Gregorio’s “REST and WS (http://bitworking.org/news/125/REST-and-WS)” compares the technologies of REST to those of the WS-* stack while showing how to create a RESTful web service interface; if you don’t care about the comparison part, try “How to create a REST Protocol (http://bitworking.org/news/
  6. How_to_create_a_REST_Protocol)” by the same author.
  7. Joe Gregorio has also written a series of articles on REST for XML.com (http://www.xml.com/pub/au/225).
  8. Duncan Cragg’s The REST Dialogues (http://duncan-cragg.org/blog/post/gettingdata-rest-dialogues/): This series of weblog entries is a thought experiment that reenvisionsan RPC-style application as a web of interconnected resources (though it doesn’t use those terms).
  9. Paul Prescod’s Common REST Mistakes (http://www.prescod.net/rest/mistakes/).

Hypermedia Formats

 

Frameworks for RESTful Development

 

Weblogs on REST

The REST community is full of eloquent practitioners who argue for and explain

RESTful architectures on their weblogs. I’ll give out links to just a few. You can find

more, in true REST fashion, by following links.

• Mark Baker (http://www.markbaker.ca/blog/)

• Benjamin Carlyle (http://www.soundadvice.id.au/blog/)

• Joe Gregorio (http://bitworking.org/news/)

• Pete Lacey (http://wanderingbarque.com/nonintersecting/)

• Mark Nottingham (http://www.mnot.net/blog/)

• RESTful Web Services co-author Sam Ruby (http://www.intertwingly.net/blog/)

Services to use

 

 

 

APPENDIX B -  The HTTP Response Code Top 42

 

 

Many web services use HTTP status codes incorrectly. The human web hardly uses them at all. Human beings discover what a document means by reading it, not by looking at an attached numeric code. You’ll see “404” in an HTML page that talks about a missing document, but your attention is on the phrase “missing document,” not on the number. And even the “404” you see is part of the HTML page, put there for human convenience: your browser doesn’t show you the underlying 404 response code.

 

The problem is that there are 41 official response codes, and standards like WebDAV add even more. Many of the codes are rarely used, two of them are never used, and some are only distinguishable from one another by careful hairsplitting. To someone used to the human web (that’s all of us), the variety of response codes can be bewildering.

 

 

Three to Seven Status Codes: The Bare Minimum

 

200 (“OK”)

Everything’s fine. The document in the entity-body, if any, is a representation of some resource.

400 (“Bad Request”)

There’s a problem on the client side. The document in the entity-body, if any, is an error message. Hopefully the client can understand the error message and use it to fix the problem.

500 (“Internal Server Error”)

There’s a problem on the server side. The document in the entity-body, if any, is an error message. The error message probably won’t do much good, since the client can’t fix a server problem.

301 (“Moved Permanently”)

Sent when the client triggers some action that causes the URI of a resource to change. Also sent if a client requests the old URI.

404 (“Not Found”) and 410 (“Gone”)

Sent when the client requests a URI that doesn’t map to any resource. 404 is used when the server has no clue what the client is asking for. 410 is used when the server knows there used to be a resource there, but there isn’t anymore.

409 (“Conflict”)

Sent when the client tries to perform an operation that would leave one or more resources in an inconsistent state.

 

IMPORTANT NOTE:  SOAP web services use only the status codes 200 (“OK”) and 500 (“Internal Server Error”).

 

<<LIST OF ALL http RESPONSE CODES AND HOW THEY CAN BE USED TO MAKE REST WEBSERVICES EASIER AND RICHER)

 

APPENDIX C -  The HTTP Header Top Infinity

 

 

Custom headers are the most common way of extending HTTP. So long as client and server agree on what the headers mean, you can send any information you like along with a request or response. The guidelines are: don’t reinvent an existing header, don’t put things in headers that belong in the entity-body, and follow the naming convention.

 

The names of custom headers should start with the string “X-,” meaning “extension.” The convention makes it clear that your headers are extension headers, and avoids any conflict with future official HTTP headers.

 

Amazon’s S3, covered in Chapter 3, is a good example of a service that defines custom headers. Not only does it define headers like X-amz-acl and X-amz-date, it specifies that S3 clients can send any header whose name begins with “X-amz-meta-.” The header name and value are associated with an object as a key-value pair, letting you store  arbitrary metadata with your buckets and objects. This is a naming convention inside a naming convention.

 

Standard Headers

 

Header

Type

Importance

Description

Accept

Request

Medium

The client sends an Accept header to tell the server what data formats it would prefer the server use in its representations. One client might want a JSON representation; another might want an RDF representation of the same data.

Hiding this information inside the HTTP headers is a good idea for web browsers, but it shouldn’t be the only solution for web service clients. I recommend exposing different representations using different URIs. This doesn’t mean you have to impose crude rules like appending .html to the URI for an HTML representation (though that’s what Rails

does). But I think the information should be in the URI somehow. If you want to support Accept on top of this, that’s great (Rails does this too).

Accept-Charset

Request

Low

The client sends an Accept-Charset header to tell the server what character set it would like the server to use in its representations. One client might want the representation of a resource containing Japanese text to be encoded in UTF-8; another might want a Shift-JIS encoding of the same data.

As I said in Chapter 8, your headaches will be fewer if you pick a Unicode encoding (either UTF-8 or UTF-16) and stick with it. Any modern client should be able to handle these encodings.

Accept-Encoding

Request

Medium to high

The client sends an Accept-Encoding header to tell the server that it can save some bandwidth by compressing the response entity-body with a well-known algorithm like compress or gzip. Despite the name, this has nothing to do with character set encoding; that’s Accept-Charset.

Technically, Accept-Encoding could be used to apply some other kind of transform to the entity-body: applying rot13 encryption to all of its text, maybe. In practice, it’s only used to compress data.

Accept-Language

Request

Low

The client sends an Accept-Charset header to tell the server what human language it would like the server to use in its representations. For an example, see Chapter 4 and its discussion of a press release that’s available in both English and Spanish.

As with media types, I think that a web service should expose different-language representations of a given resource with different URIs. Supporting Accept-Language on top of this is a bonus.

Accept-Ranges

Response

Low to medium

The server sends this header to indicate that it supports partial HTTP GET (see Chapter 8) for the requested URI. A client can make a HEAD request to a URI, parse the value of this response header, and then send a GET request to the same URI, providing an appropriate Range header.

Age

Response

Low

If the response entity-body does not come fresh from the server, the Age header is a measure of how long ago it left the server. This header is usually set by HTTP caches, so that the client knows it might be getting an old copy of a representation.

Allow

Response

Potentially high, currently low

I discuss this header in “HEAD and OPTIONS”, in Chapter 4. It’s sent in response to an OPTIONS request and tells the client which subset of the uniform interface a particular URI exposes. This header will become much more important if people ever start using OPTIONS.

Authorization

Request

Very high

This request header contains authorization credentials, such as a username and password, which the client has encoded according to some agreed-upon scheme. The server decodes the credentials and decides whether or not to carry out the request.

In theory, this is the only authorization header anyone should ever need (except for Proxy-Authorization, which works on a different level), because it’s extensible. The most common schemes are HTTP Basic and HTTP Digest, but the scheme can be anything, so long as both client and server understand it. In practice, HTTP itself has

been extended, with unofficial request headers like X-WSSE that work on top of Authorization. See the X-WSSE entry below for the reason why.

Cache-Control

Request and response header

Medium

This header contains a directive to any caches between the client and the server (including

any caches on the client or server themselves). It spells out the rules for how

the data should be cached and when it should be dumped. I cover some simple caching

rules and recipes in “Caching” in Chapter 8.

Connection

Response

Low

Most of an HTTP response is a communication from the server to the client. Intermediaries

like proxies can look at the response, but nothing in there is aimed at them. But

a server can insert extra headers that are aimed at a proxy, and one proxy can insert

headers that are aimed at the next proxy in a chain. When this happens, the special

headers are named in the Connection header. These headers apply to the TCP connection

between one machine and another, not to the HTTP connection between server

and client. Before passing on the response, the proxy is supposed to remove the special

headers and the Connection header itself. Of course, it may add its own special communications,

and a new Connection header, if it wants.

Here’s a quick example, since this isn’t terribly relevant to this book. The server might

send these three HTTP headers in a response that goes through a proxy:

Content-Type: text/plain

X-Proxy-Directive: Deliver this as fast as you can!

Connection: X-Proxy-Directive

The proxy would remove X-Proxy-Directive and Connection, and send the one remaining

header to the client:

Content-Type: text/html

If you’re writing a client and not using proxies, the only value you’re likely to see for

Connection is “close.” That just says that the server will close the TCP connection after

completing this request, which is probably what you expected anyway.

Content-Encoding

Response

Medium to high

This response header is the counterpart to the request header Accept-Encoding. The

request header asks the server to compress the entity-body using a certain algorithm.

This header tells the client which algorithm, if any, the server actually used.

Content-Language

Response

Medium

This response header is the counterpart to the Accept-Language request header, or to a

corresponding variable set in a resource’s URI. It specifies the natural language a human

must understand to get meaning out of the entity-body.

There may be multiple languages listed here. If the entity-body is a movie in Mandarin

with Japanese subtitles, the value for Content-Language might be “zh-guoyu,jp.” If one

English phrase shows up in the movie, “en” would probably not show up in the Content-

Language header.

Content-Length

Response

High

This response header gives the size of the entity-body in bytes. This is important for

two reasons: first, a client can read this and prepare for a small entity-body or a large

one. Second, a client can make a HEAD request to find out how large the entity-body

is, without actually requesting it. The value of Content-Length might affect the client’s

decision to fetch the entire entity-body, fetch part of it with Range, or not fetch it at all.

Content-Location

Response

Low.

This header tells the client the canonical URI of the resource it requested. Unlike with

the value of the Location header, this is purely informative. The client is not expected

to start using the new URI.

This is mainly useful for services that assign different URIs to different representations

of the same resource. If the client wants to link to the generic version of the resource,

independent of any particular representation, it can use the URI given in Content-Loca

tion. So if you request /releases/104.html.en, specifying a data format and a language,

you might get back a response that includes /releases/104 as the value for Content-

Location.

Content-MD5

Response

Low to medium

This is a cryptographic checksum of the entity-body. The client can use this to check

whether or not the entity-body was corrupted in transit. An attacker (such as a manin-

the-middle) can change the entity-body and change the Content-MD5 header to match,

so it’s no good for security, just error detection.

Content-Range

Response

Low to medium

When the client makes a partial GET request with the Range request header, this response

header says what part of the representation the client is getting.

Content-Type

Response

Very high

Definitely the most famous response header. This header tells the client what kind of thing the entity-body is. On the human web, a web browser uses this to decide if it can display the entity-body inline, and which external program it must run if not. On the programmable web, a web service client usually uses this to decide which parser to apply to the entity-body.

    Date

Request and response

High for request, required for response

As a request header, this represents the time on the client at the time the request was

sent. As a response header, it represents the time on the server at the time the request

was fulfilled. As a response header, Date is used by caches.

Etag

Response

Very high

The value of ETag is an opaque string designating a specific version of a representation. Whenever the representation changes, the ETag should also change. Whenever possible, this header ought to be sent in response to GET requests. Clients can use the value of ETag in future conditional GET requests, as the value of If-None-Match. If the representation hasn’t changed, the ETag hasn’t changed either, and the server can save time and bandwidth by not sending the representation again.

The main driver of conditional GET requests is the simpler Last-Modified response header, and its request counterpart If-Modified-Since. The main purpose of ETag is to provide a second line of defense. If a representation changes twice in one second, it will take on only one value for Last-Modified-Since, but two different values for ETag.

Expect

Request

Medium, but rarely used (as of time of writing).

This header is used to signal a LBYL request (covered in Chapter 8). The server will

send the response code 100 (“Continue”) if the client should “leap” ahead and make

the real request. It will send the response code 417 (“Expectation Failed”) if the client

should not “leap.”

Expires

Response

Medium

This header tells the client, or a proxy between the server and client, that it may cache

the response (not just the entity-body!) until a certain time. Even a conditional HTTP

GET makes an HTTP connection and takes time and resources. By paying attention to

Expires, a client can avoid the need to make any HTTP requests at all—at least for a

while. I cover caching briefly in Chapter 8.

The client should take the value of Expires should as a rough guide, not as a promise

that the entity-body won’t change until that time.

From

Request

Very low.

This header works just like the From header in an email message. It gives an email address

associated with the person making the request. This is never used on the human web

because of privacy concerns, and it’s used even less on the programmable web, where

the clients aren’t under the control of human beings. You might want to use it as an

extension to User-Agent.

    Host

Request

Required.

This header contains the domain name part of the URI. If a client makes a GET request

for http://www.example.com/page.html, then the URI path is /page.html and the value

of the Host header is “www.example.com” or “www.example.com:80.”

From the client’s point of view, this may seem like a strange header to require. It’s

required because an HTTP 1.1 server can host any number of domains on a single IP

address. This feature is called “name-based virtual hosting,” and it saves someone who

owns multiple domain names from having to buy a separate computer and/or network

card for each one. The problem is that an HTTP client sends requests to an IP address,

not to a domain name. Without the Host header, the server has no idea which of its

virtual hosts is the target of the client’s request.

If-Match

Request

Medium

This header is best described in terms of other headers. It’s used like If-Unmodified-

Since (described next), to make HTTP actions other than GET conditional. But where

If-Unmodified-Since takes a time as its value, this header takes an ETag as its value.

Tersely, this header is to If-None-Match and ETag as If-Unmodified-Since is to

If-Modified-Since and Last-Modified.

If-Modified-Since

Request

Very high

This request header is the backbone of conditional HTTP GET. Its value is a previous value of the Last-Modified response header, obtained from a previous request to this URI. If the resource has changed since that last request, its new Last-Modified date is more recent than the one. That means that the condition If-Modified-Since is met, and the server sends the new entity-body. If the resource has not changed, the Last-Modified date is the same as it was, and the condition If-Modified-Since fails.

The server sends a response code of 304 (“Not Modified”) and no entity-body. That is, conditional HTTP GET succeeds if this condition fails.

Since Last-Modified is only accurate to within one second, conditional HTTP GET can occasionally give the wrong result if it relies only on If-Modified-Since. This is the main reason why we also use ETag and If-None-Match.

If-None-Match

Request

Very high

This header is also used in conditional HTTP GET. Its value is a previous value of the

ETag response header, obtained from a previous request to this URI. If the ETag has

changed since that last request, the condition If-None-Match succeeds and the server

sends the new entity-body. If the ETag is the same as before, the condition fails, and

the server sends a response code of 304 (“Not Modified”) with no entity-body.

If-Range

Request

Low

This header is used to make a conditional partial GET request. The value of the header

comes from the ETag or Last-Modified response header from a previous range request.

The server sends the new range only if that part of the entity-body has changed. Otherwise

the server sends a 304 (“Not Modified”), even if something changed elsewhere

in the entity-body.

Conditional partial GET is not used very often, because it’s very unlikely that a client

will fetch a few bytes from a larger representation, and then try to fetch only those same

bytes later.

If-Unmodified-Since

Request

Medium

Normally a client uses the value of the response header Last-Modified as the value of

the request header If-Modified-Since to perform a conditional GET request. This

header also takes the value of Last-Modified, but it’s usually used for making HTTP

actions other than GET into conditional actions.

Let’s say you and many other people are interested in modifying a particular resource.

You fetch a representation, modify it, and send it back with a PUT request. But someone

else has modified it in the meantime, and you either get a response code of 409 (“Conflict”),

or you put the resource into a state you didn’t intend.

If you make your PUT request conditional on If-Not-Modified, then if someone else

has changed the resource your request will always get a response code of 417 (“Precondition

Failed”). You can refetch the representation and decide what to do with the

new version that someone else modified.

This header can be used with GET, too; see the Range header for an example.

Last-Modified

Response

Very high

This header makes conditional HTTP GET possible. It tells the client the last time the

representation changed. The client can keep track of this date and use it in the If-

Modified-Since header of a future request.

In web applications, Last-Modified is usually the current time, which makes conditional

HTTP GET useless. Web services should try to do a little better, since web service

clients often besiege their servers with requests for the same URIs over and over again.

See “Conditional GET” in Chapter 8 for ideas.

Location

Response

Very high

This is a versatile header with many related functions. It’s heavily associated with the

3xx (“Redirection”) response codes, and much of the confusion surrounding HTTP

redirects has to do with how this header should be interpreted.

This header usually tells the client which URI it should be using to access a resource;

presumably the client doesn’t already know. This might be because the client’s request

created the resource—response code 201 (“Created”)—or caused the resource to

change URIs—301 (“Moved Permanently”). It may also be because the client used a

URI that’s not quite right, though not so wrong that the server didn’t recognize it. In

that case the response code might be 301 again, or 307 (“Temporary Redirect”) or 302

(“Found”).

Sometimes the value of Location is just a default URI: one of many possible resolutions

to an ambiguous request, e.g., 300 (“Multiple Choices”). Sometimes the value of Loca

tion points not to the resource the client tried to access, but to some other resource

that provides supplemental information, e.g., 303 (“See Other”).

As you can see, this header can only be understood in the context of a particular HTTP

response code. Refer to the appropriate section of Appendix B for more details.

Max-Forwards

Request

Very low

This header is mainly used with the TRACE method, which is used to track the proxies

that handle a client’s HTTP request. I don’t cover TRACE in this book, but as part of

a TRACE request, Max-Forwards is used to limit how many proxies the request can be

sent through.

Pragma

Request or response

Very low

The Pragma header is a spot for special directives between the client, server, and intermediaries

such as proxies. The only official pragma is “no-cache,” which is obsolete in HTTP 1.1: it’s the same as sending a value of “no-cache” for the Cache-Control header.

You may define your own HTTP pragmas, but it’s better to define your own HTTP headers instead. See, for instance, the X-Proxy-Directive header I made up while explaining the Connection header.

Proxy-Authenticate

header

Low to medium

Some clients (especially in corporate environments) can only get HTTP access through

a proxy server. Some proxy servers require authentication. This header is a proxy’s way

of demanding authentication. It’s sent along with a response code of 407 (“Proxy Authentication

Required”), and it works just like WWW-Authenticate, except it tells the

client how to authenticate with the proxy, not with the web server on the other end.

While the response to a WWW-Authenticate challenge goes into Authorization, the response

to a Proxy-Authenticate challenge goes into Proxy-Authorization (see below).

A single request may need to include both Authorization and Proxy-Authorization

headers: one to authenticate with the web service, the other to authenticate with the

proxy.

Since most web services don’t include proxies in their architecture, this header is not

terribly relevant to the kinds of services covered in this book. But it may be relevant to

a client, if there’s a proxy between the client and the rest of the web.

Proxy-Authorization

Request

Low to medium

This header is an attempt to get a request through a proxy that demands authentication.

It works similarly to Authorization. Its format depends on the scheme defined in Proxy-

Authenticate, just as the format of Authorization depends on the scheme defined in

WWW-Authenticate.

Range

Request

Medium

This header signifies the client’s attempt to request only part of a resource’s representation

(see “Partial GET” in Chapter 8). A client typically sends this header because it

tried earlier to download a large representation and got cut off. Now it’s back for the

rest of the representation. Because of this, this header is usually coupled with Unless-

Modified-Since. If the representation has changed since your last request, you probably

need to GET it from the beginning.

Referer

Request

Low

When you click a link in your web browser, the browser sends an HTTP request in

which the value of the Referer header is the URI of the page you were just on. That’s

the URI that “refered” your client to the URI you’re now requesting. Yes, it’s misspelled.

Though common on the human web, this header is rarely found on the programmable

web. It can be used to convey a bit of application state (the client’s recent path through

the service) to the server.

Retry-After

Response

Low to medium

This header usually comes with a response code that denotes failure: either 413 (“Request

Entity Too Large”), or one of the 5xx series (“Server-side error”). It tells the client

that while the server couldn’t fulfill the request right now, it might be able to fulfill the

same request at a later time. The value of the header is the time when the client should

try again, or the number of seconds it should wait.

If a server chooses every client’s Retry-After value using the same rules, that just guarantees

the same clients will make the same requests in the same order a little later,

possibly causing the problem all over again. The server should use some randomization

technique to vary Retry-After, similar to Ethernet’s backoff period.

TE

Request

Low

This is another “Accept”-type header, one that lets the client specify which transfer

encodings it will accept (see Transfer-Encoding below for an explanation of transfer

encodings). HTTP: The Definitive Guide by Brian Totty and David Gourley (O’Reilly)

points out that a better name would have been “Accept-Transfer-Encoding.”

In practice, the value of TE only conveys whether or not the client understands chunked

encoding and HTTP trailers, two topics I don’t really cover in this book.

Trailer

Response

Low

When a server sends an entity-body using chunked transfer encoding, it may choose

to put certain HTTP headers at the end of the entity-body rather than before it (see

below for details). This turns them from headers into trailers. The server signals that

it’s going to send a header as a trailer by putting its name as the value of the header

called Trailer. Here’s one possible value for Trailer:

Trailer: Content-Length

The server will be providing a value for Content-Length once it’s served the entity-body

and it knows how many bytes it served.

Transfer-Encoding

Response

Low

Sometimes a server needs to send an entity-body without knowing important facts like

how large it is. Rather than omitting HTTP headers like Content-Length and Content-

MD5, the server may decide to send the entity-body in chunks, and put Content-Length

and the like at the after of the entity-body rather than before. The idea is that by the

time all the chunks have been sent, the server knows the things it didn’t know before,

and it can send Content-Length and Content-MD5 as “trailers” instead of “headers.”

It’s an HTTP 1.1 requirement that clients support chunked transfer-encoding, but I

don’t know of any programmable clients (as opposed to web browsers) that do.

Upgrade

Request

Very low

If you’d rather be using some protocol other than HTTP, you can tell the server that

by sending a Upgrade header. If the server happens to speak the protocol you’d rather

be using, it will send back a response code of 101 (“Switching Protocols”) and immediately

begin speaking the new protocol.

There is no standard format for this list, but the sample Upgrade header from RFC 2616

shows what the designers of HTTP had in mind:

Upgrade: HTTP/2.0, SHTTP/1.3, IRC/6.9, RTA/x11

User-Agent

Request

High

This header lets the server know what kind of software is making the HTTP request.

On the human web this is a string that identifies the brand of web browser. On the

programmable web it usually identifies the HTTP library or client library that was used

to write the client. It may identify a specific client program instead.

Soon after the human web became popular, servers started sniffing User-Agent to determine

what kind of browser was on the other end. They then sent different representations

based on the value of User-Agent. Elsewhere in this book I’ve voiced my

opinion that it’s not a great idea to have request headers like Accept-Language be the

only way a client can distinguish between different representations of the same resource.

Sending different representations based on the value of User-Agent is an even

worse idea. Not only has User-Agent sniffing perpetuated incompatibilities between

web browsers, it’s led to an arms race inside the User-Agent header itself.

Almost every browser these days pretends to be Mozilla, because that was the internal

code-name of the first web browser to become popular (Netscape Navigator). A browser

that doesn’t pretend to be Mozilla may not get the representation it needs. Some

pretend to be both Mozilla and MSIE, so they can trigger code for the current most

popular web browser (Internet Explorer). A few browsers even allow the user to select

the User-Agent for every request, to trick servers into sending the right representations.

Don’t let this happen to the programmable web. A web service should only use User-

Agent to gather statistics and to deny access to poorly-programmed clients. It should

not use User-Agent to tailor its representations to specific clients.

Vary

Response

Low to medium

The Vary header tells the client which request headers it can vary to get different representations

of a resource. Here’s a sample value:

Vary: Accept Accept-Language

That value tells the client that it can ask for the representation in a different file format,

by setting or changing the Accept header. It can ask for the representation in a different

language, by setting or changing Accept-Language.

That value also tells a cache to cache (say) the Japanese representation of the resource

separately from the English representation. The Japanese representation isn’t a brand

new byte stream that invalidates the cached English version. The two requests sent

different values for a header that varies (Accept-Language), so the responses should be

cached separately. If the value of Vary is “*”, that means that the response should not

be cached.

 

Via

Request and response

Low

When an HTTP request goes directly from the client to the server, or a response goes

directly from server to client, there is no Via header. When there are intermediaries (like

proxies) in the way, each one slaps on a Via header on the request or response message.

The recipient of the message can look at the Via headers to see the path the HTTP

message took through the intermediaries.

Warning

Response header (can technically be used with requests).

Low

The Warning header is a supplement to the HTTP response code. It’s usually inserted

by an intermediary like a caching proxy, to tell the user about possible problems that

aren’t obvious from looking at the response.

Like response codes, each HTTP warning has a three-digit numeric value: a “warncode.”

Most warnings have to do with cache behavior. This Warning says that the

caching proxy at localhost:9090 sent a cached response even though it knew the response

to be stale:

Warning: 110 localhost:9090 Response is stale

The warn-code 110 means “Response is stale” as surely as the HTTP response code

404 means “Not Found.” The HTTP standard defines seven warn-codes, which I won’t

go into here.

WWW-Authenticate

Response

Very high

This header accompanies a response code of 401 (“Unauthorized”). It’s the server’s

demand that the client send some authentication next time it requests the URI. It also

tells the client what kind of authentication the server expects. This may be HTTP Basic

auth, HTTP Digest auth, or something more exotic like WSSE.

 

Nonstandard Headers

 

Many, many new HTTP headers have been created over the years, most using the Xextension. These have not gone through the process to be made official parts of HTTP, but in many cases they have gone through other standardization processes. I’m going to present just a few of the nonstandard headers that are most important to web services.

 

Header

Type

Importance

Description

Cookie

 

High on the human web, low on the programmable web

This is probably the second-most-famous HTTP header, after Content-Type, but it’s

not in the HTTP standard; it’s a Netscape extension.

A cookie is an agreement between the client and the server where the server gets to store some semipersistent state on the client side using the Set-Cookie header (see below).

Once the client gets a cookie, it’s expected to return it with every subsequent

HTTP request to that server, by setting the Cookie header once for each of its cookies.

Since the data is sent invisibly in the HTTP headers with every request, it looks like the

client and server are sharing state.

Cookies have a bad reputation in REST circles for two reasons. First, the “state” they

contain is often just a session ID: a short alphanumeric key that ties into a much larger

data structure on the server. This destroys the principle of statelessness. More subtly,

once a client accepts a cookie it’s supposed to submit it with all subsequent requests

for a certain time. The server is telling the client that it can no longer make the requests

it made precookie. This also violates the principle of statelessness.

If you must use cookies, make sure you store all the state on the client side. Otherwise

you’ll lose a lot of the scalability benefits of REST.

POE

 

 

 

POE-Links

 

 

 

Set-Cookie

Response

High on the human web, low on the programmable web

This is an attempt on the server’s part to set some semipersistent state in a cookie on the client side. The client is supposed to send an appropriate Cookie header with all future requests, until the cookie’s expiration date. The client may ignore this header (and on the human web, that’s often a good idea), but there’s no guarantee that future requests will get a good response unless they provide the Cookie header. This violates the principle of statelessness.

Slug

 

 

 

X-HTTP-Method-Override

 

 

 

X-WSSE