Wednesday, February 26, 2014

The Gaps in Resource Oriented Architectures

... where I'll go over "the one that doesn't quite work yet" type of service :)

This is part of a series of posts.

As I said earlier on my previous post, we have collectively found a series of problems with Service Oriented Architectures (e.g. WSDL/SOAP). The alternative, however, still has a long way to go to get where it needs to be.

I'll start by going over the main gaps in ROA. While at it, I'll give examples of poor design in one of the APIs that I wrote myself at Google (if I'm going to pick on someone, I'm going to start picking on my own work:)).

Stay tuned on my follow up post, because I'll go over how I'm proposing how to fix this.

None of these arguments are necessarily mine. In fact, most of them I'm borrowing from Leonardo Richardson, Sam Ruby and Mike Amundsen from "RESTful Web Services" and "RESTful Web APIs". I'm going to call them LSM throughout this post, to be short.

The Application Semantic Gap

"Wrappers make service programming easy, because the API of a wrapper library is tailored to one particular service. You don't have to think about HTTP at all. The downside is that each wrapper is slightly different: learning one wrapper doesn't prepare you for the next one.

This is a little disappointing. After all, these services are just variations on the three-step algorithm for making HTTP requests. Shouldn't there be some way of abstracting out the differences between services, some library that can act as a wrapper for the entire space of RESTful and hybrid services?

This is the problem of service descriptions. We need a language with a vocabulary that can describe the variety of RESTful and hybrid services. A document written in this language could script a generic web service client, making it act like a custom-written wrapper. The SOAP RPC community has united around WSDL as its service description language The REST community has yet to unite around a description language [...] " -- LSM

Let me show you how this appears in practice.

Let me introduce you to one of the APIs I* wrote a while ago.

https://developers.google.com/+/api/latest/moments

* By "I" I mean that you can blame me individually for its faults and poor choices. I'll leave the credits and the good decisions to the number of folks that were involved in its design and implementation at Google.

What this API does is to, with the user's explicit consent, allow developers to record user's sentence-like activities to Google servers. Things like "I watched The Matrix", "I listened to The Beatles" or "I ran 5 miles with my friends".

Now you are probably asking yourself: "huh, this looks a lot like facebook's opengraph API".

And you'd be right.

https://developers.facebook.com/docs/opengraph/action-guides

Here are the problems I find with my own API:

Gap #1: Hypermedia

The first gap in ROA is that If you (a human) wanted to integrate with Google's and Facebook's APIs you would have to ask me (another human) what our RESTfull resources look like.

"One alternative to explaining everything is to make your service like other services. If all services exposed the same representation formats, and mapped URIs to resources in the same way ... well, we can't get rid of client programming altogether, but clients could work on a higher level than HTTP.

[...]

What we need is a general framework, a way for each individual service to tell the client about its resource design, its representation formats, and the links it provides between resources. That will give us some of the benefits of standardized conventions, without forcing all web services to comply with more than a few minimal requirements". -- LSM

This is what it looks like to record the fact that you've listened to a song on Google and this is what it looks like on Facebook. There are no (technical) reasons for these representations to be inconsistent.

Gap #2: A WSDL-like language for ROA

The second gap in the programmable web is that I (a human) had to make you (another human) read those APIs documentations and understand them.

The problem is that that's obviously not scalable. How do you automate that? Suppose there were 100000* of these APIs, how would you integrate with all those APIs?

* I agree that there isn't a realistic chance that we'll see 10000 APIs to record user's activities,  but I'll show you later on another post a few examples where that's realistic.

What we need is a way to connect the "human web" and the "programmable web". That is, how do you connect the entities found on the "human web" to the capabilities of the "programmable web"?

Browsers get away with this problem because their users are humans. They can display a form tag to a user and that's sufficient to allow their users to complete the form. That's an scalable approach, because virtually any developer can write a form tag and any user can understand it.

But computers are not that smart. For the programmable web, we need something more specific and better defined. And HTTP alone isn't sufficient.

"Collectively, these* methods define the protocol semantics of HTTP. Just by looking at the method used in an HTTP request, you can understand approximately what the client wants: whether it is trying to get a representation, delete a resource, or connect two resources together."

* referring to the HTTP methods, GET, POST, DELETE, PATCH, HEAD and OPTIONS.

"You can't understand exactly what's going on, because a resource can be anything at all. A GET request sent to a 'blog post' resource looks like a GET request sent to a 'stock symbol' resource. Those two requests have identical protocol semantics, but different application semantics. HTTP is HTTP, but a blogging API is not a stock quote API.

We can't meet the semantic challenge just by using HTTP correctly, because the HTTP protocol doesn't define any application semantics. But your application semantics should always be consistent with HTTP's protocol semantics. 'Get a blog post' and 'get a stock quote' both fall under 'get a representation of this resource,', so both requests should use HTTP GET." -- LSM

Here is a couple of alternatives to address some of these issues:

On understanding entities: "Rails 1.2 does an excellent job of merging the human web and the programmable web. [...] If you use a browser to access the resources, you're served HTML representations of the database objects and HTML forms for manipulating them.". -- LSM

On understanding actions: "In theory, the server can send additional information in response to an OPTIONS request, and the client can send OPTIONS requests that asks very specific questions about the server's capabilities. Very nice, except that there are no accepted standards for what a client might ask in an OPTIONS request. Apart from the Allow header, there are no accepted standards for what a server might send in response. Most web servers and frameworks feature very poor support for OPTIONS. So far, OPTIONS is a promising idea that nobody uses.". -- LSM

Here is a really good read on the subject:

http://bitworking.org/news/193/Do-we-need-WADL

Gap #3: A UDDI-like system for ROA

The third gap in ROA is that I (a human) had to send you (another human) two links [12] pointing you to google's and facebook's APIs.

"There is no magic bullet here. Any automated system that helps people find hotels has a built-in economic incentive to game the system. This doesn't mean that computers can't assist in the process, but it does mean that a human needs to make the ultimate decision.

The closest RESTful equivalents to UDDI are the search engines, like Google, Yahoo!, and MSN. These help (human) clients find the resources they're looking for. They take advantage of the uniform interface and common data formats promoted by REST. [...] But think of the value of search engines and you'll see the promise of UDDI, even if its complexity turns you off." -- LSM

Almost there

Here is an approach that *almost* fills all the gaps. It is a discussion about how to model a Blog API using a well-known vocabulary (schema.org), a well-defined set of verbs (activitystrea.ms) and a well-defined serialization format (HAL) that can connect the "human web" with the "programmable web".

"There is a schema.org microdata item called BlogPost (http://schema.org/BlogPost), which defines semantic descriptors called articleBody and dateCreated. That takes care of 'message', 'text' and 'publication date'. A collection of schema.org BlogPost is a Blog. That takes care of 'message list'.

I'll  name my unsafe state transitions post. I took that name from the ActivityStreams standard, where it means 'The act of authoring an object and then publishing it online'. Nobody ever intended schema.org microdata and ActivityStreams verbs to work together, but ALPS lets me combine their application semantics.

[...]

Am I creating the world's 58th microblogging API? In a sense, yes. But I didn't define anything new. I took everything from the IANA, schema.org and ActivityStreams. A client that already understands these semantic descriptors and link relations will understand my API. It is not very likely that such a client exists, but it is more likely that part of that client exists than it would be if I'd redesigned these basic concepts for the 58th time.

[...]

Just for the sake of variety, I'm going to choose HAL. A HAL+XML representation of the message lis might look like this:

[...]

<resource href="/">

  <link rel="profile" href="http://alps.io/schema.org/Blog" />
  <link rel="profile" href="http://alps.io/schema.org/BlogPost" />
  <link rel="profile" href="http://alps.io/activitystrea-ms/verbs" />
  <link rel="about" href="/about-this-site" />

  <Blog>

    <link rel="post" href="/messages" />

    <resource href="/messages/2" rel="item">
      <BlogPost>
        <articleBody>This is message #2.</articleBody>
        <dateCreated>2013-04-24</dateCreated>
      </BlogPost>
    </resource>

    <resource href="/messages/1" rel="item">
      <BlogPost>
        <articleBody>This is message #1.</articleBody>
        <dateCreated>2013-04-22</dateCreated>
      </BlogPost>
    </resource>

  </Blog>
</resource>

This conveys all the necessary resource state (descriptions of the two messages in the message list) and includes all the necessary hypermedia links (with the link relation profile, about, item and post).

There is a problem with the "post" link: it is not clear that it is an unsafe state transition that should be triggered with a POST, and it's not clear what entity-body the client should send along with the POST request. [...]" -- LSM

There are certainly plenty of reasons why this isn't quite there yet, most notably HAL and ALPS, as well as the fact that there isn't a specification that "post" means sending a POST request to the collection as well as what the blog post should look like.

What's up with the rambling?

Not much :) Rambling is just a way to organize my thoughts and give you context for my next post :)

Stay tuned!