Saturday, February 22, 2014

The W stands for Web

Back in 2009 we deprecated the SOAP Search API, and that reminded me of a saying at Google that goes "We have two kinds of internal services: the ones that are deprecated and the ones that don't quite work yet" :)

Jokes aside*, that's partially how I feel while reading about WSLD/SOAP: it is a neat idea that isn't supported anymore and the alternative doesn't quite work yet.

* I'll use examples of things that I've authored at Google as much as possible to discuss the technology rather than the folks involved.

In this post, I'll go over my thoughts on the former and I'll follow up with another post on the latter.

This is part of a series of posts.

Lots of folks are asking my opinion on the subject, so I just recently picked up "RESTful Web Services" and "RESTful Web APIs" as well as "Building Web Services in Java: Making sense of XML, SOAP, WSDL and UDDI" (as well as a few follow up blog posts, The S stands for SimpleDo We Need WADL?, REST and WS-*).

It was certainly an interesting read, and if you are working in this area, those are great starting points.

While this is certainly a long debate, I'll go over here some of the points that speaks to me the most.

NOTE: none of these ideas/arguments are necessarily mine, these are just the ones that I've read/heard that I can express with my own words.

The W in WS-* stands for Web

The "W" part of the WS-* acronym stands for "Web", but it doesn't really quite take the advantages of the web and the architectural style that is the most prominent (i.e. REST).

This is a long debate, and I'm not even close to being an expert in the area, but here are my main pain points with SOAP/WS-*:

  1. Uniform Interface (or lack of, aka poor usage of HTTP methods)

    SOAP messages are all sent to services via an overloaded HTTP POST request, which means that it doesn't comply to the uniform HTTP interface. e.g. GET/DELETE/PUT don't make as much sense in SOAP as it does to normal HTTP users (e.g. browsers). This means that it doesn't use 3 out of 4 extremely rich and successful mechanisms that are critical to the success of the Web.
  2. Addressability (aka endpoints vs resources)

    The problem with the RPC-style is that you don't get addressable resources. Instead, you get one endpoint where all operations are performed. This is equivalent to building a web pages like POST http://cnn.com/ body: article=1234 rather than bookmarkable webpages like http://cnn.com/articles/1234. e.g. If I wanted to send someone the former, I'd have to tell them "hey, please open your browser and submit a POST request to cnn.com with article=1234" rather than "hey, go to http://cnn.com/articles/1234".
  3. The envelope inside the envelope (aka SOAP)

    SOAP embeds all messages inside HTTP via a XML SOAP envelope. I don't quite understand why that's needed: SOAPAction seems to be a specialization of the HTTP methods and the SOAP body seems like a duplication of the HTTP body.
Other less-relevant-but-still-applicable ramblings:
  • Usage of XML versus more modern serialization formats like JSON
  • The S Stands for Simple: Overall complexity of WSDL and the need to auto-generate those descriptions from code.

Resource Oriented Architectures vs Service Oriented Architectures

The most significant consequence of these design choices is that (in practice) it is not entirely aligned with the web (albeit the standard itself allows you to expose more resources, that doesn't happen in practice).

The web wants addressable resources that supports a uniform set of methods, rather than global resources that expose a set of static and custom methods.

So, rather than having 1 endpoint that deals with *all* operations, the web pushes you towards multiple "endpoints", called resources.

Others can explain this better than I can, but I think the simplest metaphor I can map to my skill set is the difference between structured programming to object oriented programming.

So, while you'd find something like the following in WSLD/SOAP:

// This is what a Services Oriented Architecture looks like:
public class MyService {
  public static String CreateBlogPost(String text);
  public static void DeleteBlogPost(String id);
  public static String OverrideBlogPost(String id, String text);
}

This is what's more aligned with the Web:

// This is what a Resources Oriented Architecture looks like:
public interface Blog {
  public Post POST(String body);
  public List GET();
}

public interface Post {
  public String GET();
  public void DELETE();
  public void PUT(String body);
}

Where the JVM "this" reference is the URL address (e.g. http://example.com/blog).

Now, if you are a JAVA programmer, you should be able to spot easily what's the best interface design between the two.

The one that doesn't work yet

OK, so now that I went over my ramblings on WSDL/SOAP, what's the alternative?

I'll give you a hint: it exists, but it doesn't quite work yet :)

That's the subject of my next post (stay tuned!).