Friday, September 5, 2014

WS-REST 2014 Keynote



This year I had the honor to be invited by an old friend to give a keynote at WS-REST at WWW in Seoul.

Apart from meeting lots of great people, it was a great opportunity to get introduced to so many exciting ideas.

I figure that it could be useful to post my presentation online, as well as some text about it in case anyone that wasn't there needed a reference. So here it goes:



* reading note: some of the parts of the talk I wrote about before, so instead of repeating myself, I embedded links to them. You want to de-reference them inline as you go along.

It went more of less like the following...

"I think we are living in exciting times.

You see, there is this interesting idea that most technologies go through this notion of a hype-cycle: a technology trigger happens, they go through a peak of inflated expectations a trough of disillusionments, the slope of enlightment and eventually lead to a plateau of productivity.

And it feels like Hypermedia APIs are going through all that but it is early enough that we can feel that we can influence where it goes. Sort of like being at Sun when Java was being designed or Bell Labs when UNIX was being sorted out.

Sort of like the person that proposed <a>s to be added to HTML: it feels obvious now, but oh boy, did that person enabled billion dollar businesses to be created!

But even before I talk about the technology triggers, let me first tell you where I come from.

My Rock Bottom

A few years ago I was working on gmail with the task of providing better visualizations of links on emails. We used to call that "mail intelligence" or something like that.



So, as any engineer would probably do, I gathered a set of examples (youtube links, flickr links, lala.com links, etc) and I went on writing regular expressions to detect them as code to talk to their individual APIs.

That launched and it was really cool.

The week it launched, my product manager came to my desk and said "wow, that was cool, can we do vimeo.com urls too?". And I said "yes, of course, let me just add that real quick".

A few days later, 3 new urls types requests came along. Ultimately, I had personally written over 13 regexes and API calls. I thought to myself: "I don't think this is going to scale" ... and my co-worker sitting next to me, noticing how busy I was, replied "do you think so?" :)

On my next project, I got smarter.

I was doing link posting for our social products.



When I was tasked to parse video urls I said: this is the common interface between google and video services. this is how you describe its affordances. We'll implement it once and onboard any video website.

That launched and it was really cool.

But then we needed to do photo websites too. And articles. And reviews. And all of the things that you could do with them. I had personally implemented over 25 of these individually before I stopped and asked asked myself again "I don't think this is going to scale" ... "do you think so?", said my co-worker smiling at me again :)

We didn't get to where we are because we are smart.

We got to where we are because we tried everything else and failed.

When the idea of crawling these links and parsing schema.org markup was brought to my attention, it felt like a breath of fresh air (well, in reality, I ignored it at first. it took me a few rounds for it to sink in).

But the idea was powerful: delegate. Instead of writing specific APIs, give webmasters the tools they needed to express what was on the link and find a way to degrade gracefully.

Folks started exploring what would be involved in adding verbs to schema.org to allow the representation of affordances. We asked questions like: what's the difference between purchase and buy? And how do you express what's involved in fulfilling an action?

But something was missing ... our foundation wasn't feeling as solid as we would like to.

The real measurement of code quality: WTF/min

On my way to the airport to a long trip, I picked up a book at a local bookstore. It was called RESTful Web Services.

By the time I landed, I had read it 3 times. For most of the time, I was cursing. In the right wtf/min rate.


I felt like I had finally found someone in the world that understood what challenges I was facing. We were coming from different ends (I was coming from the "web for humans i.e. webpages", and they were coming from the "web for computers i.e. APIs") but we could almost touch hands in the middle.

I went back excited to tell my co-workers about it. I told them:

- "Guys, our APIs suck, we are an L2 API provider at best!"
- "How could that be?", some asked. "We proud ourselves to provide solid APIs". "Well, how many levels are there even?".

Ah, that's so cool! Show me what your APIs look like!

... and I said: well, hummm, that's the issue, I can't. You know, I'm embarrassed about my APIs too :(

Here are the challenges that they are facing.

Lots of challenges, but these are exciting times."

Sam Goto, April 7th 2014 WWW/WS-REST @ Seoul.

That's, hum, cool. But what happened after that?

I've been busy getting stuff done :) Check this out.



Thursday, September 4, 2014

schema.org actions implementations

... a random collection of screenshots/snippets of actions in practice.

Since it was announced earlier this year, I get often asked how and where schema.org/Actions are used in practice: what real life products it enables.

I was preparing a presentation for a talk to a local workshop, and I was asked for that exact same content, so I figure I'd just copy-paste it here and update it as I find more (most of the content is linked from here).

A caveat: some products launched before the final specification went out (i.e. the official documentation still contains older markup), so I updated the code snippets to the current state (for consistency).

So, here it goes (in no particular order).

Google Search Sitelinks Search Box

Official docs:

Here is what you write:
<script type="application/ld+json">
{
   "@context": "http://schema.org",
   "@type": "WebSite",
   "url": "https://www.example-petstore.com/",
   "potentialAction": {
     "@type": "SearchAction",
     "target": "https://host.example-petstore.com/search?q={search_term}",
     "query-input": "required name=search_term"
   }
}
</script>

This is what you get:



Google Knowledge Graph

Here is what you write:

http://insidesearch.blogspot.com/2014/06/find-music-on-google-and-start-playing.html
http://blog.sgo.to/2014/09/listen-action-in-practice.html

<div itemscope itemtype="http://schema.org/MusicGroup">
  <meta itemprop="description" content="Lady Gaga, an artist on Spotify">
  <meta itemprop="url"
      content="http://open.spotify.com/artist/1HY2Jd0NmPuamShAr6KMms" />
  <div itemprop="potentialAction" itemscope
      itemtype="http://schema.org/ListenAction">
    <meta itemprop="target"
        content="http://open.spotify.com/artist/1HY2Jd0NmPuamShAr6KMms" />
  </div>
</div>

And this is what you get:


Maps


Here is what you write:

https://developers.google.com/search/docs/data-types/local-businesses
<script type="application/ld+json">
{
  "@context": "http://schema.org",
  "@type": "Restaurant",
  "@id": "http://davescafe.example.com/",
  "name": "Dave's Cafe",
  "image": "http://davescafe.example.com/image.jpg",
  "address" :{
    "@type": "PostalAddress",
    "streetAddress": "123 William St",
    "addressLocality": "New York",
    "addressRegion": "NY",
    "postalCode": "10038",
    "addressCountry": "US"
  },
  "geo":{
    "@type": "GeoCoordinates",
    "latitude": 40.709312,
    "longitude": -74.007136
  },
  "telephone": "+19172423826",
  "potentialAction": {
    "@type": "OrderAction",
    "target": {
      "@type": "EntryPoint",
      "urlTemplate": "https://www.example.com/daves-cafe-new-york",
      "inLanguage": "en-US",
      "actionPlatform": [
        "http://schema.org/DesktopWebPlatform",
        "http://schema.org/IOSPlatform",
        "http://schema.org/AndroidPlatform"
      ]
    },
    "deliveryMethod": [
      "http://purl.org/goodrelations/v1#DeliveryModePickUp",
      "http://purl.org/goodrelations/v1#DeliveryModeOwnFleet"
    ],
  }
}
</script>



And here is what you get:




Gmail

Here is what you write:

https://developers.google.com/gmail/actions/reference/rsvp-action

<script type="application/ld+json">
{
  "@context": "http://schema.org",
  "@type": "Event",
  "name": "Taco Night",
  "startDate": "2015-04-18T15:30:00Z",
  "endDate": "2015-04-18T16:30:00Z",
  "location": {
    "@type": "Place",
    "address": {
      "@type": "PostalAddress",
      "name": "Google",
      "streetAddress": "24 Willie Mays Plaza",
      "addressLocality": "San Francisco",
      "addressRegion": "CA",
      "postalCode": "94107",
      "addressCountry": "USA"
    }
  },
  "potentialAction": {
    "@type": "RsvpAction",
    "target": "http://mysite.com/rsvp?eventId=123"
    "attendance-input": "required"
  }
}</script>

And here is what you get:




Here is one example of an implementer:

https://github.com/blog/1891-view-issue-pull-request-buttons-for-gmail


Google Search App Indexing

Here is what you write for servers:


<script type="application/ld+json">
{
  "@context": "http://schema.org", 
  "@type": "WebPage", 
  "@id": "http://example.com/gizmos", 
  "potentialAction": {
    "@type": "ViewAction", 
    "target": "android-app://com.example.android/http/example.com/gizmos"
  }
}
</script>

And what you'd write for native android clients:


    // Define a title for your current page, shown in autocompletion UI
    final String title = "App Indexing API Title";

    // Call the App Indexing API view method
    AppIndex.AppIndexApi.view(mClient, this, APP_URI, TITLE, WEB_URL, null);

And this is what you get:


(link to original image)

Google Search Social App Activities

Here is what you write:


<script type="application/ld+json">
{
  "type": "http://schema.org/ListenAction",
  "object": {
    "@type": "MusicRecording",
    "url": "https://developers.google.com/+/web/snippet/examples/song",
    "name": "When Johnny Comes Marching Home",
  }
}
</script>

And this what you get:


Yandex Islands

This what you write:

https://help.yandex.com/webmaster/interactive-answers/buttons-description.xml
https://github.com/bobuk/islands/blob/master/interactive-answers-eng.md

<script type="application/ld+json">
{
  "@context": "http://schema.org",
  "@type": "CheckInAction",
  "object": {
    "@type": "Flight"
  },
  "target" : "http://www.example.com/check_in"
}
</script>

And here is what you get:



Microsoft's App Linking

Here is what you write:

http://msdn.microsoft.com/en-us/library/dn614166.aspx

<span itemscope itemtype="http://schema.org/WebPage"> 
  <span itemprop="potentialAction" itemscope
      itemtype="http://schema.org/ViewAction">
    <span itemprop="target" itemscope
        itemtype="http://schema.org/WindowsActionHandler"> 
    </span>
    <span itemprop="target" itemscope
        itemtype="http://schema.org/WindowsPhoneActionHandler">
    </span>
  </span>
</span>

And this is what you get:




What's next?

Holograms :) JK :) But I promise it will be exciting, stay tuned!

In the meantime, you might want to read what I'm up to (and my quest to make APIs suck less) if you are bored and care about the geeky details :)

I'll try to keep this page updated as I find more usages. If you know of cool ways your company/partners/webmasters are using http://schema.org/Action, drop me a line in the comments below and I'll make sure to add them here!


Listen Action in practice

A step towards distributed affordances.

While I was reading about distributed affordances on my way to WWW2014, I kept saying to myself: oh goodness, if only I could talk about what I've been up to :) Well, I guess now that I got this out of the door I can :)

The basic idea about distributed affordances is simple: presenting to the user affordances from external sources in a serendipitous way.

For example, for a specific book, what's the infrastructured needed to assist users with external services where the book can be "bought", "borrowed" or "discovered"? That is, how do we enable this sort of experience (granted, not a pretty mock, it was in a phd thesis just for context:)):

Baby steps

One small step towards that goal was to solve the first part: describing affordances externally and aggregating them.

Today, if you search for your favorite artist on google, you get a set of links to external applications where you can listen to them.


How it works isn't that complicated: affordances exist out there, webmasters expose them, google crawls and assist users.

Take for example the affordances of this page about Katy Perry on rdio.com. There is a clear call to action to listen to her songs.

The tricky part was: machines couldn't read that.


Enter ListenAction

The delta needed was the ability to organize affordances as well as attach them to resources. With those mechanisms handy, developers/webmasters have the language to write:

<div itemscope itemtype="http://schema.org/MusicGroup">
  <meta itemprop="description" content="Lady Gaga, an artist on Spotify">
  <meta itemprop="url"
      content="http://open.spotify.com/artist/1HY2Jd0NmPuamShAr6KMms" />
  <div itemprop="potentialAction" itemscope
      itemtype="http://schema.org/ListenAction">
    <meta itemprop="target"
        content="http://open.spotify.com/artist/1HY2Jd0NmPuamShAr6KMms" />
  </div>
</div>

And that becomes machine readable to consumers of this page. Consumers know this page is about an artist which affords being listened to.

Come again?

OK, that was probably a bit complicated to follow if you are new to the subject. Here is a much simpler way to put it:


More actions?

Now sure, this isn't entirely serendipitous because I cheated on the second step: I hard code ListenAction.

Sure, I'll take that. One step at a time.

Here [1, 2] are some other folks who imagined the same world.


(image from http://webofdata.wordpress.com/2012/07/08/schema-org-webintents-awesomeness/)

What's next?

Not sure yet, but it will be exciting! Stay tuned!


Saturday, April 5, 2014

What all APIs still look like as WebPages

... or how far we are from the human web.

This is part of a series. You want to read this before carrying on.

On my last post, I tried to show you what your web API looks like to a web developer. On this post, I'll go over what your L3 API STILL looks like to a web developer.

Here are a few things we take for granted on the web that makes our APIs feel like they were designed back in 1995.

Websites are always at HEAD: The Versionless API challenge

Now whether you are a L3 API or not, we still live in a world that APIs are versioned and moving from one version to another is done out of band.

There is a premise that hypermedia APIs make transitioning from one version to another a lot easier, mostly due to the fact that you are under the control of the links.

But still, there are plenty of problems to be resolved (e.g. changing cardinality of your properties, changing your schema in a non-backward compatible manner, etc), and this is one of those promises that we'll have to let time be the judge.

Show me a respectable API, with millions of users, who has transitioned through 3-4 different versions, over more than 2-3 years, and you can call this problem done.

This is how absurd your APIs currently look like to an user of the human web:

http://example.org/index.html

Hi, welcome to our website!

Our website has multiple versions, click here to continue:

My website v1
My website v2
My website v3

The Discovery of APIs challenge

One of the biggest challenges about existing APIs is that they are non-discoverable. There is no "crawler" equivalent for APIs. Part of the problem is that they are entirely disconnected from the human web.

What that means is that, in order for me to use your API, I actually have to go to your website and read your documentation in English, rather than finding it programatically.

What we need is a google.com-like crawler for APIs (and hopefully we've learned from UDDI and the register-vs-crawl battle).

Your APIs looks like those websites that are entirely irrelevant to search engines, those that you have to copy-past the URL to get to it.

http://example.org/index.html
I have no idea how you got here, but welcome to our website!

The Vocabulary Fragmentation challenge

The third biggest challenge with your API right now is that it is in Klingon. Yeah, that's right, I don't understand a word of what's written over there, because we are not using the same vocabulary.

We need a language that everybody speak, that everybody can understand. Something like latim.

http://example.org/index.html


The Verb Taxonomy Fragmentation challenge

The fourth challenge is that your API is trying to give me affordances that I don't understand. Stop trying to make fetch happen, it is not going to happen.

We need to agree on a set of affordances that we all understand. Sort of what GET/POST/DELETE/PATCH is for HTTP, but at the application semantic level.

http://example.org/index.html

Welcome to my website! Fill the following form to "fetch" our products!

The Hypermedia media type fragmentation challenge

Fragmentation is certainly an important step that needs to be taken before unification. We need diversity and we need exploration. I love seing efforts like hydra, uber, hal and siren to pop up!

But at some point in time, we need to converge into a unified media type. Like text/html is for the human web.

http://example.org/index.html




The API Management challenge

Finally, managing APIs is done completely different than managing web sites. That's insane. You have to sign-up to an API key before you even use an API? When was the last time that you've seem that on a website?

http://example.org/index.html

Welcome to our website! We have detected that you are coming from a corporate IP!

Unfortunately, you can't use our website until your company signs some paperwork up front.

Could you please ask your company to get their "API KEYS" so that we can apply quotas and throttle?

So, where we go from here?

As usual, follow your nose :) Be part of the dialogue! If you have ideas on how to address these things, lets talk, I'd like to learn about them!

This is part of a series of posts. You can read more about that here.

Wednesday, April 2, 2014

What your API would look like as a WebPage

... or, do you really want to write your API in .txt?

Lately, I have been in many discussions where I had to make a case for hypermedia. I have found that the easiest way to introduce folks to hypermedia is to just claim that their APIs are at a "lower maturity level" (usually L2) and point them to Richardon's Maturity Model.




What I have also found is that, while the general concept of hypermedia makes sense for some people, it is hard for them to understand what that means in practice *.

* I don't blame them. There is very little on the subject in concrete terms (e.g. existing robust hypermedia APIs working in production code in a reasonably large corporation).

I thought that, if I showed them what their APIs would look like as if they were web pages, they'd get insulted enough to re-consider :) So, here it goes :)

Level 0: The Swamp of POX

Very rarely I'd come across a L0 API in my daily job, but I understand there are a bunch of enterprise code that uses SOAP. At this level, your API calls go through one endpoint and you are only using one HTTP method.

POST / HTTP/1.1
Host: www.example.org
Content-Type: application/soap+xml; charset=utf-8
Content-Length: 299
SOAPAction: "http://www.w3.org/2003/05/soap-envelope"
<?xml version="1.0"?>
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope">
  <soap:Body>
    <m:ScheduleAppointment xmlns:m="http://www.example.org/doctors">
      <m:DoctorId>123</m:DoctorId>
    </m:ScheduleAppointment>
  </soap:Body>
</soap:Envelope>

This what your API looks like, seem from the perspective of the web for humans:

http://example.org/index.txt
Welcome to my website!

Here are some doctors available:

Mr. Jones and his ID is 123. He has the following available slots:

- Tuesday, 05/02
- Wednesday, 05/03

If you'd like to make an appointment, please, take your browser window and append act=create and doc=123 and slot=05/02 to our web site!

K THX BYE XOXO

As you could've imagined, all interactions with this website is via its root page. There is just index.txt and that HTTP servlet does it all.

It is also notable that the entire website is in a .txt file, as opposed to a hypermedia media type like .html. What that means is that the website is instructing you to manually construct their URLs to carry on. I know, right!

http://example.org/index.txt?act=create&doc=123&s=05/02
Appointment slot created!

Please, don't reload this page, or else we'll create another appointment! Just go away :)



Thanks for your business!

Because you are not using HTTP in its entirety (e.g. you are creating a reservation slot with a GET), you don't get the benefits that a HTTP client gives you (e.g. warning a user whether they want to re-submit the data when refreshing a page after a POST).

Level 1: Resources

This is a slightly more common case, but I still very rarely find this on my day job.
GET /doc/123/slots/05/02.txt?action=create HTTP/1.0
Host: example.com
At this level, you are breaking up your API in multiple resources but still using just one HTTP method.

http://example.org/index.txt
Hi Welcome to our website! Here is our directory of doctors:
- Mr. Jones and his ID is 123.
If you'd like to make an appointment, please, take your browser window and append doc/ and the doctor ID to our web site!






Because you are not using hypermedia (e.g. links), you are still making your users construct their URLs by hand. They get IDs from you, look at your documentation and construct the next step.

http://example.org/doc/123.txt

Hi! Welcome to doctor Jones appointment page! Here are the appointments available:

- Tuesday, 05/02
- Wednesday, 05/03

If you'd like to make an appointment, append /slots/ and the date you'd like to schedule!

I know, seriously, right!

http://example.org/doc/123/slots/05/02.txt

Hi! Welcome to doctor Jones appointments for 05/02!

If you'd like to make an appointment, append a ?action=create to this page to create!

Really? OK, I guess ...

http://example.org/doc/123/slots/05/02.txt?action=create
Appointment slot created! Thanks!

Please, don't reload this page, or else we'll create another appointment! Just go away :)






This is a step forward, albeit quite a small one. Breaking down your API into multiple resources gives you things like addressability (e.g. you can forward links to people now!) and a more scalable load balancing approach.

Level 2: HTTP Verbs

Now, this is where I'm almost sure you are at if you are reading this. You pay your taxes, you eat your vegetables and you went to an ivy league school. You are trying your best to be RESTful and you have an open mind.

You spent countless hours breaking up your resources into a layout that makes sense, each resource supports very specific verbs ("even PATCH and DELETE" you say, proudly!) and yet ... I'm comparing your API to a website circa 1990.

GET / HTTP/1.0
Host: example.com

HTTP/1.0 200 OK
Content-Type: text/json
{
  "doctors": [{
    "name": "Dr. Jones",
    "id": "123"
  }]
}

The problem is ... you are using a .txt file, like JSON or XML.

This is what your API look like:

http://example.org/doc/123/slots/05/02.txt
If you'd like to make an appointment, send a POST request to this page to create! As parameters, send a slot=7pm or 8pm which are available!



The issue here is that you are using out-of-band information to tell users what your API affords. You are literally asking them to "read your page" and "construct a POST request" in English, rather than something that the browser can understand (e.g. a <form>).

This applies to links (<a> tags) too, which you are making your clients construct by reading English.

http://example.org/doc/123/slots/05/02.txt
Appointment slot created! Thanks! Feel free to reload this page, your browser will know what to do and ask you if you want to re-submit the data!


You don't run into double-submission problems anymore (as the browser is now aware that a POST is different than a GET, and hence will ask the user whether they really want to re-submit a non safe operation).

But your API still looks like it was built by a 1990-webmaster.

Level 3: Hypermedia

Welcome to the new world :) Or not so new, if you count the fact that this has been known since the invention of the web :)

Let me introduce you to two breakthrough technologies: <a> tags and <forms>!

What hypermedia media types (e.g. HTML) does is to intermingle data and control into the same message, providing a non-linear form of reading.

Now, instead of you having to explain to your users where to go next in English, you can create <a> tags to point them to the right place.

You can also use <forms> to allow write-transitions.

There is no manual your users need to use to browse your API. There is no out-of-band information: the message has both data and control to help you transition to the next place.

GET / HTTP/1.0
Host: example.com

HTTP/1.0 200 OK
Content-Type: text/ld+json
{
  "@context": "schema.org",
  "@type": "Clinic",
  "@id": "http://example.com/",
  "doctors": [{
    "@type": "Doctor",
    "@id": "http://example.com/doctors/123",
    "name": "Dr. Jones",
    "appointments": {
      "@type": "AppointmentBook",
      "@id": "http://example.com/doctors/123/slots",
      "action": {
        "@id": "http://example.com/doctors/123/slots",
        "@type":"ScheduleAction"
      }
    }
  }]
}
This is what your API would look like at this stage:

http://example.org/index.html
Hi! Welcome to our website!

Here is a list of our doctors:

- Doctor Jones, appointments



Your users are now able to click on links to go to the next step, because browsers know how to interpret anchor tags.

http://example.org/doc/123/slots/05/02
Welcome to Doctor Jones appointments list! If you'd like to create an appointment, fill the following form:
Patient's name:  Date:


Your users are now able to schedule appointments by filling a form, since your browser understand forms.

OK, but where do I go from here?

Just follow your nose :) There are plenty of trade-offs to be made here but at least you hopefully learned something new.

And, honestly, L3 APIs still have a long way to go, so join us and come be part of the discussion!

You can find out more about this here:

(Ha! You thought I would give you instructions on how to construct those URLs by hand didn't you?)

Thursday, March 13, 2014

ROWS and IDLs

This is part of a series of posts. You can find the context of this exploration here.

This week I was looking into different approaches for dealing with input constraints and requirements and I found this great talk by Mike:





That prompted me into digging into each of these approaches and trying to understand what they look like in real life. I wanted examples, because reading specifications can be quite confusing while comparing/contrasting.

I'm 100% sure that there are plenty of errors/mistakes/typos in these examples. If/when you notice them, just let me know and I'll correct them (I tried as much as possible to get from the official specification, but I edited them sometimes when space was a constrain).

Just a side note: apart from ALPS, most of the approaches start from API documentations or API entry points. If you know how they could be embedded into the human web (e.g. HTML), I'd love to add those examples here too!

So, lets get started: show me the code!

Hydra (200?)

Hydra is the specification I probably relate to the most. That's probably because Markus is super engaged with our design and has even kindly created a proposal of his own.

It just makes sense to me. It is a bit verbose and I agree it can be simplified, but all the information that's needed is there. The semantics of the request parameters are kept.

{
  "@id": "/hydra/api-demo/vocab#EntryPoint/issues",
  "@type": "http://purl.org/hydra/core#Link",
  "description": "The collection of all issues",
  "label": "issues",
  "operation": [{
    "@type": "CreateResourceOperation",
    "label": "Creates a new Issue entity"
    "method": "POST",
    "expects": {
      "@id": "/hydra/api-demo/vocab#Issue",
      "@type": "hydra:Class",
      "description": "An Issue tracked by the system.",
      "label": "Issue"
      "supportedProperties": [{
          "property": "/hydra/api-demo/vocab#Issue/title",
          "readonly": false,
          "writeonly": false
        }, {
          "property": "/hydra/api-demo/vocab#Issue/description",
          "readonly": false,
          "writeonly": false
         }, {
          "property": "/hydra/api-demo/vocab#isOpen",
          "readonly": false,
          "writeonly": false
         }, {
          "property": "/hydra/api-demo/vocab#Issue/raisedBy",
          "readonly": true,
          "writeonly": false
         }]
    },
}

Siren (2012)

Second to Hydra is Siren. Actions are taken on resources, but both the semantics of the actions (e.g. Add Item) as well as the parameter names (e.g. "orderNumber) are lost.

{
  "class": [ "order" ],
  "properties": { 
      "orderNumber": 42, 
      "itemCount": 3,
      "status": "pending"
  },
  "actions": [{
    "name": "add-item",
    "title": "Add Item",
    "method": "POST",
    "href": "http://api.x.io/orders/42/items",
    "type": "application/x-www-form-urlencoded",
    "fields": [{
        "name": "orderNumber", "type": "hidden", "value": "42"
      }, {
        "name": "productCode", "type": "text"
      }, {
        "name": "quantity", "type": "number"
    }]
  }]
}

Swagger (2009)

Very close to Siren comes Swagger, which also has the notion of the operations taken on the resource, but with the semantics of the parameters being lost too.

"apis":[
  {
    "path": "/pet.{format}/{petId}",
    "description": "Operations about pets",
    "operations": [
      {
        "parameters":[
          {
            "paramType": "path",
            "name": "petId",
            "description": "ID of pet that needs to be fetched",
            "dataType": "integer",
            "format": "int64",
            "required": true,
            "minimum": 0,
            "maximum": 10
          }
        ],
        ...

RSDL (2013)

RSDL, like WADL, feels a bit awkward to embed in a JSON payload. In addition to that, it seems like you lose the application level semantic of the PUT is lost. It is important to note that the semantics of the parameters are kept though.
  <link rel="update" href="/api/clusters/{cluster:id}">
  <request>
    <http_method>PUT</http_method>
    <headers>
      <header required="true">
        <name>Content-Type</name>
        <value>application/xml|json</value>
      </header>
    </headers>
    <body>
      <type>Cluster</type>
      <parameters_set>
        <parameter type="xs:string" required="false">
          <name>cluster.name</name>
        </parameter>
        <parameter type="xs:string" required="false">
          <name>cluster.description</name>
        </parameter>
        <parameter type="xs:string" required="false">
          <name>cluster.cpu.id</name>
        </parameter>
        <parameter type="xs:boolean" required="false">
          <name>cluster.gluster_service</name>
        </parameter>
        <parameter type="xs:boolean" required="false">
          <name>cluster.threads_as_cores</name>
        </parameter>
      </parameters_set>
    </body>
  </request>
  <response>
    <type>Cluster</type>
  </response>
  </link>

WADL (2009)

WADL is a bit confusing to me because I'm not entirely sure how to embed in my messages. It feels like this is a document I'd find that would globally describe my API, rather than control that I'd find at the message level. Perhaps someone can point me to a better example?

<?xml version="1.0"?> 
 <application> 
   <grammars> 
     <include href="NewsSearchResponse.xsd"/> 
     <include href="Error.xsd"/> 
   </grammars> 
    <resources base="http://api.search.yahoo.com/NewsSearchService/V1/"> 
     <resource path="newsSearch"> 
       <method name="GET" id="search"> 
         <request> 
           <param name="query" type="xsd:string" 
             style="query" required="true"/> 
           <param name="type" style="query" default="all"> 
             <option value="all"/> 
             <option value="any"/> 
             <option value="phrase"/> 
           </param> 
           <param name="language" style="query" type="xsd:string"/> 
         </request> 
         <response status="200"> 
           <representation mediaType="application/xml" 
             element="yn:ResultSet"/> 
         </response> 
       </method> 
     </resource> 
   </resources> 
  </application>

JSON Home (2013)

JSON Home looks a lot like a "pre-computed OPTIONS" language, giving you "hints" of which HTTP operations you can perform. Cool, but not exactly at the level that I'm looking for.

   "http://example.org/rel/widget": {
     "href-template": "/widgets/{widget_id}",
     "href-vars": {
       "widget_id": "http://example.org/param/widget"
     },
     "hints": {
       "allow": ["GET", "PUT", "DELETE", "PATCH"],
       "representations": ["application/json"],
       "accept-patch": ["application/json-patch"],
       "accept-post": ["application/xml"],
       "accept-ranges": ["bytes"]
     }
   }

ALPS

What I love about ALPS is that their example starts with a HTML page! With a form! Now we are talking!

If I understand this correctly, rel="profile" is a magic keyword that informs you that elements with class="search" will be tied to a specific application level semantic.

<!-- sample HTML document -->                    
<html>
    <head>
        <link rel="profile" href="http://alps.io/documents/search" />
    </head>
    <body>
        <form class="search" action="..." method="get">
            <input type="text" name="search" value="..." />
            <select name="resultType">
                <option value="summary" />
                <option value="detailed" />
            </select>
            <input type="submit" />
        </form>
    </body>
</html>                           

You'd find the following in this URL http://alps.io/documents/search, which would tell you that that form lets you search via parameters like "search".

<?xml version="1.0"?>
<alps version="1.0">
    <doc href="http://example.org/samples/full/doc.html" />

    <descriptor id="search" 
        type="safe">
        <doc format="text">
            A search form with two inputs.
        </doc>
        <descriptor href="#resultType" />
        <descriptor id="value"
            name="search" 
            type="semantic">
            <doc>input for search</doc>
        </descriptor>
    </descriptor>

    <descriptor id="resultType"
        type="semantic">
        <doc>results format</doc>
        <ext 
            href="http://alps.io/ext/range" 
            value="summary,detail" />
    </descriptor>
</alps>

Atom Service Documents (2007)

Atom service documents work like JSON HOME, in the sense that you can express your affordances via the HTTP methods that you Accept as well as the Content-Types that you'd expect.

  <service xmlns="http://www.w3.org/2007/app"
           xmlns:atom="http://www.w3.org/2005/Atom">
    <workspace>
      <atom:title>Main Site</atom:title>
      <collection
          href="http://example.org/blog/pic" >
        <atom:title>Pictures</atom:title>
        <accept>image/png</accept>
        <accept>image/jpeg</accept>
        <accept>image/gif</accept>
      </collection>
    </workspace>
  </service>
The content of an "app:accept" element value is a media range as defined in [RFC2616]. The media range specifies a type of representation that can be POSTed to a Collection.
      HTTP/1.1 201 Created
      ...
      <entry xmlns="http://www.w3.org/2005/Atom">
        <title>Atom-Powered Robots Run Amok</title>
        <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
        <content>Some text.</content>
        <link rel="edit"
            href="http://example.org/edit/first-post.atom"/>
      </entry>

JSON Hyperschema (2013)

While JSON Hyperschema don't express exactly the "affordances", they do express the "restrictions", via a "required" property.

{
    "title": "Written Article",
    "type": "object",
    "properties": {
        "id": {
            "title": "Article Identifier",
            "type": "number"
        },
        "title": {
            "title": "Article Title",
            "type": "string"
        },
        "authorId": {
            "type": "integer"
        },
        "imgData": {
            "title": "Article Illustration (small)",
            "type": "string",
            "media": {
                "binaryEncoding": "base64",
                "type": "image/png"
            }
        }
    },
    "required" : ["id", "title", "authorId"]
}

WSDL (2001)

WSDL describes a SOA service, via describing the messages, the parameters and the methods. Not great.

<definitions name="HelloService"
   targetNamespace="http://www.examples.com/wsdl/HelloService.wsdl"
   xmlns="http://schemas.xmlsoap.org/wsdl/"
   xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/"
   xmlns:tns="http://www.examples.com/wsdl/HelloService.wsdl"
   xmlns:xsd="http://www.w3.org/2001/XMLSchema">
 
   <message name="SayHelloRequest">
      <part name="firstName" type="xsd:string"/>
   </message>
   <message name="SayHelloResponse">
      <part name="greeting" type="xsd:string"/>
   </message>

   <portType name="Hello_PortType">
      <operation name="sayHello">
         <input message="tns:SayHelloRequest"/>
         <output message="tns:SayHelloResponse"/>
      </operation>
   </portType>

   <binding name="Hello_Binding" type="tns:Hello_PortType">
   <soap:binding style="rpc"
      transport="http://schemas.xmlsoap.org/soap/http"/>
   <operation name="sayHello">
      <soap:operation soapAction="sayHello"/>
      <input>
         <soap:body
            encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
            namespace="urn:examples:helloservice"
            use="encoded"/>
      </input>
      <output>
         <soap:body
            encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
            namespace="urn:examples:helloservice"
            use="encoded"/>
      </output>
   </operation>
   </binding>

   <service name="Hello_Service">
      <documentation>WSDL File for HelloService</documentation>
      <port binding="tns:Hello_Binding" name="Hello_Port">
         <soap:address
            location="http://www.examples.com/SayHello/">
      </port>
   </service>
</definitions>

RESTDESC (2010)

I don't think I follow RESTDESC entirely, but I think it defines the requests that can be made (via the _:request property).

Example from here and here.

@prefix book: <http://example.org/book#>.
{
  ?book book:reviewForm ?reviewForm.
}
=>
{
  _:request http:methodName "POST";
  http:requestURI ?reviewForm;
  http:body [tmpl:formData ("text=" ?text)];
  http:resp [tmpl:represents ?review].
  ?book book:review ?review.
  ?review book:reviewText ?text.
}.

API Blueprint (2012)

I don't think I follow API Blueprint either, but if I had to guess, you'd write a document in the format below and that would generate an API description that a client could bind to?

If so, I'm not entirely sure how that's going to be embedded in the message.

## Star [/gists/{id}/star{?access_token}]
Star resource represents a Gist starred status. 
The Star resource has the following attribute:
- starred
+ Parameters
    + id (string) ... ID of the gist in the form of a hash
    + access_token (string, optional) ... Gist Fox API access token.    
+ Model (application/hal+json)
    HAL+JSON representation of Star Resource.
    + Headers
            Link: <http:/api.gistfox.com/gists/42/star>;rel="self"
    + Body
            {
                "_links": {
                    "self": { "href": "/gists/42/star" },
                },
                "starred": true
            }

Related work

Few other ones that I couldn't find good examples (searching the web or reading their specs) to embed. If you are familiar with one of these and want to have them included here, feel free to send me an example:
Here are some of the IDLs that I didn't mention because I was already quite familiar with :) But if you are not, it is worth a read:

How do they relate to one another?

This is probably an arbitrary classification, but it helps me organize them in classes that are important to me (they probably wont make that much sense to an arbitrary reader).

The RO column is easy to understand.

The IN column is easy too, but there is a lot of data that may be missing/incorrect (if you know better, feel free to drop me a line and I'll get that corrected). It refers to the notion of whether the control information is embedded with the message or is found in a global master document.

The AS column and PS column are slightly harder to explain, I think.

The AS column refers to whether the language "explains natively" what action is being taken at an "application level" (as opposed to at a "http protocol" level). For example, while a HTML form doesn't explain to a machine what is being done, hydra's ReplaceResourceOperation does.

The PS column refers to whether the language "enforces natively" what the parameters should look like (e.g. the types and properties that are required).

Having said that, lets move on to the table:

YR = Year
RO = Resource Oriented
IN = Action control inlined with the message *
AS = Application level semantics pre-defined **
PS = Parameter class/property semantics/requirements ***

as opposed to a global description of the entire API.
**  as opposed to HTTP-level semantics of the actions. A pre-defined set of actions explains the "what" (semantics) in addition to the "how" (http methods).
*** as opposed to the usage of mime types or opaque parameters.

? are for fields that I'm still collecting

+-------------------------------------------------+
| NAME                 |  YR  | RO | IN | AS | PS |
+-------------------------------------------------+
| Hydra                | 200? | Y  | YN | Y  | Y? |
| ATOM                 | 2007 | Y  | Y  | Y  | Y  |
+-------------------------------------------------+
|     Realm of IDLs with user/opaque actions      |
+-------------------------------------------------+
| ALPS?                | 201? | Y  | Y? | ?  | Y? |
| Siren                | 2012 | Y  | Y  | ?  | Y? |
+-------------------------------------------------+
|   Realm of IDLs with user/opaque parameters     |
+-------------------------------------------------+
| Swagger              | 2009 | Y  | Y  | N  | N  |
| JSON-HOME            | 2013 | Y  | Y  | N  | N  |
+-------------------------------------------------+
|        Realm of IDLs that are not inlined       |
+-------------------------------------------------+
| RSDL                 | 2013 | Y  | N  | N  | Y  |
| WADL                 | 2009 | Y  | N  | N  | Y  |
| BLUEPRINT            | 2012 | Y  | N  | N  | Y  |
| RESTDESC?            | 2010 | Y  | ?  | N  | Y  |
+-------------------------------------------------+
|         Realm of IDLs that are not RO           |
+-------------------------------------------------+
| WSDL                 | 2001 | N  | N  | N  | N  |
+-------------------------------------------------+

Where do we go from here?

Stay tuned. More to follow!

Friday, February 28, 2014

ROWS

Resources-Oriented Web Services (ROWS) is a set of technologies that enable the programmatic discovery, description and invocation of actions on resources.

It connects the human web with the programmable web. It is an alternative to Service-Oriented Architectures that is more aligned with how the web works (URLs and A Uniform Interface: REST. The "R" in URL is the key part.).

This is part of a series of posts. You want to read this and this before carrying on.

This is a human-readable walk through of a slightly more technical specification (build).

What problems are we set to solve again?

Our goal is to connect the "human web" to the "programmable web". Our challenge is to automate what we currently do manually as humans.

We, as a community, haven't yet converged on the following:
  1. A general framework, a way for each individual service to tell clients about its resource design, its representation formats and the links it provides between resources.
  2. A language with a vocabulary that can describe the variety of RESTful and hybrid services. A document written in this language could script a generic web service client, making it act like a custom-written wrapper. More specifically, we'll need to tell clients:
    • What semantic operations are available to be performed
    • Which HTTP method to use
    • What the expected entity-body looks like
    • What to expect to get back after you invoke

Lets look at what our starting point looks like

The human web starts by using a browser to send a GET request to a resource via a URL. Lets say you were looking for booking a cab, here is what you'd do under the hoods:

GET /mountain-view HTTP/1.1
Host: www.yellowcab.com

And the server responds:

HTTP/1.1 200 OK Content-Type: text/html

<html>
<body>

<span>Welcome to Yellow Cab Mountain View!</span>

<a href="/moutain-view/reservations">
Click here to book a cab!
</a>

</body>
</html>

Now, that's great for humans to consume, but a computer can't tell the difference between this and a web page about frogs.

Enter JSON-LD, microdata and schema.org

The first step to help computers make sense of this page is to tell it explicitly what this is about.

There are a few good methods for transporting linked-data in HTML, but my favorite are JSON-LD and microdata*.

* I think that JSON-LD is a more scalable approach overall for large/complex instances, but microdata is easier to grasp on simpler examples. So I'm going to use microdata here in my examples, but bear in mind that I actually prefer JSON-LD a lot better in practice.

That alone isn't sufficient. You need a machine readable description of a taxi stand. Something that a computer could understand. This is where schema.org comes in: it provides a vocabulary that describes things in the universe in a manner that computers can digest.

This is what this web page would look like:

HTTP/1.1 200 OK
Content-Type: text/html

<html>
<body itemscope
    itemid="/mountain-view"
    itemtype="http://schema.org/TaxiStand" >

<span itemprop="description">
Welcome to Yellow Cab Mountain View!
</span>

<a itemprop="reservations"
  href="/mountain-view/reservations"
  itemscope itemtype="http://schema.org/ItemList">
Click here to book a cab!
</a>

</body>
</html>

Now, this basically addresses problem #1 I raised above. It gives you a general framework (json-ld/microdata + schema.org) to describe things in a manner that computers can understand.

A computer now knows:
  • This resource is a TaxiStand
  • This TaxiStand has a description
  • This TaxiStand has an ItemList of reservations
But it doesn't yet know what it can do with it.

The link to the programmable web

The programmable web exists in APIs, but it can't be easily found by computers. So, lets add a link between this specific taxi stand and where it can be found in the yellow cab APIs:

<body itemscope
    itemid="/mountain-view"
    itemtype="http://schema.org/TaxiStand" >
  <meta itemprop="alternate" itemscope 
    itemtype="http://schema.org/ApiUrl"
    content="http://api.yellowcab.com/moutain-view"/>

Now we know that this Taxi Service is linked to a specific API.

If you GET-ed that URL you'd get something like the following:

GET /mountain-view HTTP/1.1
Host: api.yellowcab.com

And the server responds:

HTTP/1.1 200 OK
Content-Type: application/json+ld
{
  "@context": "http://schema.org",
  "@type": "TaxiStand",
  "@id": "/mountain-view",
  "description": "Welcome to Yellow Cab!",
  "reservations": {
    "@type": "ItemList",
    "@id": "/mountain-view/reservations",
  }
}

That's much more like what computers can understand. There is a Content-Type header that tells computers how to parse it and inside the hypermedia there is machine-readable information about the resource.

But, can a computer tell what to *do* with these resources?

If you wanted to create a reservation, how far would HTTP take you?

The closest to an API discovery mechanism in HTTP is the OPTIONS request. 

OPTIONS /mountain-view/reservations HTTP/1.1
Host: api.yellowcab.com

And it could respond:


HTTP/1.1 200 OK
Allow: OPTIONS, GET, HEAD, POST

POST requests can take you a long way, but as LSM pointed out earlier, it is not sufficient. How would you know:
  1. What entity-body the POST request takes?
  2. Whether POST is not an overloaded POST (e.g. RPC-Style POST)?

Introducing Actions

Actions gives you the vocabulary to describe what can be done with resources. It doesn't require you to make an extra request, but it is rather attached inline with the resources.

It consists of three core mechanisms:
  1. A mechanism to link Actions with Things
  2. A taxonomy of Actions with well defined semantics and invocation constrains
  3. A vocabulary to specify what your parameters looks like
Here is what the JSON-LD response could look like:

HTTP/1.1 200 OK
Content-Type: application/json+ld
{
  "@context": "http://schema.org",
  "@type": "TaxiStand",
  "@id": "/mountain-view",
  "description": "Welcome to Yellow Cab!",
  "reservations": {
    "@type": "ItemList",
    "@id": "/mountain-view/reservations",
    "operation": {
      "@type": "CreateAction",
      "expects": {
        "@type": "SupportedClass",
        "subClassOf": "http://schema.org/TaxiReservation",
      }
    }
  }
}

You could have also equally found that information in the HTML markup *:

<a itemprop="reservations"
  href="/mountain-view/reservations"
  itemscope itemtype="http://schema.org/ItemList">
  <meta itemprop="alternate" itemscope 
    itemtype="http://schema.org/ApiAppUrl"
    content="http://api.yellowcab.com/moutain-view/reservations" />
  <div itemprop="operation" itemscope 
    itemtype="http://schema.org/CreateAction"/>
    <meta itemprop="expects" itemscope 
      itemtype="http://schema.org/SupportedClass"
      content="http://schema.org/TaxiReservation"/>
  </div>
  </div>
Click here to book a cab!
</a>

* I agree that this looks a bit verbose, but I'll show you later how to make this more concise. Hint: it has to do with linked data.

Now, that tells you *everything* a computer needs:

  • This is a TaxiStand.
  • There is an API entry point here
    http://api.yellowcab.com/moutain-view/
  • This TaxiStand has an ItemList of reservations.
  • The reservation's ItemList takes CreateAction operations, which has *very* specific semantics (as well as a specification of what it means to invoke it, e.g. it is tight a POST request because it was defined that way).
  • To create a reservation, you pass an instance of a TaxiReservation. 

Which means that from this information a computer can send a request like the following with confidence:

POST /mountain-view/reservations HTTP/1.1
Host: api.yellowcab.com
Content-Type:application/json+d;charset=utf-8
Content-Length:207
{
  "@context": "http://schema.org/",
  "@type": "TaxiReservation",
  "pickUpLocation":
    "1600 Amphitheatre Parkway, Mountain View, CA",
  "pickUpTime": "2pm",
  "numberOfPassengers": "1"
}

And the server should respond something like the following:

HTTP/1.1 201 Created
Location:
http://api.yellowcab.com/moutain-view/reservations/32523325225

This is where hypermedia comes in again.

Hypermedia is an extremely powerful concept. It allows you to hop from one resource to another following links. That's quite powerful.

You just got a resource created, lets take a peak at what it can do:

OPTIONS /mountain-view/reservations/32523325225 HTTP/1.1
Host: api.yellowcab.com

And it could respond:

HTTP/1.1 200 OK
Allow: OPTIONS, GET, HEAD, PATCH
Accept-Patch: application/json+ld

And you'd be quite excited knowing that this is a mutable resource because it takes a PATCH HTTP method.

It tells you additionally that it takes a JSON-LD patch document, which is quite informative too.

But as much as with POST, you wouldn't know enough what is the application semantics of the PATCH operation (e.g. what does it mean to "patch"? is it to update the pick up time? the drop off location?).


Enter actions again. Lets GET this resource:

GET /mountain-view/reservations/32523325225 HTTP/1.1
Host: api.yellowcab.com

And now the server can respond to you:

HTTP/1.1 200 OK
Content-Type: application/json+ld
Accept: application/json+ld
{
  "@context": "http://schema.org",
  "@type": "TaxiReservation",
  "@id": "/mountain-view/reservations/32523325225",
  "reservationStatus": "CONFIRMED",
  "operation": {
    "@type": "CancelAction"
  }
}

And because there is a very specific application semantic for CancelAction ("The act of asserting that a future event/action is no longer going to happen.") and a very specific definition of what it means to "cancel" a resource (in HTTP terms it is a PATCH request, according to the definition of "canceling"), it is well defined for a computer to send a PATCH request like the following:

PATCH /mountain-view/reservations/32523325225 HTTP/1.1
Host: api.yellowcab.com
Content-Type:application/json+d;charset=utf-8
Content-Length:100
Accept: application/json+ld
{
  "@context": "http://schema.org/",
  "@type": "CancelAction"
}

Which now sets the state of the reservation to cancelled:

HTTP/1.1 200 OK
Content-Type: application/json+ld
{
  "@context": "http://schema.org",
  "@type": "TaxiReservation",
  "@id": "/mountain-view/reservations/32523325225",
  "reservationStatus": "CANCELLED",
}

Note too that, once you cancelled, the "CancelAction" operation goes away, because that operation is no longer applicable to a CANCELLED reservation.

To Wrap Things Up

Phew, that was a lot of information. Here is where I think things fit:

+------------------------+

|         Actions        | <- missing gap #2
+------------------------+
|         Things         | <missing gap #1
+------------------------+
|         JSON-LD        | <- hypermedia
|        microdata       |
+------------------------+
|          REST          | <ROA vs SOA
+------------------------+
|          HTTP          | <- URIs, methods
+------------------------+

OK, that was an interesting read. But ...

... there is so much more to talk about.

The devil is on the details and we haven't gotten yet to how things like collections, authentication, different transport mechanisms (e.g. mobile applications, email messages) and gap #3 should look like.

Stay tuned. More to follow.