Monday, July 27, 2015

application, structure and protocol

... de-coupling common patterns in API design: the "what", "where" and "how".

Some things takes a while to digest, and this is a message from Mike that is taking me some time. Like a good wine, it takes experience and trying things to appreciate: and this one is still not going down super smoothly to me (and honestly, as a community, I still believe we have to agree on terminology here).

What I'm learning the hard way is that you want to de-couple your API design in three independent layers of specificity:
  1. the "what": application
    e.g. events, people, photos, videos, etc
  2. the "where": structure
    e.g. text/html, text/json, text/xml, json-ld, etc
  3. the "how": protocol
    e.g. http://, tel://, mailto://, android-app://, etc
As you move down, the more generic/opaque the message is: the less you know about the message. And you want all of these components to be used/replaced/switched independently.

Here is the metaphor that I use to explain to my co-workers.

When you start writing a letter you first figure out "what" to write ...

... A love letter, a thank you note, a note, etc. You use "domain-specific" language between you and your recipient, like slang, nicknames and emotions. You use recipient-specific common notions on how to organize the world (taxonomies) as well as belief systems (ontologies).


...  you also decide what's the most convenient paper format to convey your message ...

You pick the paper color, shape, size and layout. If the content of the message is short, you pick a small paper. If the content of the message has structure, you pick a more structured layout. If you are writing free-form, you pick a blank paper.


... and you also have to decide how the letter is going to be delivered ...

You pick different trade offs between cost and delivery guarantees: re-try policies, tracking numbers, latency, etc.

But  more importantly, you want to allow yourself to use any combination of these.

You want to have the ability to write about a variety of things, under a variety of paper layouts across a variety of delivery methods.

And this translates to designing your hypermedia API

You want to allow your domain-specific vocabularies (e.g. schema.org, ogp.me, microformats.org or your custom own) to be conveyed in a variety of formats (e.g. web pages, email messages, android XML UIs, etc) and to be transported using different mechanisms (e.g. http, smtp, android intents, etc).

So, where do we go from here?

Above and beyond of course!

While I think there is a lot of work to be done in all of these different levels, having already gone through a lot of time understanding the top layer, I'm particularly interested in the middle layer (structure/layout) - starting from where it currently falls short.

Stay tuned!

Sunday, July 26, 2015

hypermedia api controls missing


... a collection of affordances you find for the human-readable web that aren't yet available for the machine-readable web.

This is part of a series of posts where I try to identify things that are missing in hypermedia API design. You may want to read this as a starting point for context and this as a starting point for why it matters to me.

Here is a list of things that we still* have to parse as humans from human-readable documentation:
  1. basic <form> controls
  2. client-side navigation
  3. client-side dependencies
  4. client-side cardinality and grouping of fields
  5. client-side data loading
  6. client-side validation
* I'm unaware of any hypermedia type that I looked at (with the exception of HTML -- with the addition of javascript) that is able to express all of these. I'd love to be proven wrong and educate myself, just drop me a line in the comments with examples and I'll correct myself.

I'll go over what each of these things look like for the human-readable web.

Basic controls

You want to get some of the basic structural elements HTML forms provide. That includes things like:


  1. enumerations (e.g. <select><option name='foo' value='bar'></select>)
  2. default values (e.g. <input value="foobar">) 
  3. readonly values (e.g. <input type="hidden">) 
  4. semantic auto-completing (e.g. first name, credit card numbers, etc) (e.g. <input type="tel">, <input type="text" autocomplete="firstName">) 

Navigation

Of all of the HTML forms affordances, I'd like to point one that stands out to me: navigation. Looking at APIs right now, you have to "read english documentation" to figure out "where to start" (e.g. which field to fill first), when there is obviously an "intended order" to be followed.

We need something like "tabindexes for programmers".

Dependencies

In the same realm as "navigation", there are occasions when you want to direct your clients to fill things (or not) based on their previous choices.

For example, how often have you said in your "human-readable documentation": "if you set this property, please also add details to this other object"?


Grouping

This is an area that HTML forms falls short: nested inputs. Today, you have to live with key/value pairs. You can accomplish nesting with javascript, but you probably want this to be done declaratively.

Cardinality

You want to give your clients the ability "add/remove more-of-the-same" for repeated nested structures. You also want to give them the ability to tell the "minimum and maximum" number of items needed.


Data loading

There are some cases where your data won't fit into a single payload, and you have to load it dynamically. Supporting auto-completes in APIs are a common example of that.

Validation

I think we still need to evolve a lot how client-side validation is done and how expressive that is, including having the ability to do remote validation.


  • whether the field is required or not (e.g. <input required>)
  • min/max values (e.g. <input min=2 max=10>) 
  • max/min length
  • pattern matching (e.g. <input pattern="/[1-9]/g">) 
  • identical/different fields (e.g. email-matching)
  • remote validation

Having said that ...

I won't make a case that HTML forms are perfect either: plenty of what I mentioned here is done with unstructured javascript (as opposed to declaratively), but that's a subject for a post on its own. Stay tuned!