Skip to content

Latest commit

 

History

History
838 lines (675 loc) · 27.3 KB

emacsapis.org

File metadata and controls

838 lines (675 loc) · 27.3 KB

Literate API Development

There are a lot of fascinating REST APIs there.

  • TODO: Name a few very weird and interesting APIs.)

Combining APIs is a great way to build a product, and maybe even start a business. Want to connect your window blinds to the weather.com API? Sure. Want to do something else with some other thing? Go for it.

Thanks to technologies like OAuth and OpenID Connect, it’s easier than ever to turn that toy into a robust and secure product than ever and also to share the same client code.

In a previous life, my team and I used to be really into Postman for the initial steps of writing applications. It was good for firing off a few requests and getting a feel for how an API worked. Over time, Postman collections found their way into VC repositories, zip files in shared drives, Slack… kind of everywhere. We were happy customers.

However, like many applications over the last few years, Postman went through a period of excitement over Electron and ballooned into a whopping 800+ MB mountain of Javascript. It would hang in mysterious ways like when selecting an environment, for up to 30 seconds at a time. (This probably had something to do with my team abusing it with a common account and dozens of environments, but still.) It felt like a flow destroying laptop warmer.

So we went back to cURL. It’s amazingly portable. It’s usually a one-liner, so it’s possible to copy & paste a command and have it work. Postman exports cURL commands directly from the UI, we stored those in a file and shared the file.

Things start to break in particular ways with cURL, though. Consider these contrived but common scenarios:

  1. Multi-step flows, where the response from one API goes into a second request.
  2. When you are POSTing large payloads to JSON APIs.
  3. When you, erm, have to remember what you’ve done before.

I’ll share my current worklow based around Emacs Org mode and a package called Restclient. I’m assuming you have little or no Emacs knowledge, and probably never heard of Org mode.

It’s happily replaced both Postman and cURL in the majority of scenarios. I hope to leave you with an idea about how it fits together as a simple effetive system for hacking away at APIs right in a text editor with minimal division between code and prose. You could call it “literate API development”.

cURL: the c is for crying

We mentioned OAuth earlier. To do anything useful with an OAuth API, you need to log on and then send a request:

Step 1: Authenticate and get a bearer token. Step 2: Send that bearer token in a header and get a resource.

This is how it might work in the terminal. First, you send a cURL command with a username and password,

curl -u "someapplication:password" \
    -X GET "http://httpbin.org/basic-auth/someapplication/password" \
    -H "accept: application/json"

which returns a token

{
    "token" : "123"
}

In the terminal, you copy this token and put it in a variable:

TOKEN=123

Then you use the token in each authenticated call:

curl -i \
    -H "Authorization: Bearer $TOKEN" \
    -H "accept: application/json" \
    -XGET https://httpbin.org/bearer

**Does it work? yes**. It is still fairly easy to share, since everybody has a shell. You just have to send a few cURL commands. Except they’re run in sequence and there’s that env var in the middle, so maybe you share a shell script. (You should start getting a sinking feeling right about now, wondering how you get to be a professional programmer but are exchanging shell scripts.)

Oh and how about this one: the most interesting requests are POST or PUT. You will likely end up with two little steps for every one of these calls:

For the request body, you will want to create a small file called request.json. Then put your POST body in it:

{
    "jql": "project = HSP",
    "startAt": 0,
    "maxResults": 15,
    "fields": [
        "summary",
        "status",
        "assignee"
    ]
}

Second, youw will craft a cURL call to send that off as a request body with @:

curl -i -H "Content-Type: application/json" -XPOST http://httpbin.org/post \
    -d "@request.json"

Now you may have three things to keep track of: a token, a small JSON snippet, and a command. Not so convenient now, is it?

Both of these can be solved with Postman, which has scripting capabilities and sharing. Pretty soon you’re gonna want to version your code, though; also your code is tied to a proprietary client.

Now it’s obvious we need to go even older than cURL. To Emacs and emacs lisp. An elegant weapon from a more civilized time.

In this article, we will see how using Emacs + Jq as an embedded REST client lets you

  • Play around with REST APIs
  • Use Jq to explore the shape of data online and offline
  • Export client code to use for further exploration.

A couple of modifications allow you to log into APIs which give you a token, by storing this information directly in Emacs Org mode. In this way, your programming interface becomes your scripting. The scripting itself kind of dissolves into the document.

Other alternatives exist. For VS Code, a similar alternative is X. I still use VS Code from time to time as a debugger. For simpler text processing, though, I prefer Emacs.

Org-mode

Next to Donald Knuth, Dominic Carsten should go down in history as one of the greatest yak-shavers of all time. Starting as a note taker, Org mode is used for agendas and GTD systems, writing Ph.D theses, and publishing blogs like this one.

Superficially, Org mode is an outline system alla Markdown:

* Heading 1
Blah blah
** Heading 2
Bleh bleh

You have markup for font styles like bold and italic, formatted code snippets, and can export to HTML. So far a lot like Markdown.

The differences start showing up editing code blocks. In Markdown, syntax for code blocks is fairly basic. Some Markdown additions exist which take it a lot futher, such as [a] and [b], but they have more limited use. A source code block looks like this:

{{code}}

It is one of the more viable solutions for doing literate programming, where code exists inside prose, rather than prose being inside the code as comments.

Here’s a short snippet that shows how it works:

#+NAME: hi
echo "hi"

cowsay

We have two source code blocks, and one is getting the otuput from the other. They are both running shell scripts. When we run them, we see this:

echo "hi"
cowsay
 ____
< hi >
 ----
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

So it’s a little like Markdown, but it’s also like Jupyter. I execute a “cell” by executing a source block with a keystroke, and that outputs a results section which is not meant to be edited by hand, since it’s regenerated every time.

There’s a lot more to org-mode, but we have enough for the curious to start I hope.

Restclient

The same cURL call that took 2 steps, in restclient [link] for Emacs, looks like this:

POST http://httpbin.org/post
Content-Type: application/json

{
    "jql": "project = HSP",
    "startAt": 0,
    "maxResults": 15,
    "fields": [
        "summary",
        "status",
        "assignee"
    ]
}

That’s the first source code block. REST client uses a shortened description of the call that turns out to be an actual RFC [link]. In this printed version (which was just generated from the Emacs source) you won’t see the header line demarcating the source block. It identifies the block as being of “restclient” type:

...

Following the org example above, we will introduce a second code block, which reads the output of the first, and manipulates it: (Similar to where we had cowsay in the previous example.)

jq '.'
{
  "args": {},
  "data": "{\n    \"jql\": \"project = HSP\",\n    \"startAt\": 0,\n    \"maxResults\": 15,\n    \"fields\": [\n        \"summary\",\n        \"status\",\n        \"assignee\"\n    ]\n}",
  "files": {},
  "form": {},
  "headers": {
    "Accept": "*/*",
    "Accept-Encoding": "gzip",
    "Content-Length": "149",
    "Content-Type": "application/json",
    "Extension": "Security/Digest Security/SSL",
    "Host": "httpbin.org",
    "Mime-Version": "1.0",
    "X-Amzn-Trace-Id": "Root=1-60d12829-4c670c4707294e1e1cce5439"
  },
  "json": {
    "fields": [
      "summary",
      "status",
      "assignee"
    ],
    "jql": "project = HSP",
    "maxResults": 15,
    "startAt": 0
  },
  "origin": "190.140.132.62",
  "url": "http://httpbin.org/post"
}

This is jq, in this example just printing the output. Now let’s write another jq block, this time extracting the contents of the data entry, which containins the original payload:

jq -r .data
{
    "jql": "project = HSP",
    "startAt": 0,
    "maxResults": 15,
    "fields": [
        "summary",
        "status",
        "assignee"
    ]
}

The payload has been unescaped by JQ. Pretty neat. Something that JQ doesn’t do? No problem, let’s pipe that through a short Python function.

import json
data = json.loads(jsondoc)
print(json.dumps(data, indent=4))
{
    "jql": "project = HSP",
    "startAt": 0,
    "maxResults": 15,
    "fields": [
        "summary",
        "status",
        "assignee"
    ]
}

It may not be obvious reading this how automated this actually is. The code snippets and . None of the output blocks have been manually edited, they can all be regenerated from the input, like in a Jupyter notebook. One difference between this format and the iPython format is that there isn’t really a content vs. container format. It’s all in a big text file with some markup that still looks OK when you read it in plain text.

If you were to load it in Org mode, you would see nice syntax highting for the code in the blocks, and be able to jump into each snippet in a separate window to edit with indentation, linting, and documentation.

The whole thing is completely tied into Emacs: the way it looks, the way it evaluates, etc. There are some Org mode implementations for other editors like VI and Visual Studio Code, but they typically handle a fraction of the API, which is quite extensive. There’s no context switch between the editor and the document, or between the document the embeddd code snippets. If you invest time to get it to work, its rewards are many.

If you are interested in looking at the soure of this article to see how the entire thing was laid out, click here. Turns out github understands Org mode just fine.

Caching

Let’s keep going with something that would have made the cURL or Postman workflow really strain. We will make requests against an API that returns a large response. (Large as in lines of text, not as in MB.)

With such a large payload, we want to see if we can capture it once, and then slice and dice it several ways. This is done by enabling caching. The resulting header will be like this:


We’re going to want to use this cache in later steps. A footnote here: first, the drawer gets in the way of the results parsing. This can be overcome through scripting, but the workaround is just to shift the output by 1 line. (More in notes section, along with the workaround.)

GET https://pokeapi.co/api/v2/pokemon-species/25/
Accept: application/json

The top of the results drawer looks like this:

:results:
#+RESULTS[af90d0f88686fd6432b6b25e0f7764fad2413481]: pokemon-cached
{
  "base_happiness": 70,
  "capture_rate": 190,
  "color": {
    "name": "yellow",
    "url": "https://pokeapi.co/api/v2/pokemon-color/10/"
  },
  ... ~ 4,000 more lines ...

But when I collapse it with Emacs, it’s unintrusive:

attachment:_20210621_195727screenshot.png

It’s gone from line 192 to line 4246. That’s a pretty big JSON payload. How big?

wc
    4051   10049  133247

Pretty hard to grok the overall structure and contents.

Now let’s have a look at the keys:

jq 'keys'
[
  "base_happiness",
  "capture_rate",
  "color",
  "egg_groups",
  "evolution_chain",
  "evolves_from_species",
  "flavor_text_entries",
  "form_descriptions",
  "forms_switchable",
  "gender_rate",
  "genera",
  "generation",
  "growth_rate",
  "habitat",
  "has_gender_differences",
  "hatch_counter",
  "id",
  "is_baby",
  "is_legendary",
  "is_mythical",
  "name",
  "names",
  "order",
  "pal_park_encounters",
  "pokedex_numbers",
  "shape",
  "varieties"
]

This API has some scalars like name, and some lists like names:

jq '.name'
"pikachu"
jq '.names | length'
11
jq '.names[0]'
{
  "language": {
    "name": "ja-Hrkt",
    "url": "https://pokeapi.co/api/v2/language/1/"
  },
  "name": "ピカチュウ"
}

Now let’s try and output all the 11 translations as a table.

jq -r '.names[] | [.name, (.language | .name, .url)] | @csv'
| ピカチュウ | ja-Hrkt | https://pokeapi.co/api/v2/language/1/  |
| Pikachu    | roomaji | https://pokeapi.co/api/v2/language/2/  |
| 피카츄     | ko      | https://pokeapi.co/api/v2/language/3/  |
| 皮卡丘     | zh-Hant | https://pokeapi.co/api/v2/language/4/  |
| Pikachu    | fr      | https://pokeapi.co/api/v2/language/5/  |
| Pikachu    | de      | https://pokeapi.co/api/v2/language/6/  |
| Pikachu    | es      | https://pokeapi.co/api/v2/language/7/  |
| Pikachu    | it      | https://pokeapi.co/api/v2/language/8/  |
| Pikachu    | en      | https://pokeapi.co/api/v2/language/9/  |
| ピカチュウ | ja      | https://pokeapi.co/api/v2/language/11/ |
| 皮卡丘     | zh-Hans | https://pokeapi.co/api/v2/language/12/ |

We can also put in headings. Again we’re using the cached copy.

jq -r '["Name", "Language name", "Language URL"], (.names[] | [.name, (.language | .name, .url)]) | @csv'

This table has been created directly from the API response. I never had to leave the editor, copy & paste data or anything. Similarly, we can have a look at the other two collections in the API, for Pokemon varieties and Pokedex numers (whatever that is.)

jq -r '.varieties[] | [ .pokemon | .name, .url ] | @csv'
| pikachu              | https://pokeapi.co/api/v2/pokemon/25/    |
| pikachu-rock-star    | https://pokeapi.co/api/v2/pokemon/10080/ |
| pikachu-belle        | https://pokeapi.co/api/v2/pokemon/10081/ |
| pikachu-pop-star     | https://pokeapi.co/api/v2/pokemon/10082/ |
| pikachu-phd          | https://pokeapi.co/api/v2/pokemon/10083/ |
| pikachu-libre        | https://pokeapi.co/api/v2/pokemon/10084/ |
| pikachu-cosplay      | https://pokeapi.co/api/v2/pokemon/10085/ |
| pikachu-original-cap | https://pokeapi.co/api/v2/pokemon/10094/ |
| pikachu-hoenn-cap    | https://pokeapi.co/api/v2/pokemon/10095/ |
| pikachu-sinnoh-cap   | https://pokeapi.co/api/v2/pokemon/10096/ |
| pikachu-unova-cap    | https://pokeapi.co/api/v2/pokemon/10097/ |
| pikachu-kalos-cap    | https://pokeapi.co/api/v2/pokemon/10098/ |
| pikachu-alola-cap    | https://pokeapi.co/api/v2/pokemon/10099/ |
| pikachu-partner-cap  | https://pokeapi.co/api/v2/pokemon/10148/ |
| pikachu-gmax         | https://pokeapi.co/api/v2/pokemon/10190/ |
jq -r '.pokedex_numbers[] | [.entry_number, (.pokedex | .name, .url)] | @csv'

Org mode hackery

This is all pretty much out-of-the-box Emacs. Things get a bit more interesting when we teach Emacs a few new tricks. Since Babel has support for Emacs Lisp as a language, we can define functions in in the document, which when evaluated will add to or modify the document. Let’s see for example a code block which prints the current heading.

(org-entry-get nil "ITEM")
Org mode hackery

(As a disclaimer, my elisp-fu is pretty weak, so for sure all of these things can be done in a much better way.)

Multi-step forms

:header-args+: :var token=”InNvbWVhcHBsaWNhdGlvbiIK”

Let’s use this technique to script an API call with two steps. Here’s what we’ll do:

Fist, we’ll write a cURL command which does basic authentication against an API and returns a token. Then we’ll use elisp to store that token in a hidden field in the document. Finally, in a second call, we will use the stored token and send it using restclient to another API.

This is the elisp to save the login to a header argument. This is the save-login function:

(org-set-property "header-args+" (concat ":var token=" (json-encode (chomp token))))

We will explain how to get the token variable in a bit. The “chomp” function is from the Emacs lisp cookbook. Similar to the one in Perl, it removes leading and trailing whitespace. More in the notes section.

The “save-login” function gets a token and saves it in a “header argument” with this format:

:header-args+: :var token="3af41..."

If you’re not into Emacs (yet), the details won’t make much sense. Let’s just say that this header argument is what will let you recall the token later on.

Now we need the actual API calls to do the login:

curl -u "someapplication:password" \
    -X GET "http://httpbin.org/basic-auth/someapplication/password" \
    -H "accept: application/json"
{
  "authenticated": true,
  "user": "someapplication"
}

For now, let’s assume we got a token back from the API. We call

jq '.user' | base64
#+RESULTS: token
: InNvbWVhcHBsaWNhdGlvbiIK

Now we call

#+call: save-login :var token=token

The header argument of this article (notebook?) now contians a property.

** Multi-step forms
:PROPERTIES:
:header-args+: :var token="InNvbWVhcHBsaWNhdGlvbiIK"
:END:

What does that give us? Well, anything else in this part of the document will be able to use the token by just referring to it. Let’s pretend we’re giving it to another API call via REST Client, by placing it in an HTTP header.

GET https://httpbin.org/bearer
accept: application/json
Authorization: Bearer :token

#+name: curl
(org-with-wide-buffer
 (org-babel-goto-named-src-block "bearer-token")
 (restclient-copy-curl-command))
(org-babel-goto-named-result "curl")
(org-insert-structure-template "src")
(insert "sh :results output :var token=token\n")
(insert (car kill-ring-yank-pointer))

It inserted this, which I modified only slightly, replacing token with $token, and then ran:

It’s easy to get carried away, hacking APIs, responses, and the editor all at the same time and in the same document. Starting in this document, the Lisp and other code can take their separate paths: Lisp to go into a function bound to a key, so that I can use it in any document I like; the REST client an JQ code could be exported to Python or another programming language, so that I could put them in production.

Would I recommend it in terms of strict productivity? Perhaps not; it takes a little while to get started. Is it hella fun and kind of addictive? yes.

Next steps

There are several ways this could be improved:

  • We could have have elisp understand the jq syntax, so that table headings are automatically inserted by reading the filter.
  • We could ensure all steps are idempotent, including cleaning up cached data. They mostly are right now, but there are a few exceptions.
  • We could add an export function to other languages like Python or Go. This export function could even take into account the JQ filters in subsequent blocks, so that we have both the calling and filtering code available.

Conclusions

There are a few interesting qualities about this approach for me. The first is homoiconicity. This is the often touted description of lisp languages, and is supposed to refer to the symmetry betwen the program’s code and the data that it operates on. For lisps, since the code is a bunch of lists, and the data structures are also bunch of lists. This is supposed to unleash some real economies of scale when it comes to code re-use. Or something like that.

From the poitn of view of these exercises, you might say there’s a certain more pedestrian form of homoiconicity at play. The document itself is the srouce, the intermediate data store, and the target. The document contains the code that can further modify the document. Maybe not so pedestrian after all.

Given the pretty excellent support for domain specific langauges like Restclient, and also really good integration for subshells, it’s possible to create arbitrarily complex workflows right in one document that do real work in the outside world (gasp - outside the document!) all in the flow of text. This is why this is super-squarely in the world of literate programming. Donald Knuth would be proud.

Lisp is supposed to be one of those programming languages that you learn in order to become better at other, real, productive programming languages. In Emacs, it is just a very practical and very well documented construct that can be chained up however and wherever. It doesn’t produce the fastest code, but it’s endlessly flexible. It was not evident how much this was the case until working ont hese exercises.

I hope this article gave you a taster of what is possible if you choose to brave the world of many parens. and having that “fits-like-a-glove environment.”

References

restclient: https://github.com/pashky/restclient.el org-babel: https://orgmode.org/worg/org-contrib/babel/ elisp-cookbook: https://www.emacswiki.org/emacs/ElispCookbook auth0 examples: https://auth0.com/docs/connections/generic-oauth2-connection-examples ceo-emacs: https://www.fugue.co/blog/2015-11-11-guide-to-emacs.html

Notes

Chomp function

(defun chomp (str)
    "Chomp leading and tailing whitespace from STR."
    (while (string-match "\\`\n+\\|^\\s-+\\|\\s-+$\\|\n+\\'"
                         str)
      (setq str (replace-match "" t t str)))
    str)