Skip to content

Latest commit

 

History

History
161 lines (94 loc) · 4.76 KB

notes.adoc

File metadata and controls

161 lines (94 loc) · 4.76 KB

Dexy and Docker for Reproducible Research

Hello I’m @ananelson

Docker allows you to package an application with all of its dependencies into a standardized unit for software development.
— docker.com/whatisdocker
Dexy is a multi-purpose project automation tool with lots of features designed for working with documents.
— dexy.it/docs/what-is-dexy.html

Why Automate?

  • Make sure your docs/research are correct.

  • Automation saves you time (after initial investment).

  • Detect inefficiencies and annoyances in your install process and workflow.

  • Avoid implicit knowledge, easier to collaborate and disseminate.

Project Model

Add a Dockerfile and a dexy.yaml to the root of your project repository. Maybe a run-docker.sh for convenience.

Docker

{{ d['Dockerfile|idio|asciisyn']['from'] }}

The ubuntu image is minimal, so we have to install some basics and do some configuration that is usually taken care of for us.

Create a locale:

{{ d['Dockerfile|idio|asciisyn']['localedef'] }}

Set up defaults for unattended apt installs (so you don’t have to type -y all the time):

{{ d['Dockerfile|idio|asciisyn']['apt-defaults'] }}

Install some system utilities and convenient developer tools:

{{ d['Dockerfile|idio|asciisyn']['utils'] }}

Install python and pip:

{{ d['Dockerfile|idio|asciisyn']['python'] }}

Install asciidoctor and its dependencies (including Ruby):

{{ d['Dockerfile|idio|asciisyn']['asciidoctor'] }}

Install dexy:

{{ d['Dockerfile|idio|asciisyn']['dexy'] }}

Install the software we need for our research:

{{ d['Dockerfile|idio|asciisyn']['research'] }}

Finally, create and activate a user so we can work interactively within our image:

{{ d['Dockerfile|idio|asciisyn']['create-user'] }}

{{ d['Dockerfile|idio|asciisyn']['activate-user'] }}

Here’s how we use this Dockerfile to build an image:

{{ d['run-docker.sh|idio|asciisyn']['build'] }}

And here is how the image is run interactively as a container:

{{ d['run-docker.sh|idio|asciisyn']['interactive'] }}

This example is set up for interactive use, can also be set to generate docs automatically when you’ve finished developing.

{{ d['run-docker.sh|idio|asciisyn']['hands-off'] }} {{ d['Dockerfile|idio|asciisyn']['run'] }}

Simpy Example

{{ d['renege.py|idio|asciisyn'] }}

{{ d['renege.py|py'] | indent }}

Change random seed in renege.py and rerun dexy…​

Dexified Example

We were able to run the example as-is using Dexy, but by making some changes we can make this example more flexible. A few ideas:

  • Put data in separate input file.

  • Save results to an output file.

  • Write prose in documents, not code files.

Moviegoer

A moviegoer tries to by a number of tickets (num_tickets) for a certain movie in a theater.

If the movie becomes sold out, she leaves the theater. If she gets to the counter, she tries to buy a number of tickets. If not enough tickets are left, she argues with the teller and leaves.

If at most one ticket is left after the moviegoer bought her tickets, the sold out event for this movie is triggered causing all remaining moviegoers to leave.

{{ d['renege-dexy.py|idio|asciisyn']['moviegoer'] }}

Customer Arrivals

Create new moviegoers until the sim time reaches 120

{{ d['renege-dexy.py|idio|asciisyn']['customer-arrivals'] }}

Simulation

{{ d['renege-dexy.py|idio|asciisyn']['run'] }}

Results

This simulation was run with random seed {{ d['settings.yaml'].from_yaml()['random-seed'] }}.

{% for result in d['results.json'].from_json() %} ==== {{ result['name'] }}

{% if result['is-sold-out'] %} '{{ result['name'] }}' sold out in {{ result['sold-out-in'] }} minutes, and {{ result['queue-leavers'] }} left the queue. {% endif %}

{% endfor %}

Why Separate Data from Code?

  • decoupling == good

  • dexy manages data files for you

  • easier to access data from document templates

  • document templates are more flexible than print statements

  • present data in different ways and in different documents

  • easier to put data through additional pipelines (even using different languages), e.g. visualization

{{ d['plot-results.py|idio|asciisyn']['load-results'] }}

{{ d['plot-results.py|py'] | indent }}
sold out in

Source

{{ d['dexy.yaml|asciisyn'] }}

{{ d['notes.adoc|asciisyn'] }}