I wanted to show a quick glimpse of the programming dream (or my own,
at least) - immediate feedback while programming, validated against a
live server. Power, simplicity, reliability, and convenience, I want
it all!
Here's an example video with narration on what it's like working with
ReasonML + GraphQL in emacs:
A few benefits of this approach:
Full-stack safety: The server presents its entire known schema, so
my Reason app won't even compile if it's trying to access a
non-existent field (or trying to use it in a type-incorrect way,
e.g. mistaking a string for an integer)
Long-term safety: Because fields are never removed from a GraphQL
(only deprecated), I never have to worry about shipping a client
that might be broken by future server changes. This goes a long way
towards ever-green clients.
No forgotten edge cases - this one kills me continually outside
of Reason. I forget to check if the response is still loading, or
if it error'ed, or I try to access data on the wrong field. I can
easily add a catch-all to throw an error and ignore all the edge
cases if I'm prototyping, but once I have my happy-path, I want to
make sure things are battened-down tightly.
In-editor completion: When accessing fields
Editor-guidance: Along the previous lines, with Reason the data
structures guide me to handling each case and field access
gracefully as I explore the response structure. As soon as I hit
save, I'll know if I have a typo, or if I accessed a nullable field
without checking, or if I used the type incorrectly.
Some drawbacks:
The only drawback I can think of is I can't quite see a way to get
auto-completion while writing the GraphQL in the PPX. I'd ideally like
to have a GraphiQL-like experience with the fields auto-completing,
and being able to read docs/types inline. Currently I tend to write
the bulk of my queries in our fork of GraphiQL, then paste in the
result. It's minor, but would be really nice if there was a way to do
this (I know there's a way in Atom for example, but emacs may not make
this easy).
Closing notes:
This example is in emacs, but the experience should be the same
(or better!) in vim, Atom, and especially vscode, thanks to the great
Reason editor integrations there.
I've switched my personal site (riseos.com)
over to Docusaurus from a Mirage unikernel
for a few reasons. First, I was putting off writing blog posts because
of the amount of yak shaving I was doing. Second, the dependency
situation never really got to the point I felt it was worth the
effort. And third, some projects I've been working on have pushed me
to get a lot more familiar with frontend topics, especially
static-sites that are rendered with React.
I've deployed a few sites with Gatsby,
and was looking for something significantly simpler and more
reliable. At the recommendation of the
ReasonML team, I gave docusaurus a shot
on another site, and it worked out nicely. I appreciate that it's
limited enough to encourage you not to yak-shave too much (which is
good from time to time, but not for my personal site at this time).
Anyway, certainly recommend giving
Docusaurus +
netlify a shot, worked like a charm for
me.
Trying to blow the buzzword meter with that title...
Note of Caution!
This never made it quite 100% of the way, it was blocked largely on
account of me not being able to get the correct version of the
dependencies to install in CI. Bits and pieces of this may still be
useful for others though, so I'm putting this up in case it helps
out.
Also, I really like the PIC bug, it tickles me how far down the stack
that ended up being. It may be the closest I ever come to being
vaguely involved (as in having stumbled across, not having
diagnosed/fixed) in something as interesting as
Dave Baggett's
hardest bug ever
Feel free to ping me on the
OCaml discourse, though I'll likely just
point you at the more experienced and talented people who helped me
put this all together (in particular
Martin Lucin, an absurdly intelligent and
capable OG hacker and a driving force behing
Solo5).
Topics
What are unikernels?
What's MirageOS?
Public hosting for unikernels
AWS
GCE
DeferPanic
Why GCE?
Problems
Xen -> KVM (testing kernel output via QEMU)
Bootable disk image
Virtio problems
DHCP lease
TCP/IP stack
Crashes
Deployment
Compiling an artifact
Initial deploy script
Zero-downtime instance updates
Scaling based on CPU usage (how cool are the GCE suggestions to downsize an under-used image?)
Continuously Deploying Mirage Unikernels to Google Compute Engine using CircleCI
Or "Launch your unikernel-as-a-site with a zero-downtime rolling
updates, health-check monitors that'll restart an instance if it
crashes every 30 seconds, and a load balancer that'll auto-scale based
on CPU usage with every git push"
This post talks about achieving a production-like deploy pipeline for
a publicly-available service built using Mirage, specifically using
the fairly amazing Google Compute Engine infrastructure. I'll talk a
bit about the progression to the current setup, and some future
platforms that might be usable soon.
What are unikernels?
Unikernels are specialised, single-address-space machine images
constructed by using library operating systems.
Easy! ...right?
The short, high-level idea is that unikernels are the equivalent of
opt-in operating systems, rather than
opt-out-if-you-can-possibly-figure-out-how.
For example, when we build a virtual machine using a unikernel, we
only include the code necessary for our specific application. Don't
use a block-storage device for your Heroku-like application? The code
to interact with block-devices won't be run at all in your app - in
fact, it won't even be included in the final virtual machine image.
And when your app is running, it's the only thing running. No other
processes vying for resources, threatening to push your server over in
the middle of the night even though you didn't know a service was
configured to run by default.
There are a few immediately obvious advantages to this approach:
Size: Unikernels are typically microscopic as deployable
artifacts
Efficiency: When running, unikernels only use the bare minimum
of what your code needs. Nothing else.
Security: Removing millions of lines of code and eliminating
the inter-process protection model from your app drastically
reduces attack surface
Simplicity: Knowing exactly what's in your application, and how
it's all running considerably simplifies the mental model for both
performance and correctness
MirageOS is a library operating system that constructs unikernels
for secure, high-performance network applications across a variety
of cloud computing and mobile platforms
Mirage (which is a very clever name once you get it) is a library to
build clean-slate unikernels using OCaml. That means to build a Mirage
unikernel, you need to write your entire app (more or less) in
OCaml. I've talked quite a bit now about why
OCaml is pretty solid,
but I understand if some of you run away screaming now. No worries,
there are other approaches to unikernels that may work better for
you[2]. But as for me and my house, we will use Mirage.
There are some great talks that go over some of the cool aspects of
Mirage in much more detail [3][4], but it's unclear if they're
actually usable in any major way. There are even companies that take
out ads against unikernels, highlighting many of the ways in which
they're (currently) unsuitable for production:
Bit weird, that.
But I suspect that bit by bit this will change, assuming sufficient
elbow grease and determination on our parts. So with that said, let's
roll up our sleeves and figure out one of the biggest hurdles to using
unikernels in production today: deploying them!
Public hosting for unikernels
Having written our app as a unikernel, how do we get it up and running
in a production-like setting? I've used AWS fairly heavily in the
past, so it was my initial go-to for this site.
AWS runs on the Xen hypervisor, which is the main non-unix target
Mirage was developed for. In theory, it should be the smoothest
option. Sadly, the primitives and API that AWS expose just don't match
well. The process is something like
this:
Download the AWS command line tools
Start an instance
Create, attach, and partition an EBS volume (we'll turn this into
an AMI once we get our unikernel on it)
Copy the Xen unikernel over to the volume
Create the GRUB entries... blablabla
Create a snapshot of the volume ohmygod
Register your AMI using the pv-grub kernel id what was I doing again
Start a new instance from the AMI
Unfortunately #3 means that we need to have a build machine that's
on the AWS network so that we can attach the volume, and we need to
SSH into the machine to do the heavy lifting. Also, we end up with a
lot of left over detritus - the volume, the snapshot, and the AMI. It
could be scripted at some point though.
GCE to the rescue!
GCE is Google's public computing
offering, and I currently can't recommend it highly enough. The
per-minute pricing model is a much better match for instances that
boot in less than 100ms, the interface is considerably nicer and
offers the equivalent REST API call for most actions you take, and the
primitives exposed in the API mean we can much more easily deploy a
unikernel. Win, win, win!
GCE Challenges
Xen -> KVM
There is a big potential show-stopper though: GCE uses the KVM
hypervisor instead of Xen, which is much, much nicer, but not
supported by Mirage as of the beginning of this year. Luckily, some
fairly crazy heroes (Dan Williams,
Ricardo Koller, and
Martin Lucina, specifically) stepped up and made it
happen with Solo5!
Solo5 Unikernel implements a unikernel base, or the lowest layer of
code inside a unikernel, which interacts with the hardware
abstraction exposed by the hypervisor and forms a platform for
building language runtimes and applications. Solo5 currently
interfaces with the MirageOS ecosystem, enabling Mirage unikernels
to run on either Linux KVM/QEMU
I highly recommend checking out a replay of the great webinar the
authors gave on the topic
https://developer.ibm.com/open/solo5-unikernel/ It'll give you a sense
of how much room for optimization and cleanup there is as our hosting
infrastructure evolves.
Now that we have KVM kernels, we can test them locally fairly easily
using QEMU, which shortens the iterations while we dealt with teething
on the new platform. The
Bootable disk image
This was just on the other side of my experience/abilities,
personally. Constructing a disk image that would boot a custom
(non-Linux) kernel isn't something I've done before, and I struggled
to remember how the pieces fit together. Once again, @mato came to the
rescue with a
lovely little script
that does exactly what we need, no muss, no fuss.
Virtio driver
Initially we had booting unikernels that printed to the serial console
just fine, but didn't seem to get any DHCP lease. The unikernel was
sending
DHCP discover broadcasts,
but not getting anything in return, poor lil' fella. I then tried with
a hard-coded IP literally configured at compile time, and booted an
instance on GCE with a matching IP, and still nothing. Nearly the
entire Mirage stack is in plain OCaml though, including the
TCP/IP stack, so I was able
to add in plenty of debug log statements and see
whatwashappening. Finally
tracked everything down to problems with the Virtio implementation,
quoting @ricarkol:
The issue was that the vring sizes were hardcoded (not the buffer
length as I mentioned above). The issue with the vring sizes is kind
of interesting, the thing is that the virtio spec allows for
different sizes, but every single qemu we tried uses the same 256
len. The QEMU in GCE must be patched as it uses 4096 as the size,
which is pretty big, I guess they do that for performance reasons. -
@ricarkol
I tried out the fixes, and we had a booting, publicly accessible
unikernel! However, it was extremely slow, with no obvious reason
why. Looking at the logs however, I saw that I had forgotten to remove
a ton of
logging per-frame. Careful
what you wish for with accessibility, I guess!
Position-independent Code
This was a deep rabbit hole. The
bug manifested as Fatal error: exception (Invalid_argument "equal: abstract value"), which
seemed strange since the site worked on Unix and Xen backends, so
there shouldn't have been anything logically wrong with the OCaml
types, despite what the exception message hinted at. Read
this comment
for the full, thrilling detective work and explanation, but a
simplified version seems to be that portions of the OCaml/Solo5 code
were placed in between the bootloader and the entry point of the
program, and the bootloader zero'd all the memory in-between (as it
should) before handing control over to our program. So eventually our
program did some comparison of values, and a portion of the value had
at compile/link time been relocated and destroyed, and OCaml threw the
above error.
Crashes
Finally, we have a booting, non-slow, publicly-accessible Mirage
instance running on GCE! Great! However, every ~50 http requests, it
panics and dies:
Oh no! However, being a bit of a kludgy-hacker desperate to get a
stable unikernel I can show to some friends, I figured out a terrible
workaround: GCE offers fantastic health-check monitors that'll restart
an instance if it crashes because of a virtio (or whatever) failure
every 30 seconds. Problem solved, right? At least I don't have restart
the instance personally...
And that was an acceptable temporary fix until @ricarkol was once
again able to track down the cause of the crashes and fix things up
that had to do with some GCE/Virtio IO buffer descriptor wrinkle:
The second issue is that Virtio allows for dividing IO requests in
multiple buffer descriptors. For some reason the QEMU in GCE didn't
like that. While cleaning up stuff I simplified our Virtio layer to
send a single buffer descriptor, and GCE liked it and let our IOs go
through - @ricarkol
So now Solo5 unikernels seem fairly stable on GCE as well! Looks like
it's time to wrap everything up into a nice deploy pipeline.
Deployment
With the help of the GCE support staff and the Solo5 authors, we're
now able to run Mirage apps on GCE. The process in this case looks
like this:
Compile our unikernel
Create a tar'd and gzipped bootable disk image locally with our unikernel
Upload said disk image (should be ~1-10MB, depending on our contents. Right now this site is ~6.6MB)
Create an image from the disk image
Trigger a rolling update
Importantly, because we can simply upload bootable disk images, we
don't need any specialized build machine, and the entire process can
be automated!
One time setup
We'll create two abstract pieces that'll let us continually deploy and
scale: An instance group, and a load balancer.
Creating the template and instance group
First, two quick definitions...
Managed instance groups:
A managed instance group uses an instance template to create
identical instances. You control a managed instance group as a
single entity. If you wanted to make changes to instances that are
part of a managed instance group, you would apply the change to the
whole instance group.
And templates:
Instance templates define the machine type, image, zone, and other
instance properties for the instances in a managed instance group.
We'll create a template with
FINISH THIS SECTION(FIN)
Setting up the load balancer
Honestly there's not much to say here, GCE makes this trivial. We
simply say what class of instances we want (vCPU, RAM, etc.), what the
trigger/threshold to scale is (CPU usage or request amount), and the
image we want to boot as we scale out.
In this case, I'm using a fairly small instance with the instance
group we just created, and I want another instance whenever we
sustained CPU usage over 60% for more than 30 seconds:
`PUT THE BASH CODE TO CREATE THAT HERE`(FIN)
Subsequent deploys
The actual cli to do everything looks like this:
mirage configure -t virtio --dhcp=true \
--show_errors=true --report_errors=true \
--mailgun_api_key="<>" \
--error_report_emails=sean@bushi.do
make clean
make
bin/unikernel-mkimage.sh tmp/disk.raw mir-riseos.virtio
cd tmp/
tar -czvf mir-riseos-01.tar.gz disk.raw
cd ..
# Upload the file to Google Compute Storage# as the original filename
gsutil cp tmp/mir-riseos-01.tar.gz gs://mir-riseos
# Copy/Alias it as *-latest
gsutil cp gs://mir-riseos/mir-riseos01.tar.gz \
gs://mir-riseos/mir-riseos-latest.tar.gz
# Delete the image if it exists
y | gcloud compute images delete mir-riseos-latest
# Create an image from the new latest file
gcloud compute images create mir-riseos-latest \
--source-uri gs://mir-riseos/mir-riseos-latest.tar.gz
# Updating the mir-riseos-latest *image* in place will mutate the# *instance-template* that points to it. To then update all of# our instances with zero downtime, we now just have to ask gcloud# to do a rolling update to a group using said# *instance-template*.
gcloud alpha compute rolling-updates start \
--group mir-riseos-group \
--template mir-riseos-1 \
--zone us-west1-a
Not too shabby to - once again - launch your unikernel-as-a-site
with zero-downtime rolling updates, health-check monitors that'll
restart any crashed instance every 30 seconds, and a load balancer
that auto-scales based on CPU usage. The next step is to hook up
CircleCI so we have continuous deploy of our
unikernels on every push to master.
CircleCI
The biggest blocker here, and one I haven't been able to solve yet, is
the OPAM switch setup. My current docker image has (apparently) a
hand-selected list of packages and pins that is nearly impossible to
duplicate elsewhere.
I'm not sure if it's because the powers-that-be in the OCaml world are simply uninterested in the domain, or if it's looked down upon as "not-real development" by established/current OCaml devs, but it's a pretty dire situation. There's some movement in the right direction between Opium and Ocaml WebMachine, but both are 1.) extremely raw and 2.) pretty much completely incompatible. There's no middleware standard (Rack, Connect, or the one I'm most familiar with, Ring), so it's not easy to layer in orthogonal-but-important pieces like session-management, authentication, authorization, logging, and - relevant for today's post - error reporting.
I've worked over the past few years on ever-increasingly useful error reporting, in part because it was so terrible before, especially compared to error reports from the server-side. A few years ago, you probably wouldn't even know if your users had an error. If you worked hard, you'd get a rollbar notification that "main.js:0:0: undefined is not a function". How do you repro this case? What did the user do? What path through a (for a human) virtually unbounded state-space lead to this error? Well friend, get ready to play computer in your head, because you're on your own. I wanted to make it better, and so I worked on it in various ways, include improved source-map support in the language I was using at the time (ClojureScript), user session replay in development, predictive testing, automated repro cases, etc., until it was so nice that getting server-side errors was a terrible drag because it didn't have any of the pleasantries that I had come to be used to on the frontend.
Fast forward to this week in OCaml, when I was poking around my site, and hit a "Not found" error. The url was correct, I had just previously a top-level error handler in my Mirage code return "Not found" on any error, because I was very new to OCaml in general and that seemed to work to the extend I needed that day. But today I wanted to know what was going on - why did this happen? Googling a bit for "reporting OCaml errors in production" brought back that familiar frustration of working in an environment where devs just care (let's assume they're capable). Not much for the web, to say the least.
So I figured I would cobble together a quick solution. I didn't want to pull in an SMTP library (finding that 1. the namespacing in OCaml is fucking crazy and 2. some OPAM packages don't work with Mirage only when compiling for a non-Unix backend after developing a full feature has led me to be very cautious about any dependency) - but no worries, the ever-excellent Mailgun offers a great service to send emails via HTTP POSTs. Sadly, Cohttp can't handle multipart (e.g. form) posts (another sign of the weakness of OCaml's infrastructure compared to the excellent clj-http), so I had to do that on my own. I ended up copying the curl examples from Mailgun's, but directing the url to an http requestbin, so I could see exactly what the post looked like. Then, it was just matter of building up the examples in a utop with Cohttp bit by bit until I was able to match the exact data sent over by the curl example. From there, the last bit was to generate a random boundary to make sure there would never be a collision between form values. It's been awhile since I had to work at that level (I definitely prefer to just focus on my app and not constantly be sucked down into implementing this kind of thing), but luckily it still proved possible, if unpleasant. Here's the full module in all its glory currently:
(* Renamed from http://www.codecodex.com/wiki/Generate_a_random_password_or_random_string#OCaml *)let gen_boundary length =
let gen() = matchRandom.int(26+26+10) with
n when n < 26 -> int_of_char 'a' + n
| n when n < 26 + 26 -> int_of_char 'A' + n - 26
| n -> int_of_char '0' + n - 26 - 26inlet gen _ = String.make 1 (char_of_int(gen())) inString.concat "" (Array.to_list (Array.init length gen))
let helper boundary key value =
Printf.sprintf "%s\r\nContent-Disposition: form-data; name=\"%s\"\r\n\r\n%s\r\n" boundary key valuelet send ~domain ~api_key params =
let authorization = "Basic " ^ (B64.encode ("api:" ^ api_key)) inlet _boundary = gen_boundary 24inlet header_boundary = "------------------------" ^ _boundary inlet boundary = "--------------------------" ^ _boundary inlet content_type = "multipart/form-data; boundary=" ^ header_boundary inlet form_value = List.fold_left (fun run (key, value) ->
run ^ helper boundary key value) "" params inlet headers = Cohttp.Header.of_list [
("Content-Type", content_type);
("Authorization", authorization)
] inlet uri = (Printf.sprintf "https://api.mailgun.net/v3/%s/messages" domain) inlet body = Cohttp_lwt_body.of_string (Printf.sprintf "%s\r\n%s--" form_value boundary) inCohttp_mirage.Client.post ~headers ~body (Uri.of_string uri)
Perhaps I should expand it a bit so that it could become an OPAM package?
From there, I changed the error-handler for the site dispatcher to catch the error and send me the top level message. A bit more work, and I had a stack trace. It still wasn't quite right though, because to debug an error like this, you often need to know the context. With some help from @das_cube, I was able to serialize the request, with info like the headers, URI, etc. and send it along with the error report. The final step was to use @Drup's bootvar work (or is it Functoria? I'm not sure what the line is here) to make all of the keys configurable, so that I only send emails in production, and to a comma-separated list of email supplied either at compile- or boot-time:
let report_error exn request =
let error = Printexc.to_string exn in
let trace = Printexc.get_backtrace () in
let body = String.concat "\n" [error; trace] in
let req_text = Format.asprintf "%a@." Cohttp.Request.pp_hum request in
ignore(
let emails = Str.split (Str.regexp ",") (Key_gen.error_report_emails ())
|> List.map (fun email -> ("to", email)) in
let params = List.append emails [
("from", "RiseOS (OCaml) <errors@riseos.com>");
("subject", (Printf.sprintf "[%s] Exception: %s" site_title error));
("text", (Printf.sprintf "%s\n\nRequest:\n\n%s" body req_text))
]
in
(* TODO: Figure out how to capture context (via
middleware?) and send as context with error email *)
ignore(Mailgun.send ~domain:"riseos.com" ~api_key:(Key_gen.mailgun_api_key ()) params))
let dispatcher fs c request uri =
let open Lwt.Infix in
Lwt.catch
(fun () ->
let (lwt_body, content_type) = get_content c fs request uri in
lwt_body >>= fun body ->
S.respond_string
~status:`OK
~headers: (Cohttp.Header.of_list [("Content-Type", content_type)]) ~body ())
(fun exn ->
let status = `Internal_server_error in
let error = Printexc.to_string exn in
let trace = Printexc.get_backtrace () in
let body = String.concat "\n" [error; trace] in
ignore(match (Key_gen.report_errors ()) with
| true -> report_error exn request
| false -> ());
match (Key_gen.show_errors ()) with
| true -> S.respond_error ~status ~body ()
(* If we're not showing a stacktrace, then show a nice html
page *)
| false -> read_fs fs "error.html" >>=
fun body ->
S.respond_string
~headers:(Cohttp.Header.of_list [("Content-Type", Magic_mime.lookup "error.html")])
~status
~body ())
It's still not anywhere near what you get for free in Rails, Clojure, etc. - and definitely not close to session-replay, predictive testing, etc. - but it's a huge step up from before!
As part of due diligence before introducing OCaml to our company, I've been building this site and exploring what OCaml has to offer on a lot of fronts. Now that I have a basic (sometimes terribly painful) flow in place, I've wanted to move on to slimming it down quite a bit. Especially the Mirage build + deploy process. Right now it looks like this:
Dev on OSX (for minutes, hours, days, weeks) until happy with the changes
Git push everything to master
Start up VirtualBox, ssh in
Type history to find the previous incantation
Build Xen artifacts
scp artifacts to an EC2 build machine
ssh into build machine.
Run a deploy script to turn the Xen artifacts into a running server
Clean up left over EC2 resources
As nice as the idea is that I can "just develop" Mirage apps on OSX, it's actually not quite true. Particularly as a beginner, it's easy to add a package as a dependency, and get stuck in a loop between steps 1 (which could be a long time depending on what I'm hacking on) and 3, as you find out that - aha! - the package isn't compatible with the Mirage stack (usually because of the dreaded unix transitive dependency).
Not only that, but I have quite a few pinned packages at this point, and I build everything in step 3 in a carefully hand-crafted virtualbox machine. The idea of manually keeping my own dev envs in sync (much less coworkers!) sounded tedious in the extreme.
At a friend's insistence I've tried out Docker for OSX. I'm very dubious about this idea, but so far it seems like it could help a bit for providing a stable dev environment for a team.
To that end, I updated to Version 1.10.3-beta5 (build: 5049), and went to work trying random commands. It didn't take too long thanks to a great overview by Amir Chaudry that saved a ton of guesswork (thanks Amir!). I started with a Mirage Docker image, unikernel / mirage, exported the opam switch config from my virtualbox side, imported it in the docker image, installed some system dependencies (openssl, dbm, etc.), and then committed the image. Seems to work a charm, and I'm relatively happy with sharing the file system across Docker/OSX (eliminates step 2 the dev iteration process). I may consider just running the server on the docker instance at this point, though that's sadly losing some of the appeal of the Mirage workflow.
Another problem with this workflow is that mirage configure --xen screws up the same makefile I use for OSX-side dev (due to the shared filesystem). So flipping back and forth isn't as seamless as I want.
So now the process is a bit shorter:
Dev on OSX/Docker until happy with the changes
Build Xen artifacts
scp artifacts to an EC2 build machine
ssh into build machine.
Run a deploy script to turn the Xen artifacts into a running server
Clean up left over EC2 resources
Already slimmed down! I'm in the process of converting the EC2 deploy script from bash to OCaml (via the previous Install OCaml AWS and dbm on OSX), so soon I'd like it to look like:
Dev on OSX/Docker until happy with the changes
git commit code, push
CI system picks up the new code + artifact commit, tests that it boots and binds to a port, then runs the EC2 deploy script.
I'll be pretty close to happy once that's the loop, and the last step can happen within ~20 seconds.
Early this morning I was able to get some very, very simple OCaml code running on my physical iPhone 6+, which was pretty exciting for me.
I had been excited about the idea since seeing a post on Hacker News. Reading through, I actually expected the whole process to be beyond-terrible, difficult, and buggy - to the point where I didn't even want to start on it. Luckily, Edgar Aroutiounian went well beyond the normal open-source author's limits and actually sat down with me and guided me through the process. Being in-person and able to quickly ask questions, explore ideas, and clear up confusion is so strikingly different to chatting over IRC/Slack. I'll write a bit more about the process later, but here's an example of the entire dev flow right now: edit OCaml (upper left), recompile and copy the object file, and hit play in XCode.
The next goal is to incorporate the code into this site's codebase, to build a native iOS app for this site as an example (open source) iOS client with a unikernel backend. I'm very eager to try to use ReactNative, for:
The fantastic state models available (just missing a pure-OCaml version of DataScript)
Code sharing between the ReactJS and ReactNative portions
Tons of great packages, like ReactMotion that just seem like a blast to play with
Acknowledgements
I'd really like to thank Edgar Aroutiounian and Gina Maini for helping me out, and for being so thoughtful about what's necessary to smooth out the rough (or dangerously sharp) edges in the OCaml world. Given that tooling is a multiplicative force to make devs more productive, I often complain about the lack of thoughtful, long-term investment in it. Edgar (not me!) is stepping up to the challenge and actually making veryimpressiveprogress on that front, both in terms of code and in documenting/blogging.
As a side note, he even has an example native OSX app built using OCaml, tallgeese.
I'm toying with the idea of rewriting the deploy script I cribbed from @yomimono for this blog from bash to OCaml (there are some features I'd like to make more robust to the full deploy is automated and resources are cleaned up), and came across the OCaml AWS library. Unfortunately, installing it was a bit frustrating on OSX, I kept hitting:
NDBM not found, the "camldbm" library cannot be built.
After a bit of googling around, it was fairly simple: Simple install the Command Line Tools, and you should have the right header-files/etc. so that opam install aws or opam install dbm should work. Hope that helps someone who runs into a similar problem!
I used Let's Encrypt (LE) to get a nice SSL cert for www.riseos.com (and riseos.com, though I really would like that to simply redirect to www. Someday I'll wrap up all the loose ends).
Going through the process wasn't too bad, but unfortunately it was a bit tedious with the current flow. To pass the automated LE checks, you're supposed to place a random string at a random URL (thus demonstrating that you have control over the domain and are therefore the likely owner). I thought I would do this by responding to the url in my existing OCaml app, but
The deploy feedback cycle is just too long
The SSL cert generated by make secrets doesn't pass work for the check.
In the end I simply switched the DNS records to point to my local machine, opened up my router, and copy/pasted the example python code. Because I use Route53, it was instantaneous. Then after a bit of mucking about with permissions, I copied fullchain1.pem -> secrets/server.pem, and privkey.pem -> secrets/server.key, fixed the dns records, redeployed (now a single script on a local vm + a single script on an EC2 vm), et voila, a working SSL site!
There are some problems with the Let's Encrypt certificate however. The JVM SSL libraries will throw and error when trying to connect to it, saying something like, "unable to find valid certification path to requested target". That transitively affects Apache HttpClient, and therefore clj-http. In the end, I had to pull the cert and insert it into the keystore.
As a side note, the deploy cycle is still too long, and still too involved, but it hugely better than just a week or two ago. I expect to soon be able to remove the EC2 vm entirely, and to be able to run a full, unattended deploy from my VM - or even better, from CircleCI after every push to master. After those sets of paper cuts are healed, I want to do a full deploy on a fresh account, and get the time from initial example-mirage git checkout to running publicly-accesible server (possibly with valid https cert) to under three minutes, on either EC2, Prgmr, or Google Cloud (or Linode/Digital Ocean if anyone knows how to get xen images booting there).
This site is has been a very incremental process - lots and lots of hard-coding where you'd expect more data-oriented, generalized systems. For example, the post title, recent posts, etc. are all produced in OCaml, rather than liquid. I'd like to change that, and bit by bit I'm getting closer to that.
In fact there's a whole list of things I'd like to change:
Routing is hard-coded. I want to bring in Opium to be able to use the nice routing syntax, and middleware for auth, etc. However, its dependency on unix means that it can't be used with the Mirage backend. Definitely keeping an eye on the open PRs here.
Every page is fully re-rendered on each request - Reading the index.html (template file), searching through it for the targets to replace, reading the markdown files, rendering them into html and inserting them into the html, and finally serving the page. For production, this should be memoized.
Posts can't specify their template file - everything is just inserted into index.html. Should be trivial to change.
The liquid parser mangles input html to the point where it significantly changes index.html. It needs to be fixed up.
Similarly, I want to move more (e.g. some) logic into the liquid templates, for things like conditionals, loops, etc.
Along those lines, the ReactJS bindings are very primitive, I need to come up with a small app in this site (perhaps logging in) to start exercising and building them out (with ppx extensions at some points, etc.)
An application I'm considering is to first expose an API to update posts in dev-mode, then building a ReactJS-based editor on the frontend (draft.js is obviously a very cool tool that could be used). That way editing is a live, in-app experience, and then rendering is memoized in production. Production could even have a flag to load the dev tools given the right credentials, and allow for a GitHub PR to be created off of the changes.
Possibly use Irmin as a storage interface for the posts.
Plenty of other things as well. I'll update this as I remember them.
Mirage is going to have a ton of growing pains as it's used for real-world applications. I suspect that most of that will be spent on polish and glue (which is desperately missing right now), because the core is relatively solid (especially compared to e.g. one year ago).
Still, I have tons of Mirage questions, and would like answers/guides to them, or even better - code to completely obsolete them. I'll keep a list here, and update it with links as answers come in.
How to express pinned dependencies in the the mirage config.ml Apparently this isn't possible right now, which means others are going to have a hard time using my example repository.
Seamless, continuous, one-click deploy from any platform to AWS, GC, Linode, Digital Ocean, and prgrm
How to get stack traces from crashes in the unikernel in production (ideally we'd be able to combine with with e.g. bugsnag at some point)
How to build a xen unikernel image from OSX (likely to be a big requirement)
If the above isn't feasible, how to tie into e.g. CircleCI to build the xen artifacts and upload them somewhere.
How to parameterize the ports for development (where I don't want to use sudo to start my binary) and for production (where I don't mind it, of course). Also applies to other things besides just ports (ssl certs, etc.).