Writing WebAssembly Code by Hand

Photo by Quino Al on Unsplash

Writing WebAssembly Code by Hand

If you know anything about WebAssembly, you might have seen the title of this article and thought "OMG why would I write WebAssembly code by hand?!" And for almost all purposes, you'd be absolutely correct to be skeptical about that idea! WebAssembly—or Wasm—isn't intended to be written by hand by most users.

Use cases for hand-written Wasm

Then under what circumstances might writing Wasm by hand make sense? Here are a few of those cases:

  • You're writing Wasm to run on a microcontroller, so even the inherent small nature of Wasm modules produced by standard compilers might not be quite small enough
  • You're writing a compiler that targets WebAssembly
  • You enjoy writing assembly code from scratch occasionally to understand it deeply

If any of these resonate with you, read on as we write a a classic "Hello world!" program in Wasm!

Prerequisites

This demo will be easiest to follow for those familiar with the concept of assembly languages.

Set-up steps

There are lots of ways to do this, but for this example, we'll keep it simple and use our Subo CLI and Sat, our WebAssembly function server.

  • If you've previously installed Subo make sure to upgrade to the latest version, and if you need to install Subo check out our Subo installation instructions
  • Install Go if you don't already have it
  • Make sure your GOBIN is on your $PATH
  • Clone the Sat repo, cd into it, and to build Sat run:
    make sat/install
    
  • Install Docker if you don't already have it, and start it
  • Use Subo to create a Runnable:

      subo create runnable hello-world --lang wat
    
    • The --lang wat flag tells Subo we'll be writing in WebAssembly Text, or wat
  • cd into the new hello-world directory, and run:

      subo build
    

Our hello-world directory now contains two files: hello-world.wasm and lib.wat, and we'll open lib.wat in our favorite editor. In lib.wat we'll see something like: (module ... (bunch of code)), and we'll delete everything except (module) so we have a clean slate. (module) is the smallest possible Wasm module: its .wasm file consists of just 8 bytes!

A bit of context

The Wasm text format has a few properties to be aware of when writing it by hand:

  • wat is written as though it's running on a virtual stack machine
  • wat is typed
  • wat is written in S-expression format, which looks like this: (module (label_1) (label_2) ... (label_n)) and we'll see in the next few steps what the labels can refer to
  • The order in which variables and functions are written doesn't matter for wat itself, but does matter for wat2wasm, the tool Subo uses to convert Wasm written in wat format to wasm, Wasm's binary format

Sat requires Wasm modules to expose the following interface (we'll discuss each of these in detail below!):

  • run_e: (i32, i32, i32, i32) -> ()
    • The function we’re going to run in Sat
  • allocate: (i32) -> (i32)
    • A function to allocate memory
  • deallocate: (i32) -> ()
    • A function to deallocate memory
  • memory
    • Our module's WebAssembly memory

Let's write some code

"Deallocating" memory

Writing the function to deallocate memory is the easiest, so we’ll start there, but in fact we're just going to write a dummy function, since we're not really worried about memory leaks for this demo.

We'll name this function deallocate, and declare it like this:

(func $deallocate (param $ptr i32)
    nop
)

Let's walk through this:

  • deallocate takes one parameter—ptr i32—which is a pointer to the region of memory we want to deallocate
    • i32 refers to a 32-bit integer (the initial version of Wasm is 32-bit, so all pointers are 32-bit)
  • nop means “no operation”: we're doing nothing here, since this is dummy function

Next, we need to export this function so Sat can use it:

(export "deallocate" (func $deallocate))
  • What's up with these extremely similar names?
    • deallocate is the name by which the function will be identified by externally
    • $deallocate identifies the internal WebAssembly function inside our module that is being exported

So at the moment our module looks like this:

(module
 (func $deallocate (param $ptr i32)
  nop
 )

 (export "deallocate" (func $deallocate))
)

Allocating memory

We need to take care of memory allocation to coordinate between Sat——our Wasm host——and our module to make sure they don't overwrite each other's memory regions.

First, at the top of our module we'll specify some memory to write into:

(memory $memory 1)

Let's walk through this:

  • memory needs either one or two parameters:
    • The first value is the initial memory size and represents a number of WebAssembly pages, each of which uses 64KiB of memory, so one page will do us for this project
    • The second argument specifies the maximum number of pages to be used, but omitting it allows us to use unlimited memory

Next, we’ll need a global variable that points to the address to which we’re currently allocating memory. We’ll name that variable heap and declare it above our deallocate function like this:

(global $heap (mut i32) (i32.const 1024))
  • We want this value to be mutable, so we’ll give it the type (mut i32)
  • In Wasm global variables need to be assigned a value at the time they’re declared, so we’ll declare a constant via i32.const and give it a value of 1024 (0 is a valid pointer as far as Wasm is concerned but is a null pointer in most other languages, and 1024 is a nice power of 2 so we’ll go with that)

Now we can write our allocate function, and we’ll put this inside our module, above our deallocate function. We'll step through this function declaration in several parts, beginning with the signature:

(func $allocate (param $size i32) (result i32)
)
  • allocate will take one parameter, and we’ll call it “size” because it refers to the amount of memory we want to allocate
  • This function will return a value, and the way we signal that we want it to return a value is by appending (result i32) to the end of the declaration

We’ll need to access our heap value by running global.get $heap twice (we’ll see why we need it twice a few steps down!):

 global.get $heap
 global.get $heap
  • At the moment, global.get $heap will always return the value 1024 because that’s how we declared it, but it’s mutable and will change as we go along

Next we’ll need to add to our heap the amount of memory we want to allocate by calling local.get $size to refer to the value of the size parameter we passed to allocate:

 global.get $heap
 global.get $heap
 local.get $size

And then we’ll add together the values of local.get $size and global.get $heap with the i32.add instruction:

 global.get $heap
 global.get $heap
 local.get $size
 i32.add
  • i32.add will pop local.get $size from the stack, followed by global.get $heap and add those two values together
  • So when this runs, our stack will have two values on it: the original global.get $heap value at the bottom, and above it the sum of local.get $size and global.get $heap

But we want to actually do something with that, so we update our heap value with global.set $heap:

(func $allocate (param $size i32) (result i32)
 global.get $heap
 global.get $heap
 local.get $size
 i32.add
 global.set $heap
)
  • Updating our heap value will do two things:
    1. Pop the value of i32.add off of the stack and advance our heap value by that amount
    2. Make sure the function will return the original value of global.get $heap so we don’t get lost or overwrite it
  • This is why we needed global.get $heap twice! The second iteration gets popped and only the first remains

Then we’ll export our allocate function:

(export "allocate" (func $deallocate))

Our module now looks like this:

(module
  (memory $memory 1)
  (global $heap (mut i32) i32.const 1024)

  (func $allocate (param $size i32) (result i32)
    global.get $heap
    global.get $heap
    local.get $size
    i32.add
    global.set $heap
  )

  (func $deallocate (param $ptr i32)
    nop
  )
  (export "allocate" (func $allocate))
  (export "deallocate" (func $deallocate))
)

Writing a function to run in Sat

We'll declare a run_e function above our allocate function:

(func $run_e (param $payload_ptr i32) (param $payload_size i32) 
  (param $ident i32) (local $data_ptr i32)
)
  • Sat will provide the following parameters to run_e:
    • A pointer payload_ptr to the payload we're passing to Sat
    • payload_size, which is—surprise!—the size of the payload
    • An ident value, which Sat uses for security purposes

And of course we'll need our "Hello world!" string! We can declare it near the bottom of our module, above our function exports:

(data "Hello world!")

But at the moment this is a passive data segment that only exists in our WebAssembly file: we need to allocate some memory for it so it can be used when the runtime loads our module, and we'll declare that within our run_e function:

(local.set $data_ptr
  (call $allocate (i32.const 12))
 )
  • We can assign our data_ptr the value returned by calling the `allocate' function we wrote previously
    • allocate takes one parameter, which is the size of our data
    • In this case, our data is the string "Hello world!", which is 12 characters (so 12 bytes), so we'll pass i32.const 12

Below that, we'll load our data into the region that we just specified:

(memory.init 0
  (local.get $data_ptr)
  (i32.const 0)
  (i32.const 12)
 )
  • This is zero-indexed, so starting from the beginning means starting at the first data segment
  • local.get $data_ptr is where we want to load the data into
  • We don't need to offset our starting point in the data, so we pass i32.const 0
  • And we want to write all 12 bytes of the data

Now we need to import Sat's return_result host function at the top of our module:

(import "env" "return_result" (func $return_result (param i32 i32 i32)))
  • return_result takes three parameters, all of type i32
    • The first will be the value of the pointer to our data
    • The second will be the size of the data
    • And the third will we our ident

Finally, we need to call return_result below our memory.init function to return the result we got from calling allocate:

(call $return_result
  (local.get $data_ptr)
  (i32.const 12)
  (local.get $ident)
)

Our complete run_e function looks like this:

(func $run_e (param $payload_ptr i32) (param $payload_size i32)
    (param $ident i32) (local $data_ptr i32)

        (local.set $data_ptr   
            (call $allocate (i32.const 12))
        )

        (memory.init 0
            (local.get $data_ptr)
            (i32.const 0)
            (i32.const 12)
        )

        (call $return_result
            (local.get $data_ptr)
            (i32.const 12)
            (local.get $ident)
        ) 
    )

Nearly there! We just need to export our run_e and memory functions:

(export "run_e" (func $run_3))
(export "memory" (memory $memory))

So our complete module looks like this:

(module
    (import "env" "return_result" (func $return_result (param i32 i32 i32)))
    (memory $memory 1)
    (global $heap (mut i32) i32.const 1024)


    (func $run_e (param $payload_ptr i32) (param $payload_size i32)
        (param $ident i32) (local $data_ptr i32)

        (local.set $data_ptr   
            (call $allocate (i32.const 12))
        )

        (memory.init 0
            (local.get $data_ptr)
            (i32.const 0)
            (i32.const 12)
        )

        (call $return_result
            (local.get $data_ptr)
            (i32.const 12)
            (local.get $ident)
        ) 
    )    

    (func $allocate (param $size i32) (result i32)
        global.get $heap
        global.get $heap
        local.get $size
        i32.add
        global.set $heap
    )

    (func $deallocate (param $ptr i32)
        nop
    )

    (data "Hello world!")

    (export "allocate" (func $allocate))
    (export "deallocate" (func $deallocate))
    (export "run_e" (func $run_e))
    (export "memory" (memory $memory))
)

Testing

Now we get to test our code!

  • We’ll ask Subo to build it:

      subo build
    
  • And test it in Sat:

      echo "" | sat hello-world.wasm --stdin
    

Woohoo, we did it! If you have a use case for writing WebAssembly by hand, let us know in the comments. Or even better, post your thoughts or projects on Wasm Builders, a new blogging platform for the Wasm community!

Getting to Know Sat

A Tour of the Wasm Ecosystem

WebAssembly Extensibility: Today and Tomorrow