Announcing Potential: x86-64 assembler as a Haskell EDSL

Over the years there have been many projects which seek to use advanced types to provide better static-guarantees in low level languages. There are many examples of this in the literature; here are just a few:

  • Cyclone, perhaps the most-cited example of using types to protect memory in low-level settings.
  • Habit, a proposed Haskell dialect which uses a viable form of dependent types to model low-level data structures and their memory management which I recently learned about on reddit, coincidentally in a comment on an earlier post of mine.
  • Some work due to Oleg Kiselyov and Chung-chieh Shan showing that Haskell is a viable setting for embedding a low-level language.

(If you know of others, put them in the comments!)

In December of 2009 I became interested in typed assembly languages and began working on my own, quickly deciding to embed the language into Haskell. At present there are many facets of the language which work well, but there is still a good deal of work to be done.

When I started this, I was unaware of much of the work that had already been done in this direction (I especially wish I had been aware of the Kiselyov-Shan proof of concept). In the course of the project, I’ve learned a bunch of great tricks, and also have learned of a good deal of excellent work done by others that I hadn’t seen promoted elsewhere. Over the next few posts, my goals are threefold:

  • To describe my own project: where it is, what the challenges have been, where I hope for it to go. And to release some source (on github), such as it is.
  • To describe some of the general lessons I’ve learned working on an EDSL in Haskell, to evangelize this approach to problem solving, and to describe some caveats to conventional Haskell wisdom in this setting.
  • To promote some of the work others have done in this area, in the aim of showing just how far along this idea is.


Optional class constraints in Haskell

Work on my Haskell EDSL is moving ever onward. Today I want to talk about a trick I found while working on it. (Along the way I’ll make some allusions to the EDSL, but I want to forestall announcing the EDSL for another week or so, in the interest of ensuring it’s fully baked.)


Polymorphic first class labels

(This post uses GHC 6.12.1)

The expression “first class labels” refers to the idea that, for record data types, one should be able to pass around the labels just as they would any other type. For instance, if I have a record like

data Foo a b = { biz :: a, baz :: b }

the value biz shouldn’t just denote the function biz :: Foo a b -> a, but should also be usable as a way of updating records, that is, a function like biz' :: Foo a b -> a' -> Foo a' b.

The Mythical Haskell’ includes some proposals for updating the records system with features aimed at supporting this idea, but for the time being, many people prefer to use fclabels, which achieves much of this magic using Template Haskell.

Recently, while working on an EDSL, I found myself wishing I had first class labels. I ran into a problem, though, which (along with solution) I’ll now describe.

Consider the following code:

module Label where

data Foo a b = Foo a b deriving Show

updatea :: a' -> Foo a b -> Foo a' b
updatea a (Foo _ b) = Foo a b

updateb :: b' -> Foo a b -> Foo a b'
updateb b (Foo a _) = Foo a b

worksFine foo0 = let foo1 = updatea 'a' foo0
                     foo2 = updatea "a" foo1
                 in foo2

Here I’ve defined a data structure Foo with two fields, along with a pair of functions for updating these fields. Then I defined a function worksFine which uses updatea to modify a Foo.

Obviously I could also write the following function:

worksFine' foo0 = let foo1 = updateb 'a' foo0
                      foo2 = updateb "a" foo1
                  in foo2

which is exactly the same, except that it uses updateb instead, thereby modifying the other field in Foo.

So now we have an obvious place to generalize: instead of having both worksFine and worksFine', why not have a single function which takes the updater as a parameter?

If we try it out, the first attempt looks like this:

trouble u foo0 = let foo1 = u 'a' foo0
                     foo2 = u "a" foo1
                 in foo2

Only trouble is that this fails to type check:

    Couldn't match expected type `[Char]' against inferred type `Char'
      Expected type: [Char] -> t1 -> t
      Inferred type: Char -> t2 -> t1
    In the expression: u "a" foo1
    In the definition of `foo2': foo2 = u "a" foo1

The problem is that trouble doesn’t believe that the argument u is polymorphic enough. This is a typical rank-2 issue: we don’t want trouble to bind the type variables in the signature for u.

Using rank-2 types, we can get very close to a solution. We can write the functions

alsoWorksFinea :: (forall a a' . a' -> Foo a b -> Foo a' b)
               -> Foo a b -> Foo [Char] b
alsoWorksFinea u foo0 = let foo1 = u 'a' foo0
                            foo2 = u "a" foo1
                        in foo2

alsoWorksFineb :: (forall b b' . b' -> Foo a b -> Foo a b')
               -> Foo a b -> Foo a [Char]
alsoWorksFineb u foo0 = let foo1 = u 'a' foo0
                            foo2 = u "a" foo1
                        in foo2

but neither is able to accept both of our update functions, even though each function has exactly the same body. Worse, I wasn’t able to find a sufficiently general type signature that would allow me to have one function which would be able to accept both update functions.

Luckily, where rank-2 types have failed me, type families have saved me. Any time you need more flexibility in your type signatures than the syntax will allow, you might be in a box where type families are the way to go. Here’s what it looked like in my case.

First I defined some typed to represent the two fields of my Foo structure:

data A = A
data B = B

Obviously I can pass around these values in a first-class manner, no trouble at all.

Then I defined a class for describing updating and getting:

class Field f x y where
  type Updated f x y
  update :: f -> x -> y -> Updated f x y
  type Gotten f y
  get :: f -> y -> Gotten f y

Now there are two instances to give, one for each field in Foo:

instance Field A a' (Foo a b) where
  type Updated A a' (Foo a b) = Foo a' b
  update A a' (Foo a b) = Foo a' b
  type Gotten A (Foo a b) = a
  get A (Foo a b) = a

instance Field B b' (Foo a b) where
  type Updated B b' (Foo a b) = Foo a b'
  update B b' (Foo a b) = Foo a b'
  type Gotten B (Foo a b) = b
  get B (Foo a b) = b 

And we’re basically done. We can now write the function that started this whole mess:

shouldWorkFine f foo0 = let foo1 = update f 'a' foo0
                            foo2 = update f "a" foo1
                        in foo2

GHCi is able to give us a very promising type signature for it:

> :t shouldWorkFine 
  :: (Field f [Char] (Updated f Char y), Field f Char y) =>
     f -> y -> Updated f [Char] (Updated f Char y)

While this technique introduces type classes and type families into our program (something which can make typing troublesome in other areas), it delivers something I don’t know how to otherwise get: polymorphic first class labels.

Clearly the next step is to implement a library like fclabels which uses Template Haskell to define instances of the Field class.

Some weird interactions between Monomorphism Restriction and Template Haskell

Today I’m going to look at a weird issue I encountered this past weekend while working on a DSL in Haskell.

I’ll start with the code. As this example uses Template Haskell, we need to have the source broken up into two files:


        TemplateHaskell #-}
module Testa where

import Language.Haskell.TH

someth = [| () |]

unit = ()

mrIssue :: (Monad m) => b -> m b
mrIssue = return


        TemplateHaskell #-}
module Testb where

import Testa

g = $(someth)
foo1 = mrIssue $(someth)

f = ()
foo2 = mrIssue ()

h = unit
foo3 = mrIssue unit

Now, if we fire up ghci Testb.hs, we get the following:

$ ghci Testb.hs 
GHCi, version 6.10.4:  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer ... linking ... done.
Loading package base ... linking ... done.
[1 of 2] Compiling Testa            ( Testa.hs, interpreted )
[2 of 2] Compiling Testb            ( Testb.hs, interpreted )
Loading package syb ... linking ... done.
Loading package array- ... linking ... done.
Loading package packedstring- ... linking ... done.
Loading package containers- ... linking ... done.
Loading package pretty- ... linking ... done.
Loading package template-haskell ... linking ... done.
Ok, modules loaded: Testb, Testa.
*Testb> :t foo1
foo1 :: (Monad m) => m ()
*Testb> :t g

    Ambiguous type variable `m' in the constraint:
      `Monad m' arising from a use of `g' at :1:0
    Probable fix: add a type signature that fixes these type variable(s)

Notice that, while we’re able to get the type for foo1, for some reason g is ill-typed with an ambiguous type variable in the constraint Monad m. The only problem, of course, is that when we look at our source we don’t see any way in which the type for g should have this constraint!

So maybe the problem is that g needs a type signature. But if we go in and modify Testb.hs, giving it

g :: ()
g = $(someth)

then we get the following from ghci:

    Could not deduce (Monad m) from the context ()
      arising from a use of `mrIssue' at Testb.hs:10:7-23
    Possible fix:
      add (Monad m) to the context of the type signature for `g'
    In the expression: mrIssue ($someth)
    In the definition of `foo1': foo1 = mrIssue ($someth)
Failed, modules loaded: Testa.

So now it's upset about the type for foo1. Fine. Let's give it a signature as well:

g :: ()
g = $(someth)
foo1 :: (Monad m) => m ()
foo1 = mrIssue $(someth)

Now we get a new error from ghci:

    Contexts differ in length
      (Use -XRelaxedPolyRec to allow this)
    When matching the contexts of the signatures for
      g :: ()
      foo1 :: forall (m :: * -> *). (Monad m) => m ()
    The signature contexts in a mutually recursive group should all be identical
    When generalising the type(s) for g, foo1
Failed, modules loaded: Testa.

If we add RelaxedPolyRec to our list of LANGUAGE extensions, the problem does, indeed, go away. In this case, we can even remove our type signature for g or for foo1, but not both -- we need to have at least one of them present.

Lastly, if we go back to our original source given above, but replace the signature for mrIssue with

mrIssue :: b -> IO b
mrIssue = return

then we can remove the NoMonomorphismRestriction, and everything works just fine (we can :t g and :t foo1 without any problems).

I'm entirely unsure of what's going on here. Any theories?