Work on my Haskell EDSL is moving ever onward. Today I want to talk about a trick I found while working on it. (Along the way I’ll make some allusions to the EDSL, but I want to forestall announcing the EDSL for another week or so, in the interest of ensuring it’s fully baked.)
My EDSL is essentially an attempt at grafting the Haskell static type system onto a dynamically typed language. The idea is that a programmer writes their code in my EDSL — that is, he’s really written his code in an indexed monad I’ve set up in Haskell — and when he executes his EDSL code, the output is a bunch of code written in the target language, which can then be compiled using whatever toolchain he likes. We’ve basically embedded the target language into Haskell, which means that the Haskell type system can now be used to model concepts in the target language. The target language has a very weak dynamic type system which, if I’ve done my modeling correctly, allows a proper superset of the programs that my EDSL will allow.
Anyway, with this design, there are two operations that the programmer needs to be able to perform with their code:
- Make sure that the code type-checks. Obviously the Haskell compiler will do this on its own.
- If the code type-checks, render the code from the EDSL into the target language.
As alluded to above, the EDSL resides within an indexed monad, say
PState assumes returns a, which encodes the Hoare types of code blocks. Consequently, there is a render function, appropriately typed
render :: PState assumes returns () -> TargetAST, where
TargetAST is some algebraic data type that I can pretty print into the target language’s syntax.
Obviously I’m leaving out a good deal of detail about the EDSL, which I’ll talk about in a later post; today I want to talk about a problem that arises with this style.
It is a common theme that the true zen of Haskell code is modeling your program’s behavior in the type system. As a basic example, a program that takes user input and puts it into a syntax-sensitive structure can use types to ensure that non-escaped strings never make it from the user to the structure; as a more complex example, configuration flags can be managed at the type level, if one is sufficiently clever. These are all ideas that would be wonderful to have in other languages, but which are typically absent due to the limitations of the type systems in question.
Of course, when one wishes to encode program behavior into types, one often works into a box where type classes need to be introduced. I recently found myself: my target language has a notion of “operand size,” and some functions in the language only work with certain sizes of data; it was thus necessary to have a representation for this in the EDSL. Since operand size is a property of a type, rather than a type itself, this introduced type classes.
All of this is fine, of course, when it comes to type checking: I was able to employ a common trick to encode sizes in types, then have a two-parameter class
class HasSZ d size
whose only job is to describe a size predicate on a type. But now I have EDSL code with a type signature predicated on a class constraint, as in
someEDSLFunction :: HasSZ d (S (S Z))
=> PState (some complex expression using d) returns ()
Since the implementation of
render is essentially (using
render :: PState assumes returns () -> TargetAST
render f = getTargetAST $ runPState f (undefined :: assumes)
we now have a problem: if
f has a class constraint like that found in
someEDSLFunction, we get a type error:
> render someEDSLFunction :1:7: No instance for (HasSZ d (S (S Z))) arising from a use of `someEDSLFunction' at :1:7-22 Possible fix: add an instance declaration for (HasSZ d (S (S Z))) In the first argument of `render', namely `someEDSLFunction' In the expression: render someEDSLFunction In the definition of `it': it = render someEDSLFunction
This is particularly frustrating because (though I haven’t yet explained this) we’re using
undefined for the sole reason that we don’t want the code which is generated to depend on the input — that is, the class constraints on the input are only a formality, there to make sure that function calls within the EDSL are legal in the context of the target language!
So that’s the problem. Here’s the fix: optional class constraints. The central issue is that, during the type-checking phase, we want the class constraints to be there — but during the rendering phase, we wish they weren’t. We can achieve this by introducing a new type class, along with some suspicious looking instances.
In addition to the
HasSZ class, we’ll introduce the following class (and instances):
class MaybeHasSZ d size c
instance (HasSZ d size) => MaybeHasSZ d size ConstraintsOn
instance MaybeHasSZ d size ConstraintsOff
We then take our indexed monad,
PState, and add an extra type parameter to track whether or not we want class constraints enabled or disabled. Functions in our EDSL now have type
PState c assumes returns a.
We now give the function
someEDSLFunction a signature like
someEDSLFunction :: MaybeHasSZ d (S (S Z)) c =>
PState c (some complex expression using d) returns ()
By coupling the class constraint with the
c variable in our
PState, we’ve given ourselves a way to “flip a switch” to turn the
HasSZ constraint on and off: during type checking, the EDSL binds
ConstraintsOn, so that the constraint
MaybeHasSZ d (S (S Z)) c reduces to the constraint
HasSZ d (S (S Z)). To fix rendering, we just give
render the signature
render :: PState ConstraintsOff assumes returns () -> TargetAST
so that the expression
render someEDSLFunction collapses the constraint
MaybeHasSZ d (S (S Z)) c down to nothing, allowing the
render function to do its job.