Haskell features I’d like to see in other languages

When I read Ben Hutchison‘s OO/Imperative programmers: ‘Study Functional Programming or Be Ignorant’ I knew I had too much to say for the comments, so I figured I’d put in my 2 cents here.

Haskell is my go-to language, both for scripting, and for getting work done. This is not because of any particular allegiance to the language. Haskell and I have an open relationship, and the moment I find a language that out-Haskells Haskell, you can be sure I’ll move on.

Here I want to describe my favorite things about Haskell. You’ll note that they are all about the type-system. I don’t feel too strongly one way or the other about laziness, or about monads (though I won’t give them up without first finding something to take their place). I don’t even particularly care that it’s a functional language, in as much as I can have these features in a non-functional environment.

Some of these features are already available elsewhere. This is wonderful! If you know of any examples of this, please tell me in the comments.

This is a list of my favorite things:

Separation of class and data definitions.

Haskell’s notion of classes is more like Java’s notion of interfaces. A class is a list of function prototypes, and any data type for which such functions can be defined is an instance of that class. One does not inheret a parent class, but rather, one implements a class. It’s a weird distinction if you haven’t seen it before, but after I learned how to use it, I must say I prefer it.

The first example most people see is the Show class. Here is how it’s defined (to get this listing, I just asked ghci — the interactive GHC prompt — to give me the definition):

Prelude> :info Show
class Show a where
  showsPrec :: Int -> a -> ShowS
  show :: a -> String
  showList :: [a] -> ShowS
  	-- Defined in GHC.Show


This says that any data type a which is an instance of Show provides functions with these signatures. (Edited: The first of these functions are used for implementing a Haskell idiom for fast string construction, while the last is related to a restriction in the unmodified Haskell 98 standard.)

When I define a new datatype, I can either ask Haskell to derive a Show instance for me automatically, or I can specify one myself:

data Car = Person { make :: String, year :: Int } deriving Show
data Pet = Pet { name :: String, animal :: String, age :: Int }
instance Show Pet where
  show p = "My pet is named " ++ name p ++
           " and he is a " ++ animal p ++
           " and he is " ++ show (age p) ++ " years old."


I understand that Google’s Go language has this notion of class (they call it interface), and that Scala provides this as well (they call it trait).

Typed side-effects.

In Haskell, if a function wants to communicate with the environment, then the function’s type signature will document this fact. Want to print to the console? Open a socket? Read a file? Any of these actions will put your function into the IO monad, which is a red-flag to other programmers that the function communicates with the environment. When your application works with library code (and whose doesn’t?) this is a handy feature.

Haskell uses the monad design pattern as the underpinning of how it types side-effects. I don’t particularly care that it’s monads per se, I just like that there is something which statically documents which functions communicate with the environment.

Why is this useful? One huge answer is concurrency. If your function has side-effects, it is not obviously thread-safe. If it has no side-effects, it is thread-safe. The monad design pattern provides a way to define application-specific notions of side-effects, which allows you to dial in the granularity on this as much as is appropriate for your application.

With respect to typed side-effects, a common Haskell idiom is to break up your program into different layers of state. For instance, in a web framework, you might have a “user-input” layer which is read-only, and on top of that a “logging” layer, and on top of that your application-specific stuff. (Each of these layers is a monad, or more precisely, a monad transformer.) Haskell allows you to statically track which functions rely on which layers, which is a useful thing if you want to call a function and be certain that it won’t modify some data out from under you.

If you’re new to Haskell and monads, in my humble opinion this idiom is the real reason to give a damn about monads. But that’s just my perspective.

(And it’s certainly not obvious from the beginning, but a lot of bugs can be eliminated this way.)

Type safe macros.

No language is completely free from the occasional boilerplate. One way around this is to use macros.

In C, macros can be very tricky. The preprocessor takes all instances of a macro, replaces it with the corresponding text, then passes off to the compiler. If it turns out that you used the macro incorrectly, the compiler isn’t really there to help you out: after all C macros are all about find-and-replace.

Haskell’s macro system is called Template Haskell. Macros written in Template Haskell are actually written in Haskell syntax. The compiler then takes this code, compiles it like it would any other Haskell, and then uses it to expand your usage of the macros. Everything is typed the whole way through, and if there are errors, the compiler can tell you where they are and why (with its usual level of precision, for better or worse).

When I recently ran into a scenario where (for some very long-winded reason) I had to define 20 essentially-identical datatypes, then give them all essentially the same class instances, I was able to quickly whip up some Template Haskell to do all the lifting for me. When I realized I needed to modify those class instances, it was as simple as modifying the Template Haskell that was generating them.

This is how macro-ing should be. Instead of a deal with the devil, it should be safe enough to be accepted practice.

Quasi-quoting.

This is one of the many fine ways to embed a language in Haskell. Here’s a typical use case: you’re writing a library and the most natural way for a developer to specify some options is in a simple configuration language. You could implement a function String -> MyLibOptions, but if they have any typos in their configuration string, you won’t be able to catch them until run-time. If the configuration isn’t known until run-time that’s fine, but if the configuration is known at compile-time, you’d like the error to be caught at compile-time. (I need to mention that quasi-quoting is able to mix run-time and compile-time data — I’m just simplifying things to describe this use case.)

Quasi-quoting to the rescue. I recently gave an example of Haskell’s quasi-quoting abilities in a post about how it can be used to provide an injection-proof form of string interpolation (via Interpolique). One of my favorite applications is Michael Snoyman‘s Hamlet, a type-safe HTML generation library.

(If you’d like to see what it looks like to implement a quasi-quoter in Haskell, I’ve got some code up on github that demonstrates this in the case of string interpolation, as mentioned above.)

Quasi-quoting is basically syntactic sugar for Template Haskell. Consequently your quasi-quoters are able to reach into the environment and interact with the rest of the code (all in a type-safe, purely functional way, of course). In the string interpolation example above, for instance, the code

author = "broker"
content = "' or 1=1;"

query = [$interpolique| insert into posts values(^^author , ^^content ); |]


set query equal to the following

*Test> query 
InterpoliquedString
    " insert into posts values(b64d(\"YnJva2Vy\"), b64d(\"JyBvciAxPTE7\")); "


which was generated by inspecting the values of author and content at run-time, encoding them in base64, and then interpolating them into the result you see here. The fact that author and content were strings was determined at compile-time, so there wasn’t any chance of any shenanigans when the code actually executed.

For instance, if I instead had the code

author = 2 :: Int
content = "' or 1=1;"

query = [$interpolique| insert into posts values(^^author , ^^content ); |]


I’d get a compile-time error:

Test.hs:9:23:
    Couldn't match expected type `String' against inferred type `Int'
    In the first argument of `InterpoliqueQQ.b64enc', namely `author'


which I think is pretty cool.

Type families and associated types.

I must admit that I only use the “associated types” half of this, although the feature is slightly more general. Anyway, I’ll describe the part that I use.

Type families give you a way to compute which type you want to use. Yes, sounds weird, but it’s amazing.

A typical first example of this is the associated list. Every modern language has these: it is just an array where the lookup doesn’t need to be an Int (think HashMap and the like).

In Haskell this can be described like so:

class GenericMap a where
  type Key a
  type Value a
  get :: a -> Key a -> Value a
  set :: a -> Key a -> Value a -> a


The first two parts of this class definition are the so-called “associated types.” The easiest way to see this in use is with an example of what an instance might look like. Here I’ll do something crazy and define the function type String -> Int as an instance of this class (the Haskell Wiki article on type families has other examples, some of which you might find more conventional):

instance GenericMap (String -> Int) where
  type Key (String -> Int) = String
  type Value (String -> Int) = Int
  get f k = f k
  set f k v' = \k' -> if k == k' then v' else f k'


This instance works:

sampleMap :: String -> Int
sampleMap s = length s

sampleMap' = set sampleMap "foo" 4

...

*Main> get sampleMap "monkey"
6
*Main> get sampleMap "foo"
3
*Main> get sampleMap' "foo"
4
*Main> get sampleMap' "bar"
3
*Main> get sampleMap' "monkey"
6


which is all well and good.

Now, I haven’t yet given any reasons why this type families business is any good. The answer has to do with polymorphism: sometimes you want to write a function whose type signature is so damned flexible you just can’t figure out how to write it. You try a few examples, but each is too restrictive. But there’s a pattern to it. If you’re in this boat, type families can help.

I’d give an example of this, except I already did in polymorphic first class labels. (Which, by the way, is another feature I’d like to see in other languages.)

Another application of type families is type-level programming (functional dependencies can also be used for this, but as type families get better, my interest in seeing functional dependencies in other languages will dwindle). Type-level programming is an insane idea where you do computation in the type system at compile-time.

This actually can be helpful in situations where you have really complicated properties you want to express about your program statically. For instance, I had a situation where certain types had a “size” associated to them. I had functions that were polymorphic over arguments of a given size. Some of these functions would yield a new type that was twice the size as the input.

How do you express that statically, if the goal is to still be polymorphic? Type families can do it. I basically wrote a class whose sole job was to use type families to do arithmetic in the freaking type system. This basically looked like

class HasSize a where
  type Size a

class Doubler a where
  type Double a

...

-- The ~ operator asserts type equality, so this next
-- line basically reads "the size of b is `Double' the
-- size of a."
someFunction :: ( Size b ~ Double (Size a) ) => a -> b
someFunction = ...

I would not describe it as pretty, but it solved my problem, and it gave me a compile-time guarantee that an important design invariant was being met. The syntax is easy to read as well. And if it looks like I’m applying functions to types, it’s because I am.

Rank-2 types.

You don’t often see this on the list of great things about Haskell, but I love them. To say that a type is “rank-2” is basically a statement about just how polymorphic it is. I use this feature in two different ways: the first is to solve a polymorphism problem, the second is to prevent tainted data from leaking into places it doesn’t belong (I’m in love with this second application and I have no clue how to statically do it in any other language — tell me in the comments if you do!).

Here is an example of how I use it to get some extra polymorphism:

useFoo :: (forall a . a -> [a]) -> b -> ([b], [String])
useFoo f b = ( f b, f "bar" )

This function takes two arguments (another function and some other type) and uses them to build a tuple (by applying that function twice). The forall asserts that the function we give must work for any type a, hence why we can apply it to the mystical input of type b or to an ordinary String.

If I were to rewrite this function without the forall I'd get a type error (two type errors, actually):

useFoo1 :: (a -> [a]) -> b -> ([b], [String])
useFoo1 f a = ( f a, f "bar" )


gives me

temp.hs:14:18:
    Couldn't match expected type `[Char]' against inferred type `b'
      `b' is a rigid type variable bound by
          the type signature for `useFoo1' at temp.hs:13:25
    In the first argument of `f', namely `a'
    In the expression: f a
    In the expression: (f a, f "bar")

temp.hs:14:23:
    Couldn't match expected type `a' against inferred type `[Char]'
      `a' is a rigid type variable bound by
          the type signature for `useFoo1' at temp.hs:13:12
    In the first argument of `f', namely `"bar"'
    In the expression: f "bar"
    In the expression: (f a, f "bar")

Absent the forall, the type checker assumes that the function I'm providing works for some type a, and attempts to determine just which type that happens to be. That is, the compiler is allowing me to be a little ambiguous with my type signature, figuring that there is a particular type I have in mind and that it will use type inference to determine what that would be. But then I try to use the function on two different types -- b and String -- and therefore is quite upset. (In fact, it is already upset because, the way I've written the signature for useFoo1, Haskell assumes that a and b must be distinct, and in fact this is what those errors above are telling me: a is not the same as b, nor is it the same as String.)

While this application is nice, as I alluded above, in my mind the killer application is tracking tainted data. Here are two common scenarios where this is something you want to do:

  • You have some function which accepts untrusted user input, and you want to be certain that whatever value it returns has been scrubbed clean. This is handy for a function like, say, useUserInputToBuildSQLQuery. (There are many other ways to solve this problem, of course.)
  • You have a function which allocates some resources, uses them, then frees them, and you want to make sure it doesn't return a dangling handle. (I'm not aware of another way of solving this problem, and again would appreciate any comments with other ideas.)

The best example of that second scenario is Haskell's ST monad. Code that executes with the ST monad is able to create mutable variables. If you have a function that is written in the ST monad, you can execute it using the runST function, whose signature is

Prelude> :m +Control.Monad.ST
Prelude Control.Monad.ST> :t runST
runST :: (forall s. ST s a) -> a


The key to how this works is the forall in the signature of runST. In essence, it is preventing code in the ST monad from returning one of these mutable variables. So the following code works:

{-# LANGUAGE Rank2Types #-}

import Control.Monad.ST
import Data.STRef

exampleST :: ST s Int
exampleST =
     do myMutableVar <- newSTRef 0
        modifySTRef myMutableVar (\n -> n+1)
        n <- readSTRef myMutableVar
        return n

...

*Main> runST exampleST 
1


but the following code does not:

{-# LANGUAGE Rank2Types #-}

import Control.Monad.ST
import Data.STRef

exampleST1 :: ST s (STRef s Int)
exampleST1 =
     do myMutableVar <- newSTRef 0
        modifySTRef myMutableVar (\n -> n+1)
        return myMutableVar

...

*Main> runST exampleST1

:1:0:
    Inferred type is less polymorphic than expected
      Quantified type variable `s' escapes
    In the first argument of `runST', namely `exampleST1'
    In the expression: runST exampleST1
    In the definition of `it': it = runST exampleST1

People who read my blog will not be surprised when I mention that Oleg Kiselyov and Chung-chieh Shan have shown that this approach can be used to implement region based resource management with good granularity. (This is a paper I've been bringing up a lot recently, as it is the underpinning of memory management in Potential.)

Conclusion.

Haskell has a reputation for being hard to learn, though I feel this reputation is a bit dated now that we have good resources like Learn You a Haskell for Great Good and Real World Haskell. Certainly one of the hardest parts about learning Haskell is that so many of the examples of "good Haskell" that we hold up rely on many of the features I mentioned above, and most of them seem foreign to new Haskellers. That's hard to avoid: Haskell is, after all, a research testbed.

I don't know if you'll feel the same way as I do, but after gaining some experience with using these tools in my own code, it is frustrating to leave them behind when working in other languages. Every language has its seed of grace, elegance, and brilliance that, if it gets into you and grows, will make you into a zealot. I feel that these are Haskell's seeds.

Leave a comment

18 Comments

  1. kl

     /  30 June, 2010

    Statically typed side-effects are in D — called pure functions.

  2. josh

     /  30 June, 2010

    A lot of these features already are in other languages.

  3. dagit

     /  30 June, 2010

    @josh, Could you please provide examples?

    This is a great article about why Haskell’s type system is very handy. I’m not sure if people who don’t already see the value of static typing will get it. It seems like it might be easy to read this article and draw the conclusion that static typing forces us to create esoteric solutions to the problems it creates. Of course, as a Haskeller who has used types to great success, I know what your getting at. But, would a programmer who only knows PHP get why static types are so important?

    Perhaps that’s the topic for other articles.

  4. Nice blob post. A nice example to go under your “Typed side-effects” is definitely STM. Limiting effects to only STM and not IO, in order to attain ACID properties, is brilliant. Clojure has STM, but it has no way to disallow non-transactional code within a transaction.

  5. SJS

     /  30 June, 2010

    Typo alert!

    “Conclusiuon” isn’t a word.

  6. Rúnar

     /  1 July, 2010

    Here’s how to get rank-2 types in Scala, and apply your trick:

    type Id[A] = A

    trait ~>[F[_], G[_]] { def apply[A](a: F[A]): G[A] }

    def useFoo[B](f: Id ~> List, b: B): (List[B], List[String]) = (f(b), f(“bar”))

  7. Hi,

    > that Scala provides this as well (they call it trait).

    Trait are not really typeclasses in Scala but those can be emulated by using other constructs: http://bit.ly/cGkjXa

    Eric.

  8. Rank-2 types are like generic (free) types in F#. The following example should be self-explanatory:

    > let afunc = (fun a -> if a.GetType().Equals(typeof) then printfn “int” else printfn “non-int”);;
    val afunc : ‘a -> unit
    > afunc 1;;
    int
    val it : unit = ()
    > afunc “s”;;
    non-int
    val it : unit = ()

    Quasi-quoting reminds me of F# Quotations. Abstract Syntax Trees can be generated by simply enclosing an expression in angle brackets and “at” signs, i.e.

    > ;;
    val it : Quotations.Expr =
    Value (1)
    {CustomAttributes = [NewTuple (Value (“DebugRange”),
    NewTuple (Value (“stdin”), Value (2), Value (3), Value (2), Value (4))
    )];
    Raw = …;
    Type = System.Int32;}

    Transforming the AST’s into a Domain-Specific Language can be accomplished by using the SpecificCall Active Pattern from the Microsoft.FSharp.Quotations.DerivedPatterns namespace.

    Typed side effects–you can create application-specific notions of side effects in F# with computation expressions which are also useful for concurrency as they are the underpinning for asynchronous workflows. More specifically, you may explicitly define the Bind() member method in the Microsoft.FSharp.Control.AsyncBuilder class such that execution of a let! expression corresponds to the custom Bind() method definition, e.g.

    {| let! pattern = expr in cexpr |}
    // Translates to…
    builder.Bind(expr, (fun pattern -> {| cexpr |}))

    (* Although it’s not technically a side effect since Bind applies to immutable variables
    (F# variables are immutable by default unless declared with the mutable keyword,) the CLR allows you to cheat a little–either through boxing or by completely reinitializing the variable. Observe: *)

    let anImmutableVar = 1
    let mutable aMutableVar = 2
    aMutableVar <- 3
    let aBoxedVar = ref "hello"
    let aBoxedVar := "world"
    let anImmutableVar = 4

    /* Separation of class/data definitions (i.e. Java interfaces vs. Haskell classes) is quite similar to the concept of abstract class definitions in C# */

    public abstract class AbstractClass {
    public void Show(String s){Console.WriteLine(s);}
    }
    public class Class : AbstractClass {
    public int i;
    public int k;
    public Class(){i=1;k=2;}
    }
    void Main(){
    Class c = new Class();
    c.Show((c.i+c.k).ToString());
    }

    F# has the [] attribute for declaring such data. Note that C# can define conventional interfaces as well and features some out-of-the-box such as IDisposable and IEnumerable.

    I don’t know Haskell at all so I’m sure the Sapir-Whorf hypothesis applies here. I was unable to grasp the idea of polymorphic first class labels, but I did my best to interpret the remainder of the programming paradigms you’ve outlined. Please comment if I’ve made any glaring mistakes.

  9. @duper Rank-2 types are *not* the same as generic types in F#. F# generic types are analagous to polymorphic types in Haskell. F# does not have Rank-2 types.

    F# also does not have typed side-effects. Any F# function may cause side-effects and the type of the function does not reflect this.

    CLR interfaces are similar to type classes but type classes are more powerful in both obvious and subtle ways. One obvious benefit of type classes is that the class can provide an implementation of one or more functions, but you are not bound to ‘single-inheritance’. In other words, they combine the best aspects of interfaces and abstract classes. Also, type class instances for data types can be defined seperately from the underlying data type itself. Imagine being able to define your own interface and then define how a built-in type implements it.

  10. It was a demand that folk from different sectors in the society are asking for a continual advent of tattoo designs.
    Tattoos express everything about you and you can tell the
    world who you are with them. Rates are competitive at these businesses and the tattoo artists are very talented.

  1. Haskell API Design: Scoped Resources « Communicating Haskell Processes
  2. Haskell features I’d like to see in other languages » News, Hacker, View, Comments » App Developer Tyler Johnson Blog - tjoozey.com
  3. The Issue with Static Typing « Metaphysical Developer
  4. Quasi-quoting: ASCII art to define data structures « The Potential Programming Language
  5. non-cohesive, tightly coupled : Safe String Interpolation In Ruby
  6. Higher-Rank Polymorphism in Scala « Apocalisp
  7. I come from Java and want to know what monads are in Haskell « Integer Overflow
  8. Towards an Effect System in Scala, Part 1 « Apocalisp

Leave a comment