Wednesday, April 11, 2007

Haskell for C# 3 Programmers

I've been occupying myself as best I can, waiting impatiently for .NET 3.5 to come out. In the meantime I've been playing with Lisp and Haskell, two of the languages with the most cache among the smart folks who hang out at Lambda the Ultimate. Well it’s happened: I've fallen head over heels for Haskell. It's just so beautiful. It is a pure functional language and is therefore limited in the kinds of operations that can be performed. As a result a little bit of well-thought out syntax to support these operations goes a long way towards improving readability.

It occurs to me that it is worthwhile to introduce C# users to Haskell because many of the improvements coming in C# 3.0 (and many of those introduced in C# 2.0) are actually inspired by Haskell. Learning Haskell is a great way of training yourself to think functionally so you are ready to take full advantage of C# 3.0 when it comes out. There are a variety of Haskell tutorials available on the internet so I will just focus on some similarities between C# and Haskell with the aim of deepening your understanding of both.

Type Inference

Haskell and C# 3.0 both have type inference. In C# 3.0 you write this:
var number = 3;

In Haskell you write this:

number = 3

Haskell’s type inference is a quite a bit smarter than C#'s, but I’ll give you examples of that later.

Lazy evaluation and Streams

In C# I can create a stream of all natural numbers like this:

IEnumerable<int> NaturalNumbers()
{
var i = 0;
while (true){
yield return i;
i++;
}
}

The reason that this loop does not continue forever is that the algorithm stops as soon as a yield command is reached and resumes where it left off when it is started again. In Haskell you can duplicate this behavior by using the colon operator to construct streams.

nat :: [a]
nat = 0 : map (\x -> x +1) nat

This function specifies that the first item in the stream is 0 and then the next result is a list derived from chasing the function's tail and applying a lambda to each item. The map function is equivalent to select in C# 3.0. The first item will be 0, the next will be 1 because the previous was 0, and the next will be 2 because the previous was 1, and so on. The expression on the left side of the colon is an item and the right side is a list. You can build lists like this in Haskell:

0: 1: 2 : 3: 4 : [] -- [] is an empty list in Haskell

This syntax is useful because it signals that the algorithm can stop as soon as it gets a value that it needs, just like yield. It is also very useful to express lists using this colon separated syntax when you are creating recursive list processing functions but I’ll explain that later. For now just make sure you can recognize this syntax as the construction of a list.

Functions, Functions, Functions

It probably won’t come as a shock to you that functional programming relies heavily on functions. Consequently Haskell provides all sorts of useful syntax for manipulating them. The thing that caught me off guard when I first tried to learn Haskell was the function signatures. Look at this example of a function that adds two integers:
add :: Integer -> Integer -> Integer
add x y = x + y

Does this seem odd to you? You might have expected to see something like this:
add :: (Integer, Integer) -> Integer
add (x,y) = x + y


In fact the definition above will work, but it's not really the preferred way of defining functions in Haskell. If you read the first function signature left to right it would say: “add is a function that accepts an integer, which returns a function that accepts an integer, which returns an integer.” One of Haskell's elegant features is that you can take a function that accepts n arguments and convert it into a function that accepts n-1 arguments just by specifying one argument. This is known as partial application. For example given the first add definition I can do this:
addone = add 1 -- returns function that accepts 1 argument

addone 5 -- this returns 6

In Haskell all functions accept a maximum of one argument and return one value. The value returned with be another function with one less parameter until all the arguments have been specified in which case the output value is returned. As I’ll explain later this behavior is very useful when writing certain types of recursive functions.

Recursion

Pure functional languages have no loops. Instead they rely on recursion to repeat operations. Recursive functions have a terminating case and an inductive case. As a result you often see code like this (C#):

public int Add(int x, int y)
{
if (x == 0) // terminating case
return y;
else
return Add(x-1,y+1); // inductive case
}

Haskell allows you to specify the terminating case in a different definition than the inductive case. This is called pattern matching and results in a much more declarative style of coding:


{-- function signature (takes two Nums, returns a Num) --}

add :: Num a => a -> a -> a
add 0 y = y -- terminating case
add x y = (add (x-1) (y+1)) -- inductive case

Take a look at the add function’s signature. add is a generic function. Pretend that "a" is "T" and “Num a =>” is "where T : Num" and it will be a lot clearer. In fact our add function will work on all numeric data types that implement the Num interface (Integer, Double, etc). This doesn’t work in C# because you can’t include operator definitions in an interface. In Haskell, operators are just like any other function and are not bound to a particular class. They just use a symbol instead of a proper identifier. This allows you to include them in interface definitions and thus use them in generic functions.

Now some of you might be asking “Isn’t all this recursion awfully expensive?” No, because Haskell and many other functional languages are equipped to recognize a particular kind of recursion known as tail recursion. When Haskell recognizes this type of recursion it translates it into a simple loop so that the stack doesn’t grow. For instance our C# version of Add above could be translated to the following by a smart compiler:

public int Add(int x, int y)
{
do {
if (x == 0) // terminating case
return y;
else
{
x = x-1;
y = y+1;
}
} while(true);
}

So how do you tell when a function is tail recursive or not? It’s easy. If a function ends with a call to the same function it is tail recursive. The add function defined above is tail recursive because at the end of function add is called again. Some functions are not tail recursive which means they do grow the stack. Take the following flatten function which takes a list of lists and flattens it into a list:


flatten :: [[a]] -> [a]
flatten [] = [] –- if list is empty, return empty list
flatten (y:ys) = y ++ (flatten ys)



Notice that the last operation to execute is the list concatenation (++) function. This means that the concatenation function must wait for a result from the recursive call to flatten before it can finish its computation. As a result flatten will consume both time and stack space because it will have to keep track of where we are in each previous recursive call.

The good news is that we can take many recursive functions and convert them into tail recursive functions using a straightforward technique. We introduce an accumulator function that adds a new parameter and uses it like a variable to store the computation done so far.

flattenAcc :: [a] -> [[a]] -> [a]
flattenAcc acc [] = acc
flattenAcc acc (y:ys) = flattenAcc (acc ++ y) ys

flatten = flattenAcc [] –-partial application

I want you to notice that we are using the colon list syntax in the inductive case definition of flattenAcc. This is a great example of pattern matching. We are effectively saying that in this version of flattenAcc y represents the first item in the list passed as the second parameter and ys represents the rest of that list. Haskell uses its own syntax in its pattern matching expressions resulting in very readable code.

In our revised version the last procedure called in flattenAcc is flattenAcc. This makes our function tail recursive. In order to accomplish this flattenAcc effectively stores the state of the flattened list so far in the newly introduced "acc" function parameter. Remember that when a tail recursive function get translated into a loop the function parameters act like variables. Note that we don’t need to specify a type signature for flatten because Haskell infers it.

*Note: Derek Elkins and Cale Gibbard point out in the comments that tail recursive functions are not always preferable to recursive ones because of Haskell's lazy evaluation. If you can write a non-recursive function that uses the colon syntax it is probably better than a tail recursive one that doesn't. This is because Haskell's lazy evaluation enabled you to use the non-tail recursive version on an infinite stream without getting a stack overflow.
Now you’re familiar with the basic concepts of functional programming. Next time we’ll look at features that inspired C# query comprehensions and nullable types.

110 comments:

  1. nat = 0 : map (\x -> x +1) nat

    can be shortened to

    nat = 0 : map (+1) nat

    ReplyDelete
  2. "Unfortunately", laziness "gets in the way". While transforming non-tail-recursive code to a tail-recursive form is important and useful for functional programming in general, dealing with laziness requires a little more care, and often "non-tail-recursive" versions are preferrable. flatten is an example of this, the first version is better in many ways. While I don't believe it happens in this case, oftentimes naively writing code "tail-recursively" in Haskell will actually -make- it overflow the stack. Another (actual) benefit of the first version of flatten is that it will work on infinite lists. http://www.haskell.org/hawiki/StackOverflow gives a simple example and some explanation.

    ReplyDelete
  3. Echoing Derek's comment, you actually usually don't want tail-recursion in Haskell, but rather functions which produce values as quickly as possible, and include their recursive calls as parts of the data structure returned. This means that fewer recursive calls will have to be made, and generally less work will need to be done in order to determine the initial parts of the data structure, if that's all that's needed. Where you really do want tail recursion, you also probably want strictness (though quite often the compiler will be able to tell this, there are cases where it won't be able to, and you'll need to use annotations to tell it you want strictness, or strict higher-order functions).

    Otherwise, the article is quite nice! It would be good if more of the industry programmers (and language designers!) would pay attention to languages like Haskell, as there are lots of cool ideas here.

    ReplyDelete
  4. You've inspired me to look further into Haskell. This will take up quite a bit of my free time. I hope you don't mind if I bill you for that.

    ReplyDelete
  5. I really appreciate these posts, they are very insightful.

    Here's a question regarding the style of the Haskell examples. When you provided the equivalent C#, you used an explicit name "NaturalNumbers", but in the Haskell only the abbreviated "nat". Is this just because that's the Haskell-ish way?

    I prefer explicit naming, as the abbreviated names in the other Haskell example cause one more layer of mental translation for me. (Will I be laughed at by all the serious Haskell guys?)

    ReplyDelete
  6. You will not be laughed at for using NaturalNumber in Haskell. At least not any more than for using integerNumber in C. :)
    I like the type name Natural myself.

    ReplyDelete
  7. anonymous: Thanks for pointing out that you can use (+1) as an abbreviation for the lambda. I was aware of this but the reason I expanded it was so that it would look more like the C# lambda x => x + 1.

    derek, cale: Thanks very much for pointing out the effect of laziness on tail recursive functions. I see now that non-recursive functions are more composable. That's great news actually. I find tail recursive functions a lot less intuitive. Now I can write recursive functions guilt free! I will change the article to focus less on tail recursion.

    christopher: The Haskell compiler forces all function identifiers to be three characters long. Kidding. I just didn't want the line to wrap :-).

    ReplyDelete
  8. public int Add(int x, int y)
    {
    if (x == 0) // terminating case
    return y;
    else
    return Add(x-1,y+1); // inductive case
    }

    If I ever saw code that bad I'd beat somebody about the head!
    Bad, bad example of recursion.

    ReplyDelete
  9. The code example was not intended as best practice (or even good practice). I was trying to demonstrate what was going in Haskell in C# code so that the people who found Haskell's syntax difficult could understand. Thanks for making it clear that I need to remove any ambiguity about this though.

    ReplyDelete
  10. Great article! Thanks.

    ReplyDelete
  11. Thanks for interesting article.

    ReplyDelete
  12. Thank You! Very interesting article. Do you can write anything else about it?

    ReplyDelete
  13. Very interesting site. Blog is very good. I am happy that I think the same!

    ReplyDelete
  14. Nice! Nice site! Good resources here. I will bookmark!

    ReplyDelete
  15. Excellent website. Good work. Very useful. I will bookmark!

    ReplyDelete
  16. I see first time your site guys. I like you :)

    ReplyDelete
  17. Viagra
    Generic Viagra
    Google
    Cialis
    -------------------------------------------------------
    Levitra
    Propecia
    Meridia
    Yahoo!
    -------------------------------------------------------
    Zocor
    Soma
    Prozac

    ReplyDelete
  18. sfxfk8 Your blog is great. Articles is interesting!

    ReplyDelete
  19. Indulge yourself and find partners for hot Sexual Encounters and Adult Dating at Adult friend finder free dating site!
    If you are looking for a one night stand or a casual encounter, then the Adult Swingers Club is where the game is played.
    If you practice a different range of sexual and sensual activities then adult personals has the Club for you.
    Download over 2000 adult dvd movies, available formats: windows, mpeg, psp and ipod!
    Shop for Adult Toys, DVDs and Lingerie and other Erotic Adult Products at adult sex toys shop and adult toys store.

    ReplyDelete
  20. gGo6X8 Wonderful blog.

    ReplyDelete
  21. actually, that's brilliant. Thank you. I'm going to pass that on to a couple of people.

    ReplyDelete
  22. actually, that's brilliant. Thank you. I'm going to pass that on to a couple of people.

    ReplyDelete
  23. H6OwZh You have a talant! Write more!

    ReplyDelete
  24. jy2LzY Please write anything else!

    ReplyDelete
  25. dp1tEt Nice Article.

    ReplyDelete
  26. Please write anything else!

    ReplyDelete
  27. Ls2KzT The best blog you have!

    ReplyDelete
  28. TLlqi4 actually, that's brilliant. Thank you. I'm going to pass that on to a couple of people.

    ReplyDelete
  29. actually, that's brilliant. Thank you. I'm going to pass that on to a couple of people.

    ReplyDelete
  30. actually, that's brilliant. Thank you. I'm going to pass that on to a couple of people.

    ReplyDelete
  31. Please write anything else!

    ReplyDelete
  32. FKJcd9 write more, thanks.

    ReplyDelete
  33. Please write anything else!

    ReplyDelete
  34. actually, that's brilliant. Thank you. I'm going to pass that on to a couple of people.

    ReplyDelete
  35. actually, that's brilliant. Thank you. I'm going to pass that on to a couple of people.

    ReplyDelete
  36. What is a free gift ? Aren't all gifts free?

    ReplyDelete
  37. Build a watch in 179 easy steps - by C. Forsberg.

    ReplyDelete
  38. Build a watch in 179 easy steps - by C. Forsberg.

    ReplyDelete
  39. Lottery: A tax on people who are bad at math.

    ReplyDelete
  40. Suicidal twin kills sister by mistake!

    ReplyDelete
  41. When there's a will, I want to be in it.

    ReplyDelete
  42. Clap on! , Clap off! clap@#&$NO CARRIER

    ReplyDelete
  43. Suicidal twin kills sister by mistake!

    ReplyDelete
  44. The gene pool could use a little chlorine.

    ReplyDelete
  45. Build a watch in 179 easy steps - by C. Forsberg.

    ReplyDelete
  46. The gene pool could use a little chlorine.

    ReplyDelete
  47. Change is inevitable, except from a vending machine.

    ReplyDelete
  48. Friends help you move. Real friends help you move bodies.

    ReplyDelete
  49. Lottery: A tax on people who are bad at math.

    ReplyDelete
  50. C++ should have been called B

    ReplyDelete
  51. Beam me aboard, Scotty..... Sure. Will a 2x10 do?

    ReplyDelete
  52. I'm not a complete idiot, some parts are missing!

    ReplyDelete
  53. C++ should have been called B

    ReplyDelete
  54. What is a free gift ? Aren't all gifts free?

    ReplyDelete
  55. C++ should have been called B

    ReplyDelete
  56. What is a free gift ? Aren't all gifts free?

    ReplyDelete
  57. Oops. My brain just hit a bad sector.

    ReplyDelete
  58. Ever notice how fast Windows runs? Neither did I.

    ReplyDelete
  59. Suicidal twin kills sister by mistake!

    ReplyDelete
  60. The gene pool could use a little chlorine.

    ReplyDelete
  61. Give me ambiguity or give me something else.

    ReplyDelete
  62. 640K ought to be enough for anybody. - Bill Gates 81

    ReplyDelete
  63. Change is inevitable, except from a vending machine.

    ReplyDelete
  64. Oops. My brain just hit a bad sector.

    ReplyDelete
  65. Build a watch in 179 easy steps - by C. Forsberg.

    ReplyDelete
  66. A lot of people mistake a short memory for a clear conscience.

    ReplyDelete
  67. Give me ambiguity or give me something else.

    ReplyDelete
  68. 640K ought to be enough for anybody. - Bill Gates 81

    ReplyDelete
  69. Suicidal twin kills sister by mistake!

    ReplyDelete
  70. Suicidal twin kills sister by mistake!

    ReplyDelete
  71. I'm not a complete idiot, some parts are missing!

    ReplyDelete
  72. 640K ought to be enough for anybody. - Bill Gates 81

    ReplyDelete
  73. What is a free gift ? Aren't all gifts free?

    ReplyDelete
  74. Build a watch in 179 easy steps - by C. Forsberg.

    ReplyDelete
  75. Ever notice how fast Windows runs? Neither did I.

    ReplyDelete
  76. Give me ambiguity or give me something else.

    ReplyDelete
  77. Oops. My brain just hit a bad sector.

    ReplyDelete
  78. Build a watch in 179 easy steps - by C. Forsberg.

    ReplyDelete
  79. Meridia weight loss drug (sibutramine) is a weight loss aid, prescribed together with a larger plan of diet and exercise for people who need to lose 30 pounds or more, or for overweight people with additional factors risk (high cholesterol, hypertension).
    Proactol weight loss medication is a daily supplement clinically proven that easily help to reduce excess body weight and become attractive, slender person you have always wanted to be. Proactol will absorb up to 28% of fat in everything we eat.
    Regenon weight loss medication is used in the short term treatment of obesity. Reduce the effect their appetite tends to decrease after a couple of weeks. Because of this, these drugs are useful only during the first weeks of a program for weight loss. Regenon can help you lose weight while you are learning new ways to eat and exercise. Changes in eating habits and activity level should be developed and continued long-term in order to continue to lose weight and maintain the weight lost from returning.
    Tenuate weight loss medication decreases appetite. It is used on a short-term (a few weeks), in combination with diet, to help you lose weight. This drug is sometimes prescribed for other uses; Offices You can address your doctor or pharmacist for more information.
    Xenical Weight Loss medication a gastrointestinal lipase inhibitor used in the management of obesity in adult and adolescent patients age 12 and older. This medicine may be used during the weight loss phase or following weight loss to assist in weight management. This medicine works by inhibiting the digestion of fats from the diet and should be used with a reduced-calorie diet.

    ReplyDelete