Thursday, January 29, 2009

F#: Real Sharp

Eric Lippert's recent post on C#'s inability to infer the type of fields in classes is the perfect opportunity to contrast C# and F#.  As Erik points out you can't do this in C#:

class Customer { const age = 20; private var name = string.Empty; }

It seems that Eric has been getting a little flack from C# developers who want smarter type inference in their language.  I find this amusing as type inference was perhaps the most controversial of all the features ever added to C#.  No doubt some of the controversy was due to misinterpretations of var's meaning, with many believing it was a way of adding dynamic typing to C#.  However there were a great many developers that were genuinely concerned that the availability of the "var" keyword would allow their peers to write inscrutable code.

"I'm going to take a shower...in the bathroom."

Fast forward a few years and it's clear that the sky hasn't fallen.  In retrospect the benefits of type inference are obvious.  In everyday communication when tend to leave out information that can be readily inferred based on context.  For example we don't often find ourselves saying things like this:

"Sure I've read 'War and Peace'...the book."

Such pronouncements would no doubt be considered by others to be bizarre at best, and mildly condescending at worst.  Of course if there is the potential for misinterpretation we add supplementary information:

"Sure I've read 'War and Peace'...the comic."

Static typing was originally introduced to help compilers verify code correctness, not to enhance code clarity for developers.  As opposed to clarifying code, repetitive type information tends to obscure its meaning by lowering the signal-to-noise ratio.   Quick, what does the following code do?

Enumerable.Range(0, 365) .Select<int, DateTime>(new Func<int, DateTime>(day => new DateTime(2000, 1, 1).AddDays(day))) .Where<DateTime>(new Func<DateTime, bool>(x => x.DayOfWeek == DayOfWeek.Friday && x.Day == 13)) .GroupBy<DateTime, int>(new Func<DateTime, int>(date => date.Month / 4));

Well obviously it takes all the days in 2000, finds all the friday the 13ths, and groups them by quarter.  You might think that no programming language would impose such a heavy burden on its developers but Java 6 does.  Although it is technically possible to write Linq-style code in Java 6, the language's inability to do type inference forces you to write something like the code above.  Despite the fact that replacing a loop with a Linq query adds several technical advantages (no state bugs, can be parallelised, etc) it would be irresponsible to do so.  The problem is that the decreased code clarity of the superior solution outweighs its advantages.  We must never forget that code is for human beings to read first and foremost.  This example shows that the real benefit of type inference is not that it saves us keystrokes, but that it enables us to express new, more complex abstractions.

The Best of All Worlds

Imagine if you could remove 90% of the type declarations in your code, leaving almost nothing but type conversions, constructor calls, and logic.  What if your code could be as concise and readable as Python code but as fast as C# code?

Let's take a look at a simple example.

public static IEnumerable<T> QuickSort<T>(this IEnumerable<T> list) where T : IComparable { if (!list.Any()) { return Enumerable.Empty<T>(); } var pivot = list.First(); var smaller = list.Skip(1).Where(item => item.CompareTo(pivot) <= 0).QuickSort(); var larger = list.Skip(1).Where(item => item.CompareTo(pivot) > 0).QuickSort(); return smaller.Concat(new[] { pivot }).Concat(larger); }

This is a simple recursive quick sort routine written in C#.  Now let's take a look at the same algorithm in F#:

let rec quicksort list = match list with | [] -> [] | x::xs -> quicksort [for item in xs do if (compare item x) <= 0 then yield item] @ [x] @ quicksort [for item in xs do if (compare item x) > 0 then yield item]

To ensure a fair comparison I'll explain the syntax a little bit.  F# is whitespace-aware which removes the need for all those brackets.  Just adding an new level of indent is enough to signal a new block to the compiler.  The "rec" modifier indicates to the compiler that this function is recursive. 

The first operation is a pattern match.  F# allows you to match objects against literal declarations which makes for extremely readable code. If the list is an empty list ([]) then an empty list is returned. The "x::xs" is a pattern that means "the first item in the list as x, and the rest of the items as xs".  This makes sense when you learn that in F# you can create a list like this:

let myNewList = 1::[2; 3; 4] // this creates a new list [1; 2; 3; 4]

The "@" operator is a list concatenation operator and the list comprehensions should be straightforward to anyone familiar with Linq. 

Now that you can read this code and understand what it is doing there are a few things to notice here:

1. There are no type declarations whatsoever

F# can tell that the list parameter is a list based on the fact that an attempt is made to match it against an empty list.  Neat trick huh?

2.  The return type of the method is inferred.

Why wouldn't it be?

3. There are no type constraints specifying that the type of list must inherit from IComparable.

In F# all methods are generic by default.  If you pass a parameter to another function then F# tries implicitly adds a constraint that the parameter's type must inherit from the type of that function's argument.  Sometimes F# will not have enough information to infer the type of a method parameter in which case you can just specify the type:

let getChars (str:String) = str.Chars

However if you pass the method parameter to another function F# can usually infer what its type is:

let getChars str = str + "a"

Beautiful Code

In F# type declarations are the exception, not the rule.  You will find that F# does a surprisingly good job of inferring the type of your variables.  In fact I have written a few non-trivial data structures in F# without specifying any type information whatsoever.  I was concerned at first that the lack of explicit type declarations would make my code more difficult to follow but the opposite has proven to be true.  F# code is much easier for me to understand than C# code.  In fact, it's not even close.

It's not just the lower signal-to-noise ratio either.  Although type inference is an integral feature, pattern matching and recursion also force code into a predictable, declarative structure.  All of F#'s idioms work in harmony to create beautiful code and that's why I love it.

P.S. It also has no problem whatsoever with this :-) :

type Customer() = let mutable name = String.Empty let age = 20

18 comments:

Anonymous said...

I really like topic of your post. F# really is a beautiful language.

I hope that you'll consider cleaning up the whitespace in your post though. The large gaps make it a poor representation of F#'s beauty.

Anonymous said...

FWIW, there's a bug in your C# QuickSort. When calculating smaller you need to skip the pivot (using Enumerable.Skip), otherwise there's an infinite recursion.

Jafar Husain said...

Thanks Anonymous,

Fixed.

Anonymous said...

Nice post!

I think a slightly more readable implementation of quicksort is:

let rec quicksort = function
| head :: tail ->
(tail |> List.filter (fun item -> item <= head) |> quicksort)
@ [head]
@ (tail |> List.filter (fun item -> item > head) |> quicksort)
| _ -> []

Besides, may be (just may be) it is more true to F# design idea.

Anonymous said...
This comment has been removed by a blog administrator.
Anonymous said...
This comment has been removed by a blog administrator.
Anonymous said...
This comment has been removed by a blog administrator.
Anonymous said...
This comment has been removed by a blog administrator.
Anonymous said...
This comment has been removed by a blog administrator.
Anonymous said...
This comment has been removed by a blog administrator.
Anonymous said...
This comment has been removed by a blog administrator.
Anonymous said...
This comment has been removed by a blog administrator.
Affordable Luxurious Wedding Dress Blog said...
This comment has been removed by a blog administrator.
Anonymous said...
This comment has been removed by a blog administrator.
Anonymous said...
This comment has been removed by a blog administrator.
Anonymous said...
This comment has been removed by a blog administrator.
Anonymous said...
This comment has been removed by a blog administrator.
Anonymous said...

I have to take issue with your statement that "We must never forget that code is for human beings to read first and foremost". The first and foremost purpose of code is to do what it's supposed to do. Otherwise, the best code possible would always consist of just one comment:

// Achieve purpose

A bubble sort may be inherently more readable than a quicksort (I realize that this is debatable, but it's just for illustration), but if it makes your user wait 10 minutes longer than they're comfortable then you write the more complex, less readable quicksort because the first purpose is to achieve reasonable performance. After you've achieved that, the next most important goal is readability.

About Me

My photo
I'm a software developer who started programming at age 16 and never saw any reason to stop. I'm working on the Presentation Platform Controls team at Microsoft. My primary interests are functional programming, and Rich Internet Applications.