Thursday, January 29, 2009

F#: Real Sharp

Eric Lippert's recent post on C#'s inability to infer the type of fields in classes is the perfect opportunity to contrast C# and F#.  As Erik points out you can't do this in C#:

class Customer { const age = 20; private var name = string.Empty; }

It seems that Eric has been getting a little flack from C# developers who want smarter type inference in their language.  I find this amusing as type inference was perhaps the most controversial of all the features ever added to C#.  No doubt some of the controversy was due to misinterpretations of var's meaning, with many believing it was a way of adding dynamic typing to C#.  However there were a great many developers that were genuinely concerned that the availability of the "var" keyword would allow their peers to write inscrutable code.

"I'm going to take a shower...in the bathroom."

Fast forward a few years and it's clear that the sky hasn't fallen.  In retrospect the benefits of type inference are obvious.  In everyday communication when tend to leave out information that can be readily inferred based on context.  For example we don't often find ourselves saying things like this:

"Sure I've read 'War and Peace'...the book."

Such pronouncements would no doubt be considered by others to be bizarre at best, and mildly condescending at worst.  Of course if there is the potential for misinterpretation we add supplementary information:

"Sure I've read 'War and Peace'...the comic."

Static typing was originally introduced to help compilers verify code correctness, not to enhance code clarity for developers.  As opposed to clarifying code, repetitive type information tends to obscure its meaning by lowering the signal-to-noise ratio.   Quick, what does the following code do?

Enumerable.Range(0, 365) .Select<int, DateTime>(new Func<int, DateTime>(day => new DateTime(2000, 1, 1).AddDays(day))) .Where<DateTime>(new Func<DateTime, bool>(x => x.DayOfWeek == DayOfWeek.Friday && x.Day == 13)) .GroupBy<DateTime, int>(new Func<DateTime, int>(date => date.Month / 4));

Well obviously it takes all the days in 2000, finds all the friday the 13ths, and groups them by quarter.  You might think that no programming language would impose such a heavy burden on its developers but Java 6 does.  Although it is technically possible to write Linq-style code in Java 6, the language's inability to do type inference forces you to write something like the code above.  Despite the fact that replacing a loop with a Linq query adds several technical advantages (no state bugs, can be parallelised, etc) it would be irresponsible to do so.  The problem is that the decreased code clarity of the superior solution outweighs its advantages.  We must never forget that code is for human beings to read first and foremost.  This example shows that the real benefit of type inference is not that it saves us keystrokes, but that it enables us to express new, more complex abstractions.

The Best of All Worlds

Imagine if you could remove 90% of the type declarations in your code, leaving almost nothing but type conversions, constructor calls, and logic.  What if your code could be as concise and readable as Python code but as fast as C# code?

Let's take a look at a simple example.

public static IEnumerable<T> QuickSort<T>(this IEnumerable<T> list) where T : IComparable { if (!list.Any()) { return Enumerable.Empty<T>(); } var pivot = list.First(); var smaller = list.Skip(1).Where(item => item.CompareTo(pivot) <= 0).QuickSort(); var larger = list.Skip(1).Where(item => item.CompareTo(pivot) > 0).QuickSort(); return smaller.Concat(new[] { pivot }).Concat(larger); }

This is a simple recursive quick sort routine written in C#.  Now let's take a look at the same algorithm in F#:

let rec quicksort list = match list with | [] -> [] | x::xs -> quicksort [for item in xs do if (compare item x) <= 0 then yield item] @ [x] @ quicksort [for item in xs do if (compare item x) > 0 then yield item]

To ensure a fair comparison I'll explain the syntax a little bit.  F# is whitespace-aware which removes the need for all those brackets.  Just adding an new level of indent is enough to signal a new block to the compiler.  The "rec" modifier indicates to the compiler that this function is recursive. 

The first operation is a pattern match.  F# allows you to match objects against literal declarations which makes for extremely readable code. If the list is an empty list ([]) then an empty list is returned. The "x::xs" is a pattern that means "the first item in the list as x, and the rest of the items as xs".  This makes sense when you learn that in F# you can create a list like this:

let myNewList = 1::[2; 3; 4] // this creates a new list [1; 2; 3; 4]

The "@" operator is a list concatenation operator and the list comprehensions should be straightforward to anyone familiar with Linq. 

Now that you can read this code and understand what it is doing there are a few things to notice here:

1. There are no type declarations whatsoever

F# can tell that the list parameter is a list based on the fact that an attempt is made to match it against an empty list.  Neat trick huh?

2.  The return type of the method is inferred.

Why wouldn't it be?

3. There are no type constraints specifying that the type of list must inherit from IComparable.

In F# all methods are generic by default.  If you pass a parameter to another function then F# tries implicitly adds a constraint that the parameter's type must inherit from the type of that function's argument.  Sometimes F# will not have enough information to infer the type of a method parameter in which case you can just specify the type:

let getChars (str:String) = str.Chars

However if you pass the method parameter to another function F# can usually infer what its type is:

let getChars str = str + "a"

Beautiful Code

In F# type declarations are the exception, not the rule.  You will find that F# does a surprisingly good job of inferring the type of your variables.  In fact I have written a few non-trivial data structures in F# without specifying any type information whatsoever.  I was concerned at first that the lack of explicit type declarations would make my code more difficult to follow but the opposite has proven to be true.  F# code is much easier for me to understand than C# code.  In fact, it's not even close.

It's not just the lower signal-to-noise ratio either.  Although type inference is an integral feature, pattern matching and recursion also force code into a predictable, declarative structure.  All of F#'s idioms work in harmony to create beautiful code and that's why I love it.

P.S. It also has no problem whatsoever with this :-) :

type Customer() = let mutable name = String.Empty let age = 20

Saturday, January 24, 2009

Much Ado About Nothing

When I started at Microsoft a few months ago there was a brief discussion of the pro's and con's of exposing Nullable property values in our controls. On one hand, nullables are awkward to work with in some languages (read: C#) and are poorly understood by some of our customers. On the other hand the inability to express "no data" is an impedance mismatch when data-binding to data retrieved from a database.

In the end we chose to embrace nullable values and I'm glad because the fact that WinForms controls didn't use them caused me tremendous pain in my former position as an app developer. That said it can't be denied that working with nullables can be a pain, especially when you don't care about the distinction between a null value and the default value. Someone ran into this problem on the Silverlight Discussions mailing list today and I thought of a trick that hadn't occurred to me before: use an extension method.

public static T ValueOrDefault<T>(this Nullable<T> that)
where T : struct
{
if (!that.HasValue)
{
return default(T);
}
return that.Value;
}

Now we don't have to remember to check for nullness before pulling a value out of a nullable property to avoid a potential NullReferenceException. In short this...

if (myCheckbox.IsChecked != null && myCheckbox.IsChecked.Value)
{
// ...
}

...becomes this...

if (myCheckbox.IsChecked.ValueOrDefault())
{
// ...
}

...and hopefully nullables just got a little less awkward.

-- Edit. I completely missed the GetValueOrDefault method on Nullable. Oops.

About Me

My photo
I'm a software developer who started programming at age 16 and never saw any reason to stop. I'm working on the Presentation Platform Controls team at Microsoft. My primary interests are functional programming, and Rich Internet Applications.