Thursday, March 1, 2007

Runtime macros in C# 3.0

Macros are perhaps the most powerful tool available to programmers for creating new kinds of abstractions. They allow developers to extend their programming language with new constructs and behaviors. Many languages have this feature, the most well-known of which is LISP. A macro is a function which operates on code, transforming it into different code (usually expanding it). Just before compilation the compiler parses the code into an abstract syntax tree (AST) and passes the applicable pieces to the macro function. The developer need only indicate somehow which portions of code they would like to transform with their macro. This is a useful technique if you notice recurring patterns in your code and your language does not provide you with sufficiently powerful abstractions to factor them out.

C# 3.0 does NOT add compile-time macros to the language, but allows you to do the same type of code manipulation at run-time. In this article I'm going to create a macro function that adds a useful new abstraction to C#.

Roughly a year or so ago Microsoft Research released an extension of the C# language called C Omega. It was an experimental language written by Erik Meijer (Mr. Haskell) among others and was the precursor to C# 3.0. Many cool features in C-Omega did not make it into C# 3.0. I understand why most of the omitted features were left out, but there was one feature that I really liked that didn't make the cut.

Ask yourself how many times you have written code like this:


int? friendsSonsAge = null;
if (person != null && person.Friend != null && person.Friend.Son != null)
{
friendsSonsAge = person.Friend.Son.Age;
}

OK, OK. I was never really good at short contrived code examples but the point is that often you want to know the value of a property buried deep in a hierarchy of objects, each of which may or may not be null. In most mainstream languages (Java, C, Javascript, VB.NET, C#, Ruby) you have no recourse but the ugly boilerplate code shown above.

In C-Omega you can do this:


int? friendsSonsAge = person.Friend.Son.Age;

if (friendsSonsAge != null)
// do something


If any of the intermediate objects in the expression on the right of the assignment are null the whole expression evaluates to null. This is called null propagation and it works very nicely together with nullable types. I have no idea why this feature wasn't added to C# 3.0. No matter, we can create a macro that duplicates this behavior ourselves. :-)

The necessary ingredient for a macro is the ability to turn code into data. C# exposes code as data in an Expression tree object. If you assign a lambda function to a variable of type Expression C# will build an AST of the function instead of creating an executable delegate.


Expression<Func<int,bool>> oldEnoughToDrink = (age) => age >= 19; // oldEnoughToDrink can not be executed, instead it points to an AST representing the function

C# cannot translate arbitrary expressions or code blocks into data, only lambda functions so we will need to nest our expression inside one. Behold, my null propagation macro function:

Person person = null;
int? personsFriendsSonsAge = Macros.GetValue<int>(() =>
person.Friend.Son.Age);


In this example personsFriendsSonsAge will be null because the local variable person is null. The above code does not throw a null reference exception because the lambda function is not executed, but converted to an AST, and then passed to the GetValue function. Notice how we've nested the expression inside a lambda function with no arguments:


() =>
person.Friend.Son.Age


The C# compiler knows to convert this lambda expression into an expression tree because the type of the first argument to the GetValue function is an Expression:

public static Nullable<T> GetValue<T>(Expression<Func<T>> f) where T : struct


Here is the body of the macro (click to view full-size):



The GetValue function takes the expression passed to it and wraps it in a series of AndAlso (short-circuit evaluation "&&" operator) expressions comparing each member expression to null. I build the conditional expression from the inside-out by nesting each previous AndAlso in a new one. I skip a check for null if a member expression is referencing a value type which can never be null. Finally I compile the expression tree into a function, execute it, and return the result.

The end result is that a little function is generated that looks like this:


() => (
person != null && person.Friend != null && person.Friend.Son != null) ? person.Friend.Son.Age : null;

Now there are several things to be aware of:

1. It's slow. Generating code at runtime is comparatively expensive even if it's just a little function. That's what nice about compile-time macros: the code generation happens ahead of time. That said, execution speed is on par with an eval in IronPython so it's still quite reasonable. Just don't use it in a tight loop.

2. This version of the function only works for value types. This is due to the fact that it returns the generic Nullable class which cannot be parameterized with a reference type. You can easily add another function which works with reference types by modifying the original slightly. The need to have one function for value types and another for reference types is a necessary headache if we want to preserve type safety.

3. This is an oversimplified version for the purposes of demonstration. It only handles member access expressions like form.Size.Width, but won't work if there is a method call or indexer in the expression.

int? age= Macros.GetValue<int>(() => myCustomer.Friends[0].GetBrother().Age); // will throw an exception because a call to the GetBrother method and an indexer function is made

This should not be perceived as limitation of the language, but rather a limitation on my free time :-). In fact it is completely possible to handle method calls. The trick is to ensure that you don't call the same method again and again when making null comparisons. This can be accomplished by introducing a new lambda function every time you need to store the result of a computation. I'll leave that as an exercise for the reader. :-)

The important thing to take away from this article is that converting code to data (and vice versa) has many more uses than just generating SQL from C# queries. C# has evolved into a very powerful language for metaprogramming and I recommend learning as much as you can about it so that you can leverage these new capabilities.

10 comments:

Anonymous said...

Have you tried memoizing the resulting delegate? That way, you don't have to rebuild and recompile for the same expression.

Jafar Husain said...

Yes. Unfortunately this does not work if the expression is bound to a locally-scoped variable because a new one is instantiated every time. This would work fine for private member variables though.

nevyn said...

If you like being able to message-to-nil, you should try Objective-C ;-) Cocotron plus mono/CocoaSharp, now that would be interesting...

Anonymous said...

public static T GetValue[T](Func[T] func)
{
try
{
return func();
}
catch
{
return null;
}
}

Obiously you shouldn't use exceptions in this way, but in this case, it would be faster and less buggy.

Anonymous said...

I have been puzzled by this post for some time now. I am very interested in Macros in C#. However, there isn't much info on the web about them. Your post is my main reference.

Thus I will bug you about it :-).

Doesn't C# have an Eval method? Because if it does, you could just take the expression and run Eval on it, and return the result. What am I missing?

Thank you,
Andrei

Jafar Husain said...

C# does not have an eval method. A string is not the preferred way to create code. You should use the System.Linq.Expression factory functions. This is much cleaner and works well with linq queries. It is more verbose and difficult to read however.

Edvard Pitka said...

How about using Monads to do this. It would allow you to write code like this person.NullSafe(p=>p.Address).NullSafe(a=>a.ZipCode). I just posted a blog on how to do this.

Anonymous said...

Women’s nike tn Shox Rivalry est le modèle féminin le plus tendance de baskets pour le sport. tn chaussuresConcernant la semelle :Cheap Brand Jeans ShopMen Jeans - True Religion Jeans nike shoes & Puma Shoes Online- tn nike, le caoutchouc extérieur, l’EVA intermédiaire et le textile intérieur s’associent pour attribuer à la.ed hardy shirts pretty fitCharlestoncheap columbia jackets. turned a pair of double plays to do the trick.Lacoste Polo Shirts, , Burberry Polo Shirts.wholesale Lacoste polo shirts and cheap polo shirtswith great price.Thank you so much!!cheap polo shirts men'ssweate,gillette mach3 razor bladesfor men.As for

Anonymous said...

情趣用品|情趣用品|情趣用品|情趣|情趣用品|情趣

Nhilesh Baua said...

Have anybody tried the CodeDOM, NET API which allows you to programatically compile code using the .NET compilers and programatically construct code without much FUZZ.

-------------
BAN the EVAL

About Me

My photo
I'm a software developer who started programming at age 16 and never saw any reason to stop. I'm working on the Presentation Platform Controls team at Microsoft. My primary interests are functional programming, and Rich Internet Applications.