Monday, February 19, 2007

LINQ to Code

With all of the information on the Internet telling you how much easier LINQ makes data access you may have missed out on some of the less obvious implications of the new language features. I'm referring specifically to how much easier it is to generate code using the new LINQ API. I'm going to demonstrate a brief example that uses a domain-specific language (DSL) to serialize a class to a fixed-length, record-based text file. This is useful if you find yourself having to work with legacy file formats from the dark ages. Rather than reinvent the wheel let's steal the approach used for XML serialization in the .NET Framework.

In order to convert your C# queries to SQL at run-time LINQ introduces a new construct called a lambda expression. Lambda expressions are just lambda functions that C# represents as data instead of executable code. For a good example of how to use lambda expressions and compile them see this post in Don Box's blog. If you choose to represent a function as a lambda expression tree you can analyze it at run-time and convert it to another form like a SQL query or executable code. This is known as metaprogramming (short defintion: code that manipulates code). When the compiler converts your code into a data tree it represents it using objects in the new System.Linq.Expressions namespace. The question I asked is “Can I use these objects to build a function at run-time?” The answer is yes.

As you probably know, in order to serialize a class as XML you mark it up with attributes. You may not realize this, but this is a form of metaprogramming. When you attach attributes to language constructs you are using a domain specific language which is a little language designed for a specific task. This is a very powerful technique because when you use a language specifically geared for a task it is usually much easier and less error-prone than it would have been if you had written it in a general purpose language like C#. Using attributes to specify a DSL is great because it groups the work to be performed on data with the data itself instead of storing it in an external file where it can get out of sync with changes.Here is the file I want to read from:

1600 Pennsylvania Avenue G.W. Bush
1600 Pennsylvania Avenue C. Rice
1600 Pennsylvania Avenue D. Cheney

Here is the class I want to deserialize from the file:

[TextRecordSerializable]
public class Customer{
private string address;

[FieldRange(0,35)]
public string Address
{
get { return address; }
set { address = value; }
}
private string name;
[FieldRange(35,10)]
public string Name
{
get { return name; }
set { name = value; }
}
}

The FieldRange attributes just store the starting character index and the length of the field in the record. The TextRecordSerializable attribute indicates that the class can be serialized. These two attributes are the only “commands” in our DSL. Simple, huh? Now a full implementation would be a little more complex, allowing for type conversions and such, but it still would be simple enough to describe with attributes and parameters. What we want to do is convert our DSL in to real-honest-to-goodness executable code that deserializes the object from a string (and vice versa). Basically we want to generate the following function:

Func customerParser = (line) => new Customer{Address = line.Substring(0,35), Name = line.Substring(35,10) };

This function can be built for ANY type at run-time using the following code. It might seem complex at first, but stick with it. It's not as complicated as it seems. (Click on the image below to zoom - I haven't figured out how to keep blogger from mauling my code so if anyone knows how please leave a comment)*The code below aliases System.Linq.Expressions.Expression as "Ex" to keep things short.




This code is just beautiful. The data flows out of our DSL, is filtered, and converted to an AST all in a single expression. We create our method in the static constructor and then cache it in a static variable. We then call this function inside a public class method called “Deserialize.” The end result is that we can do this:

var serializer = new TextRecordSerializer<Customer>();
var customer = serializer.Deserialize(“1600 Pennsylvania Avenue G.W. Bush ");Console.WriteLine(customer.Name);

Once our expression is compiled into a function, the code runs as fast as it would have if we had written it ourselves in raw C#. I will leave the method that generates the Serialize function as an exercise for the reader. Next time we'll look at using expressions to create much more advanced DSLs than are possible with simple attributes.

31 comments:

  1. Interesting. But where did you find documentation? The docs in Orcas for the new features are really of shame, It is a garbage of method declarations without any examples...

    ReplyDelete
  2. Most of the documentation for LINQ can be found at Microsoft's LINQ site however there is precious little information about Expression trees. I just went spelunking with intellisense and found what I needed. The expression trees are very similar in structure to the CodeDOM but they are built with static factory methods in the System.Linq.Expressions.Expression class.

    ReplyDelete
  3. Your code is really cool man!. Would be great that Anders & Co. extend the Expression Trees to... Stement Lists :) So it will work with functions with more than one statment: with ifs, whiles, fors... and so on.

    ReplyDelete
  4. I agree. This would be ideal, but if they optimize tail recursion in lambda expressions when they are compiled we will be able to do anything we could do otherwise, but perhaps not as efficiently.

    ReplyDelete
  5. Hey Jafar! I really like the stuff you are putting out on this blog, and look forward to your book! :)

    BUT - in the interest of being able to follow along, is there anywhere I can get the code in text and not picture format? I am lazy! (maybe stupid as well, if the code is there but I've missed it).

    ReplyDelete
  6. Flattery will get you everywhere. :-) Alas, Blogger doesn't allow me to upload any files except images. I would send it to you by e-mail but it's terribly disorganized, full of aborted ideas, and stuffed into one project which is basically my playground. I plan to clean up the code, GPL it, and post it for everyone once I find a way to serve it.

    Sorry 'bout the wait

    ReplyDelete
  7. Very interesting post, LINQ is really a promising future, and this expression tree framework is really the fundations of this all.

    To copy/past code to your blog, you should try CopyAsHtml from Colin Coller (http://www.jtleigh.com/CopySourceAsHtml - the server seems to be down at the time I write this comment). I hope it can help..

    ReplyDelete
  8. did u know about haskell in WINHugs?i got assignment about haskell.....that i got a lot problem do not know how to program it

    ReplyDelete
  9. Thanks for the head start on a project I'm working on to read fixed records from text files. But what if I want to expose a property other than a string from the class to deserialize, like so:

    [TextRecordSerializable]
    public class Customer
    {
    private string address;

    [FieldRange(0, 35)]
    public string Address
    {
    get { return address; }
    set { address = value; }
    }
    private string name;
    [FieldRange(35, 10)]
    public string Name
    {
    get { return name; }
    set { name = value; }
    }

    // INTEGER type property
    private int zipcode;
    [FieldRange(45, 5)]
    public int ZipCode
    {
    get { return zipcode; }
    set { zipcode = value; }
    }
    }

    Can you point me in the right direction so that I can handle multiple data types in the Lambda expression? Thanks

    ReplyDelete
  10. Absolutely. It's best to use the TypeConverter framework that .NET exposes. All you need to do is modify your LINQ query to retrieve the type converter for each type that is not a string and then modify the expression to invoke the appropriate method on the type converter.

    ReplyDelete
  11. Thanks for the lightning fast response. After reading your suggestion I found myself digging through MSDN and finding very little in the way of useful info. I understand how to separate the query from the binding expression, but the disconnect is how to use the type converter within this query and then use this query to create an expression that invokes the appropriate type converter. Can you expand a bit more? Thanks a lot.

    ReplyDelete
  12. This comment has been removed by a blog administrator.

    ReplyDelete
  13. This comment has been removed by a blog administrator.

    ReplyDelete
  14. This comment has been removed by a blog administrator.

    ReplyDelete
  15. This comment has been removed by a blog administrator.

    ReplyDelete
  16. This comment has been removed by a blog administrator.

    ReplyDelete
  17. This comment has been removed by a blog administrator.

    ReplyDelete
  18. This comment has been removed by a blog administrator.

    ReplyDelete
  19. This comment has been removed by a blog administrator.

    ReplyDelete
  20. Jafar,

    Great post! In your "end result" code, don't you need to pass a type to the generic TestRecordSerializer in order to do work, let alone compile? In other words, this:

    var serializer = new TextRecordSerializer();

    should be this:

    var serializer = new TextRecordSerializer[Customer]();

    (Sorry I couldn't use angle brackets, blogger disallows them.)

    Have you posted the code anywhere since you wrote this?

    You might consider cleaning up the comments a bit, too. The spammers are such a shame.

    ReplyDelete
  21. This comment has been removed by a blog administrator.

    ReplyDelete
  22. This comment has been removed by a blog administrator.

    ReplyDelete
  23. This comment has been removed by a blog administrator.

    ReplyDelete
  24. Hey Chris,

    You are correct. In fact the generic type declaration _was_ in there but was intepreted as an HTML tag. Thanks for pointing it out.

    Mea culpa, I haven't posted the code anywhere. I wrote this when VS2008 was in CTP and I no longer have it. The most interesting bits are in the code snippet. From there you should be able to fill it out. We'll call it a learning exercise :-).

    ReplyDelete
  25. I think 2moons dil changes my life. Because of 2moons gold, I meet a lot of friends. Besides, my friends usually give me some 2moon dil. I usually buy 2moons dil through Internet and advice from my friends, so I gain a lot of cheap 2moons gold and harvest in life.

    9Dragons is a very good game. Through buying 9Dragons gold, I find fun in it. I am so glad that I can earn a lot of 9 Dragons gold. 9Dragons cater to the taste of young people. With cheap 9Dragons gold, you can get everything you want in this game. So I like to buy 9 Dragons gold. For me 9Dragons money is not just a simple thing.

    ReplyDelete
  26. In preparation for the purchase of a tennis racquetbefore, we must consider your financial ability to bear; On this basis, a further comparison, as far as possible, choose your tennis racket. Now a lot of cheap tennis racquet and more mixed materials, the proportion of mixed-use to control the stiffness of the tennis racquet discount and the shock-absorbing capacity, the more rigid cheap tennis racket, the swing more powerful force; but the relative resilience of the shock-absorbing capacity and discount tennis racket performance of talks on the easier it is for the wrist and elbow injury.
    head junior tennis racket
    wilson tennis racquet
    wilson tennis racket
    head tennis racket
    babolat tennis racket
    Womens Handbags
    Cheap Purses
    Designer Handbags

    ReplyDelete
  27. Women’s nike tn Shox Rivalry est le modèle féminin le plus tendance de baskets pour le sport. tn chaussuresConcernant la semelle :Cheap Brand Jeans ShopMen Jeans - True Religion Jeans nike shoes & Puma Shoes Online- tn nike, le caoutchouc extérieur, l’EVA intermédiaire et le textile intérieur s’associent pour attribuer à la.ed hardy shirts pretty fitCharlestoncheap columbia jackets. turned a pair of double plays to do the trick.Lacoste Polo Shirts, , Burberry Polo Shirts.wholesale Lacoste polo shirts and cheap polo shirtswith great price.Thank you so much!!cheap polo shirts men'ssweate,gillette mach3 razor bladesfor men.As for

    ReplyDelete