Blocks/Closures For C!
Chris Lattner, of Apple’s compiler team, writes :
Until there is more real documentation, this is a basic idea of Blocks: it is closures for C. It lets you pass around units of computation that can be executed later. For example:void call_a_block(void (^blockptr)(int)) { blockptr(4); } void test() { int X = ... call_a_block(^(int y){ print(X+y); }); // references stack var snapshot call_a_block(^(int y){ print(y*y); }); }In this example, when the first block is formed, it snapshots the value of X into the block and builds a small structure on the stack. Passing the block pointer down to call_a_block passes a pointer to this stack object. Invoking a block (with function call syntax) loads the relevant info out of the struct and calls it. call_a_block can obviously be passed different blocks as long as they have the same type.
This is very exciting: it’s the kind of new abstraction the C family has needed for years. As you know if you’ve worked in Ruby or Python or Smalltalk or any functional language, the ability to declare an anonymous function inline, and pass it as a parameter to another function, opens the door to creating new and useful control structures. Blocks are to control structures as struct is to data structures.
For example, one of the things that blew my mind about Smalltalk-80 when I first learned it was how the basic conditional and loop operations were not hardwired into the language; instead, they were just methods of built-in classes. For example:
root := y>=0 ifTrue: [y sqrt] ifFalse: [-1]
(Apologies if I’ve messed up the syntax; it’s been a long time.) Yes, this is an if-then-else expression. But literally, it’s the message ifTrue:ifFalse: being sent to the (boolean) result of the expression y>0, with its parameters being two blocks. And what happens at runtime is that the class True implements ifTrue:ifFalse: by evaluating the first block parameter and returning its value, while the class False implements it by evaluating the second block.
This is, not surprisingly, a bit too expensive a way to implement the ubiquitous if-then-else, so in practice the compiler optimizes this expression into hardwired bytecodes instead. But it demonstrates that you can define comparable control structures yourself for your own purposes … such as for URL routing in a web application, which many Ruby and Python frameworks allow to be customized with blocks.
I think blocks will also make it possible to implement better concurrency patterns like Actors in a clean way. You may recall that I was investigating these, but after a few days I gave up because the necessary stack munging confused the Obj-C runtime too much. Once blocks are available (and hopefully integrated into Objective-C and Foundation) I’ll have to give it another try.
August 31st, 2008 at 6:10 AM
Spectacular news! I’d long ago given up hope that we’d see this kind of fundamental change in Objective C. The relatively modest changes to the language introduced in Objective C 2.0 seemed to cement that for me.
Mark
September 1st, 2008 at 12:51 AM
It seems very interesting. Are Blocks being designed only for Objetive-C, or there will be an ANSI-C version too?
September 1st, 2008 at 6:48 AM
Thanks, it will be very interesting to follow this ‘thread’. There is apparently not much out there about Blocks, so maybe the readers of this post will be interested in a post on macresearch.org.
September 1st, 2008 at 11:51 AM
Carlos — Blocks are being added to the C compiler; there’s nothing specific to Objective-C about them, so they could be used in arbitrary C code, as long as you’re compiling it with Clang.
September 2nd, 2008 at 7:52 AM
Why do they have to add the ‘^’ operator?, would it not be possible to just create a ‘block’ or anonymous function inline, and pass it to any old C function that takes a function pointer?
September 2nd, 2008 at 8:24 AM
Andy — A block is not just a function pointer. It’s a “closure” that carries along a reference to the variables in scope at the time the block was created. At runtime that involves an invisible extra parameter, plus some fairly complex bookkeeping to handle the case where the block’s scope exits from the stack while the block is still alive.
Also, I’m pretty certain that the compiler wouldn’t be able to parse an expression like “call_a_block((int y){ print(X+y); })” — it needs some indicator, like “^”, to tell it where the block subexpression begins.
September 4th, 2008 at 4:44 AM
Will this feature be added to gcc as well or just to Clang. Also will it be possible to store a block definition into a variable to store and pass as an argument, or will only anonynous blocks be supported?
September 4th, 2008 at 9:21 AM
Timothy — I would guess that Apple won’t be putting blocks into gcc, since they’re switching over to Clang. (The complexity of gcc, and the GPL license, make it hard to integrate its parser into Xcode in the ways Apple wants to do in the future to support better syntax-driven editing and code refactoring.) But of course gcc is open source, so someone else could add blocks to it if they wanted.
As I understand it, a block pointer is a first-class data type, so you can store them in variables, pass them as arguments, et cetera; whatever you could do with a regular C function pointer.
September 5th, 2008 at 9:22 AM
Actually, Chris’ post is about how Apple has already added Blocks to gcc, and now is adding them to clang. They prototyped them in clang, then implemented them in gcc, and now are adding the final implementation back to clang. If you check out the llvm-gcc svn trunk, you’ll find all the code there (you can even build your own llvm-gcc with working blocks).
Of course, as with all Apple features in gcc, it will take ages till the features get merged into the FSF branch. Mainline gcc still doesn’t have a single objc-2.0 feature merged (and since nobody seems to care enough, I doubt that’ll ever happen). Pity; they’d be just as useful on the other platforms gcc supports.