SIDEBAR
»
S
I
D
E
B
A
R
«
Uncle Jens’s Coding Tips
May 6th, 2007 by jens

Ever since Brent “NetNewsWire” Simmons posted his Thoughts On Large Cocoa Projects the other week, I’ve wanted to add some of my own tips. I’ve worked on some big projects (iChat, Apple’s Java runtime, OpenDoc) and have sometimes had to find my way around in others (Safari, Mail), so I know what Brent means when he says:

There’s no way I can remember, with any level of detail, how every part of [my app] works. I call it the Research Barrier, when an app is big enough that the developer sometimes has to do research to figure things out…

It’s been said many times that “the main person you’re writing comments for is yourself, six months in the future.” It’s always a good idea to keep that shadowy figure in mind while you code. Here are some other techniques I’ve found invaluable:

Prefix your instance-variable names

Make instance variables instantly identifiable by prefixing their names with a particular character, so you can tell them apart from local variables. I use “_”, so my ivars have names like “_delegate” or “_name”. I’ve seen “m”, for “member”, used in C++ code. MacApp used to use “f” for “field”. But it doesn’t matter what you use as long as you do something to distinguish them.

Also useful, to a lesser degree, is Prefixing global/static variables with a different character like “g” or “s”.

Why do this? Because the scope and lifespan of a variable are incredibly important things to know about it. When I’m figuring out a piece of code, I want to know “where did this variable come from? What does it affect? How long doesit last?” I also want to be able to see at a glance what state of the object a method uses or changes.

I find this notation applied in most large projects at Apple; and I always notice the lack of such prefixes when I’m trying to figure out some code that doesn’t use them. I must confess that I’ve sometimes gone as far as to insert the prefixes into other people’s code myself just so I can keep things straight while working on it.

(Note: Some might dismiss this as being a form of Hungarian Notation. Not so. Hungatian Notation annotates a variable with its type. This was useful in the distant past, when languages like BCPL and pre-ANSI C didn’t have much type-checking, so it was up to the programmer to make sure s/he didn’t pass the wrong type to a function call. Nowadays the compiler does that for us, making it much less important to tag that onto the variable name. By comparison, the scope of a variable remains very useful for the programmer to know, and can be identified using a single unobtrusive character like “_”.)

“TEMP” and “FIX” markers

If you’re making a temporary change, put a magic word like “//TEMP” in a comment right next to it, so you’ll remember to take it out ASAP. By “temporary” I mean something that’s just for your own immediate debugging/testing purposes and shouldn’t be left in the code for long. We all do this—verbose logging inside inner loops, commenting out a block of code to see whether it caused a regression, quickie scaffolding to get something up and running. The kind of thing you don’t want to check in, or put into a build. So put in a special marker word the moment you make the change.

Then (just as important) search your project for “TEMP” before any checkpoint such as “svn commit”, or just before you finish work for the day, and take out any of those temporary changes that are left.

(Apparently it’s possible to configure your Subversion repository to do such a check for you, and prevent you from submitting any changes containing the word “TEMP”, but I’ve never looked up how to do it.)

If you know something needs to be fixed or finished up in a piece of code, but you’re not going to do it right away, put a different magic word like “FIX” or “TODO” in a comment right next to it, along with an explanation. This creates a sort of integrated to-do list inside your project. I’ve found it invaluable because (a) I’m forgetful, and (b) I’m lazy. Or more charitably: I can only think so many layers deep at a time. Robust code needs to handle any possible combination of circumstances, but that whole tree can be too complex to keep in mind, especially when first writing that code. So what I do is focus on the “normal” flow of control, and maybe one or two likely others, but leave little “FIX” breadcrumbs behind to remind me to tackle the other circumstances later. Things like “FIX: Retry on timeout” or “FIX: This assumes the text is a single line”. It’s also useful if I notice a mistake in a part of the code I’m not working on right now, and don’t want to derail my current task by going to fix it.

At a later point in the project, though, it becomes important to search for all the remaining “FIX” comments and give them a more formal representation as items in your bug database.

Your own special logging function

Define and use your own “Log()” function/macro, instead of the built-in ones. Console logging is your friend. Breakpoints and single-stepping are the microscope, but logging is the stethoscope, the EEG, the higher-level view of what the whole program is doing. (It’s hard to believe now that, until OS X, the Mac didn’t have a console. If you wanted to do even the most basic logging, you had to use a third-party tool or else DebugStr() which dropped you into MacsBug on every call.)

But logging can become too much of a good thing. You don’t want your program to spew lots of stuff to the console in normal use, because that slows it down a lot and makes the console less useful for anyone else. So you should turn off all your excellent logging except when you need it.

The lazy way to do turn off logging is to comment out, or even delete, all of those NSLog() calls. Don’t do that! You’ll probably need them later on, when your supposedly 100%-debugged code turns out to have one more problem. Instead, make your own “Log()” macro that you can turn on or off with a single switch:
// Logging.h
extern BOOL gLogging;
#define Log   if(!gLogging) ; else NSLog
#define Warn(FMT,...)   NSLog(@"WARNING: " FMT, ##__VA_ARGS__)

That’s all it takes. Now you can use “Log()” just like “NSLog()”, except that nothing will get logged unless you set the value of gLogging to YES. Initialize that variable based on a “Logging” user default [code left as an exercise for the reader] and you can then go into Xcode’s Executable inspector and add the command-line arguments “-Logging YES” to enable logging whenever you run your app in Xcode. But sometimes you want to log warnings about Bad Stuff, which should always appear, even if Log() is turned off. You could always use regular NSLog for this, but I find it useful to have a Warn() function:

#define Warn(FMT,...)   NSLog(@"WARNING: " FMT, ##__VA_ARGS__)

Again, Warn() is called just like NSLog(). But it’s always enabled, and it prefixes the message with “WARN:”, which looks important and stands out even in the midst of a bunch of logorrhea. It looks distinctive in your source code, too.

Break your functions into paragraphs

Try to organize the code inside a function into “paragraphs” of no more than a dozen lines, prefixed with a comment describing what that paragraph does. Most people nowadays have the good sense not to write functions/methods that are more than a page long. But even a page of unadorned code can be hard to figure out six months later. So give yourself the Cliff’s Notes, little sub-headings that describe what’s going on:

// Build a header dictionary out of the input:
...
// Serialize that into RFC822 format:
...
// Send it:
...
if( ok ) {
// Success! Now parse the response:
...
} else {
// Failed; interpret the error code:
...
}

You can think of these comments as being the labels inside the boxes in an imaginary flow-chart of the code. I find them very useful later on—for example, if there’s some problem in that RFC822-format data, I can find at a glance the dozen lines that are responsible for it.

Zero tolerance for compiler warnings

As soon as you begin a project, turn on “Treat Warnings As Errors” and enable “All” warnings. This will prevent any warnings from creeping into your project.

“Treat Warnings As Errors” has its own checkbox in the “Warnings” section of the “Build” tab of the project/target inspector. There’s no checkbox for “All Warnings”, so you’ll have to edit “Other Warning Flags” and add “-Wall”. (Don’t be fooled by the name: “All Warnings” doesn’t enable all warnings; it skips some annoyingly pedantic ones. There’s also “-Wmost”, which skips even more, but I couldn’t tell you which ones.)

Compiler warnings must seem like a good idea to compiler writers. They point out things that are likely to be mistakes, but which are still valid code, so the compiler shouldn’t lay down the law and make the programmer fix them. A warning passes the buck. But for a developer, warnings are bad to leave in your code:

  1. They stick around. The first time you get a warning on line 348, you might look at the code and decide that it’s not a problem. Unfortunately, that doesn’t make the compiler warning go away. It’ll reappear every time that source file is recompiled, without any indication of whether or not you’ve OK’d it.
  2. They mutate. Even if you remember that the warning on line 348 isn’t a problem, and even if you haven’t edited line 348, the warning could become a real problem in the future if you change something that line 348 depends on (such as the declaration of a variable that it uses.) If you ignore the warning, you won’t know about the problem.
  3. They accumulate. If you don’t keep your code clear of warnings, they grow and grow. By the time a source file has a dozen warnings in it, you’re not likely to notice a new one. It’s also annoying wading through warnings when you’re looking for the actual errors.

Unfixed warnings are especially bad when you’re making architectural changes to your code. A real-life horror story: A colleague of mine was tasked with making someone else’s library 64-bit compatible. She didn’t know the code, but fortunately gcc has a number of warnings that point out typical 64-bit issues, such as casting a pointer to an integer that’s too small to contain it. Unfortunately, though, this library already generated thousands of compiler warnings when it built in regular 32-bit. So the first order of business was to clean up the code and get rid of those warnings, before she could flip the 64-bit switch and get to those warnings. And guess what? She found that the old warnings were pointing out two serious bugs, one of which had been reported but wasn’t reproducible enough for the original programmer to track it down.

I’ve seen some people complain that “there’s no way I can get this perfectly-legal code to compile without warnings!” All I can say is that I’ve never run into such a problem. Sometimes you have to make the code a little bit more verbose, like adding type-casts or breaking an expression into two statements, but not often, and these changes don’t affect the quality of the compiled code.

Jumping to definitions (Cmd-double-click)

This is just a basic Xcode goodie, but once in a while I find an experienced developer who’s never heard of it, so it’s worth pointing out:

In Xcode, hold down Command and double-click on an identifier to jump to its definition. This works for classes, methods, functions, variables, even #defines. It works for identifiers in system frameworks too—it’ll jump to the declaration in a header. If there are multiple choices (like a method name used in several classes) you’ll get a pop-up menu listing them.

A related tip is to Option-double-click to look up the documentation of an identifier in Xcode’s documentation window.

Don’t use tab characters in source files!

The world will never come to an agreement on whether a tab character indents 8 spaces or 4, especially on the Mac, where lots of Unix tools (and Unix source code) are hard-coded for 8. So since different people will have their tab-width preferences set differently, just don’t use tab characters in your source code if you want everyone to be able to read it.

In Xcode, go to the Indentation pref pane and uncheck “Tab key inserts tab, not spaces”. In Textmate, check “Soft Tabs” in the tabs pop-up at the bottom of the editor window. You won’t notice a difference in editing text, but your source code will now look properly indented to everyone.

EOF

That’s all for now … I hope some of you actually read this far, or even found some of these tips novel/useful. I have a few others in mind that I might write up later, especially if people liked this post.


42 Responses  
  • Daniel Jalkut writes:
    May 6th, 20071:37 PMat

    Good tips. I like to use “o” prefix for IB outlets. I am one of those who uses “m” prefix for regular instance variables. I used to use _ but convinced myself it was safer to use m to avoid namespace collision with Apple.

    The meta-tagging-comments are a great idea. I have a variety of habits along these lines. Mark Dalrymple suggested an extension to the “FIXME” type comments, which would work well in a group environment. Basically just tack on initials whatever to make it clear who wrote the comments. So I might write: FIXME(dcj): we need to clean up this crud! One of the problems with FIXME tags I’ve noticed in group environments is once they get stale, nobody takes responsibility for them.

  • Matt Deatherage writes:
    May 6th, 20071:40 PMat

    I’ve seen some people complain that “there’s no way I can get this perfectly-legal code to compile without warnings!” All I can say is that I’ve never run into such a problem.

    Apparently, you’ve never had to use a deprecated API for which there is still no approved replacement. This is especially a Carbon problem, where one-and-only APIs like TrackDrag still require regions and other deprecated QuickDraw structures that you have to create with deprecated APIs. Cursor management, QuickTime media handlers, even PackBits all require deprecated APIs, but as of now, there are still no approved replacements.

    It’s not just Carbon, either. A while back, Jonathan Kew noted that CGFontCreateCopyWithVariations generates a compiler warning that it’s deprecated, even though that function was new in Tiger.

    I understand that this is why there’s now a switch to turn off only the warnings for deprecated functions, and it’s possible to isolate them in wrapper functions in individual files for maximum granularity, but it deserves mention that there are a fat lot of Macintosh applications that cannot be built without any warnings whatsoever, and maybe this is why so many people don’t try.

  • Todd Ransom writes:
    May 6th, 20071:46 PMat

    I tried adding -Wall but I end up with a warning “suggest parentheses around assignment used as truth value” every time I use an enumerator. How do you handle this since it is such a common pattern in Cocoa code?

    // normal enumerator code
    while (thisObject = [enumerator nextObject])
    {
    // do something
    }

  • Dmitry Chestnykh writes:
    May 6th, 20072:01 PMat

    Wow, so great that you’re guys writing coding tips on the perfect time for me, when I’m learning Cocoa and Mac programming :)

    Jens, thank you!

  • Dmitry Chestnykh writes:
    May 6th, 20072:02 PMat

    Todd, here we go:

    while ((thisObject = [enumerator nextObject]))

    It tells the compiler to first evaluate your expression in parentheses, then use the value of this expression to check for “while”.

    (Just found the same warnings in my project :)

  • JMC writes:
    May 6th, 20073:32 PMat

    I’m not trying to rehash the old holy war of tabs vs spaces, but I have to disagree on that tip. I think there is just as many reasons to use tabs as there are to use spaces. Some of the more notable reasons include the fact that since a tab is a single character, it can make your source code slightly smaller. Also, with tabs you allow the person viewing your code to decide the amount of indent. One distinction that most people get wrong is the idea of indenting vs formatting. You should use tabs for indenting and then use spaces to make sure things line up.

  • acdha (LJ) writes:
    May 6th, 20074:22 PMat

    I’ll second JMC’s comments: I find tabs much easier to work with because I can set them to my preferred level, which turns out to be a surprisingly big win when working with code using the “wrong” indentation settings - enough so that I’ll take the time to convert a file if I need to debug anything non-trivial.

    On the subject of Log(), I’ll go a step further: any moderately complex program which isn’t completely trivial needs to have a log system which is always on and accessible for end users who need to debug it. If you don’t have more specific needs (e.g. logging some information to a database or having to deal with privacy/security-sensitive data) the Unix syslog(message, priority level) model works well enough for most things.

    A concrete example: we’ve had endemic problems with the OS X NFS client, DirectoryService and automount - these are difficult to diagnose because they happen infrequently under fairly heavy use without an obvious cause. I listed the components above in order of decreasing frustration: the NFS client logs absolutely nothing about problems which require the client to be rebooted; DirectoryService logs tons of useless information (when signaled with USR1/2) but nothing helpful like actual error codes, and automount actually provides a good view into its functioning at the syslog debug priority. We’ve been able to kludge around the automount bugs and partially obviate the DS bugs but while the NFS client causes daily data-loss problems, the lack of any meaningful debugging capability has prevented us from providing much useful information to Apple engineering and they’re unlikely to have a good handle on this without something more reproducible or a psychic debugger. In consequence we’re moving those jobs to Linux - not just because it actually works but also because it’s far more debuggable, which is worth more to us than the OS X’s advantages in other areas.

    (Package management is another example of this effect: if you want an app to be used in large environments it needs to be either trivial to manage or be Microsoft Office.)

  • Peter N Lewis writes:
    May 6th, 20074:56 PMat

    If you wanted to do even the most basic logging, you had to use a third-party tool or else DebugStr() which dropped you into MacsBug on every call.

    Actually, DebugStr could log the message to MacsBug without dropping into MacsBug by adding “;g” to the end of the text (after the ; was a command, g just means continue “go”). This was also quite useful for running commands like “scramble heap”, “heap check”, “stack crawl” and such. For example DebugStr( “Message;hc;g” ) would log Message, check the heap and continue only if the heap was still valid - very useful for finding those memory corruption bugs, all too often occurances in Mac OS 6-9 without protected memory and with such small amounts of memory space and memory corruption prone Handles.

    Good article - we use f for field and g for global, it definitely helps. We also prefix sg for local file globals to distinguish from program wide globals.

  • Jens Alfke writes:
    May 6th, 20075:00 PMat

    Matt: You have a point that deprecation can be a problem. It’s probably worth turning off that particular warning in such projects.

    Todd: I actually try not to use NSEnumerators; they’re expensive. I iterate arrays by numeric index, and other collections by first getting an array of allValues or allKeys. Fortunately Leopard finally adds a “for( x in y)” that makes enumerating collections easy and fast!

    JMC: I don’t agree. The space savings are a few hundred bytes per file. And I find the “wrong” indentation level in files much less annoying than the “wrong” tab width, because the latter makes the indentation completely impossible to follow without reformatting. But let’s be gentlemen and agree to disagree :)

  • alex.r. writes:
    May 6th, 20076:39 PMat

    Actually, good Hungarian notation is not really used to indicate the type of a variable but its intended use. Which is still pretty useful nowadays.

    See http://blogs.msdn.com/larryosterman/archive/2004/06/22/162629.aspx for a good summary.

  • Maciej Stachowiak writes:
    May 6th, 20076:45 PMat

    I agree with most of this advice. One exception: if you find a function is complex enough to break into “paragraphs” with a comment before each one, you should consider factoring the paragraphs into separate, well-named functions. I’d do your example as:


    NSDictionary* headers = buildHeaderDictionary(input);

    NSData* serializedHeaders = rfc822Serialization(headers);

    bool ok = sendRequest(serializedHeaders, payload, &response, &error);
    if (ok)
    parseResponse(response);
    else
    handleError(error);

    Or something to that effect.

    The more you can make the code itself read like English, the less comments are needed. I usually reserve comments for explaining why the code is doing something, rather than what it’s doing, which I prefer to express through function and variable names. Explaining highly non-obvious algorithms usually merits a comment too (like references for a hash function or handwritten sorting algorithm).

  • Steve Miner writes:
    May 6th, 20077:13 PMat

    Don’t use _underscore as your prefix. See Naming Methods

    Names of most private methods in the Cocoa frameworks have an underscore prefix (for example, _fooData ) to mark them as private. From this fact follow two recommendations.

    Don’t use the underscore character as a prefix for your private methods. Apple reserves this convention.
    If you are subclassing a large Cocoa framework class (such as NSView) and you want to be absolutely sure that your private methods have names different from those in the superclass, you can add your own prefix to your private methods. The prefix should be as unique as possible, perhaps one based on your company or project and of the form “XX_”. So if your project is called Byte Flogger, the prefix might be BF_addObject:
    Although the advice to give private method names a prefix might seem to contradict the earlier claim that methods exist in the namespace of their class, the intent here is different: to prevent unintentional overriding of superclass private methods.

  • Jens Alfke writes:
    May 6th, 20077:30 PMat

    Maciej: I do try to keep methods short, but beyond a certain point I think it can diminish readability, since the reader has to skip between different methods to follow what’s going on; and also because a lot of state may need to be passed between those methods. As always, it’s a matter of taste where one chooses to stop in breaking things up.

    Steve: I was talking about instance variable names, not method names. Ivar names don’t have the same name-conflict issues at runtime because they’re not late-binding (they compile into integer offsets from ‘self’ instead of being looked up at runtime.)

  • Daniel Jalkut writes:
    May 6th, 20078:02 PMat

    Ivar names don’t have the same name-conflict issues at runtime because they’re not late-binding (they compile into integer offsets from ’self’ instead of being looked up at runtime.)

    That’s a really good point, which I hadn’t really thought about. I am not sure if it will convince me to change my m-prefix for iv’s, but I breathe a sigh of relief about some of my existing classes that still have the _-prefixed iv names.

  • Peter Hosey writes:
    May 6th, 20078:15 PMat

    And I find the “wrong” indentation level in files much less annoying than the “wrong” tab width, because the latter makes the indentation completely impossible to follow without reformatting.

    But a tab is always the right width, because it’s the viewer who sets it, not the author. Indenting with spaces will look wrong when you move the code to someone who uses more or fewer spaces than you do; tabs don’t have that problem.

    The problem comes when you use tabs to create columns. That’s wrong, because then the columns don’t line up when the tab width changes. That, I think, is where your objection originates. You should always use spaces to create columns.

    Indentation, however, is the proper use of a tab, and tabs are the proper way to indent.

  • Jens Alfke writes:
    May 6th, 20079:01 PMat

    Peter: It’s not that simple. If the tab width is set to 8, as in all Unix-derived code (and all the Cocoa sources I’ve seen), then the editor uses a mixture of tabs and four spaces to get the 4-character indents. (Which is dumb, but blame the old Unix guys who started it, or more accurately the companies that made the terminals that had fixed 8-character tab stops.) So that code is going to look completely messed-up to someone with different tab settings. Because I work with some of that code, I keep my tabs set to 8.

    But if you have tabs set to 8 and you view code that uses 4-char tabs for indentation, the indentation level is 8 characters, which is pretty ridiculous looking and makes most normal code fall off the right edge of the window. So a tab is absolutely _not_ the right width for me.

    (Moreover, nearly all code I’ve seen uses extra spaces for indentation. Xcode does this for you in Obj-C code, to make the colons line up. That stuff looks really awful if you change the indentation width.)

    OK, I declare a moratorium on further arguing about tabs. But if you keep using them, I am not going to work on your code. ;-)

  • Blacktiger writes:
    May 6th, 20079:39 PMat

    After reading a c++ style guide online, I’ve recently been experimenting with postfixing member variables with an underscore (aka var_). I don’t like doing that though… so then I started prefixing function parameters with an underscore (aka _var) and leaving member variables alone. The great thing about doing this is that you only need to do it at the implementation of a function. If you have an interface defined you can leave it alone.

  • Harvey Swik writes:
    May 6th, 20079:57 PMat

    Jens: The content here is awesome (Especially Warn()) but it hurts my eyes to read it.

    Orange on Orange on Orange does not good contrast make. :-(

  • Graeme Mathieson writes:
    May 6th, 200711:40 PMat

    Jens, it strikes me that your Log macro could be made a little more robust. My C is pretty rusty, but it’s described well here: Stupid C++ tricks: Adventures in Assert (OK in terms of assert instead of logging).

  • Domain of the Bored » Blog Archive » Tabs vs. spaces writes:
    May 7th, 20071:07 AMat

    […] Jens Alfke wrote a post of coding tips that includes this advice: […]


»  Substance:WordPress   »  Style:Ahren Ahimsa