Defensive Coding in Objective-C

By uliwitness

When programming in a C-descended language like Objective C, there are many things that can easily go wrong. To avoid the worst of these errors, programmers have come up with various coding conventions that make it harder to cause such bugs. We’re not talking about indentation or spacing, but rather about “mini-patterns” that ensure certain errors are caught more easily. Here’s my spontaneous, certainly not exhaustive list:

Autorelease Early, Autorelease Often

When you allocate a new object that doesn’t immediately go into an instance variable, it is easy to forget to release that object and leak it. Even if you remember to call -release on it at the end of your method, someone might later add a return statement somewhere, and overlook there’s an object in need of releasing.

One way to fix this is to use goto and a bail: label to cause all exits from your method to go through one funnel point that releases everything again. Kind of a “dealloc method for your method”. goto is not inherently bad (that’s just a rumour brought about by a misinterpretation of the title of a paper by Mr. Dijkstra). That said, the code quickly becomes hairy if you have many different error exits from your method.

An easier way to fix this is to just remember to -autorelease the object right after you create it. That way, at the moment of creation, where it is glaringly obvious there’s an object in need of later cleanup (based on alloc/init or copy in the name), you already ensure you’re not leaking. If someone needs it later, they can always retain it explicitly. Leak-free code for free, and even for people with the attention span of a goldfish (Or poisson rouge, as my favorite leak-hunting colleague would say).

NIL Everything That Isn’t Bolted Down

The problem with C is that local variables are not initialized to zero. Nor are pointers to released objects cleared to nil. No, local variables contain arbitrary numbers that happened to be on the stack when they were created, and variables valiantly keep pointing at the spot that used to house your NSString long after you’ve released it. “There! Look! There’s a big bunch of nothing here that seems to be an NSString!”

A good way to avoid spending hours trying to track down dangling pointers is to set them to nil whenever they contain nothing. Every time you declare a pointer like

	NSString* myString;

stop yourself and instead initialize it properly

	NSString* myString = nil;

You’ll be grateful you did that the moment someone adds an if statement around a few lines that used to assign a value to this variable.

The same applies to objects that you dispose of. The moment you dispose of an object, set the variable that used to point to it to nil (be it an instance variable, a global, or just a local one). Again, in a complex function, someone might insert an if statement that releases your object, and miss that under certain conditions, the code you wrote still tries to access that object. When you later debug that code, nil will make it obvious the object is gone. On the other hand, any old pointer, probably still pointing at valid-looking remnants of the object that used to be there, will not obviously be invalid to you.

I’ve defined myself a DESTROY() macro like GNUstep has it to help with this. DESTROY() first releases an object, then sets its variable to nil. But I only write DESTROY(myVar);.

Don’t Use Accessors in Constructors or Destructors

This may sound a bit odd, but there is a reason to this madness. Constructors (i.e. -init methods in ObjC) and destructors (i.e. -dealloc or -finalize) are special methods: They are called before your object has fully been initialized, or may be called after it has already been partially torn down.

If someone subclasses your class, your object is still an object of that subclass. So, by the time your -dealloc method is called, the subclass has already been asked to do its -dealloc, and most of the instance variables are gone. If you now call an accessor, and that accessor does anything more than change the instance variable (e.g. send out notifications to interested parties), it might pass a pointer to its half-destructed zombie self to those other objects, or make decisions based on half-valid object state. The same applies to the constructor, but of course in reverse.

Now, some people say that accessors should not be doing anything but change instance variables, but that is utter nonsense. If that was all they’re supposed to do, we wouldn’t need them. Accessors are supposed to maintain encapsulation. They’re supposed to insulate you from the internal details of how a particular object does its work, so you can easily revise the object to work in a completely different way, without anyone on the outside noticing. If an accessor could only change an instance variable, you would have very limited means to change this internal implementation.

Moreover, I don’t think Apple would have introduced Key-Value-Coding and Key-Value-Observing if they didn’t agree at least partially that it’s fine to do a bunch of things in response to an accessor being used to mutate an object.

Mind you, all of this only applies to accessors being called on self from your constructor. If you’re setting up another object, you essentially have no choice but to use its accessors, and it would very often violate encapsulation if you did otherwise.

In Fact, Don’t Do Anything Big in Constructors and Destructors

The above rule can actually be made more generic: Whenever you do anything in a constructor or a destructor, try to think whether you really need to do it here and now. They’re mainly there to manage your instance variables. If you have to register for notifications or otherwise access external objects, it’s always safer to do it elsewhere, when you can be sure that your object has been completely constructed.

A neat trick in constructors is to call -performSelector:withObject:afterDelay:0 on yourself. This will ensure a method to initialize stuff gets called on your object the next time through the event loop. Of course, for many objects that opens yet another can of worms (imagine you’d just created an NSScanner and had to wait for the event loop to run once before you can use it!).

Another thing that sometimes works is to access external objects lazily. E.g. the first time someone calls any of the -scanXXX methods on an NSScanner, it could transparently and implicitly do some more involved setup and set a flag that this setup has happened.

I have a similar recommendation for destructors: You should try to close files or relinquish external resources explicitly, before your object is released, if you can. There’s nothing wrong in having code in your constructor that makes sure of this as well (i.e. to avoid leaking open file descriptors), but it is desirable to have that as a fallback, not as the preferred API.

Now, before you go all “goto considered harmful” on me: I’m not saying doing worthwhile things in constructors in destructors was bad. Rather, all I’m saying is that other options for good places to do it are often overlooked. Both because the whole matter of half-constructed/destructed objects can get hairy, and also because anyone else can retain your object and thus prevent your resource from going away.

If the object is by itself the resource, that is exactly what you want. It is what retain/release was designed for, after all. But if the resource simply represents a file or a hardware device, and someone deletes the file or unplugs the device, you must be able to cope gracefully with your object still existing because some nit retains it, even though the actual resource is gone.

And if you want to call methods that a subclass would want to override (and in Objective-C, there is no such thing as a “method that can not be overridden”, by design!), you’d prefer to have a fully-initialized object ready.

Follow Apple’s Singleton Design Pattern

There is a nice little example implementation of the Singleton design pattern on Apple’s developer web site. Implement it.

While I think the -retain/-release methods should actually be left alone so you get some decent crashes and notice when someone retains/releases a singleton the wrong number of times (retaining or releasing should be allowed on any object, even if just to make it easy to keep certain code agnostic of the precise type it’s dealing with, so we can’t just make it throw an exception), they got a lot of details right:

They don’t wait for -init to return to set the global singleton variable. After all, singletons can be subclassed, too (such a subclass usually gets instantiated instead of the superclass, as just like on Highlander, there can only be one). If any of the -init methods do anything that might trigger code that might in turn call your +sharedManager method (like, I don’t know, register for IOKit notifications and send NSNotifications when they come in), this would invite endless recursion. Since the singleton global hasn’t been set yet, that second call would create a second singleton instance, which would in turn trigger the notifications, which would in turn create a third singleton … and so on.

What Apple’s code does is to cleverly override +alloc to set the global variable. That way, it is already set before anyone ever gets around to doing anything with the object. They also have a lock on the class. So, even in a multi-threaded implementation, they only allocate the object once. Since they return nil on subsequent attempts to alloc the object, they also only ever allocate and init one object.

It’s a very solid implementation, and whenever I’ve taken a shortcut on this in the past (and the code on my site will show you I have been doing this this until fairly recently), it’s caused me pain. I’m glad I finally understand it now.

Clear Your Weak References

One convention in Cocoa is that you don’t retain your delegate. This kind of “weak” reference from the delegator to the delegate may seem dangerous at first, but makes complete sense in the common use case:

Usually, a controller object creates another object and retains it, and sets itself up as that object’s delegate to be able to modify or benefit from its behaviour. For example, an NSWindowController creates an NSWindow, becomes its owner by retaining it, and makes itself the delegate of that window.

Now, if the NSWindow retained its delegate, if it retained the NSWindowController, we would have a retain circle: When the NSWindowController is released by the last external party, it would still have a retain count of 1, because the NSWindow would have its delegate retained. However, the NSWindow would also have a retain count of 1, because the NSWindowController created the NSWindow and kept it retained. So both are waiting for the other to release them. Only then would their -dealloc methods get called, which would release the other one. They’d be like two lovers lost in space, separate from everyone else, but closely holding on to each other.

So, the rule was laid down: You don’t retain your delegate, as the delegate probably already owns you. But what happens if someone else retains the object you own? Your NSWindowController is released, it relinquishes its hold on the NSWindow by releasing it. But that other guy still has the NSWindow retained, so it stays open. Someone clicks your window and a delegate method is called.

Wait a second! The NSWindowController was the delegate! But it is gone!? Well, unless the NSWindowController was considerate enough to tell the NSWindow, by calling its -setDelegate: method and setting it to nil, NSWindow wouldn’t know. It would find itself yelling at a dead object, probably crashing.

So what have we learned? Unless you’re a fan of zombies, you’ll appreciate setting any weak references to yourself to nil in your -dealloc method.

In case you’re wondering who might be mad enough to retain your objects, look no farther than the deferred method call mechanisms, particularly NSTimer, NSThread, NSInvocation and the -performSelector:... family of methods that eventually end up with your NSRunLoop.

Of course, you can go and invalidate the timer, cancel the -performSelector:s, and in many cases you well should, but in other cases, you may actually want all of these operations to be performed on the object before it goes away (though maybe not in our example of an NSWindow). And of course this isn’t really a good example, because a good design would probably not install timers and the likes on objects but themselves (that usually violates encapsulation, after all). Then again, with NSInvocation you have no choice.

Use symbolic constants

Cocoa and Foundation make use of string constants for identification purposes a lot. Notification names, keys in NSDictionary objects. You also use them elsewhere, to refer to files on disk, processes using their bundle identifier etc.

Now, everyone knows that defining a string constant as a symbolic constant using #define or by defining it as a variable at global scope makes it easier to change this string later. Particularly if the string is used in several places. But often, people “know” that this constant will never need to be changed, so they just hard-code it. Bad idea. There are other advantages to a symbolic constant:

The Compiler knows about symbolic constants.

That is right. That means that, should you mistype the symbolic constant, the compiler will only see an unknown identifier. If you mistype a regular string constant, all the compiler sees is a string. A compiler has no idea that “MyPrettyColor” and “MyPrettyColour” are supposed to be the same thing, but one of them is obviously a mis-spelling. If you had defined a symbolic constant like

#define MyPrettyColour    "MyPrettyColour"

It would compile to the exact same code as using the pure string constant, but if you mistype MyPrettyColour as MyPrettyColor, the compiler would immediately tell you about that and you wouldn’t wonder why a dictionary value always gets returned as nil even though you know for certain you put it in the dictionary.

This applies similarly to any other kind of constant, be it an int, a double or whatever. It’s easy to hit 111111 when you meant to write 11111, and that excess digit is not always easily noticed, as our mind tends to “correct” what we see as it tries to make sense of it. If you define a symbolic constant, the compiler will catch any additional letter you type by accident. Even better, you can define the constant correctly. Tend to forget the U at the end of unsigned numbers? The constant will always contain it, you only have to think of it once. If you forgot it, you can simply add it, and all other spots that use the constant are magically fixed.

And last but not least, symbolic constants can improve readability. Imagine drawing code where you deal with margins, line widths etc. Now in one spot you draw a button, and in another you hit-test it. To do hit-testing, you inset your rectangle. What do you think is more maintainable?

buttonHotRect = NSInsetRect( [self bounds], 67, 42 );

or

buttonHotRect = NSInsetRect( [self bounds],
                             MyLeftMarginWidth + MyLineWidth + MyLineWidth
                             + MyRightShadowWidth + MyRightMarginWidth,
                             MyTopMarginHeight + MyLineHeight + MyLineHeight
                             +MyBottomShadowHeight + MyBottomMarginHeight );

If you ever change the drawing, how likely is that you’ll recall what separate numbers 67 or 42 were made up of? Any compiler worth its salt will fold the numeric constants and thus generate the same code for both of them. There is no reason to not go for readable code.

Closing words

Reading this, you may think I should just stop writing thoughtless or bad code instead of doing things like these to mask the issues. But the matter of the fact is: Everyone has a bad day, everyone makes a mistake. Particularly when you program in teams and you’re programming all week long and there are deadlines when you have to ship, the likelihood of mistakes increases.

And even if you’re not working in a team, remember the Zarra description of programming alone: You’re programming in a team of three people. Past You, who was a moron, Present You, who is average, and Future You, who is a genius. You’re already being annoyed by Past You‘s lack of skill, you don’t want to make it any harder on Future You.

Following the rules in this article will make many bugs more obvious while you’re debugging them, and will prevent many crashes from happening in the first place.

So, do you guys have any neat coding tricks to share that I forgot to mention?

4 Comments Leave a comment

  1. Hey there,

    Just kind of getting to grips with obj-c and coding in general. Was working through a little programming exercise (in loops) when I didn’t initialise a variable, but was wanting to sum an integer that I’d requested from the user. I used the ‘% 10′ trick to grab the last number entered, place it in a variable, then continue through the loop, summing as the loop iterated. The sum came out very large indeed sometimes, despite only a small number being entered by the user. I thought that it might have been translating it to hex or something, but that wasn’t it. Then I noticed, when putting in a NSLog command every loop to see what the variable was doing, that the sum variable was incrementing as expected, but it was starting off with some spurious value. Once I’d noticed this, I initialised the variable at 0, and the problem was cured. If I didn’t initialise the variable at 0, the number that it was being intialised by xcode for appeared to be 32767. Is this a similar thing you are talking about with initialising strings at NIL? Where does that number come from?

    Thanks

  2. There is no “empty” variable. As such, whenever you declare a variable, it just contains whatever random value was at that position in memory before. This can be a different value each time, or the same (arbitrary) value each time, or it can even be zero.

    You can only get a predictable value in a variable by explicitly assigning a value to the variable, or by using an API call that clears the memory for you (e.g. when you alloc/init an Objective C object, all its instance variables get initialized to zero by alloc for you, or if you use the calloc() function, the memory that it gives you is cleared to zeroes).

  3. Thank you for your reply – and clear explanation.

    Ian

  4. Wow what a great post. Just what I was looking for – how to write proper code. Do you have more of this great stuff?
    Luda

Share your thoughts