Log in

No account? Create an account
13 September 2006 @ 08:31 am
Computer Language Geeking  
Raymond Chen writes today about when to mark a method as virtual in a C++ class. He makes some good points, which in turn could be considered to be a critique of object oriented languages that make every function overridable.

I'll need to think on that a bit. One of those problems with language flexibility that we don't think enough about is the testability aspect - every bit of power a language grants you is likely to be abusable in some way, even if they're not all as bad as C++ :-)
(Anonymous) on September 14th, 2006 12:37 am (UTC)
(tdl posting anonymously)

I agree, but I think of it in terms of interfaces and contracts. Every piece of capability that a class exposes to other classes is part of its interface, and so each piece of capability comes with requirements and invariants that need to be well-understood and documented. If class B derives from A, then B is a client of A, though it's almost never described that way. So B, as a client, must fulfill all of the invariants that A has (we hope) made clear.

Method-level overridability (via virtual) is one such capability, and--as the original post mentions--it frequently comes with invariants that A better mention, and B better respect. But that's true of all of A's interface, not just virtual. "Protected" is another part of a class's interface that comes with invariants that derived classes must respect.

For six momths last year I had the chance to work in Objective-C, for my actual job. (It's always been C++ otherwise.) Obj-C is, of course, a language in which every method is overridable. What's more, every class is introspectable, which means you can dig through and see what methods exist. And--best of all--there's no public/private.

So if you were psychotic, you could dig through the core Cocoa classes to find out what's there. Later, when you're up against a deadline and you realize that some Cocoa class isn't parameterized properly (as is often the case) and you can't override its behavior in the way you want, and suddenly you remember seeing a method called __SliderPreUpdateInternal, which was clearly internal and not part of the interface, but you can *guess* what it does, and you try overriding it and hey! it seems to work...

People did this. It's an Obj-C way-of-life sort of thing. I always found a better way because I'm prissy and not nearly pragmatic enough. But that means I get to be haughty and look down my nose at people who violate encapsulation because the language is "flexible" and lets them get away with it.

(Anonymous) on September 14th, 2006 01:10 am (UTC)
(tdl posting anonymously again, because he likes to hear himself type and because noah made the mistake of geeking about languages)

Something I should have mentioned, in the context of testability: I mentioned invariants that a class A must document, and B must respect. Ideally, A would not only document, but also enforce those invariants when possible. At runtime at the very least, or at compile time for extra credit. (Sometimes a contract can't be enforced programatically. Thread locks are a good example. But it's rare.)

Adequately documenting and enforcing A's contract adds a burden to A's author, and so it's in A's author's interest to keep A's contract as simple as possible. Raymond's post says, in effect, that "virtual" complicates a class's contract and so should be used carefully. I totally agree.

A languages that *only* has virtual methods prevent us from simplifying our contracts. Languages without protected/private further hamper that ability.

In general, I'm a big fan of language features that let me rein in a class's interface, to put a boundary around it and make statements about what things are possible and which are not. This goes against arbitrary open-ended flexibility, but it assists enormously with testability and stability. If I have to choose between hypothetical surprising future uses of my code and stability, and I'll choose stability any day.

Noahangelbob on September 14th, 2006 02:23 pm (UTC)
This makes a lot of sense and seems like a good way to think of it. I do a fair amount of verification of my 'contract' for APIs, mostly using assertions (compile-time is usually much harder to verify). I keep thinking about problems like these in terms of having multiple program modes, roughly analogous to debug versus release... But of course, the more of those you have, the harder it is to test reasonably. And the more different you make a single debug mode from release mode, the more potential you have for one-mode-only bugs, which are bad news.

(Anonymous) on September 14th, 2006 03:00 pm (UTC)
Yeah, I tend to scatter gobs of assertions, some of which are only enforced in a non-release build, and some of which are enforced all the time. (Using two different assert-like macros.) I used to work in code in which every single statement was inner-loop, and so basically all of my assertions were debug-only. More recently I've worked in interactive apps where a lot of high-level interfaces aren't inner loop, so nowadays most of my assertions are always-on, which helps fight the one-mode-only problem.

In theory you can later convert an always-on assertion to a debug-only assertion, but it took me a while to realize that this choice should actually percolate out to the documentation for your class. For every prerequisite that you document on a class (eg, "index must be >= 0 and < size"), you need to mention the consequence of violation. Does it post a fatal error? Warning? Exception? Different consequences mean different usage patterns for the client.

So how do you document a prerequisite that you only enforce in debug builds? The client can't count on your enforcing it, so it should be documented that a violation will lead to "undefined behavior". That's the sort of thing that leads to huge problems in a public API, so I almost never use debug-only assertions when testing public API prerequisites.

In other words, debug-only assertions are excellent for testing my own internal invariants, but when an assertion is part of the public "contract", I use the appropriate exception or error-posting mechanism, and avoid debug-only tests like the plague.

The rare exception is something like operator[] on a Vec3d class. That's always gonna be inner-loop, so I bite the bullet and say "If index is < 0 or >= 3, undefined behavior occurs."


Noahangelbob on September 14th, 2006 03:53 pm (UTC)
Again, makes sense. I'm less allergic to saying "undefined behavior" than you are, but that may be a difference in what interfaces we're used to using. The C standard library, for instance, actually has a *lot* of undefined behavior. Freeing NULL, for instance, or almost any pointer-y stuff. So there's already a lot of "no, seriously, just don't do that" :-)

But yeah, I keep the vast majority of my assertions on at all times, for similar reasons.