"Computers are getting smarter all the time. Scientists tell us that soon they will be able to talk to us. And by 'they', I mean 'computers'. I doubt scientists will ever be able to talk to us."
- Dave Barry
More pages: 1 2 3
Dealing with uninitalized memory
Friday, June 18, 2010 | Permalink

A common source of undeterministic bugs that are hard to find is uninitialized variables. In a local scope this is typically detected by the compiler, but for member variables in a class you're typically on your own. You forget to initialize a bool and in 255 cases of 256 you get 'true', so the code might appear to work if that's a reasonable initial value. Until your bool ends up on a memory address with a zero in that byte.

One way to deal with this problem is to simply initialize all memory to zero first thing in the constructor. Unfortunately, that adds a runtime cost and code size cost. I figured someone must have attacked this problem before, and I'm sure someone did, but googling on it I haven't found much about it. So I came up with my own approach, which I'll be adding to Framework4. Basically it's a small class and a set of macros to simplify stuff. In my class definition I just tag the range of variables I want to check, typically all of them.

class MyClass
{
public:
    MyClass();
    ...
private:
    CLASS_BEGIN()
    // Variables go here
    CLASS_END()
};

This inserts a start and end tag into the class, which are basically two uint32.
In the constructor I do this:

MyClass::MyClass() : CLASS_INIT()
{
    ...
}

CLASS_INIT() will initialize all memory between the tags to 0xBAADCODE, before any other initialization happens.

At some point where I expect everything to be set up, for instance the end of the constructor or maybe after some call to Init() or whatever, I simply add CLASS_VERIFY() which will check that no memory is left as 0xBAADCODE. Also, it will check start and end tags to make sure they have not been touched, which will also detect common out of range writes.

Adding this to my simple test app I found a whole bunch of variables I didn't initialize. Most of them were simply unused though. I can't imagine what kind of bug could be avoided if something like this is added to the main system classes in a big scale project. And the best of all, this come at no extra cost at runtime because in final builds those macros are defined to nothing.

Name

Comment

Enter the code below



Will F
Friday, June 18, 2010

Neat technique, but one question: what do CLASS_BEGIN() and CLASS_END() expand to? I'm not clear how you'd convert those macros to a range of variables to initialize without adding new members to the class (which seems like an undesirable side effect)

Will F
Friday, June 18, 2010

Ugh, never mind; I fail at reading comprehension

Yury
Friday, June 18, 2010

Does it really works? Consider the following example:

CLASS_BEGIN();
u8 byte0;
u8 byte1;
u8 byte2;
u8 byte3;
CLASS_END();

MyClass::MyClass()
: byte0(0)
, byte1(0)
// oops, missed one
, byte3(0)
{
CLASS_VERIFY();
}

Verification function will do a lookup for 0xBAADCODE, but won't find since three of four bytes were changed. In the same time one variable wasn't initialized at all.

Yes, byte2 will have deterministic value, though it will vary from build to build as class description changes.

fmoreira
Friday, June 18, 2010

@Yuri: I think it was just a way for Humus to exemplify what he means. You could define a 1byte code to initialize all the data.

The main problem here is that actually 0xBAADC0DE ( or every else code ) can be valid data for some variables, so you could get false positives.
But don't get me wrong, this approach is very cleaver!

What about static code analyzers? Shouldn't these kind of tools check for this type of flaws?

Arseny Kapoulkine
Friday, June 18, 2010

There is one thing about class layouts that drives me crazy. And this is changing your memory layout between builds. This leads to all sorts of weird "unreproducible" bugs.

Some compilers (gcc) allow you to declare a zero-size array member, though that's not portable.

Anyway, why do you need those markers? If that's a debugging feature, just work on the whole class.

Mike
Friday, June 18, 2010

Is there any reason why you are not handling the initialisation of memory to a known marker value in the memory allocator?
It seems to me this would be consistant with c++ ways of working (no risk of overwriting vtable ptrs etc ) and not require class_init macro. Of course I probably have only part of the picture, maybe use of inplace new for example ...
Mike

sqrt[-1]
Friday, June 18, 2010

Lots of suggested ways of finding initialised member variables here:
http://stackoverflow.com/questions/2099692/easy-way-find-uninitialized-member-variables

from compiler flags in gcc (and VC team systems) to external tools.

Humus
Friday, June 18, 2010

Yury, yes that's a problem. It will only detect on a 4-byte granularity. If you have a bunch of chars or bools it will only detect if all four are uninitialized. You could potentially make special cases for those though. Like a range of bools where all are initialized to something else than 0 or 1 and check on byte level. For chars it's trickier since no particular value can be considered "invalid". And bitfields are pretty much hopeless.
Anyway, even if it doesn't catch everything, it catches many. And I suppose once an ASSERT() triggers you better look over all variables in the class.

More pages: 1 2 3