Memory

Accessing memory. One of the simplest operations of programming — taking information to be performed upon. You'd think such a common and trivial operation as this, central to the very basics of programming, would be rote knowledge as to perform. That this most simple of operations is the single largest cause of programmer error and security breaches, as well as the single largest timesink of programmer time in the debugger, goes to show just how far, far awry the C++ language has gone.

The main symptoms of this category of error are:

  1. Access violations (Windows) or segfaults (*nix)
  2. Random memory corruption ("Trampling")

While this type of error can sometimes be easily deduced, it can also be the cause of bugs that only reveal themselves every 3rd blue moon, in Release mode only, on end-user computers to which the development team does not have access to. This page aims to document some of the many, many ways this bug can occur, aiming to be near-exhaustive, although this will almost certainly never be the case.

Accessing Invalid Memory

Using Uninitialized Variables

Two major causes for accessing invalid memory would be:

  1. Using an uninitialized pointer (points to somewhere random in memory)
  2. Using an uninitialized index (has some random index which is probably far, far beyond the bounds of the associated array)

Accessing Beyond Array/Container Bounds (buffer overflow)

Access without a bounds check

This is a common security hole abused by attackers in just about any piece of code. This is caused by expecting a given index is in range rather than double checking, even when that index depends on user input. For example, strcpy()ing a user inputed string to a buffer of fixed length — this can be abused by an attacker by providing large strings which may cause a buffer overflow, resulting in possible crashes (simple DoS attacks), or even code injection (vantage point to install a virus, back door, or other malicious code). The solution is distressingly simple for most situations: Check your bounds. For example, strcpy() has an alternative, strncpy(), which allows you to pass the buffer size as an extra parameter. Or better yet, use std::string, which keeps track of sizes itself, and even automatically grows the buffer size as needed.

A more blatent example would be telling the server you shot player #31 in a game with only 29 players.

Actual size out of sync with expected size for bounds check (improper bounds check)

This is another common security hole. This is caused by having an explicit user-inputed size which is erronous, and inconsistently used (say, allocating based on that number, but being open to reading beyond that buffer length due to depending on, say, there being a null at the end of the input — see "Access without a bounds check" above), or storing a copy of a buffer/array length for whatever reason, and then reusing that even after the buffer has been resized.

The Microsoft JPEG Vulnerability would be an example of this I believe, with the buffer size relying on the .jpeg header's size (or dimensions?) field(s), and the actual loading of data depending on the absolute file size (which could be far larger than the .jpeg header indicated).

Change of Container size during iteration (improper bounds check)

Common programming error. This one is usually pulled off by passing a container size by argument, but anything that stores the size of a container independantly of the container's own recollection is vulnerable — the usual problem is objects removing themselves from containers while those containers are in use (being iterated over). While there are other problems associated with such behavior, the overflow aspect at least can be avoided by always relying directly on what a container reports it's own size to be, rather than a copy of that.

Off by 1 error

Common programming error. This one can take many forms — starting at 1 in a 0-based array, for example. The classical poster child of this is with C-style character arrays being copied — if one allocates space for strlen(string1) characters for string2, then does a strcpy(string2,string1);, a tiny and very hard to track down buffer overflow occurs (of 1 character in size). This is because strlen() does not count the terminating NUL of a string, which is required to properly store C-style strings.

Accessing Invalidated Iterators

Using an erase()d iterator while looping

Common programming error — due to it's odd proper handling, this one gets special mention. This is usually come across when you're looping over a container and removing elements from it.

for ( std::vector<T>::iterator i = v.begin() ; i != v.end() ; ++i ) {
    if ( ... ) v.erase(i);
    // Problem!  v.erase(i) invalidates the iterator i, meaning our for statement's ++i
    // invokes undefined behavior, usually resulting in a crash or skipped elements.
    // Instead we want to use the iterator returned from erase, which points to the
    // next element, but this has a cavet too:
 
    if ( ... ) i = v.erase(i);
    // Problem!  Since i now refers to the next element in the list, our for statement's
    // ++i will cause us to skip over this next element entirely.  Worse yet, if there
    // isn't a next element -- that is, if we just erased the last element and now
    // i == v.end(), the ++i will advance us past end(), invoking undefined behavior,
    // usually resulting in a crash after the iterator hits some inaccessable memory
    // during it's rampaging buffer overflow due to the fact that the i != v.end() check
    // ended up being skipped for the only time it was equal.
}
 
// The "safe" std::vector erase version:
for ( std::vector<T>::iterator i = v.begin() ; i != v.end() ; /* NO INCREMENT HERE */ ) {
    if ( ... ) {
        i = v.erase(i);
        // This won't work with associative containers like std::set, since the C++ standard
        // doesn't say they return iterators from their erase(iterator)s.  Some implementations
        // return iterators anyways, code (especially portable code) should avoid relying on this.
    } else {
        ++i;
    }
}
 
// The "safe" std::set erase version:
for ( std::set<T>::iterator i = s.begin() ; i != s.end() ; /* NO INCREMENT HERE */ ) {
    if ( ... ) {
        s.erase(i++);
        // This won't work with std::vector, since std::vector::erase(iterator)
        // invalidates all iterators from (and including) iterator to end().
    } else {
        ++i;
    }
}

Mutating Container during iteration

Mutating Container while storing iterators

(note to self for commenting: remember that playlist example that only failed on one computer?)

Accessing Invalidated Memory

Accessing Out-of-scope Stack Variables

Accessing Deleted Memory

Rule of 3 Violation

Unmanaged shallow copy

delete-ing a previously deleted pointer that was not set to NULL

Accessing Relocated Elements

Accessing pointers to std::vector elements which have been resize()ed to a new location