When to use the Heap in C++
When I first learned memory management the idea of a heap, and allocating my own memory sounded awesome. I quickly jumped in to this brave new world and started heap allocating anything and everything I could think of. The number of segfaults in my code climbed, but I paid no attention, the heap was cool and I wanted to be a cool kid.
After writing code this way for a good year, and getting very good at allocating memory and keeping track of pointers, I had a new thought. Perhaps the heap wasn’t the best way to go about things. I listened to a talk given by Bjarne Stroustrup and he decried usage of the heap. After listening to this talk, and collecting my own thoughts on the matter, I came to hold a much more mature memory management style.
Always stack allocate…
The stack is your friend. It’s automatically managed, has a reasonable scope, and as long as you work within some reasonably defined limits, it’s perfectly suitable for a good deal of algorithms. Nothing prevents you from getting a pointer to a stack allocated piece of memory, for passing to other functions, and you’re guaranteed that as long as the function calls are nested within that function you’ll have access to that variable.
You can use the stack to pass in character arrays to receive stream output from a file, you can use the stack to pass by reference variables into sub functions, and you can use the stack to create STL data structures. In fact, there’s nothing you can’t fundamentally do with the stack, but there are some constraints that can make life difficult.
… unless you have a good reason not to
As beautiful as the stack is, there are some genuine reasons not to use it for certain tasks. The primary concern with using the stack is that it is of a fixed (small) size, and allocating large objects will almost certainly make your day difficult. If you need 100KB as a buffer for reading a binary file or network stream, then the stack is not the right place for this data structure. In Windows the default stack size is 1MB, while in POSIX pthreads it is typically 16kB. This really isn’t that much room to play with when we start introducing large buffers and other data structures with a primary purpose of bulk storage.
The heap too is of a fixed size, but it’s generally much larger (you have a whole address space out there). For most intents and purposes it is unlimited.
Handling Resource Allocation Gracefully
So, the point here is that you should be wary of using the heap for much of anything. It makes a beautiful promise of virtually limitless, unscoped memory, but in reality it is a liar, and will covertly infiltrate your code base, introducing the nastiest of bugs. Unfortunately we have to live with it, and we have to use it from time to time. How do we use it wisely?
I’ve found a strategy that works fairly well, it’s a generalized strategy for resource management, called Resource Allocation is Initialization (don’t worry, I didn’t think of it myself). The core idea here is to always manage memory in constructors and deconstructors.
The STL containers are very good examples of this. When you create a vector on the stack, the elements inserted into that vector certainly aren’t stack allocated, instead they are managed by the vector container, and kept on the heap. When you push items onto the vector, they are stored in heap storage, and when the vector goes out of scope, its resources are automatically freed by the destructor, and the memory is released back to the heap.
I would argue that anytime you have an object that needs to be stored on the heap allocation should be done this way. Encapsulate the object in a class, and allocate/deallocate resources in the constructor and destructor. The benefits are numerous.
- Resources now follow the call graph. Heap allocated memory is now cleanly deallocated when the containing stack allocated class falls out of scope.
- Exception handling is now simplified. Deconstructors are still called as the stack is unwound when exceptions take place, elegantly releasing resources. We no longer have to create a long list of possible inconsistent states to check for, and fix up, when exceptions do take place.
- Implementation details are hidden better. Without explicit steps required to manage memory, we can create better encapsulation and worry less about how memory is managed inside objects. Think about the last time you had to worry about allocate inside of a vector or map.
Things that aren’t memory
The cool thing about RAI is that it applies nicely to things that aren’t memory too. Any sort of unmanaged resource can follow the RAI strategy and achieve the same benefits I outlined. Candidates for this include network sockets, database connections, and locks and mutexes.
Closing Remarks
I think, as is typical in C++, that with memory we are given a large collection of tools. The language gives us a huge chest with funny-shaped manipulators, and no real guidance on how to effectively or safely use them. I’ve seen a lot of code, and the memory management schemes are all different. This one makes a lot of sense, and is highly efficient. Ever since I switched to this strategy of memory management I’ve seen the number of bugs in my code reduced significantly, it’s elegant and seems to ‘just work’.
The stack must almost always be used.
Andy Harglesis
26 Nov 12 at 9:53 pm
The heap should almost never be used.
Andy Harglesis
26 Nov 12 at 9:54 pm
Do not allocate heap memory if it serves no purpose.
Andy Harglesis
26 Nov 12 at 9:54 pm
Trust the stack; it’s your friend.
Andy Harglesis
26 Nov 12 at 9:54 pm