Why doesnt C++ have a garbage collector?

Why doesnt C++ have a garbage collector?

Implicit garbage collection could have been added in, but it just didnt make the cut. Probably due to not just implementation complications, but also due to people not being able to come to a general consensus fast enough.

A quote from Bjarne Stroustrup himself:

I had hoped that a garbage collector
which could be optionally enabled
would be part of C++0x, but there were
enough technical problems that I have
to make do with just a detailed
specification of how such a collector
integrates with the rest of the
language, if provided. As is the case
with essentially all C++0x features,
an experimental implementation exists.

There is a good discussion of the topic here.

General overview:

C++ is very powerful and allows you to do almost anything. For this reason it doesnt automatically push many things onto you that might impact performance. Garbage collection can be easily implemented with smart pointers (objects that wrap pointers with a reference count, which auto delete themselves when the reference count reaches 0).

C++ was built with competitors in mind that did not have garbage collection. Efficiency was the main concern that C++ had to fend off criticism from in comparison to C and others.

There are 2 types of garbage collection…

Explicit garbage collection:

C++0x has garbage collection via pointers created with shared_ptr

If you want it you can use it, if you dont want it you arent forced into using it.

For versions before C++0x, boost:shared_ptr exists and serves the same purpose.

Implicit garbage collection:

It does not have transparent garbage collection though. It will be a focus point for future C++ specs though.

Why Tr1 doesnt have implicit garbage collection?

There are a lot of things that tr1 of C++0x should have had, Bjarne Stroustrup in previous interviews stated that tr1 didnt have as much as he would have liked.

To add to the debate here.

There are known issues with garbage collection, and understanding them helps understanding why there is none in C++.

1. Performance ?

The first complaint is often about performance, but most people dont really realize what they are talking about. As illustrated by Martin Beckett the problem may not be performance per se, but the predictability of performance.

There are currently 2 families of GC that are widely deployed:

  • Mark-And-Sweep kind
  • Reference-Counting kind

The Mark And Sweep is faster (less impact on overall performance) but it suffers from a freeze the world syndrome: i.e. when the GC kicks in, everything else is stopped until the GC has made its cleanup. If you wish to build a server that answers in a few milliseconds… some transactions will not live up to your expectations 🙂

The problem of Reference Counting is different: reference-counting adds overhead, especially in Multi-Threading environments because you need to have an atomic count. Furthermore there is the problem of reference cycles so you need a clever algorithm to detect those cycles and eliminate them (generally implement by a freeze the world too, though less frequent). In general, as of today, this kind (even though normally more responsive or rather, freezing less often) is slower than the Mark And Sweep.

I have seen a paper by Eiffel implementers that were trying to implement a Reference Counting Garbage Collector that would have a similar global performance to Mark And Sweep without the Freeze The World aspect. It required a separate thread for the GC (typical). The algorithm was a bit frightening (at the end) but the paper made a good job of introducing the concepts one at a time and showing the evolution of the algorithm from the simple version to the full-fledged one. Recommended reading if only I could put my hands back on the PDF file…

2. Resources Acquisition Is Initialization (RAII)

Its a common idiom in C++ that you will wrap the ownership of resources within an object to ensure that they are properly released. Its mostly used for memory since we dont have garbage collection, but its also useful nonetheless for many other situations:

  • locks (multi-thread, file handle, …)
  • connections (to a database, another server, …)

The idea is to properly control the lifetime of the object:

  • it should be alive as long as you need it
  • it should be killed when youre done with it

The problem of GC is that if it helps with the former and ultimately guarantees that later… this ultimate may not be sufficient. If you release a lock, youd really like that it be released now, so that it does not block any further calls!

Languages with GC have two work arounds:

  • dont use GC when stack allocation is sufficient: its normally for performance issues, but in our case it really helps since the scope defines the lifetime
  • using construct… but its explicit (weak) RAII while in C++ RAII is implicit so that the user CANNOT unwittingly make the error (by omitting the using keyword)

3. Smart Pointers

Smart pointers often appear as a silver bullet to handle memory in C++. Often times I have heard: we dont need GC after all, since we have smart pointers.

One could not be more wrong.

Smart pointers do help: auto_ptr and unique_ptr use RAII concepts, extremely useful indeed. They are so simple that you can write them by yourself quite easily.

When one need to share ownership however it gets more difficult: you might share among multiple threads and there are a few subtle issues with the handling of the count. Therefore, one naturally goes toward shared_ptr.

Its great, thats what Boost for after all, but its not a silver bullet. In fact, the main issue with shared_ptr is that it emulates a GC implemented by Reference Counting but you need to implement the cycle detection all by yourself… Urg

Of course there is this weak_ptr thingy, but I have unfortunately already seen memory leaks despite the use of shared_ptr because of those cycles… and when you are in a Multi Threaded environment, its extremely difficult to detect!

4. Whats the solution ?

There is no silver bullet, but as always, its definitely feasible. In the absence of GC one need to be clear on ownership:

  • prefer having a single owner at one given time, if possible
  • if not, make sure that your class diagram does not have any cycle pertaining to ownership and break them with subtle application of weak_ptr

So indeed, it would be great to have a GC… however its no trivial issue. And in the mean time, we just need to roll up our sleeves.

Why doesnt C++ have a garbage collector?

What type? should it be optimised for embedded washing machine controllers, cell phones, workstations or supercomputers?
Should it prioritise gui responsiveness or server loading?
should it use lots of memory or lots of CPU?

C/c++ is used in just too many different circumstances.
I suspect something like boost smart pointers will be enough for most users

Edit – Automatic garbage collectors arent so much a problem of performance (you can always buy more server) its a question of predicatable performance.
Not knowing when the GC is going to kick in is like employing a narcoleptic airline pilot, most of the time they are great – but when you really need responsiveness!

Leave a Reply

Your email address will not be published.