Tuesday, November 16, 2010

Fun with function call profiling

On a lark, I decided to construct an experiment to see how much (if any) difference in runtime there was for various permutations on function calls in C++. I wanted to test objects vs. pointers, and static vs. dynamic binding. This was not a rigorously defined experiment, and I was honestly expecting not much difference at all. However, I was shocked to discover that these variations create some substantial differences in runtime. Further, this finding is interesting because function call overhead is not reported by profiling.

I defined four trivial foo() functions that do nothing and return nothing:

void foo() { return; }

class Unvirtual {
public:
void foo() { return; }
};

class Base {
public:
virtual void foo() { return; }
};

class Derived: public Base {
public:
void foo() { return; }
};


The main() code called foo() 50 billion times in each of the following manners:

Unvirtual o; o.foo();
Base o; o.foo();
Derived o; o.foo();
Unvirtual *o = new Unvirtual; o->foo();
Base *o = new Derived; o->Base::foo();
Derived *o = new Derived; o->foo();
::foo();
Base *o = new Base; o->foo();
Base *o = new Derived; o->foo();


main() was engineered to make sure that the dominant factor in the runtime was the function call overhead -- the code is small enough that the program should be immune from I/O delay and cache misses, and I compiled with -O0 so that nothing would get inlined. I made one empty loop just so that I could subtract the loop overhead from the runtime. I used times() to get the runtime, and I ran everything 6 times to average the results. Here's what I found:



Let's go through the list of dismaying findings!

First: pointers cost runtime. Look at the difference between unvirtual.foo() and unvirtual->foo() -- a jump of 35% in runtime overhead!

Second: virtual functions cost you dearly. Look at the jump between unvirtual->foo() and base=>derived->foo(), which is an astounding 177% increase in runtime overhead.

Third: static derived objects incur the same cost as pointers. I do not really understand this case. Any static object should have static binding, so the compiler should be able to figure out at compile-time which function to call, and there should be no run-time cost at all. But if there IS a runtime element to resolving which function to call, then it should have a significant overhead cost like the others. I'm not sure why a static derived object incurs a small additional cost, but it was consistent across the 6 runs.

One huge zit on this data set is the cost of the global ("::foo()") function. As a global function, there's no dynamic component to it. In fact, it should be the absolute cheapest function call because it doesn't have to place the implicit 'this' variable onto the stack like all the others do. However, it has one of the worst runtimes, falling right in the middle of all the dynamic types! This defies explanation. It also had the largest swing in its 6 runtimes, ranging from 13k to 27k, a spread of over 100%. (All of the other runtimes correlated to within a percent.)

So what do we take away from all this? I think the important lesson is the relatively high cost of virtual functions, even when it's possible to figure out which function to call at compile-time. However, keep in mind that function call overhead is usually a trivial part of realistic program runtime, so don't interpret this data to mean you should avoid virtual functions in any speed-critical application.

No comments:

Post a Comment