Now, it's very tempting to violate this transparency rule, and performance can be improved by making what seem like innocent assumptions. Here's an example of using direct code cache addresses for return instructions, placing our code cache addresses on the application stack in order to avoid our indirect branch lookup cost. We get a significant performance improvement on several benchmarks.
But, we have to make sure that we catch every instance where the application reads a return address and stores it somewhere. We were able to do that for the SPEC CPU2000 benchmarks shown here, but only with some specific pattern matches and hacks. Larger programs have more cases of address reading, and we gave up trying to extend this to our desktop applications. To solve this in general would require checking all loads, which is far too costly.
We found that every shortcut like this violates some program's dependencies. This illustrates one of our biggest lessons: to run real-world applications, we must be absolutely transparent.
|Copyright © 2004 Derek Bruening|