[Info] GetUnitAura reverse engineering notes

**EmilyStrange** · 01-04-2010

I am looking forward to Apoc's notes with interest.

I don't think I have the time or inclination to go diving through that much code. GDC and E3 is rolling towards us and that means there will lots of fires to put out between now and the end of May.

It will be interesting to see, once Apoc's notes are published, what was inlined, and what wasn't, and infer why Blizzard chose to do it.

That said, there are many functions that will never be inlined due to size, and many that cannot, due to compiler technology.

Inlining with the inline keyword these days, for the most part, is a suggestion to the compiler to prefer inlining, rather than a dictate. The only good way to force inlining on member functions, in C++, is to define the function in the class body -- which gets messy as anything when dealing with templates or complex multiple inheritance trees. Everything else, in terms of non-member methods, is up for grabs.

Edit: Also, just because one function is inlined in one location, due to an "inline" directive on the function declaration, does not mean it will be inlined when called from a different function.

P.S. #define could be used for forced inlining, but there is such a thing as "code maintainability."

**Cypher** · 01-04-2010

Originally Posted by EmilyStrange

[SNIP]
Inlining with the inline keyword these days, for the most part, is a suggestion to the compiler to prefer inlining, rather than a dictate. The only good way to force inlining on member functions, in C++, is to define the function in the class body -- which gets messy as anything when dealing with templates or complex multiple inheritance trees. Everything else, in terms of non-member methods, is up for grabs.

Even if you define the member in the class body, that is still does not force inlining of the function. It's no different to using the 'inline' keyword, it's simply implicit rather than explicit.

Originally Posted by EmilyStrange

[SNIP]
P.S. #define could be used for forced inlining, but there is such a thing as "code maintainability."

1. The preprocessor has nothing to do with force inlining. What I think you mean is that you could use compiler-specific extensions for forced-inlining and select the correct one at compile-time via use of the preprocessor.

2. Of course there is such a thing as 'code maintainability'. You said it yourself in the first post. You said that defining member functions inside the class body "gets messy". Now, why would one care about mess in the source code of a project? BECAUSE IT MAKES IT HARDER TO MAINTAIN!

**EmilyStrange** · 01-05-2010

You are correct, the preprocessor has nothing to do with inlining functions in the compiler. I wasn't clear on the definition of inlining. What I mean by use of the preprocessor is that you would define a "pseudo-function", i.e. just a code snippet, that would expand in to whatever code you need to call. It is not a function, per se, but it is inlining the code.

Defining the function in the class declaration, rather than in the implementation file, on every decent C++ compiler I have used to date, will *always* inline the function, for given values of "always." i.e. always being if the function *can* be inlined. Polymorphism is the biggest one that will change "always" in to "yeah, right..." There are other cases where this explicit inlining will fail.. Unlike the use of the "inline" keyword though, defining inside of the class body is more explicit than a hint and is not dependent on which compiler flags were specified.

With a project that is approaching more than a decade in development, which so many programmers than have cycled through the development team, code maintainability and stability of the client and server (with so many millions of players) are most likely higher priority than anything else these days.

**Cypher** · 01-05-2010

Originally Posted by EmilyStrange

You are correct, the preprocessor has nothing to do with inlining functions in the compiler. I wasn't clear on the definition of inlining. What I mean by use of the preprocessor is that you would define a "pseudo-function", i.e. just a code snippet, that would expand in to whatever code you need to call. It is not a function, per se, but it is inlining the code.

Defining the function in the class declaration, rather than in the implementation file, on every decent C++ compiler I have used to date, will *always* inline the function, for given values of "always." i.e. always being if the function *can* be inlined. Polymorphism is the biggest one that will change "always" in to "yeah, right..." There are other cases where this explicit inlining will fail.. Unlike the use of the "inline" keyword though, defining inside of the class body is more explicit than a hint and is not dependent on which compiler flags were specified.

With a project that is approaching more than a decade in development, which so many programmers than have cycled through the development team, code maintainability and stability of the client and server (with so many millions of players) are most likely higher priority than anything else these days.

Oh okay, now I understand what you mean in regards to using the preprocessor. My bad. It's just such an ugly 'solution' I didn't think of it first off.

It doesn't matter whether the implementations you tested do it or not. You stated outright that it's a way to force inlining. It is NOT. The C++ Standard says nothing about it other than it's simply an implicit inline. And I can tell you right now that if your functions are large enough, compilers WILL reject the implicit inline. MSVC for example did last time I tested it.

It is no different to using the 'inline' keyword, it's simply implicit. Any difference in behaviour is implementation-specific and should not be relied on (both because it makes your code unportable, and because it's undocumented and could change at any time via new compiler releases).

If you NEED a function to be inlined, then use compiler-specific extensions. Otherwise just use 'inline' and let the compiler do its job. If the function is not being inlined and you seriously think it makes a difference, then use PGO. With PGO data the compiler will be able to make a much better decision about whether the function should be inlined than any human ever could, so it's by far your best bet.

EDIT:

Did you modify your original post or did I misread? I swear you originally said "there is no such thing as code maintainability". If I misread, then sorry, but if you ninja-edited to correct it, please don't in future, correct the note at the bottom of your post with an EDIT tag. It gets very confusing and disrupts the flow of conversation otherwise.

EDIT 2:

I just thought of a reason you may think that manually marking a function as 'inline' may cause problems. Due to the way C++ compilers work (i.e. the compiler works on a per-translation unit basis, and the linker then consolidates all translation units) some functions may not be inlined correctly because the compiler cannot 'see' the definition of the function at the time of compilation. This is why it's normally recommended to put the definition of an inline function in a header file, and also why templates MUST be in a header file, because the compiler can't instantiate it if the implementation is in a separate translation unit.

Whether that's still an issue on modern optimizing compilers is another question though. It should definitely not be an issue if LTCG is enabled, though it may remain an issue on some compilers, especially in Debug mode, and maybe in Release mode with LTCG and other fancy stuff disabled (as LTCG can often heavily affect compilation time).

**EmilyStrange** · 01-05-2010

Originally Posted by EmilyStrange

With a project that is approaching more than a decade in development, which so many programmers than have cycled through the development team, code maintainability and stability of the client and server (with so many millions of players) are most likely higher priority than anything else these days.

Was the edit I made, about 2 minutes after I did the original post. Nobody had responded at that time.

It is no different to using the 'inline' keyword, it's simply implicit. Any difference in behaviour is implementation-specific and should not be relied on (both because it makes your code unportable, and because it's undocumented and could change at any time via new compiler releases).

Agreed. It is implementation specific. My statements only apply to specific compilers. Though this is how Intel, VC++ and GCC work, *at this time*.

Due to the way C++ compilers work (i.e. the compiler works on a per-translation unit basis, and the linker then consolidates all translation units) some functions may not be inlined correctly because the compiler cannot 'see' the definition of the function at the time of compilation.

Agreed. Visibility of a particular object within the compiler memory is the #1 reason why something may be inlined in one module, but not in another. Defining your function in the declaration file only ensures that the visibility of the function is raised, but it may also increase code size for each module that includes the class, even if the class is not instantiated, simply having a reference to another module in your code can include it.

Whether that's still an issue on modern optimizing compilers is another question though. It should definitely not be an issue if LTCG is enabled, though it may remain an issue on some compilers, especially in Debug mode, and maybe in Release mode with LTCG and other fancy stuff disabled (as LTCG can often heavily affect compilation time).

LTCG is often easily broken, or at least, partially broken, by implementation specific details. Though it is another pass (and dogged slow on some projects, though Xoreax can negate part of that problem), LTCG is not an exhaustive solution so does not guarantee optimal generation of code, though it helps significantly.

**Cypher** · 01-05-2010

Originally Posted by EmilyStrange

Was the edit I made, about 2 minutes after I did the original post. Nobody had responded at that time.

I was referring to your previous post. Never mind though, it's not important.

Originally Posted by EmilyStrange

Agreed. It is implementation specific. My statements only apply to specific compilers. Though this is how Intel, VC++ and GCC work, *at this time*.

That may be true, but as I've already stated, it's not documented behaviour, so there is no way you can rely on it, which makes is (for the most part) useless if you NEED the function to be inlined for technical reasons (it's very rare that this is necessary, but there have been one or two times when it has come up for me, both when working on very low level projects).

Hence, you're better off using documented compiler-specific extensions. At least that way you get specific and documented guarantees, rather than what is effectively a guess (especially when the compiler is updated).

Btw. Proof? (Not saying it's not true, simply saying that it's a pretty useless claim to make without proof, as it makes it even more unreliable)

Originally Posted by EmilyStrange

Agreed. Visibility of a particular object within the compiler memory is the #1 reason why something may be inlined in one module, but not in another. Defining your function in the declaration file only ensures that the visibility of the function is raised, but it may also increase code size for each module that includes the class, even if the class is not instantiated, simply having a reference to another module in your code can include it.

You're mixing up quite a bit of terminology there, but if I'm interpreting you correctly, you're saying that including unnecessary headers into a translation unit can affect the generated object file?

Whilst that may be true, as soon as the linker sees that the excess functions are not referenced it should strip them.

Exceptions to this would be if the functions are exports, or the linker is not in an optimizing mode, but the former situation is rare, and the latter is irrelevant, as the only time it would crop up is when compiling for Debug, in which case code size isn't really an issue.

Another exception would be if the header was for example using anonymous namespaces and dumping large static objects into each translation unit, but this is another very rare case...

Small objects I could understand, but the resulting change in code size would have no effect on the execution speed of the resulting binary, as they would never get in the way of CPU cache optimizations. They'd be run once at the start, then never referenced again, so it's not like they'd affect the resulting size of any performance-sensitive code.

Originally Posted by EmilyStrange

LTCG is often easily broken, or at least, partially broken, by implementation specific details. Though it is another pass (and dogged slow on some projects, though Xoreax can negate part of that problem), LTCG is not an exhaustive solution so does not guarantee optimal generation of code, though it helps significantly.

I never stated LTCG is an 'exhastive solution', nor did I state it guarantees 'optimal generation of code'. PGO on the other hand, comes pretty close.

Fact: The compiler is 'smarter' than you. It knows a lot about CPU internals and can make much better judgements about what should and shouldn't be inlined.

This is especially true when you give it PGO data which resembles the most common real-world usage of the software.

The only times you should really be overriding the compiler are when:
a) You need to for technical reasons. This is only relevant in extremely low level software where the function probably isn't performance critical anyway, but you need it to be inlined for other reasons.
b) You don't want to go to the extreme of using PGO, and you have done extensive profiling with both the function inlined and outlined, and the performance difference is large enough to justify it.

Unless I'm missing something?

At any rate, you still haven't given any better alternatives to what I've already suggested... Or been able to refute any of my arguments.

All you've really said is "agreed", "agreed" and "LTCG isn't perfect". (Yes, I know I'm leaving out some details, but lets get real, for the most part this is the gist of your post.)

**EmilyStrange** · 01-05-2010

I am not trying to refute any of your points.

We are discussing the technical merits, pitfalls and shortcomings of various methods of "inlining" code. I do not think you have missed anything. I do not think I have missed anything.

You're mixing up quite a bit of terminology there, but if I'm interpreting you correctly, you're saying that including unnecessary headers into a translation unit can affect the generated object file?

Yes, sorry, it was a rushed post. Perforce had just bitten me at the time I was writing that so I was trying to juggle two thoughts -- "WTF?" and "coherent argument" -- at the same time. You are correct in your interpretation of my poorly worded statement.

My only "proof" with regard to how a particular compiler does or does not do a particular thing is through my usage of that compiler. As it is obvious that we both know that compiler implementation and technology changes over time, any assumptions of how a compiler will implement a particular feature that is outside of the scope of the specification of the language (and when it comes to C++, even that isn't guaranteed) is based purely on personal discovery and personal usage scenarios. I'll certainly concede that until I can provide definitive proof of how a particular version of a particular compiler handles a specific set of situations, the "proof" is based on observational evidence that has not been documented to a sufficient level.

I never stated LTCG is an 'exhastive solution', nor did I state it guarantees 'optimal generation of code'.

I was not attempting to claim that you did on either points. Just pointing out, since you raised Link-time Code Generation as a possible fix, that it was not a panacea to code generation problems when it came to object visibility.

I think we're both arguing the same subject, from almost the same viewpoint, with slightly varying degrees of what is "acceptable" from our compilers, but also seeing if we can one-up each other on how much we know about this particular technology.

Shall we concede that we're both stating the same thing or should we continue for the benefit of the peanut gallery?

**Cypher** · 01-05-2010

Oh lord, how did we get from me correcting the mistake in your post on inlining (i.e. pointing out that that declaring member functions in the class body is simply an implicit inline as far as the C++ standard goes, nothing more) to where we are now...

Shout-Out

User Tag List

Thread: [Info] GetUnitAura reverse engineering notes

Thread Tools

Search Thread

Similar Threads

Learning Reverse Engineer

Reverse Engineering

Reverse Engineering/Disassembly

[DLL] Reverse engineered Scan.dll

OwnedCore Forums

casino

CoreCoins

My OwnedCore

About Us

Casino