I've been doing some research on building a self-decrypting/encrypting injected class.
I've been wondering if it would be possible to self-encrypt/decrypt the binaries injected before and after calls into the injected code. At first I thought the process would be cumbersome and slow, but my tests are showing otherwise. Even on a slow, single-core machine, a simple rotating XOR encryption (hardly "encryption," but I'm assuming Blizzard isn't going to subject my machine to forensics analysis) can encrypt/decrypt roughly a billion bytes per second. That's fast enough to be done on every single call -- every single frame, too -- without a noticeable client slowdown, unless your injected code gets huge.
If you put a simple call-gate into a non-standard section (obviously the encryptor/decryptor needs to be unencrypted; self-decrypting functions are a bitch) and gated everything through that call gate, and had the call gate do the encrypt/decrypt upon enter/exit of the injected DLL, you'd have basically an undetectable binary. It would be obvious that SOMETHING was there, but not what it was.
The weaknesses of the approach are:
1) Some slowdown in the binary due to hashing large blocks of memory on a regular basis. This can be mitigated by reducing the size to be hashed (for instance, we can use the file alignment instead of memory alignment on the sections, and trim up to 0xE00 bytes per section). Also, the xor operation is insanely fast, so I don't think this will really be a problem.
2) The encryptor/decryptor is hashable. However, I anticipate implementing this in two parts, both using raw ASM (which implies limited portability; I know Cypher will complain about that). In the end, the ASM shouldn't look much different from a million similar blocks of ASM, so it will be extremely hard to hash without inducing false positives. Also, since the encryption/decryption will essentially make the entire REST of the binary unhashable, ultimately this will significantly decrease the hashable footprint of the app. You might even be able to skip hiding the module VA since (unless you do something really stupid like name it WoWHack.dll) it will just look like a DLL with garbage in it.
3) It will take some care and work to make this thread-safe. There are two primary ways to do this. One is to use a spinlock to essentially make the injected DLL single-threaded (additional threads would just spin until the first call exits). This is relatively simple to implement, but it does have some limitations -- if your detoured thread took a long time to execute, a second thread could wait for a long time. If the threads were dependant upon each other in some way (or worse, if you had a re-entrant thread), you could deadlock. The other alternative is to implement a "use count" mechanism, again guarded by a spinlock. Essentially each thread would increment the usecount on entry, and decrement it on exit. When the usecount incremented to 1, the thread that triggered that would know it needed to decrypt the rest of the PE. Similarly, when the usecount decremented to 0, the thread that triggered that would re-encrypt the PE. This is a little more performant (no long waits just for a second thread), and it's also safe against reentrancy (a reentrant thread would just increment, and decrement, the usecount).
3a) The thread safe mechanisms above could be hashable if not implemented carefully. However, I'm guessing that there are at least a few lock cmpexchg's in WoW already
4) Although I say this makes the module "unhashable," it's not 100%. There's always the possibility that a second thread could run Warden or something while the first thread is in a hooked routine (and thus the code is plain and hashable). However, this is a very, very small chance, and I don't even know if Warden is capable of running on other threads.
So, this is a fair amount of work to make happen. Here's what needs to happen:
1) Injector code (simple, and done)
2) After injection, the injector process needs to:
2a) Open the remote module and read the section table.
2b) Process each injected call into a trampoline for encryption/decryption (this implies that the injector process would know the list of injected calls somehow; I was thinking that __declspec(dllexport) would work, since all the exports will be encrypted too, thus you're not more detectable)
2c) Manually build the encryptor/decryptor via injected ASM. This sounds hard, but it's not really; you can hash up a random key with something like GetTickCount() (thus preventing key-based hashing by Warden), inject the calls to do the thread-barrier stuff, the decryption, the actual call, and the decryption.
2d) Unlink the module from the loaded module list (this is extra stealth and it also prevents the app from crashing when someone tries to walk the PEB LDR lists and your DLL header is encrypted). For extra security, wipe out the PE header and the import/export tables (now that you're done with them).
2e) Encrypt the sections (pretty much everything except your decryption code, if you decide to keep it in a custom code section).
2f) CreateRemoteThread on the decryptor call gateway, passing in a pointer to the "real" init code as well as the table of encryption entry trampolines built up in 2b.
Once this is done, the remote DLL is injected, encrypted, and running. The init code (after the initial decryption) will set up hooks as usual (IAT, detours, they should all work), but instead of hooking directly to the function in your injected module, they should hook to the appropriate offset in the table of encryption entry points from 2b.
Et voila, a self-encrypting/decrypting injected DLL.
If I can make it work, it will make it nigh impossible for Warden to signature-hash your encrypted code. In fact, if I can make it work, it shouldn't be tightly coupled to any individual code; it will be reusable and make Blizzard's work much, much harder. Warden will be able to detect that SOMETHING is in your process (unless you detour/hook VirtualQuery etc.), but they won't know what, and even a memory snapshot -- as long as the code was in the encrypted state -- will just look like a virus. Plausible deniability at its best.
So right now, most of this is in POC stage, but I'm hoping to have something working within a week (unless I can get some of the heavy hitters to help me with important chunks, like the mechanism to list the hook functions needing call gate trampolines, or the thread-safe mechanisms, etc.)
One of the reasons I'm even posting about this is to ask for help; unlike the usual "security through obscurity" techniques, if I can make this work, it will make the code theoretically unhashable (since I'm using polymorphic virus techniques), except for the encrypt/decrypt call gate (which I have a few tricks up my sleeve to tweak, including using metamorphic techniques, again from VX tech). This means that even if Warden devs read this thread, they won't be able to counter this technique without obtaining a memory dump and running forensics analysis. Any real reverse engineer could counter it in minutes, but the point is that Blizzard doesn't have the time and money to do forensics analysis on a bunch of random boxes, especially if there's nothing tripping hash signature detection routines in Warden to key Blizz to the fact that THIS box is special. It becomes an essentially unbreakable Catch-22 for them; either they have to go super hardcore on their detection, probably banning lots of innocents, probably crashing a lot, and probably running SUPER slow due to network transfers of HUGE amounts of memory data for analysis... OR they simply won't be able to counter it.
By the way, if anyone steals this idea and implements it before I can, at least give a credit to AMM for the idea
I'm open to anyone pointing out flaws I've missed in my gedankenversuch here.