Use of XNA if he hasn't done his own Matrix classes.
Use of XNA if he hasn't done his own Matrix classes.
Most of Managed DirectX is just a COM wrapper. However math stuff is rewritten in C# due to overhead issues.
Interesting thing, that. I moved a goodly chunk of my bot out-of-process (for performance reasons, not security reasons; I know the risks) so there were a handful of functions I had to call from a non-render thread, including some big ones like DoString, and also TraceLine.
At first, I was super paranoid about races, and put locking EVERYWHERE (my render thread would block while another thread was calling DoString, etc.). My assumption was that, esp. w/DoString, since I'm potentially modifying a global state (the Lua state object in this case), it's a Very Bad Thing to do fully free-form calls.
Edit: at FIRST at first, I used a super sekrit synch context to force everything to run in the render thread. But this was SLLLLOOOOOWWWW, so I went with an external thread/set the TLS magic method...
Performance was fine with the synch in, but on a whim, I took it (the synch) out, and the interesting thing is that I've yet to hit a single race condition (or in-game GPF, for that matter).
So, I'm not sure if I'm just getting lucky (although calling literally hundreds of out-of-render-thread calls per second with no synchronization for many hours stretches credulity about how "lucky" I can be; surely I'd have hit at least ONE fatal race during that time!) or what, but I'm starting to think that the "don't call engine API's out of the render thread because you'll race and crash the game" thing may be more myth than reality.
Of course, none of this addresses what I believe is the REAL root cause of people doing out-of-thread calls having problems: crappy code. My code (although ugly to read) is NOT crappy, and thinks carefully about what's shared, what happens when, and so forth. Also, unlike most of the samples I've read on here, I'm not leaking memory/handles like a sieve. So, maybe that's it.
In fact, if I had an easy way to CreateRemoteThread, set the TLS stuff and any other setup state (ECX, pushing structures onto the stack, etc.), and an easy way to get non-trivial results back... I might just skip my current in-proc shim for render thread-specific calls and CreateRemoteThread for each call. I'd have to look at the thread startup/teardown perf costs first, though.
Last edited by amadmonk; 01-06-2010 at 04:16 PM.
Don't believe everything you think.
Release builds of C++ use method thunks, IIRC. So the "address of function" you're using is actually just giving you the address of a thunk, and you're copying the thunk into remote memory, which (of course) isn't valid in the target process.
I ran into this problem many moons ago when working on my "encryption call gate" code. There's a VC++ linker setting to disable the thunks, but I can't remember what it is (or even what the feature is called atm... it's not inlining, but something similar... Cypher would know).
Don't believe everything you think.
Thank you amadmonk and kynox.
Except the incremental linking, there's another switch to turn the IRC(some runtime check chunks appended to the end of a function for stack corruption detection) off, which will reduce the required trunk size to be copied to remote process. The injection memory size can be reduced to 1024 or even more less.
![]()
Yeah, forgot to mention that. It references a .rdata variable, so you would need to relocate it if you were to inject it.. which would cause problems.
What you should do is include a disassembler (there are some very lightweight ones around) in your project, so you're only injecting what you need to.