I can't find anything that can build a call trace while maintaining a usable UI in the target process, so I'm writing my own. My approach is to patch branches with INT-3, and use an in-process vectored exception handler to log & continue. I'm hoping that the UI will at least be somewhat responsive because a), I'm not single-stepping, and b), I'm handling the breakpoints in-process.
Before I get too deep into things, does anyone have recommendations for a utility that can build complete traces? By complete, I mean "no missing branches"... a ring buffer is fine, as long as there's no gaps. Periodic stack snapshots aren't of any real interest to me.