Using the Task Parallel Library to Enforce Render-Thread Invocation menu

User Tag List

Results 1 to 8 of 8
  1. #1
    amadmonk's Avatar Active Member
    Reputation
    124
    Join Date
    Apr 2008
    Posts
    772
    Thanks G/R
    0/0
    Trade Feedback
    0 (0%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Using the Task Parallel Library to Enforce Render-Thread Invocation

    A pretty common problem when writing in-process bots is this:

    1. I want to do some long-running work that will bog down the render thread, so I want to put it in another thread (or a background thread, or whatever). There are a lot of examples of this kind of work (local pathfinding, status spamming to a network connection, etc.).

    2. However, some or all of this work cannot be run in the render thread. This is because Wow stores some important constructs in thread-local storage (TLS), which means that unless you poke this TLS info into every other thread you create, you risk crashing the game any time you touch certain pieces of information.

    It is definitely possible to solve this problem through setting the TLS values in the worker thread and being very, very careful with your thread synchronization. Even then, you run the risk of crashes due to game state being inconsistent with this architecture (for instance, accessing the same Lua state object on multiple threads at the same time is NEVER going to be very safe, and that's just one example.) The only really 100% "safe" solution has previously been to always just run in the render thread.

    Microsoft has been pushing code parallelism alot lately, with things like parallel LINQ (PLINQ) stealing much of the spotlight. However, a bit of code that's standard in .Net 4.0 is the Task Parallel Library (System.Threading.Tasks). Essentially this gives you, among other things, a seamless and easy way to package a unit of work up into a "task" and ship it off to... somewhere else... to be run, and then fetch the results locally. In Wow, you can use a little bit of glue code to ship all of your render thread work to the render thread, but make it look like it runs locally!

    Enough talk. Here's the code snippet:

    Code:
    public readonly Queue<Task> Work = new Queue<Task>();
    
    public T RunOnRenderThread<T>(Func<T> function)
    {
        var task = new Task<T>(function);
        RunOnRenderThread(task);
        return task.Result;
    }
    
    public void RunOnRenderThread(Action action)
    {
        RunOnRenderThread(new Task(action));
    }
    
    public void RunOnRenderThread(Task task)
    {
        // if we're already on the render thread, just run...
        if (GetCurrentThreadId() == _renderThreadId)
        {
            task.RunSynchronously();
        }
        else
        {
            // otherwise, queue up a task to do the work
            lock (Work)
                Work.Enqueue(task);
    
            // and wait for it
            task.Wait();
        }
    }
    Now you're building up a queue of Task objects to execute in the render thread. So in the render thread's EndScene method, you need something like this:

    Code:
    ...
    
    if (Work.Count > 0)
        lock (Work)
            while (Work.Count > 0)
            {
                var task = Work.Dequeue();
                task.RunSynchronously();
            }
    
    ...
    And that (along with tracking the render thread ID so that the RunOnRenderThread method will know which thread is the render thread) is all you need. Note that we favor the render thread in the lock contention (each EndScene drains the entire queue, whereas each execute only gets to enqueue one item in the lock)... this is for two reasons: first, the render thread is more important, and second, if you use this code a lot you can forget that you are essentially "gating" your worker thread to run as fast as the render thread does -- one call per frame!

    There are ways to get around the one call per frame limitation. One of the simplest is simply to package more tasks into a single task. These are just Actions or Func's, after all, so you can just write an anonymous delegate to execute a batch of them at a time. This is what I do in my code, and it's pretty effective.

    Here's a sample usage:

    Code:
    ...
    var myPosition = RunOnRenderThread(() => player.GetPosition());
    ...
    And that's it... now the position virtual method invoke will always automatically happen on the right thread. If you're on the render thread, this will just loop back and execute synchronously. If you're NOT on the render thread, it will queue up a work item, wait for it to be completed, and then shuttle the result back to the calling thread.

    Note that this isn't magic -- you could have done this exact same thing with a queue of Action's or Func's or whatever. However, the TPL makes the code much, much cleaner and simpler -- you don't have to write any of the waiting/synchronization code yourself (each task potentially has a wait while we compute the results, plus you have to associate queued results with queued tasks). So, while TPL doesn't technically (in this case) allow you to do anything you couldn't do before, it makes the code so much cleaner that there's no real reason NOT to implement this in your bots; even if you don't use it very often due to the "render thread gating" effect, it can save you some grief.

    Edit: forgot to mention that this technique is infinitely extensible; if you want to define some methods that can invoke in a parallel thread, with the simple requirement that the TLS data is set, you can extend this design pattern so that you use a custom Task object that has an "invocation policy." This policy can then determine (a) if it's okay to run this task synchronously, or (b) if not, how do we run it in the "correct" invocation context? Using this, you can have all kinds of interesting arbitrary call patterns, including RPC over the network (but figuring out how to serialize an Action or Func is something that I'll leave for advanced readers ) You can also use this technique to enforce a memoization policy (which can dramatically improve your FPS since you really don't need "live" values for every frame for most values). Of course things can get weird if you have "chained" policies like a memoization policy and a cross-thread invocation policy.

    Edit2: yes, astute readers will realize that setting an invocation policy on generic Task objects begins to approach AOP. But since I really loathe all of the aspect-weaving tools I've come across, I'll leave the full AOP implementation up to you.
    Last edited by amadmonk; 12-13-2010 at 02:02 PM.
    Don't believe everything you think.

    Using the Task Parallel Library to Enforce Render-Thread Invocation
  2. #2
    Robske's Avatar Contributor
    Reputation
    305
    Join Date
    May 2007
    Posts
    1,062
    Thanks G/R
    3/4
    Trade Feedback
    0 (0%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Very nice, thank you amadmonk - I was looking for something like this.
    "Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live." - Martin Golding
    "I cried a little earlier when I had to poop" - Sku

  3. #3
    XTZGZoReX's Avatar Active Member
    Reputation
    32
    Join Date
    Apr 2008
    Posts
    173
    Thanks G/R
    0/0
    Trade Feedback
    0 (0%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Good approach; exactly what the TPL was made for.

    I should mention, though, that Task objects are relatively expensive; don't overuse them.

  4. #4
    dook123's Avatar Active Member
    Reputation
    21
    Join Date
    Oct 2008
    Posts
    115
    Thanks G/R
    0/0
    Trade Feedback
    0 (0%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Lots of things are "expensive" depending on "relative" conditions. I once was told OOP overhead was expensive and to write everything in assembly
    ------------------------------
    If not me than who?

  5. #5
    amadmonk's Avatar Active Member
    Reputation
    124
    Join Date
    Apr 2008
    Posts
    772
    Thanks G/R
    0/0
    Trade Feedback
    0 (0%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I'm instantiating, invoking, and GC'ing on the order of 100-500 Task objects/frame without seeing a frame rate hit.

    They may be expensive, but apparently not expensive enough to matter
    Don't believe everything you think.

  6. #6
    adaephon's Avatar Active Member
    Reputation
    76
    Join Date
    May 2009
    Posts
    167
    Thanks G/R
    0/0
    Trade Feedback
    0 (0%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Nice work looks interesting. .Net 4 also has thread safe collections so it might be worth checking out System.Collections.Concurrent.ConcurrentQueue instead of a normal queue + locks

  7. #7
    XTZGZoReX's Avatar Active Member
    Reputation
    32
    Join Date
    Apr 2008
    Posts
    173
    Thanks G/R
    0/0
    Trade Feedback
    0 (0%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Originally Posted by adaephon View Post
    Nice work looks interesting. .Net 4 also has thread safe collections so it might be worth checking out System.Collections.Concurrent.ConcurrentQueue instead of a normal queue + locks
    Absolutely. It uses CaS instead of locks for most operations.

    I'm instantiating, invoking, and GC'ing on the order of 100-500 Task objects/frame without seeing a frame rate hit.

    They may be expensive, but apparently not expensive enough to matter
    OK, when you're in that range you should be just fine. The stuff I deal with is in the 100.000-200.000 range, so yeah... I had to resort to a Task-less actor model based on coroutines because the Task objects were thrashing the GC.

    Lots of things are "expensive" depending on "relative" conditions. I once was told OOP overhead was expensive and to write everything in assembly
    I'm not an unreasonable person; I do realize what real overhead in the real world is. But that's completely irrelevant here.

  8. #8
    amadmonk's Avatar Active Member
    Reputation
    124
    Join Date
    Apr 2008
    Posts
    772
    Thanks G/R
    0/0
    Trade Feedback
    0 (0%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Originally Posted by XTZGZoReX View Post
    OK, when you're in that range you should be just fine. The stuff I deal with is in the 100.000-200.000 range, so yeah... I had to resort to a Task-less actor model based on coroutines because the Task objects were thrashing the GC.
    Ah yes, most of my local bot logic (it's distributed) is in Lua, so there's not a need for quite so many cross-thread invocations. Since I get simple property gets (etc.) for "free" in Lua, I don't have to do thousands (or more) of checks from a remote thread.
    Don't believe everything you think.

Similar Threads

  1. Donte use the Picklocking Exploit
    By Mikas in forum World of Warcraft Exploits
    Replies: 40
    Last Post: 06-13-2007, 07:19 PM
  2. How to use the Mind Controll tactic
    By wowmusic in forum World of Warcraft Guides
    Replies: 8
    Last Post: 04-29-2007, 12:38 AM
  3. Using The Auction House To Create Cheap Gold
    By andrelie in forum World of Warcraft Guides
    Replies: 10
    Last Post: 02-02-2007, 03:21 AM
  4. Lower your risk of being banned when using the backspace scam
    By Antoni_11 in forum World of Warcraft Exploits
    Replies: 5
    Last Post: 08-20-2006, 11:14 PM
  5. Warning: Do Not Use The Trade Scam!
    By sano in forum World of Warcraft General
    Replies: 5
    Last Post: 06-28-2006, 07:58 PM
All times are GMT -5. The time now is 07:37 AM. Powered by vBulletin® Version 4.2.3
Copyright © 2025 vBulletin Solutions, Inc. All rights reserved. User Alert System provided by Advanced User Tagging (Pro) - vBulletin Mods & Addons Copyright © 2025 DragonByte Technologies Ltd.
Google Authenticator verification provided by Two-Factor Authentication (Free) - vBulletin Mods & Addons Copyright © 2025 DragonByte Technologies Ltd.
Digital Point modules: Sphinx-based search