OS X Pattern Searching menu

User Tag List

Results 1 to 11 of 11
  1. #1
    ProbablyEngine's Avatar Contributor Lead ProbablyEngine Dev CoreCoins Purchaser
    Reputation
    160
    Join Date
    Mar 2008
    Posts
    64
    Thanks G/R
    0/0
    Trade Feedback
    1 (100%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    OS X Pattern Searching

    I'm working on pattern searching in OS X for my releases of ProbablyEngine.

    Here is the code I've got now, its incredibly slow....

    Code:
    /*
     *
     * Author: Grant Douglas (@Hexploitable)
     *
     */
    
    #include <sys/types.h>
    #include <sys/stat.h>
    #include <unistd.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <dlfcn.h>
    #include <mach/vm_map.h>
    #include <mach-o/dyld_images.h>
    
    int pid = 0;
    int g_pid = 0;
    #define sigSize 4 //Number of bytes used in signature
    
    
    mach_vm_address_t *scanMem(int pid, mach_vm_address_t addr, mach_msg_type_number_t size)
    {
        task_t t;
        task_for_pid(mach_task_self(), pid, &t);
        mach_msg_type_number_t dataCnt = size;
        mach_vm_address_t max = addr + size;
        int bytesRead = 0;
        kern_return_t kr_val;
        pointer_t buf;
        uint32_t sz;
        //Method signature - taken from disassembler
        unsigned char signature[sigSize] = "\x74\x69\x83\xf9";
        unsigned char buffer[sigSize];
    
        while (bytesRead < size)
        {
            if ((kr_val = vm_read(t, addr, sigSize, &buf, &sz)) == KERN_SUCCESS)
            {
                memcpy(buffer, (const void *)buf, sigSize);
                if (memcmp(buffer, signature, sigSize) == 0)
                {
                    fflush(stdout);
                    return (unsigned long long *)addr;
                }
                else
                    printf("[-] %p ---> vm_read()\r", addr);
                fflush(stdout);
                //usleep(50);
            }
            else
            {
                printf("[-] %p ---> vm_read().\r", addr);
                fflush(stdout);
            }
            addr += sizeof(unsigned char);
            bytesRead += sizeof(unsigned char);
        }
        printf("[i] Scanning ended without a match.\r\n");
        fflush(stdout);
        return NULL;
    } 
    
    
    unsigned int *getMemRegions(task_t task, vm_address_t address)
    {
        kern_return_t kret;
        vm_region_basic_info_data_t info;
        vm_size_t size;
    
        mach_port_t object_name;
        mach_msg_type_number_t count;
        vm_address_t firstRegionBegin;
        vm_address_t lastRegionEnd;
        vm_size_t fullSize;
        count = VM_REGION_BASIC_INFO_COUNT_64;
    
        int regionCount = 0;
        int flag = 0;
        while (flag == 0)
        {
            //Attempts to get the region info for given task
            kret = mach_vm_region(task, &address, &size, VM_REGION_BASIC_INFO, (vm_region_info_t) &info, &count, &object_name);
            if (kret == KERN_SUCCESS)
            {
                if (regionCount == 0)
                {
                    firstRegionBegin = address;
                    regionCount += 1;
                }
                fullSize += size;
                address += size;
            }
            else
                flag = 1;
        }
        lastRegionEnd = address;
        printf("[+] Region to scan: %p - %p\n", firstRegionBegin, lastRegionEnd);
        unsigned int *ptrToFunc = (unsigned int *)scanMem(pid, firstRegionBegin, fullSize);
        return ptrToFunc;
    }
    
    
    int main() {
        kern_return_t rc;
        mach_port_t task;
        mach_vm_address_t addr = 1;
    
        printf("[+] Please specify the pid of target process.\n");
        scanf("%d", &pid);
        g_pid = pid;
    
        rc = task_for_pid(mach_task_self(), pid, &task);
        if (rc)
        {
            fprintf(stderr, "[-] task_for_pid() failed, error %d - %s", rc, mach_error_string(rc));
            exit(1);
        }
    
        printf("[i] RC %d ---> Task %d\n\n", rc, task);
    
        unsigned int *sym = getMemRegions(task, addr);
        if (sym != NULL)
            printf("[$] Located target function ---> %p\n", sym);
        else
            printf("[-] Didn\'t find the function.\n");
    
        return 0;
    }
    I've returned the code to its original version with comments and verbose output, even with the extra bits removed, its way too slow. Does anyone have any experience working with this on OS X?

    OS X Pattern Searching
  2. #2
    Jadd's Avatar 🐸 Premium Seller
    Reputation
    1515
    Join Date
    May 2008
    Posts
    2,433
    Thanks G/R
    81/336
    Trade Feedback
    1 (100%)
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    So you're reading 4 bytes after every byte, I can see why that would be slow. Instead, you should do one big memory read at the beginning and use the same buffer for each comparison.

    Edit: I'm sure you can find some simple Windows FindPattern snippets which you could easily convert if need be.
    Last edited by Jadd; 10-21-2013 at 05:16 PM.

  3. #3
    ProbablyEngine's Avatar Contributor Lead ProbablyEngine Dev CoreCoins Purchaser
    Reputation
    160
    Join Date
    Mar 2008
    Posts
    64
    Thanks G/R
    0/0
    Trade Feedback
    1 (100%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I'd thought about reading in a few kb at a time, I'll work with that some.

  4. #4
    Jadd's Avatar 🐸 Premium Seller
    Reputation
    1515
    Join Date
    May 2008
    Posts
    2,433
    Thanks G/R
    81/336
    Trade Feedback
    1 (100%)
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Originally Posted by phelpsben View Post
    I'd thought about reading in a few kb at a time, I'll work with that some.
    The less memory read calls the better. Even if you are loading whole modules into memory the buffers they are stored in will be deleted once the scan is done; the overhead will only be present for a very short time.

  5. #5
    ProbablyEngine's Avatar Contributor Lead ProbablyEngine Dev CoreCoins Purchaser
    Reputation
    160
    Join Date
    Mar 2008
    Posts
    64
    Thanks G/R
    0/0
    Trade Feedback
    1 (100%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Here is the version I've got now, it reads in 1MB at a time, searches are pretty much instant (less than 80ms).

    Code:
    // Byte Searching for OS X
    // BenPhelps http://benphelps.me/
    
    #include <stdio.h>
    #include <stdlib.h>
    #include <mach/mach.h>
    
    unsigned int scanMem(task_t wow, mach_vm_address_t start, mach_msg_type_number_t size, unsigned char *signature, int signature_size)
    {
        unsigned int buffer_size = 0x100000;
        int bytes_read = 0;
        uint32_t sz;
        while (bytes_read <= size)
        {
            unsigned char buffer[buffer_size];
            unsigned int address = bytes_read;
            pointer_t buffer_pointer;
            vm_read(wow, address, buffer_size, &buffer_pointer, &sz);
            // copy over to us
            memcpy(buffer, (const void *)buffer_pointer, sz);
            // parse 1mb
            unsigned int buffer_position = 0;
            while (buffer_position <= buffer_size) {
                unsigned int signature_start = buffer_position;
                unsigned int signature_position = 0;
                // parse bytes
                while (buffer[signature_start + signature_position] == signature[signature_position]) {
                    signature_position++;
                    if(signature_position == signature_size){
                        return (int) bytes_read + buffer_position;
                    }
    
                }
                buffer_position++;
            }
            bytes_read+=buffer_size;
        }
        return 0;
    }
    
    
    int main() {
    
        kern_return_t kern_return;
        mach_port_t task;
    
        int pid = 0;
        scanf("%d", &pid);
    
        kern_return = task_for_pid(mach_task_self(), pid, &task);
        if (kern_return != KERN_SUCCESS)
        {
            printf("task_for_pid() failed, error %d - %s", kern_return, mach_error_string(kern_return));
            exit(1);
        }
    
        int signature_size = 4;
        unsigned char signature[4] = "\xBB\x8D\x24\x3F";
    
        unsigned int ptr = scanMem(task, 0x1000, 0xffff3000, signature, signature_size);
    
        if (ptr){
            printf("%X\n", ptr);
        }
        else {
            printf("Nothing found.\n");
        }
    
        return 0;
    }

  6. #6
    Cypher's Avatar Kynox's Sister's Pimp
    Reputation
    1358
    Join Date
    Apr 2006
    Posts
    5,368
    Thanks G/R
    0/6
    Trade Feedback
    0 (0%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    That's a good start, but there's still a few things you probably want to address.

    e.g.
    What happens if a signature straddles a buffer boundary? (Currently it seems you overflow the end of buffer and read invalid memory.)
    What happens if vm_read fails?
    What happens if vm_read only does a partial read? (Not familiar with the OSX APIs so perhaps this can't happen, but based on the 'sz' parameter I'm assuming that it can...)
    What happens if sizeof(unsigned int) != sizeof(void*)? (Often the case on 64-bit platforms. You probably want something like uintptr_t -- or an actual pointer type.)
    That's a pretty massive buffer to be putting on the stack. You may want to consider moving that to the heap.
    You would probably get even better performance with a more appropriate algorithm (i.e. something smarter than 'brute force'), but it of course depends on your use case, average signature length, etc. Example: Boyer?Moore?Horspool algorithm - Wikipedia, the free encyclopedia
    Probably more that I'm overlooking that someone else will undoubtedly pick up.

    Good luck.

  7. #7
    ProbablyEngine's Avatar Contributor Lead ProbablyEngine Dev CoreCoins Purchaser
    Reputation
    160
    Join Date
    Mar 2008
    Posts
    64
    Thanks G/R
    0/0
    Trade Feedback
    1 (100%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    What happens if a signature straddles a buffer boundary? (Currently it seems you overflow the end of buffer and read invalid memory.)
    I've only really thought about that, I figured i could fix it by reading the signature length backwards in bites on subsequent passes.

    What happens if vm_read only does a partial read? (Not familiar with the OSX APIs so perhaps this can't happen, but based on the 'sz' parameter I'm assuming that it can...)
    I honestly don't know how the API truly works, its not documented at all.

    That's a pretty massive buffer to be putting on the stack. You may want to consider moving that to the heap.
    Very true, but it exists for less than a second, so im not too worried.

    You would probably get even better performance with a more appropriate algorithm
    I might do this as an exercise in C, but I honestly don't think its required (not for my use case anyhow).

  8. #8
    ioctl's Avatar Active Member
    Reputation
    23
    Join Date
    Jan 2013
    Posts
    35
    Thanks G/R
    2/4
    Trade Feedback
    0 (0%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Originally Posted by phelpsben View Post

    Very true, but it exists for less than a second, so im not too worried.
    Not sure you asked for a code review, but you're getting one =)

    It's not the amount of time that's important. Stack space is pretty limited on many platforms, and it is best not to assume that there's 1Mb of it. Your code may misbehave in subtle or not so subtle ways if you break the stack.

    I'm not sure why you're doing the memcpy() in there. Why not just scan through the buffer that vm_read() returns? If you don't want to do a bunch of casting to get the compiler to stop complaining, just do it once: "unsigned char *buffer = (unsigned char*) buffer_pointer". Based on the docs, I assume that the pointer will be valid until the next time you call vm_read(). You're not checking for errors on vm_read(), but I assume you already know that you should =)

    Don't manually search for your pattern. All the code after "// parse 1mb" can be replaced with a single call to memmem(). It's easier to read and less prone to error than doing it yourself, and is, in all likelihood, much faster as well. As a bonus, you only have to refer to your memory buffer once, so you can pass buffer_pointer directly in with one cast to make the compiler happy.

    Code:
     int signature_size = 4;
     unsigned char signature[4] = "\xBB\x8D\x24\x3F";
    This is poor style, and, more importantly, incorrect. You have three constants in these two lines that need to be kept in sync for the program to be correct, and one of them is already wrong: your initializer is five bytes long, including the trailing \0, but you've only allocated 4 bytes for it. When you initialize an array with a string constant (as you do here), C lets you leave out the array size, getting rid of one unnecessary constant. Further, sizeof(myArrayVariable) returns the size of the array in bytes:

    Code:
    unsigned char signature[] = ".....";
    int signature_size = sizeof(signature) - 1; // we don't care about the trailing nul.
    Finally, it's important to understand what operations involve calling into the kernel: basically any io, interaction between tasks, etc. There is a massive overhead for doing this, and minimizing the number of times you do this is a huge win. That's why your 1Mb buffer is way faster than your 4byte reads -- you are making .0004% as many system calls.

  9. #9
    ProbablyEngine's Avatar Contributor Lead ProbablyEngine Dev CoreCoins Purchaser
    Reputation
    160
    Join Date
    Mar 2008
    Posts
    64
    Thanks G/R
    0/0
    Trade Feedback
    1 (100%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks, I'll work in fixing this up.

    I have no idea what I'm doing in C and there is pretty much nothing online that covers this, everything helps!

  10. #10
    snakeninny's Avatar Private
    Reputation
    1
    Join Date
    Mar 2014
    Posts
    2
    Thanks G/R
    0/0
    Trade Feedback
    0 (0%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Originally Posted by phelpsben View Post
    Here is the version I've got now, it reads in 1MB at a time, searches are pretty much instant (less than 80ms).

    Code:
    // Byte Searching for OS X
    // BenPhelps http://benphelps.me/
    
    #include <stdio.h>
    #include <stdlib.h>
    #include <mach/mach.h>
    
    unsigned int scanMem(task_t wow, mach_vm_address_t start, mach_msg_type_number_t size, unsigned char *signature, int signature_size)
    {
        unsigned int buffer_size = 0x100000;
        int bytes_read = 0;
        uint32_t sz;
        while (bytes_read <= size)
        {
            unsigned char buffer[buffer_size];
            unsigned int address = bytes_read;
            pointer_t buffer_pointer;
            vm_read(wow, address, buffer_size, &buffer_pointer, &sz);
            // copy over to us
            memcpy(buffer, (const void *)buffer_pointer, sz);
            // parse 1mb
            unsigned int buffer_position = 0;
            while (buffer_position <= buffer_size) {
                unsigned int signature_start = buffer_position;
                unsigned int signature_position = 0;
                // parse bytes
                while (buffer[signature_start + signature_position] == signature[signature_position]) {
                    signature_position++;
                    if(signature_position == signature_size){
                        return (int) bytes_read + buffer_position;
                    }
    
                }
                buffer_position++;
            }
            bytes_read+=buffer_size;
        }
        return 0;
    }
    
    
    int main() {
    
        kern_return_t kern_return;
        mach_port_t task;
    
        int pid = 0;
        scanf("%d", &pid);
    
        kern_return = task_for_pid(mach_task_self(), pid, &task);
        if (kern_return != KERN_SUCCESS)
        {
            printf("task_for_pid() failed, error %d - %s", kern_return, mach_error_string(kern_return));
            exit(1);
        }
    
        int signature_size = 4;
        unsigned char signature[4] = "\xBB\x8D\x24\x3F";
    
        unsigned int ptr = scanMem(task, 0x1000, 0xffff3000, signature, signature_size);
    
        if (ptr){
            printf("%X\n", ptr);
        }
        else {
            printf("Nothing found.\n");
        }
    
        return 0;
    }

    It's a good example of searching a string pattern in memory, but how do I search for a number, say an integer value?

  11. #11
    ioctl's Avatar Active Member
    Reputation
    23
    Join Date
    Jan 2013
    Posts
    35
    Thanks G/R
    2/4
    Trade Feedback
    0 (0%)
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Use pointer casting to treat your integer as a binary string.

    Code:
      int needle = 0x12345678;
      int needle_length = sizeof(needle);
      if (memmem(haystack, haystack_length, &needle, needle_length)) { .... }

Similar Threads

  1. Thottbot Search Engine (MATT please read)
    By Tbone in forum Community Chat
    Replies: 13
    Last Post: 11-07-2006, 06:27 AM
  2. The Search Button and YOU
    By Finnster in forum World of Warcraft General
    Replies: 2
    Last Post: 10-23-2006, 06:45 PM
  3. Search
    By Alexandria in forum World of Warcraft General
    Replies: 0
    Last Post: 08-12-2006, 07:37 PM
  4. Searching for Voice Actors/In-Game Actors
    By Örpheus in forum World of Warcraft General
    Replies: 1
    Last Post: 08-01-2006, 10:03 AM
  5. Couple Patterns
    By funkdmonkey in forum World of Warcraft Guides
    Replies: 2
    Last Post: 05-25-2006, 12:46 PM
All times are GMT -5. The time now is 03:02 AM. Powered by vBulletin® Version 4.2.3
Copyright © 2025 vBulletin Solutions, Inc. All rights reserved. User Alert System provided by Advanced User Tagging (Pro) - vBulletin Mods & Addons Copyright © 2025 DragonByte Technologies Ltd.
Google Authenticator verification provided by Two-Factor Authentication (Free) - vBulletin Mods & Addons Copyright © 2025 DragonByte Technologies Ltd.
Digital Point modules: Sphinx-based search