basic introduction to the structure of WoW's files and hex editing
Introduction
So .. I don't exactly know where to start. The thing we want to learn today is how to hex edit models, WMOs, ADTs etc to get some better possibilities on editing. For example you are able to edit different aspects of models like the animations, bones etc. Its just more in depth than normal file swapping.
In order to understand everything, do not skip parts of this guide. Maybe you don't need to read the basics about hex but in order to understand the whole thing it might be better. Additionally I haven't everything in the right order since I write this just down as it comes to my mind.
Basics about Hex editors etc
Everyone who actually edited his armor the itemcache.WDB way has used a hex editor. For those who haven't I will start by describing the basic use of a hex editor. The others may pass this section.
So, what is a hex editor? Its a application that allows you to edit files in their basic structure.
On computers everything is made up with zeros and ones (1&0). But no-one can understand 110011001110101011010110110101100100000011110010110111101110101 (b) which is a textfile saying "fukk you". Text is made up in ASCII on computers.
There is a simple number assigned to every letter. The ASCII for U is 85. And this is 1010101 (b) written in binary. But as said, no one wants to read binary so we use the hexadecimal system. In hex 85 is 55 (h).
If you don't know what hex basically is, go to Google and search about it. But do not continue before you know that.
Lets get back to the "fukk you". In hex we got this as: 66 75 6b 6b 20 79 6f 75 (h). You can test that by calculating the decimal numbers out of this and then entering the separate letters in a text editor by pressing [alt] and the number on the num-block aka. the numbers on the right side.
If you'd open that file in a hex editor now, you will see the numbers named above. It may look like this:
But wait .. which editor to use?! Well, I take notepad++ with the hex edit plugin. Its a free extended notepad like the one shipped with windows. But waaaaaaaaay better. So .. just Google for Notepad++ or use your own choice. I will always refer to Notepad++ if I need to.
The first thing after opening a file in Notepad++ is switching to the hex-view. Just press the H or the keys [ctrl]+[alt]+[shift]+[h] to see the hex view.
So, how is our window build up? On the top we got Notepad++'s tabs. Its like tabbed browsing, you have several files open at the same time. Nothing great but useful sometimes.
The remaining parts will be as in every editor: Then we got that layout divided into 2 big parts, the header and the content. In the header you can see the 18 labels for the columns. As you should be able to read, you see, that there are the Address one, some +x and a Dump one.
The Dump column:
Here you can see our great "fukk you". That column just shows a ascii-dump of our content.
The Address column and the +xes:
This is showing us where in the file we are. Since we have only such a bit text we only see the 000000000 in the address one.
But on the +xes we see those "66 75 6b 6b 20 79 6f 75" again. Divided into the separate columns.
To know what exactly is the character on the 4th position in the file we could also say "what is the letter on 0x4?". The 0x shows us that its in hex since programmers are too lazy to write down HEX! all the time.
So what does 0x4 mean .. its just saying .. 4. Since you guys all know hex ... I hope.
Lets just try to read which number is on 0x4 yourself. If you got 0x20 you're lucky and may now try to get the one at 0x7. If you get 0x75 there you're just awesome.
But we do not always have just one line in there... Lets think about 5 lines of text .. That size may be around 0x1CA. Just load a part of the Guide into your test-file now. Be sure you copy more than a line so you get a bit more than a "page" full in hex. Now try to browse to some more "complicated" address. Maybe take 0x5D. I will take that in my next example.
To browse to a specific position, also called offset you need to divide your address. Since you got 0x5D, you divide it into 0x50 and 0xD. Just that you have the last digit on a separate number. On 0x1F4DE you will get by doing that?
Right, 0x1F4D0 and 0xE! Maybe you got it from that yet. The first one is the line in the address column and the other one in the +x ones.
Now try to get some practice in reading it
You should be able to use a hex editor now. Just try a bit in some textiles...
What's the difference to WoW's files now?
Well, in files that describe models, worlds, textures or something you don't only have things like the ascii -> hex one. Additionally, those files have a defined structure.
WoW uses different file structures. If you open a WMO you may find a lot of differences to a M2 but not to a ADT.
We divide in 2 main types:
Chunked file structures:
To give an examples of chunked files: ADT, WDT, WDL, WMO.
These files are made up in blocks. You get a block defining the file type, one defining the header, aka some basic information. Other ones defining the actual content of the file. Those blocks are identified by 4 characters. They are in ASCII we all know now. But: they are in reversed order! If you search for the MOHD chunk you will have to search for DHOM in the file. This is because WoW will most likely see the MOHD not as MOHD but as a number to identify it, which just - god knows why - matches a readable text that makes it possible for us to read them.
Following to this identifier is the size of the block. But wait .. how do we say WoW the size of the block? What we need for this is some basics about types of data.
Well, we have the following basic data types in our computers:
- Integer, defining a number like 10.
- Float, defining a number like 5.23.
- Characters, those are the ASCII ones named above.
These are the basic ones. Now we can divide them into several subtypes again. Every type of data can have a different size again. Since the basic types are 32 bytes they can only hold a specific range of numbers. I don't know the exact values by heart, but you may look them up on Wikipedia or Google. Well, sometimes we need a larger range of numbers or we only need positive numbers. So we get unsigned for the only positive ones and more bytes for the larger ranges.
We call the large integer int64 or long. That depends on the author. The tiny ones are int16 (short) or int8 (char, since they have the same size). The unsigned ones are identified by a little u in front of the type. So uint16 is a integer holding a medium range of numbers that are only positive.
For floats we call those big ones double and don't have little ones. But double is actually not used that often in WoW.
This should have been our little vacation in data-type-land. Lets get back to our file-structures.
I stopped at the chunked structure with their blocks that are made up of 4 characters and a size. The size is a Integer. A uint32 one. So as example we may have got: MOHD. .. in our ascii-dump. This actually represents: 4d 4f 44 48 01 20 00 00 (h).
As you might think now, the size is 0x01200000: Wrong. In our data numbers are stored big-endian. Meaning the least significant byte is at the first place and the most significant on the last one. So our size is: 0x00002001 or shorter 0x2001! You always have to keep that in mind! Its like that all the time. Its always by the type of data. So if you got 2 int16s that are together: 00 40 15 0E, you got 0x4000 and 0x0E15!
So what comes after our size? The content. No, I can't explain you now how every chunk's content is made up. That's far too much! Lets just get to the other file structure:
non-chunked file structure:
Yes, I know .. lawl, chunked and non-chunked.. but its just the easiest way to divide the two :P
Well, non-chunked formats are for example: M2, BLP, DBC, WDB, BLS.
These files are built up a bit strange. WoW knows at those files where the specific data is hold. So for example at the models (M2s), WoW knows that theres a unit32 at 0x134 describing the number of ribbon emitters. Well .. what's about the different file sizes at models then? Yeah, at 0x138 is a uint32 that tells us, where in the file the ribbon emitters are hold!
WoW is able to look up how many emitters there are and where they are now. You are able to do this too! Lets just try.
Since ribbon emitters are a thing not every model has, I will tell you a model that has some: Creatures\zigguratcrystal\zigguratcrystal.m2. I hope I am not wrong and that file really has some since I am not able to check this at the moment. But well ..
At 0x134 in this file there should be at least a 1. And at 0x138 you will find a huge number. This is the offset to the emitters themselves. You should be able to browse there if you read everything above.
Lets try to edit something!
I think this is the part you will actually learn off the most. As an example I will show you how to modify doodads held in WMOs. If you don't know what that is: WMOs (aka houses) are able to hold several sets of models that do some interior. In Elwynn there are a lot of those human houses. Those have all the beds and cupboards, the books etc defined in a doodad-set.
Overview about our problem
So, what do we have to think of when it comes to that set:
We have that basic house. Now we want to add a book. The first thing we'll need is a book! As a model. This model is defined by a filename. This filename is, in this case we have "World\Goober\g_book1.m2". This filename is defined in the MODN-chunk of the WMO. This chunk contains all the model filenames in the WMO. If you open up a human house you will come over around 30 files there, depending on the size. Those filenames are padded by zeros. They need to have the file type .MDX, even if they are .M2s! Always keep that in mind!
Then we need information about where the model is. This information is hold in the MODD-chunk. That block holds all the information about the placement etc for every model in the WMO. There are 40 bytes per doodad. The chunk itself is made up like this:
- name_offset - uint32 - the offset to the filename. this is in relation to the MODN chunk. they start counting after the size!
- X, Y, Z - 3 floats - the position of the model inside of the WMO. its defined by 3 floats.
- W - float - this is a mathematical thing used for the orientation of the doodad. its made with quaternions .. its not easy to explain and i wont do it in here.
- A, B, C - float - the A, B and C for the orientation quaternion! its pretty shitty :P
- scale - float - the scale of the model.
- color - 4 uint8 - this is a color .. actually there is no real reason in a color for the model, so its 0,0,0,255 most of the time .. its maybe used for lightning but who knows ..
That was the structure of the MODD chunk.
Then, the WMO / WoW needs to know how many doodads are in that file xD So you need to edit the number of models and doodads in the head-chunk. This one is the MOHD one. What you need to edit is:
- 0x10 uint32 nModels - number of M2 models imported
- 0x14 uint32 nDoodads - number of doodads (M2 instances)
Again, all offsets in the chunks are in relation to the position after the size-field!
I said you before .. There are doodad-sets. Its a bit complicated and I'd suggest you to only edit WMOs with only one set. The reason is that the MODS-chunk only holds, how many doodads are in a specific set and you can most likely only edit the count of doodads in that set. In that case you need 0x18, uint32, the number of doodad instances in this set. Better don't edit WMOs with more than one set as long as you don't know which set is on the map!
So .. This may be the information you expect the WMO needing to know where a doodad is, but no. You might have noticed that a WMO is separated into that name.WMO and name_000.WMO, name_001.WMO etc. Those _xxx are the files that actually describe the model and the other one is the root one that holds general information.
All the information above is in the root one. But the child one wants to know that it has a model too!
So in the child-WMO's MODR chunk are references to the doodad-instances in the MODD chunk. Those references are uint16s in a row. Just one after the other.
Lets come to the edit part
So, what I "teach" you to edit now is the position and the filename of that model, additionally you will be able to add a new model or a new doodad-instance too, if you understood it all.
So, position .. You remember? Right, MODD. Lets get the structure back in the mind and copy it down here again :P
- name_offset - uint32 - the offset to the filename. this is in relation to the MODN chunk. they start counting after the size!
- X, Y, Z - 3 floats - the position of the model inside of the WMO. its defined by 3 floats.
- W - float - this is a mathematical thing used for the orientation of the doodad. its made with quaternions .. its not easy to explain and i wont do it in here.
- A, B, C - float - the A, B and C for the orientation quaternion! its pretty shitty :P
- scale - float - the scale of the model.
- color - 4 uint8 - this is a color .. actually there is no real reason in a color for the model, so its 0,0,0,255 most of the time .. its maybe used for lightning but who knows ..
So .. the position was the 3 floats after the name_offset. Meaning: MODD+size + 0x4, MODD+size + 0x8 and MODD+size + 0xA. Those are the 3 coordinates!
It shouldn't be so hard for you to edit those coordinates now. The only problem: they are in floats. These floats are written in a way we don't understand them by simply re-calculating to decimal.. To get that, you will be needing a little application giving you what you want. This is this one, coded by myself. It just asks you for a float and will throw it out to a file then. You will be able to copy it from the file then. To use it, just extract both files and run the .exe.
Here is the link: RapidShare: 1-Click Webhosting
You should be able to move around the model now. Well, at least I hope it :P
Lets come to editing the model itself, aka the filename. The filenames are stored in that above-named chunk and all need to have the file type .MDX, even if they are .M2s! Always keep that in mind!
Just keep it low and only edit the last filename this time. You remember MODD -> name_offset? You'd have to change every doodad-instance if you change a filename in the middle of them! Just try to edit the last name into something else. You will see that it wont work. Yes, you forgot about the size one .. As said, every chunk has its size. If you edit the chunk's size you will have to edit the size too! Modify that and you will see that the model will change.
The next and last thing: I will take a short look at the new doodad-instances part.
- You will have to add a new 40 bytes for the MODD / doodad information chunk.
- You will have to edit the child-wmo's list of instances.
- Same for the set and the header.
I think that should be enough to get the thing.
Outro
Well, now I come to the point I actually don't know what I should write on about ..
Just ask me and I may add some things, I hope you have been able to understand everything.
Please give feedback if you have something to add or correct.