Explorations on Random Numbers, Part 4
filed in Code on Aug.02, 2011
Astute readers will notice that the sample code in the previous post passes in a constant value as the seed to the random number generator. What would happen if we tried passing in the current time? Obviously, that would completely prevent the optimization we saw last time that just calculates the result, since the compiler won’t know ahead of time what time you’re running your program. So, let’s compile this code down:
int main()
{
RAND tRand;
RandInit(tRand, time(NULL));
unsigned int nRandom = RandGetNum(tRand);
printf("%u\n", nRandom);
return 0;
}
This is the code that the compiler generates for RandGetNum() in x86:
; Comments added by me ; For reference: v = 36969 * (v & 0xFFFFFFFF) + (v >> 32); ; edx:eax contains v 0137101B imul eax,eax,9069h ; Multiply low part of V by 36969 01371021 add edx,eax ; Add multiplied low part and high part
Holy crap! That’s just TWO instructions for the whole thing! That’s even better than the 64-bit version! Speaking of which, what does that look like now?
; Comments added by me ; For reference: v = 36969 * (v & 0xFFFFFFFF) + (v >> 32); ; rax contains v 000000013F881024 mov rdx,rax ; Copy v into rdx 000000013F881027 shr rdx,20h ; Set rdx to the high part of v 000000013F88102B imul eax,eax,9069h ; Multiply the low part of v by 36969 000000013F881031 add edx,eax ; Add multiplied low part to high part of v
Hmmm… That hasn’t changed much at all from the looped version we saw earlier. In fact, it’s exactly the same set of operations: move, shift, multiply, add. Something interesting does come out of this, though, and that is that the 32-bit code and the last two instructions in the 64-bit code are identical, which means that the preceding two instructions in the 64-bit code are just “ceremony”. That is, they’re getting the right data into the right registers for the later operations to “just work.” In the 32-bit code, the high and low parts are already split up by definition, since the only way to operate on a 64-bit register using 32-bit instructions is to split it up into two 32-bit registers. In the 64-bit code, there is an easy way to get the lower 32-bits of a register (rax is a 64-bit extension of eax in the same way that eax is a 32-bit extension of ax), but no built-in way to get the upper 32-bits other than copying the value and shifting it right.
Ultimately, we have to trust that the compiler has some understanding of what the code is doing, and how to translate that into efficient and correct machine code. It can be surprising, though, just exactly how much understanding the compiler has about what’s going on and what the best way to organize the code is. The whole point of these last two posts has been to delve into what the compiler is actually doing (don’t be afraid of the disassembly window!).
Your code is going to be running on a real computer, and having an understanding of what the computer is actually doing and how the code is executed is part of what makes a great programmer. It can be interesting, educational, and fun to see what code the compiler is generating.
Leave a Reply