Hi Mike
1. I want to save the current rsp in a temp var for further restoration after the function call. What should the proper size of this var be under 64 bits?
It should be stored as a 64bit pointer type - presumably
long long *2. I want to be able to push my args to the 64-bit function stack. I'm currently doing it in a mixed for() loop in C/assembly. I could've used memcpy() instead but that would be way too slow to call yet another function in each call to my own one. So:
2.1. What is the structure of the function call stack under 64 bits?
2.2. What are the stack sizes of 64-bit function call args and where are they located in the stack and in what order?
2.3. Where is the return address located and what is its size in the call stack?
Assuming you want to use the MS64 bit calling convention:
The stack is 8 bytes wide - even for longs / dwords etc
Create a stack frame of at least 32 bytes aligned down to 16 bytes. If you fail to align rsp to 16 bytes - you will be exiled to Crashbania
sub rsp,48 'will take 5 or 6 params
The first 4 params are passed in registers in this order: RCX, RDX, R8, R9
But Direct floats must be passed in the corresponding 4 SIMD Registers XMM0,XMM1,XMM2,XMM3
The lower 32 bytes of your stack frame are used by the function as a
spill zone for these registers.
If there are more than 4 params, store them in [rsp+32] [rsp+40], [rsp+48] etc
The call is made in the usual way - pushing the return address onto the stack. This, of course puts the stack pointer out of 16 byte alignment, so it is one of the duties of the function prologue to ensure to remedy this situation.
After making the call, release your stack frame:
add rsp,48Values ar returned in the RAX register, except for floats which are returned in SIMD register XMM0.
2.4. Are there any "dark areas" in the function call stack that are reserved for the system and thus shouldn't be overwritten with my call stack construction methods?
Only the spill zone
3. A 32-bit return instruction ret N has its N sized as a 16-bit quantity. What is its proper size under 64 bits?
ret n is not used - as with cdecl, but it remains 16 bits.
These rules can be vexatious, so for internal functions, you may prefer to use
stdcall or
cdecl or something entirely customised, as long as the stack pointer is kept in 16 byte (128 bit alignment) before making calls.