NASM - The Netwide Assembler

NASM Forum => Programming with NASM => Topic started by: frp on July 30, 2013, 06:58:27 AM

Title: x86_64 stack frame and alignment
Post by: frp on July 30, 2013, 06:58:27 AM
Hello,

I am a beginner in assembly world and I'm trying to adapt some examples and tutorials on my OSX 64 bits. Reading the System V Application Binary Interface, I found this :

Quote
In addition to registers, each function has a frame on the run-time stack. This stack grows downwards from high addresses. (...) The end of the input argument area shall be aligned on a 16 (...) byte boundary. In other words, the value (%rsp + 8 ) is always a multiple of 16 when control is transferred to the function entry point. The stack pointer, %rsp, always points to the end of the latest allocated stack frame.

What does that mean ("aligned on a 16 byte boundary") ? And should I do something in my program, like pushing something on the stack, to ensure that the alignment (?) is correct ?

Thanks for any help
Title: Re: x86_64 stack frame and alignment
Post by: dogman on July 30, 2013, 11:35:45 AM
Hello,

I am a beginner in assembly world and I'm trying to adapt some examples and tutorials on my OSX 64 bits. Reading the System V Application Binary Interface, I found this :

Quote
In addition to registers, each function has a frame on the run-time stack. This stack grows downwards from high addresses. (...) The end of the input argument area shall be aligned on a 16 (...) byte boundary. In other words, the value (%rsp + 8 ) is always a multiple of 16 when control is transferred to the function entry point. The stack pointer, %rsp, always points to the end of the latest allocated stack frame.

What does that mean ("aligned on a 16 byte boundary") ? And should I do something in my program, like pushing something on the stack, to ensure that the alignment (?) is correct ?

Thanks for any help

Hi. Alignment means the address is evenly divisible by some factor. 16 byte alignment means the address is an integral multiple of 16. Addresses like 1600, 1616, 1632 would all qualify. This is something the OS and tools enforce. By the time your program gets invoked your frame is already aligned. That brings us to your next question of how to keep things aligned and I don't know the answer other than always pushing 16 byte items but I very much doubt that is correct because in a 64 bit system most of the data (addresses, etc) are 8 bytes, not 16. Maybe you have to make sure your frame remains 16-byte aligned when you call a library function. You can do that with a few lines of code. I'm sure Frank or Rob or one of the other guys can clear this up.
Title: Re: x86_64 stack frame and alignment
Post by: frp on July 31, 2013, 08:42:18 AM
Thank you very much for your reply.
I understand now what "alignment" means. But I am still not sure of what to do in my programs  ;)
This is not so easy (but instructive) to begin in assembly !
Title: Re: x86_64 stack frame and alignment
Post by: dogman on July 31, 2013, 09:49:44 AM
Don't worry, I don't know what to do in my Intel assembly programs either ;D So far all I have done is push doublewords on the stack and pop them off and Linux hasn't complained. I haven't had to deal with any alignment issues.

But there are many smart helpful guys here who can help us!

Title: Re: x86_64 stack frame and alignment
Post by: Gerhard on July 31, 2013, 11:30:25 PM
Hi frp,

you can use the align 16 (or whatever) statement to reach that goal. What is your development platform? May be that I can help with a few appropriate links.

Gerhard
Title: Re: x86_64 stack frame and alignment
Post by: frp on August 01, 2013, 08:12:55 AM
Hi,
I (try to) use NASM 2.10 on OSX 64 bits. I found a lot of documentation about 32 bits Linux or Windows assembly, but for OSX, I have to guess a lot of things !
Anyway thanks for your replies.
Title: Re: x86_64 stack frame and alignment
Post by: Gerhard on August 01, 2013, 12:42:34 PM
Hi frp,

Hi,
I (try to) use NASM 2.10 on OSX 64 bits. I found a lot of documentation about 32 bits Linux or Windows assembly, but for OSX, I have to guess a lot of things !
Anyway thanks for your replies.

okay, I've found that thread (http://masm32.com/board/index.php?topic=1911.0) in Hutch's forum. It's about Linux 64 bit programming, but since OS X is a modified BSD, we've the same ABI. Hope that helps.

Gerhard
Title: Re: x86_64 stack frame and alignment
Post by: frp on August 02, 2013, 08:51:15 AM
Thank you !
I will study this example.
Title: Re: x86_64 stack frame and alignment
Post by: Gerhard on August 02, 2013, 01:01:08 PM
Thank you !
I will study this example.

you're always welcome.

Gerhard
Title: Re: x86_64 stack frame and alignment
Post by: Rob Neff on August 02, 2013, 04:19:42 PM
128-bit operands, which are permissible according to the x64 ABI, must be aligned properly and thus 64-bit functions  must account for that possibility whether they use parameters that large or not.  When you call another function from your within your own function you are likewise required to ensure the stack is aligned properly prior to the call.  So if you make use of the stack to store function variables you need to make sure of the stack alignment prior to calling another method ( in addition to ensuring your own variables are aligned properly on the stack ).  X64 programming in assembly is more complex than 32-bit programming when it comes to calling conventions.  But if you understand the rules properly you can develop macros to handle this for you automatically - or you can use NASMX (http://forum.nasm.us/index.php?board=11.0) which will help you with most of the housekeeping. ;)

Edit: Note that, unlike Linux or Windows, I make no guarantees regarding OSX as most of the Mach/OS support was based on educated "guesses" as I don't have an OSX box to test on.  The *BSDs are supported so you may be able to hack together a working program.
Title: Re: x86_64 stack frame and alignment
Post by: Frank Kotler on August 02, 2013, 05:58:45 PM
At the risk of stating the obvious, what frp needs to align the stack is not the "align" directive (although that may help, too) but:
Code: [Select]
and rsp, -16
You don't want to do just that, of course, since we need to get our old rsp back, but that's the part that guarantees stack alignment. See the example. Thanks, Gerhard!

Best,
Frank

Title: Re: x86_64 stack frame and alignment
Post by: Gerhard on August 02, 2013, 07:17:49 PM
Frank,

At the risk of stating the obvious, what frp needs to align the stack is not the "align" directive (although that may help, too) but:

some data must be aligned, for example for access with xmm or ymm registers. On the other hand, AMD recommends to align the loop hot spot by 16.

Code: [Select]
and rsp, -16

yes, that'll do the stack alignment.

Gerhard
Title: Re: x86_64 stack frame and alignment
Post by: Rob Neff on August 03, 2013, 02:21:53 AM
At the risk of stating the obvious, what frp needs to align the stack is not the "align" directive (although that may help, too) but:

To reinforce what Frank states: the Nasm ALIGN directive will align variables in the .DATA segments - not the stack.

Code: [Select]
and rsp, -16

yes, that'll do the stack alignment.


( Please note that the following is definitely way more than a beginner should be expected to understand but if you're going to program as per the X64 calling convention it needs said. )

It's not always that simple.  That will only work if the function you are calling does not exceed the maximum INTEGER or FLOATING_POINT registers used by the X64 calling convention.  If you are calling a function that expects more parameters than those defined then you now need to use the stack for parameter passing.  Subsequently, by performing a simple AND operation on RSP you will inadvertently cause the function parameters you just pushed on the stack to now be offset incorrectly.  If you push parameters on the stack after the AND then the function you call "may" execute correctly - depending on what you've PUSHed and the current  alignment of the stack prior to the call.  Regardless, you still have to account for your own function's stack after the called function returns.  This is only the tip of the ice berg.

Fortunately, most functions exposed by the OS and most libraries shouldn't exceed the maximum available register usage for the X64 calling convention.  Thus my previous paragraph regarding stack parameters doesn't apply for the majority of cases.  However, be aware of the possibility as you will most likely run into it for large non-trivial programs, graphical applications, or even very poorly designed interfaces for that matter.

As an aside note that programming Windows X64 does in fact maintain the old stack reservation requirements ( think 32-bit C calling convention ) for backward compatibility and the register usage for parameters is more restricted.  But I digress - we're not talking Windows programming here.  ;D
Title: Re: x86_64 stack frame and alignment
Post by: frp on August 03, 2013, 08:29:15 AM
I thank you all for these detailed explanations ! I can't say I understood everything, but I think I get the idea.
Now I will try to write a small but useful program to see if I can make something work... I guess that I will post again on this forum  ;) I will take a look at NASMX too.

Again thanks for your help.

(Is there a "solved" button somewhere ?)
Title: Re: x86_64 stack frame and alignment
Post by: Gerhard on August 03, 2013, 01:01:49 PM
Hi Rob,

To reinforce what Frank states: the Nasm ALIGN directive will align variables in the .DATA segments - not the stack.

that's true.

( Please note that the following is definitely way more than a beginner should be expected to understand but if you're going to program as per the X64 calling convention it needs said. )

It's not always that simple.  That will only work if the function you are calling does not exceed the maximum INTEGER or FLOATING_POINT registers used by the X64 calling convention.

sure, but if we're talking about the Unix ABI, we can pass 6 integer arguments in the registers RDI, RSI, RDX, RCX, R8, R9 in that order. Furthermore, the registers XMM0-XMM7 are used to pass the single and double precision floating point arguments. It seems to me that a procedure or function which needs more than 14 parameters, is a little bit overloaded. But if we would need that, we could place the parameters inside an array and pass the array pointer to our procedure.

But no offense, in general you're right.

Gerhard
Title: Re: x86_64 stack frame and alignment
Post by: Rob Neff on August 04, 2013, 01:06:42 AM
sure, but if we're talking about the Unix ABI, we can pass 6 integer arguments in the registers RDI, RSI, RDX, RCX, R8, R9 in that order. Furthermore, the registers XMM0-XMM7 are used to pass the single and double precision floating point arguments. It seems to me that a procedure or function which needs more than 14 parameters, is a little bit overloaded. But if we would need that, we could place the parameters inside an array and pass the array pointer to our procedure.

But no offense, in general you're right.

Gerhard

You're understanding is close but still not there yet.  The statement I made was regarding the calling convention's defined registers allocated for INTEGER or FLOATING_POINT.  The key word in that sentence is "or".  You are only thinking of 14 max registers, not of 6 INTEGER or 8 FLOATING_POINT registers.  Allow me to challenge you to write a simple program which calls the following X library window function:

Code: [Select]
Window XCreateSimpleWindow(Display *display, Window parent, int x, int y, unsigned int width, unsigned int height, unsigned int border_width, unsigned long border, unsigned long background);

This is not a function you've written.  It comes from a library that you must interface with thus you do not have the luxury of defining the interface nor of packaging up all the values into an array or struct and passing a simple pointer.  This particular function requires the use of 9 integer/pointer values and zero floating point values.

If you get stuck I'll let you peek into the NASMX Linux X64 Demo5 and examine the Nasm generated binary object file using objdump to see one way that this can be handled.  Hopefully this little challenge will strengthen your understanding of some of the subtleties of X64 programming.  ;)

Edit: spelling