Author Topic: Getting C Structure Fields in Assembly  (Read 12777 times)

Offline js19

  • Jr. Member
  • *
  • Posts: 4
Getting C Structure Fields in Assembly
« on: March 27, 2016, 06:02:55 PM »
My program is suppose to fill in a structure of a book (author, title, subject, year) store it in a pointer, use assembly to access the structure and compare the year.

In my C code:
Code: [Select]
struct book {
char author[20 + 1];   
char title[32 + 1];
char subject[10 + 1];
unsigned int year;             
};

In my assembly code:
Code: [Select]
        mov ebx, [book1]
mov ecx, [book2]
holds the book pointers.
How do I go about accessing book1->year in assembly? I've tried something like mov edi, [esp + 21] and get nothing.
Thanks!

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: Getting C Structure Fields in Assembly
« Reply #1 on: March 27, 2016, 08:52:43 PM »
Hi js19,

I'm afraid I'm not very good at C. I don't follow the reasoning for:
Code: [Select]
mov edi, [esp + 21]
What's on the stack? Why 21? What do you mean by "nothing"? edi is zero?

I would expect "year" to be 65 bytes into the structure (although C can align it as it wishes, I think). I'd take a wild-asmed guess at:
Code: [Select]
mov ebx, [book1] ; how is "book1" defined?
mov eax, [ebx + 65]
but you might have to dereference the pointer:
Code: [Select]
mov ebx, [book1]
; or maybe mov ebx, book1?
mov ebx, [ebx]
mov eax, [ebx + 65]

Perhaps you could compile your C code with the "-S" switch and see if C will confess where it has put things?

Maybe someone who knows C will come along, or try posting more of your code. Good luck!

Best,
Frank


Offline Bryant Keller

  • Forum Moderator
  • Full Member
  • *****
  • Posts: 360
  • Country: us
    • About Bryant Keller
Re: Getting C Structure Fields in Assembly
« Reply #2 on: March 28, 2016, 02:30:03 AM »
Hey js19,

My program is suppose to fill in a structure of a book (author, title, subject, year) store it in a pointer, use assembly to access the structure and compare the year.

I'm not sure about how your class deals with this stuff, but NASM does actually support structures.

In my C code:
Code: [Select]
struct book {
char author[20 + 1];   
char title[32 + 1];
char subject[10 + 1];
unsigned int year;             
};

Awesome! You've got a C structure. Let's try to turn that into a NASM structure. We do this with the STRUC/ENDSTRUC directives.

Code: [Select]
STRUC book
.author:  RESB (20+1) ; reserve 20+1 bytes for book.author
.title:   RESB (32+1) ; reserve 32+1 bytes for book.title
.subject: RESB (10+1) ; reserve 10+1 bytes for book.subject
.year:    RESD 1      ; reserve 1 32-bit unsigned integer for book.year
ENDSTRUC

In my assembly code:
Code: [Select]
        mov ebx, [book1]
mov ecx, [book2]
holds the book pointers.
How do I go about accessing book1->year in assembly?

You would use the notation [base_ref + structure.element] to access the elements of your structure. For your example, you've dereferenced the book1 pointer by assigning it's address to ebx (and the same for ecx). Remember from your C classes that the '->' operator is a shorthand semantic; for example book1->year is a shorthand for (*book1).year. In the latter form we can clearly see that we are dereferencing a pointer and treating it as a regular structure. This is exactly how we do it in assembly.

Code: [Select]
mov ebx, [book1]           ; ebx = book1
mov ebx, [ebx]             ; ebx = *ebx
mov edi, [ebx + book.year] ; edi = ebx.year

When trying to translate ideas from C to assembly, it's usually best to avoid the syntactic sugars and stick to the longer forms because they translate better. This also includes things like using <bool-expr>?<true-clause>:<false-clause> instead of the longer if (<bool-expr>) <true-clause> else <false-clause> form, these can sometimes make translation a little harder.

If you're teacher wants you to use the numbers directly, you could always count the number of bytes from the beginning of the STRUC to the .year label and use that in place of book.year, in your case it would be:

book.year = (20+1) + (32+1) + (10+1) -> 21 + 33 + 11 -> 65

So the above code would change to:

Code: [Select]
mov ebx, [book1]           ; ebx = book1
mov ebx, [ebx]             ; ebx = *ebx
mov edi, [ebx + 65]        ; edi = ebx.year
« Last Edit: March 28, 2016, 03:01:24 AM by Bryant Keller »

About Bryant Keller
bkeller@about.me

Offline Bryant Keller

  • Forum Moderator
  • Full Member
  • *****
  • Posts: 360
  • Country: us
    • About Bryant Keller
Re: Getting C Structure Fields in Assembly
« Reply #3 on: March 28, 2016, 02:43:31 AM »
js19,

I would expect "year" to be 65 bytes into the structure (although C can align it as it wishes, I think). I'd take a wild-asmed guess at:

Frank makes a really good point here, the fact is C has lots of "undefined behavior" which allows compiler implementations to change things so they operate better on certain systems. In this case, some C compilers might look at those character arrays (author, title, subject) and decide to put garbage data between them to ensure their addresses in memory will be aligned to a specific boundary so they can be referenced faster by the CPU. Eric S. Raymond wrote a really nice paper on C structure packing that you can review for more information on this.

Code: [Select]
mov ebx, [book1] ; [b]how is "book1" defined?[/b]
mov eax, [ebx + 65]
but you might have to dereference the pointer:
Code: [Select]
mov ebx, [book1]
; or maybe mov ebx, book1?
mov ebx, [ebx]
mov eax, [ebx + 65]

Very important question here. Above I assumed you were using book1 as a variable that holds the address of some memory created via malloc() or mmap(), but is that a correct assumption? I came to this assumption because of your use of '->' in your example code, but it is worth it to at least get validation that is what's going on.
« Last Edit: March 28, 2016, 03:11:08 AM by Bryant Keller »

About Bryant Keller
bkeller@about.me