Author Topic: Mixed assembly and C  (Read 43440 times)

Offline dbfn

  • Jr. Member
  • *
  • Posts: 17
Re: Mixed assembly and C
« Reply #15 on: January 26, 2010, 01:17:00 AM »
all right then....

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: Mixed assembly and C
« Reply #16 on: January 26, 2010, 03:08:40 AM »
Well, I still don't understand why your example ran on Keith's Mac! (OMG, they really *are* better! :)

I sympathize with your problems with structures in C. When I was trying to teach myself C, I utilized structures extensively. Had *no* idea what I was doing, just bashed at it until it worked - which it eventually did. Now that I've learned assembly, I understand why "returning the structure" wasn't working. The return value is a... well, 16-bit then, 32-bit or more now, value - "the structure" ain't going to fit.

Thanks to Bryant for the example!

Best,
Frank


Offline Keith Kanios

  • Full Member
  • **
  • Posts: 383
  • Country: us
    • Personal Homepage
Re: Mixed assembly and C
« Reply #17 on: January 26, 2010, 06:30:17 AM »
Hum...try then...o-o

Segfaulted on 32-bit Ubuntu. I checked a dump of the file and the Ubuntu version is still trying to access the stack, where-as the Mac version is not.

Ubuntu Binary Dump:
Code: [Select]
; Disassembly of file: main.bin
; Mon Jan 25 23:37:56 2010
; Mode: 32 bits
; Syntax: YASM/NASM
; Instruction set: 80386
 
 
global _edata
global __bss_start
global _end
global teste
global _start
global h
 
 
SECTION .text   align=16 execute                        ; section number 1, code
 
teste:  ; Function begin
        push    ebp                                     ; 080480A0 _ 55
        mov     ebp, esp                                ; 080480A1 _ 89. E5
        mov     ecx, dword [ebp+8H]                     ; 080480A3 _ 8B. 4D, 08
        mov     dword [h], 10                           ; 080480A6 _ C7. 05, 080490E8(d), 0000000A
        mov     dword [?_001], 30                       ; 080480B0 _ C7. 05, 080490EC(d), 0000001E
        mov     eax, dword [h]                          ; 080480BA _ A1, 080490E8(d)
        mov     edx, dword [?_001]                      ; 080480BF _ 8B. 15, 080490EC(d)
        mov     dword [ecx], eax                        ; 080480C5 _ 89. 01
        mov     dword [ecx+4H], edx                     ; 080480C7 _ 89. 51, 04
        mov     eax, ecx                                ; 080480CA _ 89. C8
        pop     ebp                                     ; 080480CC _ 5D
        ret     4                                       ; 080480CD _ C2, 0004
; teste End of function
 
_start: ; Function begin
        call    teste                                   ; 080480D0 _ E8, FFFFFFCB
        mov     esi, h                                  ; 080480D5 _ BE, 080490E8(d)
        mov     eax, 1                                  ; 080480DA _ B8, 00000001
        mov     ebx, 0                                  ; 080480DF _ BB, 00000000
; Note: Function does not end with ret or jmp
        int     -128                                    ; 080480E4 _ CD, 80
; _start End of function
 
 
SECTION .bss    align=4 noexecute                       ; section number 2, bss
 
h:                                                      ; qword
        resb    4                                       ; 080490E8
 
?_001:                                                  ; dword
        resd    1                                       ; 080490EC

I'm not exactly an expert in C or compiler theory, and perhaps I am missing something, but I find that trying to access an unknown portion of the stack seems rather foolish... especially when it is above any known portion of a downward growing stack (ebp+[0-3] = ebp; ebp+[4-7] = return address; ebp+[8+] = ???).

Perhaps there was a faulty assumption by the compiler that a pointer to the structure was going to be supplied as the first/only argument?

Perhaps mov ecx, dword [ebp+8H] is consistently grabbing an invalid or NULL pointer, causing a segfault on the subsequent mov dword [ecx], eax?


Mac Binary Dump:
Code: [Select]
; Disassembly of file: main.bin
; Mon Jan 25 23:23:53 2010
; Mode: 32 bits
; Syntax: YASM/NASM
; Instruction set: 80386


global __mh_execute_header
global start
global _teste
global _h

__mh_execute_header equ 00001000H                       ; 4096


SECTION ._TEXT.__text align=1 execute                   ; section number 1, code

start:  ; Function begin
        call    _teste                                  ; 1FB7 _ E8, 0000000C
        mov     eax, 1                                  ; 1FBC _ B8, 00000001
        mov     ebx, 0                                  ; 1FC1 _ BB, 00000000
        int     -128                                    ; 1FC6 _ CD, 80

_teste:
        push    ebp                                     ; 1FC8 _ 55
        mov     ebp, esp                                ; 1FC9 _ 89. E5
        sub     esp, 8                                  ; 1FCB _ 83. EC, 08
        call    ?_001                                   ; 1FCE _ E8, 00000000
?_001:  pop     ecx                                     ; 1FD3 _ 59
        lea     eax, [ecx+102DH]                        ; 1FD4 _ 8D. 81, 0000102D
        mov     eax, dword [eax]                        ; 1FDA _ 8B. 00
        mov     dword [eax], 10                         ; 1FDC _ C7. 00, 0000000A
        lea     eax, [ecx+102DH]                        ; 1FE2 _ 8D. 81, 0000102D
        mov     eax, dword [eax]                        ; 1FE8 _ 8B. 00
        mov     dword [eax+4H], 30                      ; 1FEA _ C7. 40, 04, 0000001E
        lea     eax, [ecx+102DH]                        ; 1FF1 _ 8D. 81, 0000102D
        mov     eax, dword [eax]                        ; 1FF7 _ 8B. 00
        mov     edx, dword [eax+4H]                     ; 1FF9 _ 8B. 50, 04
        mov     eax, dword [eax]                        ; 1FFC _ 8B. 00
        leave                                           ; 1FFE _ C9
        ret                                             ; 1FFF _ C3
; start End of function


SECTION ._DATA.__common align=4 noexecute               ; section number 2, bss

_h:                                                     ; byte
        resb    8                                       ; 2000


SECTION ._IMPORT.__pointers align=4 noexecute           ; section number 3, data

        db 00H, 20H, 00H, 00H                           ; 3000 _ . ..

The above Mac version, although more indirect, is a little "smarter" (probably has more to do with ABI/API design) in that it only tries to return the values of each structure member in eax and edx. This is the only major difference between the two systems/binaries and could definitely account for a segfault.

In the end, however, you did something that was theoretically acceptable to the compiler but the real results were obviously nothing that was intended. I would recommend brushing up on Pointers in C/C++.

Another note about what you are trying to achieve. If you want direct access to the C struct in ASM, you would most likely want to put an extern for it in the ASM code and address it directly.


Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: Mixed assembly and C
« Reply #18 on: January 26, 2010, 07:35:04 AM »
That is truely bizzarre. And some fancy footwork on the part of the Mac!

I have a question about mixed C and asm. What's the C for? :)

Best,
Frank


Offline Bryant Keller

  • Forum Moderator
  • Full Member
  • *****
  • Posts: 360
  • Country: us
    • About Bryant Keller
Re: Mixed assembly and C
« Reply #19 on: January 26, 2010, 04:39:01 PM »
dbfn,

You could _technically_ return a struct even if it doesn't fit into Eax, although I've never seen it done in C. The technique accessed the previous frame to modify a previously declared structure at a known point on the frame. I don't exactly remember where I saw this (I do remember it was written in GoASM's USEDATA directive) but for the sake of modularity it would be worthless -- it has to know where the variable is in the previous frame.

Personally, I prefer to use the dereferenced pointer method shown in my previous post because it could easily be extended to void pointer types to allow for structures which share a base structure (think along the lines of inheritance, very useful in IP based protocols where different "payloads" decide the rest of the structure even though the "header" is effectively the same).

Just to be consistant I would avoid the use of returned structures, yes there are some compilers which will detect a structures size and put it in a "known" location like FPU registers or adjust the stack by moving the ebp/eip to make room for the value. But to be honest there is no "universal method" used by any compiler and your code will be restricted to only a specific set of build tools, whereas the pointer method is always going to be in Eax.

Quote
Thanks to Bryant for the example!

Frank,

No problem man. Actually, for the sake of this new forum I've decided to start saving a local copy of my responses as well as working examples in a neat directory structure.

Code: [Select]
/home/bkeller/Desktop/Support/[username]/[subject]/
/home/bkeller/Desktop/Support/[username]/[subject]/src/*.*
/home/bkeller/Desktop/Support/[username]/[subject]/responses/###.txt

This way not only do I know for certain that all code I post will work (no more tongue-and-cheek code) but I'll also have a way to quickly look up past responses in the event I have a repeat question. I hope this will make me more effect in responding to any questions.

We've talked before about me joining sourceforge to take part in on the discussion group but as I mentioned before, I hated the sourceforge based forums due to their poor support for displaying code snippets. (You would think a site dedicated to hosting open source projects would at least allow users to post readable code). I honestly can't express how happy I am that the board is now running SMF and I look forward to being a regular member here. :)

Keith,

Mac's are so cute. The Mac dump you posted has me completely boggled as to why in the world it would want to use position relative addressing for accessing memory locations. That code reminds me of the old K32B.INC include Homer and I created for tiny executables on win32 (no imports) through use of PRA (The Delta Method) and a few PEB hacks. Outside of that include I've actually never seen PRAs used for, well anything (except maybe shellcode and viral type code). lol

If you have any other code dumps like that (being as I don't own a Mac) toss them to me in PM, I'd love to see what other little tricks Mac compilers have picked up. :)

Regards,
Bryant Keller

About Bryant Keller
bkeller@about.me