Author Topic: Converting C code to Nasm  (Read 18795 times)

Offline axper

  • Jr. Member
  • *
  • Posts: 4
Converting C code to Nasm
« on: November 17, 2013, 11:31:02 AM »
Hello, I am trying to convert a small C file to nasm.
Here is my C file: http://pastebin.com/KZ6L2TK4
It works fine when compiling with the following command:
Code: [Select]
gcc -lglfw -lGL -lm -m32 -o main main.c
I'll be using "-m32" to generate 32 bit code all along for simplicity.

Now the conversion part.
Using gcc's "-S -masm=intel" argument I first generate an assembly file with intel syntax:
Code: [Select]
gcc -m32 -S -masm=intel -Os -o main.intel-gcc.s main.c
Here's the result: http://pastebin.com/ZNwr9S9a
This is gcc's intel syntax, which differs a bit from nasm's syntax. So I try to adapt it to nasm syntax and remove unncessary parts. I wrote the following sed script to do the trick: http://pastebin.com/vW53pZDV. That does most of the work but the result isn't parseable by Nasm yet.
Here's the result: http://pastebin.com/Dk1rm4Fg
The rest of the work must be done manually, as I don't know enought sed to automate that.
Next, we must take care of segments. Just try to comile with nasm and examine the output. I use the following command:
Code: [Select]
nasm -f elf32 -o main.o main.nasm
I put .text segment at top - just rename ".text" to "section .text". Next, move the offending ".section .rodata etc" segments to the end of the file. Also delete repeating ".text" and ".section .text.startup etc" lines. Then, rename the ".data" line to ".bss" and in that segment rename all dd instructions to resd, and all db instructions to resb. Also remove .align lines from there. Next, insert a line "section .data" at top of the all .rodata segments we moved to end of the file eariler, and delete those ".section .rodata etc" lines.
Here's the result: http://pastebin.com/XrtMm6LF
Next, Nasm needs to know about external library functions used, such as strcpy and exit. I manually copied their names and put them at top of the file with extern declaration. Finally, Nasm needs a "global _start" line.
Here's the final result: http://pastebin.com/EbUyEfcb
For some reason, I am getting many warnings like this: main.c.nasm:158: warning: dword data exceeds bounds
Is this because I assigned wrong storage size to my variables? Also what segment do I put intialized variables in (not constants)?

Finally, link the resulting main.o with the following command:
Code: [Select]
ld -s -dynamic-linker\
/usr/lib32/ld-linux.so.2\
-L/usr/lib32\
-lc -lm -lglfw -lGL -m elf_i386\
-o main \
main.o

./main - and I get this line as output: zsh: killed     ./main

If someone would be kind enough to look into my steps and point out my mistakes or offer suggestions I would be very thankful.

I am using Arch Linux x64
gcc version: 4.8.2
nasm version: 2.10.09
ld version: 2.23.2

As a side note, anti-spam methods on this forum are really annoying and confusing. I had to find a mouse! Then spent half an hour before realizing that after assembling the picture I should click register as there is no indicator of correctness or auto-verification.

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: Converting C code to Nasm
« Reply #1 on: November 17, 2013, 12:52:26 PM »
This is rather a large mouthful to chew!

Quote
Then, rename the ".data" line to ".bss" and in that segment rename all dd instructions to resd, and all db instructions to resb.

I suspect this is where you've gone wrong. Your "section .bss" doesn't look right at all. "resd 0" makes no sense, and "resd some_large_number" reserves an insane amount of memory. That's probably why you're being "killed". I'd change 'em back to "dd", etc. (in "section .data")

Agner Fog's "objconv" ( http://www.agner.org ) will disassemble your executable - or an .o file - into Nasm syntax (objconv -fnasm main main.af). This might be easier to work with than gcc's "-S" switch. There's still a lot of "C cruft" in it, but at least what you've got will be Nasm syntax. You've put so much work into this method that it may not be worth starting over.

I sympathize about the "anti-spam measures". If I were freshly trying to register here, it would probably keep me out. I blame the spammers. I think (I hope!) it will go away after you've logged in a few times.

Good luck - with both!

Best,
Frank


Offline axper

  • Jr. Member
  • *
  • Posts: 4
Re: Converting C code to Nasm
« Reply #2 on: November 17, 2013, 03:59:13 PM »
Thank you so much!
Converting from binary code seems a better approach indeed (I blame google for not showing me that).
For anyone wondering:
Code: [Select]
#!/bin/sh
gcc -m32 -c -o main.c.o main.c
objconv -fnasm main.c.o main.nasm
sed -i 's|st(0)|st0  |g' main.nasm
sed -i 's|noexecute|         |g' main.nasm
sed -i 's|execute|       |g' main.nasm
sed -i 's|: function||g' main.nasm
sed -i 's|?_|L_|g' main.nasm
sed -i -n '/SECTION .eh_frame/q;p' main.nasm
sed -i 's|;.*||g' main.nasm
sed -i 's/^M//g' main.nasm
sed -i 's|\s\+$||g' main.nasm
nasm -f elf32 -o main.nasm.o main.nasm
gcc -m32 -lglfw -lGL -lm -o main main.nasm.o
I misunderstood the res* instructions but now objconv sorted it all out.
And to answer my own question about segments:
  • initialized variables go to .data section
  • uninitialized variables go to .bss section
  • initialized constants go to .rodata section