Author Topic: Get NASM code from C code automatically  (Read 18040 times)

Offline AndrewF

  • Jr. Member
  • *
  • Posts: 15
Get NASM code from C code automatically
« on: May 27, 2014, 12:06:05 AM »
Is there a way to get NASM code from c code without manually write everything?

For example in gcc there is an option to save on a file the generic assembly code of the program, this simply adding a flag to the compilation command.
I'm looking for a similar method that returns me pure NASM code.

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: Get NASM code from C code automatically
« Reply #1 on: May 27, 2014, 12:39:11 AM »
Well... David Lindauer's old cc386 used to output Nasm syntax, but it's not very up-to-date. Using gcc, you can ask it for Intel Syntax. "-masm=intel" or something like that. Nasm won't assemble it, but it would be closer. Maybe I don't understand what you're trying to do. If you've got C code, why do you need Nasm code? Why not let gcc assemble it with (G)as?

Offline AndrewF

  • Jr. Member
  • *
  • Posts: 15
Re: Get NASM code from C code automatically
« Reply #2 on: May 27, 2014, 02:58:17 AM »
I want to create more optimized routines for a specific CPU, the standard gcc compilation procedure generates too generic assembly.
Write all NASM code from zero is very difficult and requires too much work, so would be useful a way to get the nasm code automatically so I could try to edit and optimize this replacing older istruction and parallelizing loops where possible.
« Last Edit: May 27, 2014, 03:00:23 AM by AndrewF »

Offline encryptor256

  • Full Member
  • **
  • Posts: 250
  • Country: lv
  • Win64 .
    • On Youtube: encryptor256
Re: Get NASM code from C code automatically
« Reply #3 on: May 27, 2014, 11:18:13 AM »
Is there a way to get NASM code from c code without manually write everything?

No, but possible, might be:

GCC is insanely configurable, ~"five billion" command switches and options to choose, turn on or off.

Only GCC could answer to this question properly.

I think Intel Syntax is the thing you are looking for, like Frank suggests.

I think there are only two syntax'es: Intel and others. :D

For me, Intel always make sense, but "others" - not really.


Bye.
Encryptor256's Investigation \ Research Department.

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: Get NASM code from C code automatically
« Reply #4 on: May 30, 2014, 05:38:19 PM »
Occurs to me that Agner Fog's "objconv", run on the C-built executable or .o file(?) might help you. "-fnasm" no space. http://www.agner.org

Best,
Frank


Offline AndrewF

  • Jr. Member
  • *
  • Posts: 15
Re: Get NASM code from C code automatically
« Reply #5 on: June 01, 2014, 03:15:01 AM »
Occurs to me that Agner Fog's "objconv", run on the C-built executable or .o file(?) might help you. "-fnasm" no space. http://www.agner.org

Best,
Frank
How to use this software on linux? Is there any deb package?

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: Get NASM code from C code automatically
« Reply #6 on: June 01, 2014, 04:03:24 AM »
"make install" not doing if for you? Mmmm, looks like there's a "build.sh" in there.

Best,
Frank


Offline AndrewF

  • Jr. Member
  • *
  • Posts: 15
Re: Get NASM code from C code automatically
« Reply #7 on: June 01, 2014, 12:28:52 PM »
"make install" not doing if for you? Mmmm, looks like there's a "build.sh" in there.

Best,
Frank
I have modified the build.sh permission to make it runnable, but when I try to execute build.sh I get this error

Code: [Select]
g++: error: *.cpp: No such file or directory
g++: fatal error: no input files
compilation terminated.

I'm using Kubuntu 64 bit, I have unzipped the source in a folder named objectconvsource located in my home directory

if I try to write

Code: [Select]
-> `cd objectconvsource`

-> `g++ -o objconv -O2 *.cpp && make && sudo make install`

I get the error

   
Code: [Select]
No targets specified and no makefile found.  Stop.
« Last Edit: June 01, 2014, 12:54:29 PM by AndrewF »

Offline Mixolydian

  • Jr. Member
  • *
  • Posts: 21
Re: Get NASM code from C code automatically
« Reply #8 on: June 01, 2014, 01:57:36 PM »
Is there a way of doing this with ndisasm, something like:

Code: [Select]
main.asm: main.c
gcc main.c -o main.elf
objcopy -O binary main.elf main.bin
"$(NASM)\ndisasm" -u main.bin -O main.asm

Although this doesn't work for multiple reasons.

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: Get NASM code from C code automatically
« Reply #9 on: June 01, 2014, 06:25:38 PM »
Ndisasm will produce Nasm code, but probably not in a very useful format.

AndrewF, it "sounds" like you're not in the right directory... although if you altered the permissions on "build.sh" you must be...

I hate to say this, but beating a compiler these days isn't easy. Not like the "good old days". As encrypto256 suggests, you might be better off studying your compiler switches.

Best,
Frank


Offline AndrewF

  • Jr. Member
  • *
  • Posts: 15
Re: Get NASM code from C code automatically
« Reply #10 on: June 02, 2014, 01:01:25 AM »
Ndisasm will produce Nasm code, but probably not in a very useful format.

AndrewF, it "sounds" like you're not in the right directory... although if you altered the permissions on "build.sh" you must be...

I hate to say this, but beating a compiler these days isn't easy. Not like the "good old days". As encrypto256 suggests, you might be better off studying your compiler switches.

Best,
Frank

Thank you for the help. Now I have correctly configured the suggested program, the asm generated files contains nasm istructions and in the first part is readable but in the second part of generated file there are a lot of codelines like this that I haven't understood what purpose have and how should I interpret

Code: [Select]
SECTION .data   align=1 noexecute                       ; section number 2, data


SECTION .bss    align=1 noexecute                       ; section number 3, bss


SECTION .rodata.str1.8 align=8 noexecute                ; section number 4, const

?_035:                                                  ; byte
        db 0AH, 0AH, 53H, 74H, 61H, 72H, 74H, 20H       ; 0000 _ ..Start
        db 4DH, 61H, 74H, 72H, 69H, 78H, 20H, 4DH       ; 0008 _ Matrix M
        db 65H, 6DH, 6FH, 72H, 79H, 20H, 41H, 6CH       ; 0010 _ emory Al
        db 6CH, 6FH, 63H, 61H, 74H, 69H, 6FH, 6EH       ; 0018 _ location
        db 2EH, 2EH, 2EH, 0AH, 00H, 00H, 00H, 00H       ; 0020 _ ........

?_036:                                                  ; byte
        db 0AH, 0AH, 45H, 6EH, 64H, 20H, 4DH, 61H       ; 0028 _ ..End Ma
        db 74H, 72H, 69H, 78H, 20H, 4DH, 65H, 6DH       ; 0030 _ trix Mem
        db 6FH, 72H, 79H, 20H, 41H, 6CH, 6CH, 6FH       ; 0038 _ ory Allo
        db 63H, 61H, 74H, 69H, 6FH, 6EH, 2EH, 0AH       ; 0040 _ cti..
        db 00H, 00H, 00H, 00H, 00H, 00H, 00H, 00H       ; 0048 _ ........

?_037:                                                  ; byte
        db 0AH, 0AH, 53H, 74H, 61H, 72H, 74H, 20H       ; 0050 _ ..Start
        db 4DH, 61H, 74H, 72H, 69H, 78H, 20H, 4DH       ; 0058 _ Matrix M
        db 65H, 6DH, 6FH, 72H, 79H, 20H, 44H, 65H       ; 0060 _ emory De
        db 61H, 6CH, 6CH, 6FH, 63H, 61H, 74H, 69H       ; 0068 _ alloct
        db 6FH, 6EH, 2EH, 2EH, 2EH, 0AH, 00H, 00H       ; 0070 _ on......

?_038:                                                  ; byte
        db 0AH, 0AH, 45H, 6EH, 64H, 20H, 4DH, 61H       ; 0078 _ ..End Ma
        db 74H, 72H, 69H, 78H, 20H, 4DH, 65H, 6DH       ; 0080 _ trix Mem
        db 6FH, 72H, 79H, 20H, 44H, 65H, 61H, 6CH       ; 0088 _ ory Deal
        db 6CH, 6FH, 63H, 61H, 74H, 69H, 6FH, 6EH       ; 0090 _ loc
        db 2EH, 0AH, 00H, 00H, 00H, 00H, 00H, 00H       ; 0098 _ ........

?_039:                                                  ; byte
        db 0AH, 0AH, 53H, 74H, 61H, 72H, 74H, 20H       ; 00A0 _ ..Start
        db 4DH, 61H, 74H, 72H, 69H, 78H, 20H, 54H       ; 00A8 _ Matrix T
        db 72H, 61H, 6EH, 73H, 70H, 6FH, 73H, 69H       ; 00B0 _ ranp
        db 74H, 69H, 6FH, 6EH, 2EH, 2EH, 2EH, 0AH       ; 00B8 _ tion....
        db 00H, 00H, 00H, 00H, 00H, 00H, 00H, 00H       ; 00C0 _ ........

//and so on....


Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: Get NASM code from C code automatically
« Reply #11 on: June 02, 2014, 02:20:05 AM »
The first three lines are section declarations. The cruft after the section name looks like what Nasm would default to if you didn't say. Shouldn't do any harm to leave it in there. The rest of it looks like strings. Hard to say without seeing your C code. I don't see a ".text" section. That may be in the "and so on" part... and is probably what you're interested in. Good progress. Keep on keepin' on.

Best,
Frank


Offline AndrewF

  • Jr. Member
  • *
  • Posts: 15
Re: Get NASM code from C code automatically
« Reply #12 on: June 02, 2014, 01:50:14 PM »
...
« Last Edit: June 04, 2014, 02:33:44 AM by AndrewF »

Offline AndrewF

  • Jr. Member
  • *
  • Posts: 15
Re: Get NASM code from C code automatically
« Reply #13 on: June 04, 2014, 01:53:51 AM »
Ok.
now I have removed unuseful frame part, and problematic keywords but I don't know how to fix these problems

Code: [Select]
test.asm:189: warning: absolute address can not be RIP-relative
test.asm:225: warning: absolute address can not be RIP-relative
test.asm:286: warning: absolute address can not be RIP-relative

That happens in these lines

Code: [Select]
vmovss  xmm1, dword [rel .LC0]
Code: [Select]
vmovss  xmm8, dword [rel .LC2]
Code: [Select]
vucomiss xmm0, dword [rel .LC1]
causing the compiler returns me these errors
Code: [Select]
test.asm:189: error: symbol `?_015.LC0' undefined
test.asm:225: error: symbol `matrixum.LC2' undefined
test.asm:286: error: symbol `?_025.LC1' undefined

I have read that NASM syntax, use of the 64-bit absolute form requires QWORD.
So the problem could be caused by the fact in this section of my code there are only dword?
Code: [Select]
SECTION .rodata.cst4                   ; section number 4, const

.LC0:                                                   ; dword
        dd 3F800000H                                    ; 0000 _ 1.0

.LC1:   dd 00000000H                                    ; 0004 _ 0.0


SECTION .rodata.cst16              ; section number 5, const

.LC2:                                                   ; dword
        dd 80000000H, 00000000H                         ; 0000 _ -0.0 0.0
        dd 00000000H, 00000000H                         ; 0008 _ 0.0 0.0


« Last Edit: June 04, 2014, 02:38:46 AM by AndrewF »

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: Get NASM code from C code automatically
« Reply #14 on: June 04, 2014, 04:28:08 AM »
I'm way over my head with this, AndrewF. I have almost no experience with 64-bit code. The first three lines are "only" warnings. It "might" work anyway, but probably not. You could try removing the "rel", but that probably won't work either. You might try making the whole thing "default rel". If objconv thinks gcc encoded it as "rel", it probably did. I don't know why Nasm thinks it's a problem.

As you know, symbols starting with a dot are Nasm's idea of how to implement a local label. The scope is from one non-local label to the next. Do you see labels like "?_015", "matrixum", and/or "?_025" anywhere (with or without a ':)'? You might try stuffing such labels in - someplace that looks probable. If ".LC0" etc. are reused, you're going to have to be careful to get 'em in the right place! If they're not reused, removing the dot might work(?).

I think that all addresses would be oword, but I don't see any problem that they point to dword or even byte data. As I said, this is way beyond my experience. Good luck!

Best,
Frank