Author Topic: 64-bit assembly - some basic functionality, and Windows and c interfacing  (Read 25660 times)

Offline Sydney Grew

  • Jr. Member
  • *
  • Posts: 2
Greetings to all. This is about a simple example I made, showing how to use 64-bit assembly language to perform some basic but useful functions such as:

- Reading the command line
- Displaying text with printf
- Getting text with gets and _getch
- Listing the contents of a directory using _findfirst64 and _findnext64
- Reading a file with fopen, feof, fread, and fclose
- Calling Windows APIs

And there is a second simple example showing why it may be better to call c functions from an assembly language main program, rather than to write a main program entirely in c.

They use nasm in a 64-bit windows environment, so may be of interest to some people. Full instructions for building them are included.

Um . . . at present all this may be seen in a general discussion forum where I uploaded them last week - here is the link:

http://artmusic.smfforfree.com/index.php/topic,249.0.html

but if preferred I could paste them them here as well.

And thanks so much to the authors of nasm for all their hard work over many years to provide what is now a very useful program indeed!

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: 64-bit assembly - some basic functionality, and Windows and c interfacing
« Reply #1 on: December 24, 2011, 04:18:03 PM »
Shouldn't be a problem to paste it here. I think this forum is configured so that you need to make 5(?) posts before you can attach files. We can probably override that, if necessary...

Code: [Select]
; Simple examples of the use of 64-bit assembly language to perform
; a few fundamental functions in a "c" and "windows" environment.

; ====================================================================

extern printf
extern _getch
extern gets
extern _findfirst64
extern _findnext64
extern _findclose
extern fopen
extern fread
extern feof
extern fclose
extern GetCommandLineA

; ====================================================================

buflen  equ   2000000                 ; File input buffer size

; ====================================================================

; Structure definition used by findfirst64 and findnext64:

struc finddata64_t
  attrib:       resq  1
  time_create:  resq  1
  time_access:  resq  1
  time_write:   resq  1
  size:         resq  1
  name:         resb  260
endstruc

; ====================================================================

; This directive is essential; 64-bit code normally addresses things
; in a new mode, relative to the instruction pointer!

default rel

; ====================================================================

section .code

[BITS 64]
global _start

; The conventional prologue
_start: mov   [rsp + 8],rcx
        push  r15
        push  r14
        push  r13
        sub   rsp,128           ; Stack space

; Get the command line (an example of one of the many hundreds of
; Windows APIs which may be called in this way)
        xor   rcx,rcx
        xor   rdx,rdx
        xor   r8,r8
        xor   r9,r9
        call  GetCommandLineA   ; Returns rax pointing to the command line

; Display something using printf
        mov   rcx,mess1
        mov   rdx,rax           ; Pointer to the command line (%s)
        xor   r8,r8
        xor   r9,r9
        call  printf

; Get some text from the user (note - no check for buffer overflow here - to avoid
; that one could use GetStdHandle and fgets, which accepts a maximum count)
        mov   rcx,buffer
        xor   rdx,rdx
        xor   r8,r8
        xor   r9,r9
        call  gets
;
        mov   rcx,youent
        mov   rdx,buffer
        xor   r8,r8
        xor   r9,r9
        call  printf

; List a selection of files in the current directory
        mov   rcx,wci         ; The mask *.txt will select all text files
        mov   rdx,finddata64  ; Address of the result structure
        xor   r8,r8
        xor   r9,r9
        call  _findfirst64
;
        cmp   rax,-1
        je    goout           ; --> No file of that kind found
        mov   [hfile],rax     ; Save the file "handle"

; Display the file's name and size
filope: mov   rcx,fileinfo
        mov   rdx,finddata64+name   ; Address of the file name
        mov   r8,[finddata64+size]  ; File size - note that it is a quadword here (%lld)
        xor   r9,r9
        call  printf

; Open the file for reading in binary mode
        mov   rcx,finddata64+name   ; Address of the file name
        mov   rdx,ipmods      ; Mode string - important - must be zero-terminated!
        xor   r8,r8
        xor   r9,r9
        call  fopen

; It will return either a FILE * or null
        or    rax,rax
        jz    error                 ; --> Open failed
        mov   qword [filpoi],rax    ; Save the FILE pointer

; Test whether end of file has been reached
        mov   rcx,rax         ; The FILE pointer
        xor   rdx,rdx
        xor   r8,r8
        xor   r9,r9
        call  feof
;
        or    rax,rax
        jnz   goteof          ; --> Yes it has

; Read the first twenty bytes of the file
        mov   rcx,buffer      ; Destination
        mov   rdx,1           ; unit or item
        mov   r8,20           ; Read 20 bytes
        mov   r9,qword [filpoi]     ; FILE *
        call  fread
;
        or    rax,rax
        jz    goteof          ; --> No readable bytes

; Display those 20 bytes in hexadecimal. Note that printf will
; preserve registers r13 and r14
        mov   r13,buffer      ; First byte to be displayed
        mov   r14,buffer+20
disbak: mov   rcx,bytesmes
        xor   rdx,rdx
        mov   dl,[r13]
        xor   r8,r8
        xor   r9,r9
        call  printf
        inc   r13
        cmp   r13,r14
        jne   disbak
;
        mov   rcx,bmafter
        xor   rdx,rdx
        xor   r8,r8
        xor   r9,r9
        call  printf

; Close the file
goteof: mov   rcx,[filpoi]
        xor   rdx,rdx
        xor   r8,r8
        xor   r9,r9
        call  fclose

; Wait for a key-press - if the user types "x" we exit right away
        xor   rcx,rcx
        call  _getch
        cmp   al,'x'
        je    goout

; Otherwise we go on to find the next file, if there is one
        mov   rcx,[hfile]     ; The findfirst handle again
        mov   rdx,finddata64
        xor   r8,r8
        xor   r9,r9
        call  _findnext64
;
        or    rax,rax
        jz    filope          ; --> Found one, so go back and open it
;
        jmp   goout           ; --> No more files found

; ---------------------------------------------------

; Error exit
error:  mov   rcx,errmes
        xor   rdx,rdx   
        xor   r8,r8
        xor   r9,r9
        call  printf

; Normal exit
; Call _findclose if necessary to terminate the directory loop
goout:  mov   rcx,[hfile]
        or    rcx,rcx
        jz    noneed
        xor   rdx,rdx
        xor   r8,r8
        xor   r9,r9
        call  _findclose
noneed:

; Standard epilogue
        xor   rax,rax
;
        add   rsp,128
        pop   r13
        pop   r14
        pop   r15
        ret

; ====================================================================

section .data

wci     db      "*.txt",0 ; Wild-card will find all .txt files
ipmods  db      "rb",0    ; Modes for file binary read

        align   16
filpoi  dq      0         ; FILE pointer
hfile   dq      0         ; Handle from findfirst

; An instance of the structure finddata64_t. The initializations (the lines
; beginning with "at") are not always necessary.
finddata64:
  istruc finddata64_t
  at attrib,      dq  0
  at time_create, dq  0
  at time_access, dq  0
  at time_write,  dq  0
  at size,        dq  0
  at name,        db  0
iend

errmes   db   "Error",0
mess1    db   'Command line is ->%s<-',0x0a,0x0d, 'Please type your name: ',0
youent   db   'You entered "%s"',0x0a,0x0a,0x0d,0
fileinfo db   'File name "%s," file size %lld',0x0d,0x0a,0
bytesmes db   '%2.2x ',0
bmafter  db   0x0d,0x0a, 0x0d,0x0a, 0x07,0    ; The 7 is a proper beep

; ====================================================================

section .bss

; Input buffer
        alignb  16
buffer: resb  buflen

; End of the programme ===============================================

And the batch file to build it:

Code: [Select]
c:\nasm\nasm -f win64 -Ox -Z Sample.err Sample.asm -l Sample.lst
if errorlevel 1 goto nasmfail
k:
cd \"Microsoft Visual Studio 10.0"\VC
call vcvarsall.bat x64
e:
link Sample.obj /subsystem:console /defaultlib:msvcrt.lib /defaultlib:kernel32.lib /entry:_start
if errorlevel 1 goto linkfail
pause
goto end

:nasmfail
:linkfail
@echo There were errors, so examine Sample.err
pause

:end

And here's the second one. A C program for comparison first:

Code: [Select]
/*
    An extremely simple "c" programme
*/
int main( void )
  {
  int numa, numb, numc;
 
  numa = 1;
  numb = 2;
  numc = numa + numb;
 
  printf( "The answer is %lld\n", numc );
  }

And the batch file to build it...

Code: [Select]
k:
cd \"Microsoft Visual Studio 10.0"\VC
call vcvarsall.bat x64
e:
cl Simplec.c
pause

Quote
After compilation the resultant executable file simplec.exe occupies 51,712 bytes on my hard drive (which of course will be rounded up to the next highest allocation unit, and becomes 53,248 bytes on my machine).

Secondly, now, consider this assembly language programme, which calls a "c" function. Again the "c" function adds 1 to 2 and obtains the result 3. The only real difference between this function and the main programme above is that this is called from a main programme written in assembly language:

Code: [Select]
; Example of calling a "c" function from an assembly language main programme.

; ====================================================================

extern printf
extern _getch
extern trythis

; ====================================================================

default rel

; ====================================================================

section .code

[BITS 64]
global _start

; The conventional prologue
_start: mov   [rsp + 8],rcx
        push  r15
        push  r14
        push  r13
        sub   rsp,128           ; Stack space

; Send a C function two numbers to be added
        mov   qword [numa],1
        mov   rcx,[numa]
        mov   rdx,2
        xor   r8,r8
        xor   r9,r9
        call  trythis       ; The return value will come back in rax

; Display the return value
        mov   rcx,retmes
        mov   rdx,rax   
        xor   r8,r8
        xor   r9,r9
        call  printf

; Wait for a key-press
        xor   rcx,rcx
        call  _getch

; Standard epilogue
        xor   rax,rax
;
        add   rsp,128
        pop   r13
        pop   r14
        pop   r15
        ret

; ====================================================================

section .data

numa    dq  0
retmes  db  'We are now back after calling the C function; it has returned %lld',0x0a,0x0d,0

; End of Programme ===================================================

And the C function to call...

Code: [Select]
/*
   An extremely simple "c" function
*/

int trythis( int numa, int numb )
  {
  int numc;
 
  numc = numa + numb;
 
  printf( "We are now in the c function and the answer is %lld\n", numc );
  return numc;
  }

And the batch file:

Code: [Select]
--------------------------------------------------------------------------------
c:\nasm\nasm -f win64 -Ox -Z SampleCF.err SampleCF.asm -l SampleCF.lst
if errorlevel 1 goto nasmfail
k:
cd \"Microsoft Visual Studio 10.0"\VC
call vcvarsall.bat x64
e:
cl /c Simplecf.c
if errorlevel 1 goto cfail
link SampleCF.obj Simplecf.obj /subsystem:console /defaultlib:msvcrt.lib /defaultlib:kernel32.lib /entry:_start
if errorlevel 1 goto linkfail
pause
goto end

:nasmfail
:cfail
:linkfail
@echo There were errors!
pause

:end

Quote
After assembly and compilation the resultant executable file SampleCF.exe occupies just 4096 bytes on my hard drive. In other words, the version with a single "c" main programme takes 13 times the space of the version with an essentially identical "c" function called from assembly language. Odd is it not? Fishy even perhaps.

:) C adds some cruft to our program? I'm shocked, shocked I tell ya! :)

Quote
And of course the assembly code - if it is time critical - is guaranteed to run about five times faster than the "c" code.

Have you tested this? I'd be surprised it it were true. IME, C used to be much slower than asm, but recent versions (gcc, at least) are as fast or faster(!). Depends on what you're doing, of course.

A couple of comments...

Quote
But escape sequences such as "\n" will not work.

Actually, if you use "back quotes" instead of single or double quotes to delimit your format string, Nasm (recent versions) will accept "\n" and the other usual C escape sequences. You do still need the explicit zero terminator!

Code: [Select]
section .code

Unless I'm mistaken, ".text" is the "known" name for a code section (these names are case sensitive!). I guess a "default" section has the right attributes ("treated like '.text'", the manual says) so it shouldn't make any difference...

Anyone interested in this should probably visit your original link to see your full comments.

Thanks for the examples! We're short of examples, especially for 64-bit.

Best,
Frank


Offline Sydney Grew

  • Jr. Member
  • *
  • Posts: 2
Re: 64-bit assembly - some basic functionality, and Windows and c interfacing
« Reply #2 on: December 25, 2011, 01:02:03 PM »
Many thanks for pasting all that stuff so expertly!

Offline fgenolini

  • Jr. Member
  • *
  • Posts: 2
Re: 64-bit assembly - some basic functionality, and Windows and c interfacing
« Reply #3 on: February 26, 2013, 09:27:18 AM »
For a simple C program that only uses Win32 API (no C++, no CRT, ...), then the generated object and executable code is comparable in size as the equivalent binaries coming from assembler code.

Compare:

Code: [Select]
#define VC_EXTRALEAN
#define WIN32_LEAN_AND_MEAN
#include <windows.h>

// For C only (no C++), use this entry point, as WinMain is too big and not required
EXTERN_C int WINAPI WinMainCRTStartup(void) {
  int retVal = MessageBoxA(0, "Hello, World!", "64 bit Windows C", 0);
  ExitProcess(retVal);
}

against:

Code: [Select]
bits 64
extern MessageBoxA
extern ExitProcess

section .text
  global main
main:
  sub rsp, 0x28
  mov rcx, 0            ; hWnd = HWND_DESKTOP
  mov rdx, qword message        ; LPCSTR lpText
  mov r8, qword title           ; LPCSTR lpCaption
  mov r9d, 0            ; uType = MP_OK
  call MessageBoxA
  add rsp, 0x28
  mov ecx, eax          ; uExitCode = MessageBox(...)
  call ExitProcess

section .data
  title: db     "64 bit Windows assembler", 0
  message: db   "Hello, world!", 0

Both C and assembler can be made into EXE files of exactly the same size, running at seemingly the same speed.  Only the object files have different sizes (the one from C being 1KiB instead of 0.5Kib from assembler).

To make this happen, specific compiler and linker options must be set.  Using a cygwin Makefile for GNU make:

Code: [Select]
PATH := /cygdrive/c/Program Files (x86)/Microsoft Visual Studio 11.0/VC/BIN/amd64:${PATH}
VCINCLUDE = "c:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\INCLUDE"
SDKINCLUDE = "c:\Program Files (x86)\Windows Kits\8.0\Include\um"
SDKSHAREDINCLUDE = "c:\Program Files (x86)\Windows Kits\8.0\Include\shared"
VCLIBPATH = "c:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\LIB\amd64"
SDKLIBPATH = "c:\Program Files (x86)\Windows Kits\8.0\Lib\win8\um\x64"

CC = cl.exe
COPTS = /nologo /TC /favor:INTEL64 /MT /GA /GR- /Ox /w /Y- /I ${VCINCLUDE} /I ${SDKINCLUDE} /I ${SDKSHAREDINCLUDE}

LD = link.exe
LDOPTS = /nologo /MACHINE:X64 /OPT:REF /OPT:ICF /nodefaultlib /SUBSYSTEM:WINDOWS
LDOBJS = /LIBPATH:${VCLIBPATH} /LIBPATH:${SDKLIBPATH} kernel32.lib user32.lib

ASM = /cygdrive/c/nasm/nasm.exe
ASMOPTS = -f win64 -Ox

all: test1.exe test2.exe

test1.exe: test1.obj
        ${LD} ${LDOPTS} /entry:main /OUT:test1.exe test1.obj ${LDOBJS}

test1.obj: test1.asm
        ${ASM} ${ASMOPTS} -o test1.obj test1.asm

test2.obj: test2.c
        $(CC) ${COPTS} /c test2.c

test2.exe: test2.obj
        ${LD} ${LDOPTS} /OUT:test2.exe test2.obj ${LDOBJS}
« Last Edit: February 27, 2013, 08:45:47 AM by fgenolini »

Offline Rob Neff

  • Forum Moderator
  • Full Member
  • *****
  • Posts: 429
  • Country: us
Re: 64-bit assembly - some basic functionality, and Windows and c interfacing
« Reply #4 on: February 26, 2013, 05:20:50 PM »
I disagree with the original post: I have found that a C program does not need to be that big.

You are comparing apples to oranges.

I would prefer that this thread not degenerate into a "let's make it even smaller" discussion ( we've beaten that topic to death elsewhere on these forums ) but rather keep the OP's original intent of showing 64-bit Windows programming examples.

I would, however, caution the OP on making assertions regarding executable size/speed considerations until after considerable experience is acquired.   ;)

Edit:  Actually, I just realized this thread was necro'd.  Sheesh!  :o
« Last Edit: February 26, 2013, 05:27:27 PM by Rob Neff »

Offline fgenolini

  • Jr. Member
  • *
  • Posts: 2
Re: 64-bit assembly - some basic functionality, and Windows and c interfacing
« Reply #5 on: February 26, 2013, 07:19:17 PM »
Sorry, I did indeed get a warning that the thread was more than 120 days old.
I should have created a new thread and titled it on its own merit.

<<Win64 example code: comparable assembly and C code>>

The original post from 2011 made several points, not just giving a single example code.  One of the point made was that you could make the resulting executable smaller by calling a C function from assembly instead of main.  This point is incorrect: if you use the correct entry point (such as WinMainCRTStartup) and the correct compiler and linker options, there is no need to preamble your code with assembly.
Assembly code shines elsewhere, not in the example given.

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: 64-bit assembly - some basic functionality, and Windows and c interfacing
« Reply #6 on: February 26, 2013, 07:44:02 PM »
Thanks for the example and compiler/linker options, regardless how old the original post was!

Best,
Frank