NASM Forum > Example Code

Opengl/OpenAL game 100% NASM x86_64 Assembly

(1/3) > >>

Rodrigo Robles:
When I first saw x86_64 I was amazed. 16 general purpose 64-bit registers plus 16 128-bit floating point registers is much more than a guy raised with 6502 could imagine.

I thought that would be so easy to code that Assembly so the effort would be close to write C code. Then some time ago I decided to write a little OpenGL/openal game 100% Assembly to measure the productivity and prove the viability of writing large programs in x86_64. In the last years I made some little retro games for Android in JavaScript so I could make a comparison.

I choose to make a revamp for the classic 1982's Attack of the Timelord. Here is the sources: https://gitlab.com/RodrigoRobles/trevaskas-2

Here is a screenshot of the game:


The graphics are quite simple because it is a retro game, but there is no obstacle to make larger games with fancy graphics with pure x86_64 Assembly.

As I expected, the productivity in hours/FP (for not much optimized code) was close to the JavaScript, wich proves that "basic" x86_64 is much easier to write than previous 8-bit or x86 architectures. (Of course optimized modern multithreading SIMD code costs much more than ordinary x86_64 code)

It proves Randall Hyde's point of view:
"Software engineers estimate that developers spend only about thirty percent of their time coding a solution to a problem. Even if it took twice as much time to write a program in assembly versus some HLL, there would only be a fifteen percent difference in the total project completion time. In fact, good assembly language programmers do not need twice
as much time to implement something in assembly language."

Being happy with the results, later I wrote a paper about the theme of large x86_64 Assembly programs: https://drive.google.com/uc?id=1_fKS97tb0UzWJ0RqZXpfTA8odrkCK5bE&export=download

Also created an itch.io page: https://rodrigo-robles.itch.io/trevaskas-ii

You can see a video of gameplay here: https://youtu.be/GzBffhLwkR4

Deskman243:
If that library is closed source and based on C how is any of that possible?

fredericopissarra:
Just a couple of considerations on the source code (and your "paper")...

1. You don't need to align the stack pointer to DQWORD if you are not using the stack. sub rsp,8 and add rsp,8 as prolog and epilog aren't necessary all the time;

2. If you are loading a 32 bits value into a 64 bits register, use E?? instead of R??. The instruction will be smaller and faster (since there's no REX prefix if registers below R8 are used). For example, instead of xor rax,rax, use xor eax,eax.

3. There's no real gain to use assembly for C like routines, unless you are prepared to optimize the code in ways GCC can't do. Example, to use SSE4.2 for string routines. GCC do a better job with integer divisions, for example, than simply using div/idiv (specially with literal divisors). I recommend to consider to create freestanding routines in C.

4. To use -fno-pie is against SysV ABI for x86-64, you should consider to use rip relative effective addressing in your code.

Overall the code is very good! Just for fun, I'm trying to optimize my way and show to you here, if there is interest in such a thing...

[]s
Fred

fredericopissarra:
Another thing... this:

--- Code: ---  section .data
  ...
width:  dq 1
  ...
  section .text
  ...
  movq xmm0,[width]
  ...

--- End code ---
Will not load 1.0 (double) in XMM0, but a QWORD 1 (0x00000001). The correct approach is to convert the integer representation to double as in:

--- Code: ---  ; Casting necessary because you can use a dword reference as well...
  cvtsi2sd xmm0,qword [width]

--- End code ---
The other way around as well:

--- Code: ---  ; write the double as an integer
  cvtsd2si rax,xmm0  ; destination MUST be a register.
  mov [width],rax

--- End code ---

And... the default for NASM is 32 bits code, it is recommended you tell the compiler your code is 64 bits and using RIP relative addressing, at the beginning:

--- Code: ---  bits 64
  default rel
--- End code ---
And all effective addresses loaded to registers should be done with LEA, like:

--- Code: ---  mov eax,1
  mov edi,eax
  lea rsi,[msg]  ; this is a rip relative effective address.
  mov edx,msg_size
  syscall
...
msg: db `Hello\n`
msg_size equ ($ - hello)
--- End code ---

Rodrigo Robles:

--- Quote from: Deskman243 on June 28, 2023, 11:51:54 AM ---If that library is closed source and based on C how is any of that possible?

--- End quote ---

Are you talking about Opengl and Openal?
They are not part of the project, the game call these libraries to render graphics and play sound. Theoretically one could call directly Linux audio and video drivers, but it would be really uncommon.
By the way, most (or all?) Linux distros uses opensource libraries for this (libopengl, libopenal, freeglut).

Navigation

[0] Message Index

[#] Next page

Go to full version