Author Topic: My own 64-bit `puts' instruction (No length required)  (Read 44540 times)

Offline MediocreVeg1

  • Jr. Member
  • *
  • Posts: 26
My own 64-bit `puts' instruction (No length required)
« on: January 13, 2021, 06:32:21 AM »
So I've been having a bit of difficulty regarding procedures, and the people on this forum helped a lot (and helped me understand 64-bit assembly in general). So in return, I want to help others who may be having difficulty like me too with the `puts' procedure i made. But the main advantage of the procedure - not having to enter the length - requires another procedure `strlen', which calculates the length of the string (not including the terminating 0 character). It takes rdi (which should hold the string) as a parameter and returns the length in rax. So here it is:
Code: [Select]
strlen:
    xor rax, rax ; Return value, will count string length
    dec rdi
.cntloop:
    inc rax
    inc rdi
    cmp byte [rdi], 0 ; Terminating character
    jnz strlen.cntloop

    dec rax ; So that it does not include the last terminating character
    ret
Thanks to this, the actual printing procedure is really simple. It also takes rdi as a parameter and prints the string in the register:
Code: [Select]
puts:
    mov rsi, rdi
    call strlen
    mov rdx, rax
    mov rax, 1
    mov rdi, rax
    syscall
    ret

Note that the string doesn't need to have a 0 character at the end for this to work, but it would be better if you did put it.

This isn't as advanced as the `puts' macro I made, which didn't require the argument to be a variable and could take an infinite amount of arguments, but it couldn't print variables of .bss and too macros is probably a bad idea. Anyway, do give me any suggestions to improve the code if you feel it could be better.
« Last Edit: January 13, 2021, 06:37:15 AM by MediocreVeg1 »

Offline MediocreVeg1

  • Jr. Member
  • *
  • Posts: 26
Re: My own 64-bit `puts' instruction (No length required)
« Reply #1 on: January 13, 2021, 07:47:26 AM »
UPDATE: I realised an issue with the procedure related to the way section .data works. From what I can tell, if you do this:
Code: [Select]
achar db "H"
bchar db "F"
the variables will be stored in the memory side-by-side with no space in-between, so no terminating character in-between. This means that if you print achar, it'll print bchar right after that. Due to this, I HIGHLY RECOMMEND YOU PUT THE NULL CHARACTER YOURSELF AT THE END OF EACH STRING. Apart from that, the rest is fine.

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: My own 64-bit `puts' instruction (No length required)
« Reply #2 on: January 13, 2021, 09:23:09 PM »
Looks as if you've reinvented "gets". No protection against overflow.  Take it away!

Best,
Frank


Offline MediocreVeg1

  • Jr. Member
  • *
  • Posts: 26
Re: My own 64-bit `puts' instruction (No length required)
« Reply #3 on: January 14, 2021, 06:50:31 AM »
What do you mean by that? I actually haven't tried to make a `gets' without specifying the size because I'm not sure if that is possible with how I'm getting length currently.

Offline MediocreVeg1

  • Jr. Member
  • *
  • Posts: 26
Re: My own 64-bit `puts' instruction (No length required)
« Reply #4 on: January 14, 2021, 06:57:51 AM »
Oh right, if you are talking about the going to the other vairable thing, yeah, it is a bit annoying. If you really can't put the 0 at the end, you can override db to automatically put a 0 with this macro:
Code: [Select]
%macro db 1+
    db %1, 0
%endmacro
(The plus will make it a greedy parameter so %1 will include all args)
if you don't want to override db or dont want to do this with every dx instruction you can rename it to something else too.

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: My own 64-bit `puts' instruction (No length required)
« Reply #5 on: January 14, 2021, 10:33:51 PM »
My Mistake. Sorry.

Best,
Frank


Offline MediocreVeg1

  • Jr. Member
  • *
  • Posts: 26
Re: My own 64-bit `puts' instruction (No length required)
« Reply #6 on: January 15, 2021, 08:53:12 AM »
Oh no it's really fine, I agree that the thing can be quite annoying, no denying that. Perhaps I should make another procedure that could allow input for size (basically normal printing but a bit shorter to call). My strlen also seems a bit brute force-ish, looping through the whole string, so I might try to change the code for that too.

Offline fredericopissarra

  • Full Member
  • **
  • Posts: 373
  • Country: br
Re: My own 64-bit `puts' instruction (No length required)
« Reply #7 on: January 15, 2021, 03:36:20 PM »
if you intend to mimic puts() function from libc, there are 2 things missing:

  • it prints an extra '\n' at the end of the string;
  • it returns the number of chars printed or -1 in case of error.

A better approach:
Code: [Select]
  extern strlen

; int puts_( char * );
;
; puts() returns the number of chars writen or -1 if error.
;
; Entry: RDI points to string
  align 16
puts_:
  push  rbx

  mov   rbx,rdi

  ; glibc strlen() is 100 times faster than that
  ; handmade routine.
  call  strlen wrt ..plt
 
  mov   edx,eax
  mov   rsi,rbx
  mov   ebx,edx     ; save for later.
  mov   eax,1
  mov   edi,eax
  syscall

  test  rax,rax
  js    .error

  ; puts() prints a final '\n'.
  mov   byte [rsp-8],`\n`     ; Use the 'red-zone'.
  mov   eax,1
  mov   edi,eax
  mov   edx,eax
  lea   rsi,[rsp-8]
  syscall
 
  test  eax,eax
  js    .error

  lea   eax,[rbx+1]
  pop   rbx
  ret

.error:
  mov   eax,-1
  pop   rbx
  ret

Offline MediocreVeg1

  • Jr. Member
  • *
  • Posts: 26
Re: My own 64-bit `puts' instruction (No length required)
« Reply #8 on: January 15, 2021, 03:48:32 PM »
I actually didn't intend to mimic the function, but the extra features do seem helpful. I agree about my strlen being slow too, maybe I'll try a different approach with scasb or something similar. There are definitely some things I need to learn from this example (I tried so much to print a character without making a variable and now I hear about this "red zone"!) and I will try to improve on this. I also will probably want to make another procedure for the newline in case you want to print, for example, a greeting and a variable <name> on the same line. Java has a println() and print() and I've always found this helpful.
Also, do you know what kind of errors could arise from puts?
« Last Edit: January 15, 2021, 04:00:39 PM by MediocreVeg1 »

Offline MediocreVeg1

  • Jr. Member
  • *
  • Posts: 26
Re: My own 64-bit `puts' instruction (No length required)
« Reply #9 on: January 15, 2021, 05:34:19 PM »
[UPDATE] I basically remade my strlen procedure and I think it should be faster now. here's the code:
Code: [Select]
strlen:
    mov rcx, 0xff ; (A string may be larger than 255 characters, in which case this would have to increase
    xor rbx, rbx
    repnz scasb ; Will store in rbx
    sub rdi, rbx
    mov rax, rdi
    dec rax ; So as not to include 0
    ret

Offline fredericopissarra

  • Full Member
  • **
  • Posts: 373
  • Country: br
Re: My own 64-bit `puts' instruction (No length required)
« Reply #10 on: January 17, 2021, 03:36:53 PM »
I'm trying to post this reply for the last 2 days.

Actually, my measurements were about a similar routine using rep/scasb against glibc's strlen().

Offline fredericopissarra

  • Full Member
  • **
  • Posts: 373
  • Country: br
Re: My own 64-bit `puts' instruction (No length required)
« Reply #11 on: January 17, 2021, 04:07:00 PM »
The actual routine I've tested was this one:
Code: [Select]
  bits  64
  default rel

  section .text

; size_t strlen_( char * );
  global  strlen_
strlen_:
  xor   eax,eax
  mov   ecx,-1      ; Limit size to 2³²-1 bytes long
  repnz scasb
  jnz   .not_found
  not   ecx
  dec   ecx
  mov   eax,ecx
  ret
.not_found:
  mov   rax,-1      ; Return maximum length if
                    ; NUL char not found.
  ret

Offline MediocreVeg1

  • Jr. Member
  • *
  • Posts: 26
Re: My own 64-bit `puts' instruction (No length required)
« Reply #12 on: January 18, 2021, 08:51:33 AM »
Interesting, I'm not exatly sure how you've used not here (Probably way faster than my sub alternative), but I'll try to figure it out. I'll also incorporate your error handlng into my procedure. Thanks for the example!
As for the print statements, I did end up making up making separate puts and putsln procedures. Still not sure what kind of error could arise from them (I guess one could be that if strlen failed, it would return the same error  code).

Offline fredericopissarra

  • Full Member
  • **
  • Posts: 373
  • Country: br
Re: My own 64-bit `puts' instruction (No length required)
« Reply #13 on: January 18, 2021, 02:22:02 PM »
Interesting, I'm not exatly sure how you've used not here (Probably way faster than my sub alternative), but I'll try to figure it out.
Not necessarily 'faster', but since 2³²-1-len is the same as ~len, I just used this fact to calculate the string length. (DEC ECX because we're excluding the final NUL char).

I found strange your approach, since sub rdi,rbx (with rbx == 0) do nothing, except affect the flags. Din't you mean to use 'mov rbx,rdi' instead of 'xor rbx,rbx'?

PS: Try to use E?? registers as many times as possible. Using R?? will imply a REX prefix and bigger instructions. moving and doing arithmetic/logical operations to E?? registers will automatically zero the upper 32 bits of R?? registers for you (this doen't work in a few instructions as, for example, CDQ [will zero upper RDX but not upper RAX).

PS: Notice in my routine, if '\0' isn't found in a block of 2³²-1 bytes it returns -1 (all bits set) in RAX. This allows you to test an error:
Code: [Select]
size_t size = strlen_( str );
if ( (long)size < 0 ) { ... handle error... }
« Last Edit: January 18, 2021, 02:29:08 PM by fredericopissarra »

Offline MediocreVeg1

  • Jr. Member
  • *
  • Posts: 26
Re: My own 64-bit `puts' instruction (No length required)
« Reply #14 on: January 18, 2021, 03:06:55 PM »
Quote
I found strange your approach, since sub rdi,rbx (with rbx == 0) do nothing, except affect the flags. Din't you mean to use 'mov rbx,rdi' instead of 'xor rbx,rbx'?
Maybe I'm not understanding how scas works, but isn't the result of scasb stored in rbx in 64-bit assembly? That's why I cleared it with XOR before using scasb and why I substracted it from rdi (which has starting address of string). I'm probably wrong here though.

Quote
PS: Try to use E?? registers as many times as possible. Using R?? will imply a REX prefix and bigger instructions. moving and doing arithmetic/logical operations to E?? registers will automatically zero the upper 32 bits of R?? registers for you (this doen't work in a few instructions as, for example, CDQ [will zero upper RDX but not upper RAX).
Wouldn't the assembler get confused if I used 64-bit syscalls on 32-bit registers? Or if I put some arguments of a syscall in R?? registers and others in E?? registers?

Quote
S: Notice in my routine, if '\0' isn't found in a block of 2³²-1 bytes it returns -1 (all bits set) in RAX. This allows you to test an error:
Yeah, I put it into my procedure as well after you showed your example. Wouldn't this event be highly unlikely though? I think 2^32-1 is like 4294967295 bytes so every single byte after the starting address of the string would have to be non-zero, right?