Maybe I'm not understanding how scas works, but isn't the result of scasb stored in rbx in 64-bit assembly? That's why I cleared it with XOR before using scasb and why I substracted it from rdi (which has starting address of string). I'm probably wrong here though.
SCASB reads from ES:RDI and compares with AL, affecting the flags, and updates RDI. With REP (or REPNZ) prefix it does RCX times while ZF=0 (hence the NZ). So
strlen could be implemented as:
; Same as: size_t strlen( const char * );
; the function assumes ALL strings will be NUL terminated.
strlen_:
xor eax,eax
lea ecx,[rax-1] ; Limiting the string size to 2³²-1, max.
mov rdx,rdi
repnz scasb ; Scan for '\0'...
sub rdi,rdx
mov rax,rdi ; returns size in RAX.
ret
Wouldn't the assembler get confused if I used 64-bit syscalls on 32-bit registers? Or if I put some arguments of a syscall in R?? registers and others in E?? registers?
E?? registers are the lower part of R?? registers. And, in x86-64 mode, when you change E?? register the upper 32 bits of R?? register is automatically zeroed... Instructions using R?? registers need to insert an prefix (REX prefix), with E?? no prefix...
Wouldn't the assembler Notice in my routine, if '\0' isn't found in a block of 2³²-1 bytes it returns -1 (all bits set) in RAX. This allows you to test an error:
... Wouldn't this event be highly unlikely though? ...
You are right!... It easier to assume the routine expects ALL strings to be zero terminated...