It seems ironic to me that UNIX is based on null terminated strings everywhere except where they would actually be nice to use, like when calling their kernel functions. Why should we have to pass a length? Go look for the null, ya buttheads!
Actually, sys_write & sys_read are BINARY input/output routines, yes they are used to work with text on the console via stdin,stdout, stderr file descriptors, but keep in mind that you might not always be using text. Say, for example, you're wanting to read the contents of a bitmap image into memory. In such a case you really don't want your sys_read routine to terminate at every null. Same can be said for when you want to dump that bitmap image back to disk.
Why should we have to go looking for the null, when the length is known in most cases (not this one)? Zero-terminated strings are actually kind of a dumb data structure (IMO). Length-prefixed strings (a la Pascal) are usually more efficient! (programmer's choice again, except where we have to interface with C).
I completely agree about having the length of the string with the string itself. That is one of the few places where I tend to be "wasteful" and actually deal with string references rather than strings themselves.
; build with : nasm -f elf strdemo.asm && gcc -nostartfiles -nostdlib strdemo.o -o strdemo
bits 32
__NR_exit equ 1
__NR_write equ 4
STDOUT equ 1
section .rodata
; Our DB Array's and size equates
dbaHello: DB "Hello, World!", 10, 0
dbaHello_size EQU ($-dbaHello-1)
dbaBye: DB "Goodbye, World!", 10, 0
dbaBye_size EQU ($-dbaBye-1)
section .data
; "String" structures
Hello:
.length DD dbaHello_size
.string DD dbaHello
Bye:
.length DD dbaBye_size
.string DD dbaBye
section .text
global _start
_start:
mov edx, [Hello.length]
mov ecx, [Hello.string]
mov ebx, STDOUT
mov eax, __NR_write
int 0x80
mov edx, [Bye.length]
mov ecx, [Bye.string]
mov ebx, STDOUT
mov eax, __NR_write
int 0x80
xor ebx, ebx
mov eax, __NR_exit
int 0x80
In the above example I added use $-label-1 because the length is 1 byte (the null terminator) less than the full ASCIIZ string. I still put the null terminator in strings simply cause I might someday use that string with a C function and as a good forward thinking programmer I don't want the possible C functions to puke on me.
I started doing this when I was coding under windows and I wanted to keep all my strings themselves in the CONST section, later I just adapted the same practice to the .rodata section. It's mostly useful for dynamic strings where you might need to resize the allocated space and you can set .length to contain the max space allocated for that string (if reached just get more memory). Personally though, I think it's a good trade off, yeah I may be using up more space but I'll never have to do a null byte search.