Sure you can use ecx. If this is supposed to be the setup for a sys_write, 40 isn't going to be valid memory to find the buffer...
In my "print the matrix" routine, I used ecx as a pointer into the buffer as I filled it up with digits/characters so it was supposed to be "all set" when I got to the sys_write, but I screwed up! I decremented ecx to the beginning of my buffer, but didn't put a space in the last [ecx]. A classic "fencepost" or "off-by-one" error. My bad! It kinda works okay anyway, but it definitely isn't right. I'll fix it if I'm in the mood...
The "mov cx, 5" was left over from your code. Doesn't do any good, but it doesn't do any harm. I tried to leave your code alone (consistent with making it work with dwords) and just add the "print the matrix" routine so you could track down what was happening more easily... and I didn't even get that right... sigh...
Best,
Frank