Author Topic: NASM assembly and brainf*ck  (Read 8640 times)

Chris Lutz

  • Guest
NASM assembly and brainf*ck
« on: December 07, 2008, 03:25:10 PM »
(I don't normally agree with censorship, but I figured I'd be polite in the subject.)

I'm trying to create a program that translates brainfuck (http://en.wikipedia.org/wiki/brainfuck) to assembly, because anyone can translate brainfuck to C and assembly is smaller and faster. I have a .bss section of this:

section .bss
lst resb 100

Then in the text section, I have the following assembly equivalents for brainfuck:

> = inc bh
< = dec bh
+ = inc [byte lst+bh]
- = dec [byte lst+bh]
. = push 0x1
    push [byte lst+bh]
    push 0x1
    mov eax,0x4
    call _syscall
    pop eax
    pop eax
    pop eax
, = push 0x1
    push [byte lst+bh]
    push 0x0
    mov eax,0x3
    call _syscall
    pop eax
    pop eax
    pop eax
[ = BEG0: cmp [byte lst+bh], 0
    je END0
] = END0: cmp [byte lst+bh], 0
    jne BEG0

I'm on Mac OS X Leopard, using the nasm provided with Apple. It says in the info files the syntax for specifying size is [byte lst+bh] or whatever, but when I use that it gives me "error: operation size not specified." When I take out the word "byte" and just have [lst+bh], it gives me the same error. When I put "byte [lst+bh]" it gives me some other error, and seems to be complaining about that more.

Why the hell is this happening, how can I fix it, and how soon do I need to switch to Linux/BSD?

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: NASM assembly and brainf*ck
« Reply #1 on: December 08, 2008, 02:35:46 AM »
I think if I wanted to translate from brainfuck to Norwegian, I'd learn Norwegian first...

[lst + bh] isn't going to be a valid addressing mode, no matter how much bling you put around it. Can you arrange to use ebx? [lst + ebx] would be valid. You may need to specify a size - "cmp [lst + ebx], al", the register specifies the size; "cmp byte [lst + ebx], 0" (or "cmp [lst + ebx], byte 0"), Nasm needs to know the size. In general, you don't want a size specifier inside the brackets (it does something different).

I'm not sure what you're trying to do... You've got an uninitialized (actually initialized to zero) buffer, "lst". You print one character, then overwrite the character with a character read from stdin. sys_write and sys_read expect the address of a buffer to read from/write into. What's supposed to be in "lst"? Characters, or (dword) addresses of characters? Printing zeros isn't going to do anything, printing from address zero is going to segfault. Which would you like? :)

Reading first and then printing the character might make more sense - not a lot, it'll just go "aabbccdd", but just to get the syntax...

;---------------------------
section .bss

buf resb 100

section .text
global _start
_start:  ; is this right for MacOSX?

mov ebx, buf ; address of buffer

top:
push 1 ; count
push ebx ; buffer
push 0 ; stdin
mov eax, 3 ; __NR_read
call _syscall
pop eax
pop eax
pop eax
; or add esp, 12
; or lea esp, [esp + 12]

; give the poor bastard a chance to quit
cmp byte [ebx], 'q'
je exit


push 1 ; count
push ebx ; buffer
push 0 ; stdout
mov eax, 4 ; __NR_write
call _syscall
pop eax
pop eax
pop eax
; or...

inc ebx
cmp ebx, buf + 100
jne top

exit:
push 0 ; exit code
mov eax, 1 ; __NR_exit
call _syscall
; no need to clean up stack :)
;--------------------

;----------------------
_syscall:
int 80h
ret
;----------------------

That's untested (haven't got a Mac), but it should be close to what I "understand" OSX wants - typos and brainos excepted. The way sys_read works, the poor user is going to have to type a character, hit "enter", type another character, hit "enter"... If he types multiple characters before hitting enter, sys_read won't actually return until he does. Only one character will appear in our buffer (at a time), but the excess characters will stay in the system buffer. We print one, and when sys_read comes around, read another... So far, so good. But suppose the pesky user types "abcqls"? We quit on 'q', the remaining characters show up on our command prompt, and we see a directory listing. Oops! Not exactly what we intended!

Ahhh... gotta go - more later...

Best,
Frank