NASM - The Netwide Assembler
NASM Forum => Other Discussion => Topic started by: Mixolydian on June 01, 2014, 07:56:39 PM
-
Crambly is a "wrapper language" for NASM syntax assembly.
It provides several higher level abstractions such as being able to name local variables and function parameters, being able to call functions with all of the parameters in the correct order and on the same line, and the "\n" like in C.
Also note that Crambly tries to recognise and retain compatibility with original NASM syntax, so you can mix and match syntaxes in the same file; see the example backwardsCompatibility (https://github.com/CTurt/Crambly/blob/master/examples/backwardsCompatibility/backwardsCompatibility.cram).
Sample Crambly program:
section .data
addingString: db "Adding %d and %d\n", 0
resultString: db "Result is %d\n", 0
section .bss
result: resd 1
section .text
@global addFunction(number1, number2) {
call _printf(addingString, dword [number1], dword [number2])
add esp, 3*4
mov eax, [number1]
add eax, [number2]
}
@global _main() {
locd variable1
locd variable2
mov dword [variable1], 5
mov dword [variable2], 2
call addFunction(dword [variable1], dword [variable2])
mov dword [result], eax
call _printf(resultString, dword [result])
add esp, 2*4
}
Which evaluates to:
extern _printf
global addFunction
global _main
section .data
addingString: db "Adding %d and %d", 10, "", 0
resultString: db "Result is %d", 10, "", 0
section .bss
result: resd 1
section .text
addFunction:
push ebp
mov ebp, esp
and esp, -16
push dword [ebp+12]
push dword [ebp+8]
push addingString
call _printf
add esp, 3*4
mov eax, [ebp+8]
add eax, [ebp+12]
mov esp, ebp
pop ebp
ret 8
_main:
push ebp
mov ebp, esp
and esp, -16
sub esp, 16
mov dword [ebp-4], 5
mov dword [ebp-8], 2
push dword [ebp-8]
push dword [ebp-4]
call addFunction
mov dword [result], eax
push dword [result]
push resultString
call _printf
add esp, 2*4
mov esp, ebp
pop ebp
ret 0
The output from this program is:
Adding 5 and 2
Result is 7
Download is available from GitHub (https://github.com/CTurt/Crambly). Please give feedback.
-
Cool!
One more thing to thing about is - how to define and access function arguments, like:
section .data
messageformat: db "%s: %s",0
section .text
printmessage title, message {
call _printf messageformat, title, message
}
OR
printmessage(title, message) {
call _printf messageformat, title, message
}
Well, I don't know. You are the artist of this and you should know, how to define and pass function arguments! :)
Did you knew that there is also such thing as NASMX, it already does what you are trying to do?
+ One more thing: You are using "LEAVE" instruction, so, there is also "ENTER" instruction.
I don't know how that "ENTER" instruction works, but it saves bytes and it works nearly like this:
section .data
message: db "Hello, world! %d", 10, 0
variable: db 0
section .text
main:
enter 0,0
push variable
push message
call _printf
leave
ret
I have never used "ENTER" instruction, so, that means,
above example needs adjusted before compilation,
basically it is built in instruction that does multiple things in one, like "push ebp" and "mov ebp,esp".
Bye.
-
I didn't know about NASM-X. I'll take a look at that later, this is basically just an exercise to help me learn assembly though.
That's a nice suggestion about being able to pass function arguments, I was thinking about doing that but it might take me a while to implement.
Thanks for telling me about the enter instruction! The key to good code is consistency: if I am using leave but not using enter it doesn't look very clean. I've updated the project to reflect this (see compiling with -a).
I've thought of a few more things which I may potentially implement, check the first post.
-
Using "leave" but "hand coding" the equivalent to "enter" is common. I am told that "enter" is "vector path" (so slower, though smaller) but "leave" is "direct path" (so smaller, but no slower). I don't know much about optimization at this level. I suggest you don't worry about it too much at this stage. The first operand to "enter" is the number of bytes to subtract from esp for local variables (0 if you haven't got any). The second operand is "lex level" - I suggest you just make it 0 unless you're looking for a puzzle.
Looking at NASMX might be worthwhile to see how they do it, but it's a "macro set" and it sounds like you're taking a different approach (your wrapper written in asm?). Good start, in any case. Keep us posted.
Best,
Frank
-
Thanks Frank, I have taken a look at NASMX and it does seem to be a slightly different project - the good thing with mine is that you can convert your Crambly to real assembly and stop using my tool at any point, with NASMX, once you've started to use it, it becomes part of your code. Being written as an external tool also gives the project much more flexibility than by just using a set of macros, so I should be able to implement functionality which isn't possible in NASMX. This is written in C since I've only been learning assembly for a few days.
-
... it will also find functions which you have created but not declared and declare them as global in the header.
Hi,
why do you think that it would be a benefit to declare all functions
automatically global respectively public? I don't think so.
I would prefere a more flexible way. Maybe with new directives, e.g. something like that:
@global _main {
call _printf message, dword [variable]
}
greetings
-
When I implemented the "automatically declaring all functions global" I was wondering if it was overkill.
I really like your suggestion of:
@global _main {
call _printf message, dword [variable]
}
I will switch to this method.
Another thing I just want to say is that I will try to make crambly compatible with NASM code so that you will be able to just use the parts of crambly which you want - or if you are converting assembly to crambly you will be able to do it in parts. For example, if you have the below code:
_main:
enter 0, 0
push dword [variable]
push message
call _printf
leave
ret
Crambly will not alter it in any way.
However, code which uses crambly syntax such as the function posted earlier in this post will be parsed.
EDIT: I have implemented the @global keyword.
-
What's the feature you are working on now?
-
At the moment I'm just cleaning up the code. It's gotten quite messy and I want to be able to make this open source eventually so I think it's important that I do this before continuing.
The next addition will either be the ability to access function arguments by their name, so something like this:
printNumber dword number {
call _printf numberFormat, number
}
Or local variables:
addOne dword number {
resd localNumber
mov localNumber, number
add localNumber, 1
mov eax, localNumber
}
Note, these are just examples of how it might work. I have not even started to add this in yet.
-
yes, you are right, simple replacement won't work. There are a lot of traps awaiting you, e.g. your utility mustn't mistake labels with your localvar feature.
cmp eax, number
...
; thousend lines assembly code
...
jz number
-
For the time being at least, I think it will be a requirement that variable names cannot be function names; when I have to distinguish whether a parameter is a label or a variable the whole project gets too complex - especially this early on.
Crambly already has flaws which I am aware of, EG:
call _printf xor eax, eax
Crambly will treat these as parameters, when it should be a separate opcode.
So as you can tell, I have bugs of higher priority.
I was able to fix a bug today though, the below syntax used to not work, but now it does:
functionName
{
; some people prefer this syntax
}
Personally I like to write code like this:
functionName { ;the { on the same line as the function name
xor eax, eax ; indented with a tab, parameters separated with spaces
}
However I will do testing with different syntaxes.
-
For the time being at least, I think it will be a requirement that variable names cannot be function names; when I have to distinguish whether a parameter is a label or a variable
but your tool allways must detect those situations, because, if your tool run over such things, the user will throw your tool out of the window.
call _printf xor eax, eax
I think your tool should terminate with an errorcode, because it's malformed code.
functionName
{
; some people prefer this syntax
}
functionName { ;the { on the same line as the function name
xor eax, eax ; indented with a tab, parameters separated with spaces
}
between these two codingstyles are no differences!
both matches a single rule:
function ::= <function_name> [whitespaces] '{'<all_the_function_stuff>'}'
a function starts with a function name
followed by optional whitespaces
followed by a '{'-sign
followd by functionstuff
and at least a '}'-sign.
-
I think you are overestimating the project, it is not a fully fledged language. It is a "wrapper language". Its purpose is to make writing NASM assembly easier.
For example, crambly has no sense of what "mov" is.
Crambly sees:
call _printf message
And:
call _printf mov
As the same thing - a call to a function, followed by one parameter to be pushed on the stack.
I think your tool should terminate with an errorcode, because it's malformed code.
Using the model above, how can crambly tell if "call _printf xor eax, eax" is malformed, or if the user really wants:
push xor eax
push eax
call _printf
There is no way of knowing.
It is NASM's job to check whether the code will compile or if it is malformed.
-
I think you are overestimating the project, it is not a fully fledged language. It is a "wrapper language".
...
excuse me please, but I think you are underestimate the problems of every language. No matter if the language is a 'standalone' programming language or a simple 'wrapper language'
e.g. '@global' is only working because you/your wrapper has distinguish knowledge!
1. @globale is definately not legal nasm code
2. @globale is an well defined directive of your wrapper language, hopefully not conflicting with other tools.
Using the model above, how can crambly tell if "call _printf xor eax, eax" is malformed, or if the user really wants:...
you shouldn't try to guess 'what the user really wants'. You should define and implement a rule that describes a legal function call or function declaration in 'Crappy'-style. You should be sure, that no such construction could be legal nasm code. Otherwise your crappy tool conflicts with nasm.
It is NASM's job to check whether the code will compile or if it is malformed.
It is your job to know crappy-style code and to distinguish in a reliable manner between crappy code and other stuff.
If your tool is confrontated with malformed crappy code, I think, such a tool should report this and mybe terminate. Guessing what the user want is a battle that you won't win.
-
OK; I've given it some thought and have decided to rewrite Crambly from scratch. It will be much more intelligent now that I understand what I have to do.
-
Sorry for the wait, I had some paid programming work so obviously that took priority.
I finally found the time to rewrite Crambly from the ground up and I am pleased to announce its release!
See the first post for the download link and a sample Crambly program which shows off all of the features I've been working on.
-
Hi,
nice work, but ...
_main:
push ebp
mov ebp, esp
and esp, -16
mov dword [ebp], 5
mov dword [ebp-4], 2
what are you doing here?
local variables should look like this:
sub esp, 8 ; reserve enough bytes for my local vars
mov dword [ebp-4], 5
mov dword [ebp-8], 2
And if you intend to use resd as directive for local vars, I would say, that's not a good idea.
btw: I would prefere an output that preseves the original source formating. Only your generated source should start at the beginning of a line, everything else as founded in the original source file.
EDIT: and you have left the function arguments on the stack
-
Thanks for the feedback, I will fix the local variables.
you have left the function arguments on the stack
Originally I was going to leave this for the user to clean up, but I haven't fully decided.
I would prefere an output that preseves the original source formating.
Yes, I will definitely implement this in the future.
EDIT: I've just made some changes based on your suggestions. Here's the new adding.asm file that is generated:
extern _printf
global addFunction
global _main
section .data
addingString: db "Adding %d and %d", 10, "", 0
resultString: db "Result is %d", 10, "", 0
section .bss
result: resd 1
section .text
addFunction:
push ebp
mov ebp, esp
and esp, -16
push dword [ebp+12]
push dword [ebp+8]
push addingString
call _printf
add esp, 3*4
mov eax, [ebp+8]
add eax, [ebp+12]
add esp, 8
mov esp, ebp
pop ebp
ret
_main:
push ebp
mov ebp, esp
and esp, -16
sub esp, 16
mov dword [ebp-4], 5
mov dword [ebp-8], 2
push dword [ebp-8]
push dword [ebp-4]
call addFunction
mov dword [result], eax
push dword [result]
push resultString
call _printf
add esp, 2*4
add esp, 16
mov esp, ebp
pop ebp
ret
And if you intend to use resd as directive for local vars, I would say, that's not a good idea.
I could change the the res to loc, so it would be:
locd test
mov dword [test], 5
-
I've just uploaded Version 0.2 to GitHub.
Local variables are now accessed correctly, function parameters are now taken off the stack before returning, and local variables are declared with locb, locw, locd, and locq instead of resb, resw, resd, and resq.
EDIT: I forgot to mention that the reason I sub esp, 16 instead of just sub esp, 8 is because Crambly automatically aligns to multiples of 16 to support 64bit, this is the same thing that gcc does; for example:
locb variable
Reserves 16 bytes, even though we only use 1.
But:
locq variable1
locq variable2
locb variable3
Reserves 32 bytes, even though we only use 17.
-
extern _printf
global addFunction
global _main
section .data
addingString: db "Adding %d and %d", 10, "", 0
resultString: db "Result is %d", 10, "", 0
section .bss
result: resd 1
section .text
addFunction:
push ebp
mov ebp, esp
and esp, -16
push dword [ebp+12]
push dword [ebp+8]
push addingString
call _printf
add esp, 3*4
mov eax, [ebp+8]
add eax, [ebp+12]
add esp, 8 ; that isn't a cleanup
mov esp, ebp
pop ebp
ret ; no callee cleanup
_main:
push ebp
mov ebp, esp
and esp, -16
sub esp, 16
mov dword [ebp-4], 5
mov dword [ebp-8], 2
push dword [ebp-8]
push dword [ebp-4]
call addFunction
; and no caller clean up
mov dword [result], eax
push dword [result]
push resultString
call _printf
add esp, 2*4
add esp, 16 ; that's not necessary
mov esp, ebp ; stack ballancing is done here
pop ebp
ret
You still have'nt fixed the clean up, take a look at this:
http://en.wikipedia.org/wiki/X86_calling_conventions#Callee_clean-up
and take a look at your stack balancing
-
Thanks for helping me so much; I think I've finally got everything working correctly now, please try the latest Version 0.4 and tell me if there are any bugs still present.
-
sorry, I haven't tried it. But I've two notes, not really bugs.
- because you tried to align the stack you should know that your local vars are unaligned. Was it intended?
- ret 0 is not necessary maybe it's slower then simply ret
I assume that your crappy tool is not written in assembly. Which language do you use for it?
_main:
push ebp
mov ebp, esp
and esp, -16
sub esp, 16
mov dword [ebp-4], 5 ; ebp is unaligned esp
mov dword [ebp-8], 2
push dword [ebp-8]
push dword [ebp-4]
call addFunction
;...
mov esp, ebp
pop ebp
ret 0 ; it's not necessary with 0, it could be slower
ret ; then this.
-
The tool is written in C - I will release the source, but not right now. It's not ready yet.
Regarding the incorrect alignment:
sub esp, 16
mov dword [ebp-4], 5 ; ebp is unaligned esp
mov dword [ebp-8], 2
Should it be this?
sub esp, 16
mov dword [ebp-12], 5
mov dword [ebp-8], 2
ret 0 ; it's not necessary with 0, it could be slower
ret ; then this.
OK; that's a simple fix.
I'm really busy for the rest of the week and I'm away on the weekend but I'll try and get an update fixing these two "notes" when I can.
Once again, thanks for the feedback.
-
Should it be this?
sub esp, 16
mov dword [ebp-12], 5
mov dword [ebp-8], 2
No.
I'm really busy for the rest of the week and I'm away on the weekend but I'll try and get an update fixing these two "notes" when I can.
I understand you but If you want to understand the stack and what goes on with the stack frame you'll need a little bit time to go for it.
-
What should it be?
-
Let the user do the work, if they need aligned local variables.