NASM - The Netwide Assembler

NASM Forum => Announcements => Topic started by: Keith Kanios on November 08, 2010, 05:26:18 PM

Title: NASM - Official Recursive Macro Support!
Post by: Keith Kanios on November 08, 2010, 05:26:18 PM
This is an official announcement regarding the rewrite/revamp of the NASM preprocessor, now available as part of the official/main NASM source code starting with NASM 2.10rc2.

This rewrite addresses issues with implementing recursive macros (%rmacro/%irmacro) as previously attempted, %exitmacro and also makes room for new directives.

New NASM Directives:

This rewrite also addresses efficiency issues with heavy/nested macro invocations.

Please help us test this rewrite further by downloading/compiling the NASM 2.10rc2 and testing it against your NASM code/snippets.

NASM 2.10rc2 is available for download (source and binaries) at http://www.nasm.us/pub/nasm/releasebuilds/2.10rc2/ (http://www.nasm.us/pub/nasm/releasebuilds/2.10rc2/)

Please note that the official documentation will be updated over the next few weeks as we approach an official/stable NASM 2.10 release.

If there are any questions/comments/concerns, please reply to this thread.

Thanks. -Keith
Title: Re: NASM - Official Recursive Macro Support!
Post by: cm on November 08, 2010, 07:28:34 PM
A bug report. Posting it here as I guess it's on-topic enough (appears to be a 2.10rc2 issue), and the SF.net bug tracker doesn't let me post a new bug for whatever reason.

The Windows executable of 2.10rc2, as distributed, appears to crash on %fatal in included files.

Two files form a simple test case of this; assemble them with the command line "nasm test.asm".

test.asm contains:
Code: [Select]
%include "test.mac"
test.mac contains:
Code: [Select]
%fatal Test
This crashes for me with a GPF.

Notably, a local Windows build compiled by OpenWatcom 1.9 does not crash and outputs the %fatal message as expected. (I tried compiling with Visual Studio 2008 but that didn't work. Probably because of some incompatibility to VS 2005.)

Edit: This simple test case doesn't crash with my OW build, but assembling my whole project crashes the same with either the distributed build or the OW build.
Title: Re: NASM - Official Recursive Macro Support!
Post by: Klod on November 10, 2010, 03:12:38 AM
Just downloaded Nasm 2.10rc2 win binaries.
Noted decrease in assemble time (faster)
So far all test programs I tried compiled correctly.

One observation:
Quote
%macro mac1 5
   %rep %0
      %Warning %? %0 %1 %2 %3 %4 %5
      %rotate 1
   %endrep
%endm
mac1 firstarg,2,3,4,string

:7: warning: (mac1:2) %? 5 firstarg 2 3 4 string

Loss of macroname %? %??

Regards
Klod


Title: Re: NASM - Official Recursive Macro Support!
Post by: Cyrill Gorcunov on November 10, 2010, 08:19:07 PM
A bug report. Posting it here as I guess it's on-topic enough (appears to be a 2.10rc2 issue), and the SF.net bug tracker doesn't let me post a new bug for whatever reason.

The Windows executable of 2.10rc2, as distributed, appears to crash on %fatal in included files.

Two files form a simple test case of this; assemble them with the command line "nasm test.asm".

test.asm contains:
Code: [Select]
%include "test.mac"
test.mac contains:
Code: [Select]
%fatal Test
This crashes for me with a GPF.

Notably, a local Windows build compiled by OpenWatcom 1.9 does not crash and outputs the %fatal message as expected. (I tried compiling with Visual Studio 2008 but that didn't work. Probably because of some incompatibility to VS 2005.)

Edit: This simple test case doesn't crash with my OW build, but assembling my whole project crashes the same with either the distributed build or the OW build.

The fix pushed upstream as commit 55cc4d04235cb884a885682b5a52f367ec7d50c3.
Please give it a shot by using nasm snapshot. Thanks for report!
Title: Re: NASM - Official Recursive Macro Support!
Post by: Cyrill Gorcunov on November 10, 2010, 08:22:10 PM

:7: warning: (mac1:2) %? 5 firstarg 2 3 4 string

Loss of macroname %? %??

Regards
Klod


Hi Klod, probably ;) I hope I manage to find time for this issue at this weekend.
Though no promises, not enough spare time :(

But observations from nasm users are quite important for us -- so please continue testing ;)
Title: Re: NASM - Official Recursive Macro Support!
Post by: cm on November 11, 2010, 12:57:31 PM
The fix pushed upstream as commit 55cc4d04235cb884a885682b5a52f367ec7d50c3.
Please give it a shot by using nasm snapshot. Thanks for report!

Okay, that appears to fix this problem. (Though I had to recompile on my own using OW 1.9 and the OW makefile doesn't contain the perl commands and I had to execute these manually.)

I see other issues now: Assembling my project will take a lot longer than with previous releases. During that time, NASM uses up to 50% CPU time (as I have a dual core system, this means the core NASM runs on is fully utilized) for as long as one minute. The resulting binaries appear correct.

Notably, several warnings about unterminated strings are displayed after periods of waiting. Investigating these, I found that they were generated by "code" in %if 0 blocks, which I use for long comments because of the lacking %comment in previous releases.

Further testing reveals that replacing %if 0 by %comment where appropriate does not change anything: assembling still takes long and the "unterminated string" warnings are still displayed. (This is a minor bug. I do not want these syntax checks in %comment blocks, and arguably they shouldn't be done in disabled parts of %if either.) Even more testing (with a file that contains lots of long %if 0 or %comment blocks) reveals that the time issue does not appear related to the parsing of the commented lines that results in warnings. If I find out what exactly is causing the long delay I'll let you know.

Edit: One thing I found is that in preproc.c, line 5319 (where there is a comment "sanity check: is this supposed to be here?"), the commented function call (to free_tlist) should indeed be executed. Of course I'm not very familiar with the source, but it seems the right thing to do. Assembling my project with this line commented (as in the repo) takes about 50 seconds, and peaks at a memory usage of 70 MiB. With the line uncommented, it takes about 30 seconds and peaks at 45 MiB. Notably, with 2.10rc1 the same action takes about 3 seconds and peaks at 6 MiB. All produce the same result. So the current revision certainly has some leaks and possibly other stuff slowing it down. The mentioned function call appears to solve some of these.
Title: Re: NASM - Official Recursive Macro Support!
Post by: Cyrill Gorcunov on November 11, 2010, 07:42:03 PM
The fix pushed upstream as commit 55cc4d04235cb884a885682b5a52f367ec7d50c3.
Please give it a shot by using nasm snapshot. Thanks for report!

Okay, that appears to fix this problem. (Though I had to recompile on my own using OW 1.9 and the OW makefile doesn't contain the perl commands and I had to execute these manually.)

I see other issues now: Assembling my project will take a lot longer than with previous releases. During that time, NASM uses up to 50% CPU time (as I have a dual core system, this means the core NASM runs on is fully utilized) for as long as one minute. The resulting binaries appear correct.

Notably, several warnings about unterminated strings are displayed after periods of waiting. Investigating these, I found that they were generated by "code" in %if 0 blocks, which I use for long comments because of the lacking %comment in previous releases.

Further testing reveals that replacing %if 0 by %comment where appropriate does not change anything: assembling still takes long and the "unterminated string" warnings are still displayed. (This is a minor bug. I do not want these syntax checks in %comment blocks, and arguably they shouldn't be done in disabled parts of %if either.) Even more testing (with a file that contains lots of long %if 0 or %comment blocks) reveals that the time issue does not appear related to the parsing of the commented lines that results in warnings. If I find out what exactly is causing the long delay I'll let you know.

Edit: One thing I found is that in preproc.c, line 5319 (where there is a comment "sanity check: is this supposed to be here?"), the commented function call (to free_tlist) should indeed be executed. Of course I'm not very familiar with the source, but it seems the right thing to do. Assembling my project with this line commented (as in the repo) takes about 50 seconds, and peaks at a memory usage of 70 MiB. With the line uncommented, it takes about 30 seconds and peaks at 45 MiB. Notably, with 2.10rc1 the same action takes about 3 seconds and peaks at 6 MiB. All produce the same result. So the current revision certainly has some leaks and possibly other stuff slowing it down. The mentioned function call appears to solve some of these.

ok, new preproc code needs to be polished that is why we pushed it out early and want the feedback ;)
If this is not a secret -- what kind of project you have and if it's open-sourced mind to provide the
code so we could be able to compile it and figure out what eats cpu time.
Title: Re: NASM - Official Recursive Macro Support!
Post by: Cyrill Gorcunov on November 11, 2010, 07:47:25 PM
I read inaccurate -- you've explained what is happening. Will take a look. Thanks!
Title: Re: NASM - Official Recursive Macro Support!
Post by: cm on November 11, 2010, 08:42:36 PM
Quote from: Cyrill Gorcunov
ok, new preproc code needs to be polished that is why we pushed it out early and want the feedback ;)

Eh yeah. I appreciate it I guess. It has good new features and maybe we can make the speed-ups noticeable for me too :D

Quote
If this is not a secret -- what kind of project you have and if it's open-sourced mind to provide the
code so we could be able to compile it and figure out what eats cpu time.

I read inaccurate -- you've explained what is happening. Will take a look. Thanks!

Yeah, but with that it still takes multiple times longer than the revision from before the preproc-rewrite merge. So there is still some work to do. My stuff isn't ready so there are no public releases yet. But it is free software (2-clause BSD) so if you point me to where I can find your e-mail address, or write me a PM with it or something, it's no problem for me to send a current copy to you. Uncompressed it is about 600 KiB large. (It's also a program for DOS, but since the only build tool used is NASM (with multi-section bin output :)) that shouldn't be an issue for you.)
Title: Re: NASM - Official Recursive Macro Support!
Post by: Cyrill Gorcunov on November 12, 2010, 07:01:58 PM
yup, gorcunov@gmail.com
Title: Re: NASM - Official Recursive Macro Support!
Post by: Keith Kanios on November 13, 2010, 03:42:10 PM
:7: warning: (mac1:2) %? 5 firstarg 2 3 4 string

Loss of macroname %? %??

A fix has been pushed to the source code repository.
Title: Re: NASM - Official Recursive Macro Support!
Post by: Klod on November 13, 2010, 11:56:36 PM

Quote
Keith Kanios:
A fix has been pushed to the source code repository.

I do not have the resources to compile the fix with my setup, I will wait for an updated snapshot
Title: Re: NASM - Official Recursive Macro Support!
Post by: Keith Kanios on December 18, 2010, 04:58:40 PM
Notably, several warnings about unterminated strings are displayed after periods of waiting. Investigating these, I found that they were generated by "code" in %if 0 blocks, which I use for long comments because of the lacking %comment in previous releases.

Pushed a simple fix for this.
Title: Re: NASM - Official Recursive Macro Support!
Post by: Keith Kanios on December 18, 2010, 05:08:45 PM
One thing I found is that in preproc.c, line 5319 (where there is a comment "sanity check: is this supposed to be here?"), the commented function call (to free_tlist) should indeed be executed. Of course I'm not very familiar with the source, but it seems the right thing to do. Assembling my project with this line commented (as in the repo) takes about 50 seconds, and peaks at a memory usage of 70 MiB. With the line uncommented, it takes about 30 seconds and peaks at 45 MiB. Notably, with 2.10rc1 the same action takes about 3 seconds and peaks at 6 MiB. All produce the same result. So the current revision certainly has some leaks and possibly other stuff slowing it down. The mentioned function call appears to solve some of these.

Uncommenting it does indeed pass tests and still builds various code. I probably commented that when I was tracking down another related bug.

I've uncommented the line and pushed to GIT.
Title: Re: NASM - Official Recursive Macro Support!
Post by: cm on December 18, 2010, 07:16:48 PM
Notably, several warnings about unterminated strings are displayed after periods of waiting. Investigating these, I found that they were generated by "code" in %if 0 blocks, which I use for long comments because of the lacking %comment in previous releases.

Pushed a simple fix for this.

I've looked into the source myself and did not yet implement a good solution. In your fix, are these error conditions still handled properly when expanding non-%comment ExpDefs later?

Besides, it isn't optimal that tokenize() is always used - even if the (untokenized, ie input) line was only read to be discarded. Anything except certain preproc directives is discarded in %comment blocks, and ignored %if parts (and some other less common cases). My current method of checking for whitespace, then a preproc ID with a letter, and discarding the line if it doesn't look like that, actually decreases performance (by a noticeable amount!) in some reference cases though. A better method might be to check there only for the specific directives and processing them directly if an expansion is currently defined. But implementing that well might require understanding the hash table creation and lookup, which I don't. Just rambling here.

Your fix for %unmacro on a currently active macro is okay but I'm currently playing around with a mechanism where the check that frees any ExpDef except macro ones will additionally check for a certain flag - set by %unmacro if it would display your error. I store that flag in the `state' field as that's only used by %if (and maybe %while).

I'll probably get that %unmacro patch (along with a bunch of other preproc patches) ready during the holidays.

EDIT: By the way, I think the linked list `expansions' wasn't properly used anywhere in the repo. Just confused me a little when I reviewed/compared one of your patches right now.
Title: Re: NASM - Official Recursive Macro Support!
Post by: Keith Kanios on December 18, 2010, 08:15:09 PM
Notably, several warnings about unterminated strings are displayed after periods of waiting. Investigating these, I found that they were generated by "code" in %if 0 blocks, which I use for long comments because of the lacking %comment in previous releases.

Pushed a simple fix for this.

I've looked into the source myself and did not yet implement a good solution. In your fix, are these error conditions still handled properly when expanding non-%comment ExpDefs later?

No, but the latest push does.

Besides, it isn't optimal that tokenize() is always used - even if the (untokenized, ie input) line was only read to be discarded. Anything except certain preproc directives is discarded in %comment blocks, and ignored %if parts (and some other less common cases). My current method of checking for whitespace, then a preproc ID with a letter, and discarding the line if it doesn't look like that, actually decreases performance (by a noticeable amount!) in some reference cases though. A better method might be to check there only for the specific directives and processing them directly if an expansion is currently defined. But implementing that well might require understanding the hash table creation and lookup, which I don't. Just rambling here.

Yes, I have come to the same conclusion. I am currently in "make it work" mode and will revisit this for "make it better" mode.

Your fix for %unmacro on a currently active macro is okay but I'm currently playing around with a mechanism where the check that frees any ExpDef except macro ones will additionally check for a certain flag - set by %unmacro if it would display your error. I store that flag in the `state' field as that's only used by %if (and maybe %while).

I'll probably get that %unmacro patch (along with a bunch of other preproc patches) ready during the holidays.

What is the advantage/benefit of this?

EDIT: By the way, I think the linked list `expansions' wasn't properly used anywhere in the repo. Just confused me a little when I reviewed/compared one of your patches right now.

Define properly used.
Title: Re: NASM - Official Recursive Macro Support!
Post by: cm on December 18, 2010, 09:33:34 PM
No, but the latest push does.

What if the processed line is a preproc directive that ends/changes/goes deeper, though? In most simple cases (like %endcomment) these will just check that the line ends after the directive, but in other cases (think %elif) such a line might actually be parsed. What happens then if there is an unterminated %[ construct?

Yes, I have come to the same conclusion. I am currently in "make it work" mode and will revisit this for "make it better" mode.

Okay. I'll report back, if I should get something that works (and handles all cases well).

What is the advantage/benefit of this?

I don't think it has an intrinsic one - an error might help users to catch obscure errors; support for it (possibly with a (suppressible) warning - I don't show one) eases some things. But note that you partially break compatibility (not that this is such a bad thing in a major new release); previously, %unmacro on an active macro appeared to work fine - the only issue, as I reported, appeared to be that the macro's name would not show correctly in warnings/errors caused by that macro's content. In my implementation of this, %unmacro will immediately unlink the ExpDef from the list of macros but will not free it just yet if it is still active. The code that frees ExpDefs will free a macro's ExpDef only if it reached cur_depth == 0 and the state was set by %unmacro as I mentioned; all other ExpDefs are freed unconditionally.

EDIT: By the way, I think the linked list `expansions' wasn't properly used anywhere in the repo. Just confused me a little when I reviewed/compared one of your patches right now.

Define properly used.

I didn't review the code in the repo again (yet, but I don't really intend to) but I think it wasn't read anywhere really. `istk->expansion' is the linked list of currently active expansions while `defining' points to the currently defined expansion. `expansions' is unnecessary and not used. In my current source I removed it entirely. (Actually I see that the variable still existed, but it wasn't accessed anywhere (in my modified source).)

In your documentation push you state that %while<condition> is implemented. I think the appropriate directives are checked for already, but the code (for anything but plain %while) is not yet implemented. If you intend to change that, you'll have to think about how to store the %while's condition type in the ExpDef. (You could also go with the current implementation's way to store the actual condition and re-parse the necessary part of the stored condition line each time the condition is to be evaluated.)

This brings me to another thing I thought about: There are a number of fields in ExpDefs and ExpInvs that are only used for specific expansion types (often, these fields are specific to macros). Do you think that it would be worth it to do something about that?
Title: Re: NASM - Official Recursive Macro Support!
Post by: Keith Kanios on December 19, 2010, 06:06:56 PM
No, but the latest push does.

What if the processed line is a preproc directive that ends/changes/goes deeper, though? In most simple cases (like %endcomment) these will just check that the line ends after the directive, but in other cases (think %elif) such a line might actually be parsed. What happens then if there is an unterminated %[ construct?

Not sure. If you can work up an example that breaks it, I'll take a look at it.

Yes, I have come to the same conclusion. I am currently in "make it work" mode and will revisit this for "make it better" mode.

Okay. I'll report back, if I should get something that works (and handles all cases well).

OK.

What is the advantage/benefit of this?

I don't think it has an intrinsic one - an error might help users to catch obscure errors; support for it (possibly with a (suppressible) warning - I don't show one) eases some things. But note that you partially break compatibility (not that this is such a bad thing in a major new release); previously, %unmacro on an active macro appeared to work fine - the only issue, as I reported, appeared to be that the macro's name would not show correctly in warnings/errors caused by that macro's content. In my implementation of this, %unmacro will immediately unlink the ExpDef from the list of macros but will not free it just yet if it is still active. The code that frees ExpDefs will free a macro's ExpDef only if it reached cur_depth == 0 and the state was set by %unmacro as I mentioned; all other ExpDefs are freed unconditionally.

Can probably simplify that as mark, then free in pp_getline:

Code: [Select]
} else if ((ed->type != EXP_MMACRO) || (ed->unlink == true)) {

Now, do we want to prevent use of a macro that is marked in such a manner?

EDIT: By the way, I think the linked list `expansions' wasn't properly used anywhere in the repo. Just confused me a little when I reviewed/compared one of your patches right now.

Define properly used.

I didn't review the code in the repo again (yet, but I don't really intend to) but I think it wasn't read anywhere really. `istk->expansion' is the linked list of currently active expansions while `defining' points to the currently defined expansion. `expansions' is unnecessary and not used. In my current source I removed it entirely. (Actually I see that the variable still existed, but it wasn't accessed anywhere (in my modified source).)

Can't remember what expansions was implemented for, probably safe to remove.

In your documentation push you state that %while<condition> is implemented. I think the appropriate directives are checked for already, but the code (for anything but plain %while) is not yet implemented. If you intend to change that, you'll have to think about how to store the %while's condition type in the ExpDef. (You could also go with the current implementation's way to store the actual condition and re-parse the necessary part of the stored condition line each time the condition is to be evaluated.)

I don't intend to expand on it. Feel free to rework it so that it is more like the %if family, if the rest of the dev team approves.

This brings me to another thing I thought about: There are a number of fields in ExpDefs and ExpInvs that are only used for specific expansion types (often, these fields are specific to macros). Do you think that it would be worth it to do something about that?

Yes, once the current effort is "stable" and we know the processor is not going to change much further, we can revisit such optimizations.
Title: Re: NASM - Official Recursive Macro Support!
Post by: cm on December 19, 2010, 07:17:19 PM
Can probably simplify that as mark, then free in pp_getline:

Code: [Select]
} else if ((ed->type != EXP_MMACRO) || (ed->unlink == true)) {

No, we have to check whether this is the last invocation of the macro that terminated. Here's my current code.

Code: [Select]
/*
 * The state field of mmacro ExpDefs contains one of these
 * values. "Free defered" indicates that this mmacro should
 * be freed by its last invocation's termination (like other
 * ExpDefs are anyway). This is set by %unmacro if necessary.
 */
enum {
    MACRO_NORMAL = 0, MACRO_FREEDEFERED
};

This is in pp_getline:
Code: [Select]
                        /*
                         * The expansion that ran out is a %while or %rep that
                         * terminated, or an %if construct wrote all its content,
                         * or a multi-line macro invocation completed.
                         */

                        if (ed->cur_depth > 0)
                            ed->cur_depth --;
                        else
                            nasm_assert(ei->type == EXP_IF);

                        /*
                         * Free the expansion's definition if so requested.
                         * This frees the ExpDefs of all %rep, %while and %if
                         * constructs. Additionally, %unmacro will set its
                         * macro's state to "Free defered" in case the macro
                         * is currently expanding. (%comment does not actually
                         * create invocations; %final and pre-defs do create
                         * such but are anonymous (ie they have no ExpDef).)
                         *
                         * Note that entered macros are only freed here if we
                         * are the last invocation to terminate. All of the
                         * other expansion types are only invoked once; the
                         * termination of that single invocation therefore
                         * unconditionally marks they should be freed.
                         */
                        if (ed->type != EXP_MMACRO
                            || (ed->state == MACRO_FREEDEFERED && ed->cur_depth == 0)) {
                            free_expdef(ed);
                        }

This is in do_directive (%unmacro):
Code: [Select]
                *ed_p = ed->next;

                /*
                 * Free the macro's definition if it isn't expanding, or defer
                 * freeing it until all of the macro's expansions terminate. Note
                 * that we unlinked the definition anyway, so new expansions won't
                 * use the definition anymore.
                 */
                if (!ed->cur_depth)
                    free_expdef(ed);
                else {
                    ed->state = MACRO_FREEDEFERED;
                }

Now, do we want to prevent use of a macro that is marked in such a manner?

In the above code fragment and comment I note that usage of the macro (ie new invocations) are immediately made impossible in the %unmacro handler by unlinking it from the list. I think that this is the right behaviour, and it is efficiently and simply implemented by always unlinking the ExpDef.

I don't intend to expand on it. Feel free to rework it so that it is more like the %if family, if the rest of the dev team approves.

Will do so later on, but if a release/RC goes out with the current implementation then the doc could explicitly say that only %while is implemented.