Author Topic: SIGNIFICANT speed boost by labels.c changes  (Read 3865 times)


  • Guest
SIGNIFICANT speed boost by labels.c changes
« on: January 26, 2005, 12:23:09 AM »
I write this (probably in the wrong project place, but i dont know where else) just to let you know about a nice performance boost i did today .

Preamble: I noticed that labels.c is one of the most time consuming source, at least when labels extensively used. Therefore i hacked a bit around.

File: labels.c
- Structure label: new Field: int labels.defn.hid;
     (deleted .dummy for it, so no space added)
- function find_label():
   - store hash summary (before %= ) in .hid
   - do string compares FIRST against .hid
     and only if this matchs do the full string

Effect: Assembling times for all my modules reduced by more than 35%. Biggest source changed from >11 seconds to <7 seconds assemble time. Even in very simple sources like a bootsector i get significant speed changes.

Maybe this is usefull for other symboltypes too, i'll look into it.
Another minor speed boost i did is that my find_label() searchs optimized for global labels (eliminates prevlabel comparison). This gave me ~3% speed boost.

And last and least:
Structure permts contains a size field wich isn't needed because size is constant. I eliminated this field too.

If someone want a source image, tell me the right place.

Hoping somebody make use of it,


  • Guest
Re: SIGNIFICANT speed boost by labels.c changes
« Reply #1 on: January 26, 2005, 01:00:46 AM »
Yep, the NASM 0.98 way of handling labels is not
optimal. In my local forked version I went to a
different design, which relies on dynamic memory
allocation, a flexible hash bucket count, and a
more suitable hash algorithm.

SF #893507: replace memory allocation in labels.c
SF #888518: add support for -H as well as ...
SF #900810: switch to Jenkins' hash to improve...

The fundamental idea is to reduce the number of
linear lookups, i.e. reduce the number of labels
per hash bucket, by increasing the number of hash
buckets, and employing a good hash algorithm, to
ensure nearly even distribution amongst buckets.
Worked out _really_ well for me.  :)