Author Topic: fast substring search  (Read 16517 times)

Offline vygr

  • Jr. Member
  • *
  • Posts: 9
fast substring search
« on: May 23, 2016, 06:40:45 PM »
Hi folks, new member here, but not new to assembler coding.

I've been doing a project for a while in NASM https://github.com/vygr/Asm-Kernel, and inside the complied C/C++ like language, created wth macros, for app level coding I'm thrashing a substring search and a few related macros while doing the token parsing and register allocation phases.

The specific file where all this goes on is https://github.com/vygr/Asm-Kernel/blob/master/inc/code.inc, and the function that get's thrashed a lot is this:

   %macro sub_string 2
      ;%1 = param to find
      ;%2 = param to search
      %strlen %%l1 %1
      %strlen %%l2 %2
      %assign _pos 0
      %if %%l1 <= %%l2
         %assign _pos %%l2 + 1 - %%l1
         %rep _pos
            %substr %%ss2 %2 _pos, %%l1
            %ifidn %%ss2, %1
               %exitrep
            %else
               %assign _pos _pos - 1
            %endif
         %endrep
      %endif
   %endmacro

I use it to return the position of a substring within another string, or 0 if not found.

Is this the best I can do ? Speed wise ? Is there a better way, maybe request that some similar function be added to NASM perhaps ?

I also find myself doing a lot of list searches constructed from single line macros, so tending to build a list with a prefix and using an %assign as the list element. So for instance pushing a token in the token parser I do.

   %macro push_token 2
      %deftok %%t %1
      %xdefine _token_%[_token_sp] %%t
      %assign _token_type_%[_token_sp] %2
      %assign _token_sp _token_sp + 1
   %endmacro

_token_sp is the last value etc and to enumerate through, something like this.

   %macro print_token_list 0
      %assign %%n 0
      %rep _token_sp
         %warning token %%n: n: _token_%[%%n] t:_token_type_%[%%n]
         %assign %%n %%n + 1
      %endrep
   %endmacro

Again, am I missing a trick here ? Is there a way to express a list in a better faster way ?

May I just say I think NASM is great :) Best assembler I've ever used, first one I've written a compiler in the macro preprocessor with anyways.

Regards to all

Chris

------
Chris Hinsley

Offline vygr

  • Jr. Member
  • *
  • Posts: 9
Re: fast substring search
« Reply #1 on: May 25, 2016, 01:26:06 PM »
In order to give a little context, and for those that don't want to click on my project link and have a look. Here is an example of the Network load monitor app, written in the c++'ish compiled script.

%include 'inc/func.inc'
%include 'inc/mail.inc'
%include 'inc/gui.inc'
%include 'class/class_window.inc'
%include 'class/class_flow.inc'
%include 'class/class_button.inc'
%include 'class/class_progress.inc'
%include 'class/class_string.inc'
%include 'tests/gui/gui1/app.inc'

;;;;;;;;;;;
; test code
;;;;;;;;;;;

   fn_function tests/gui/gui1/app

      ptr window
      ptr window_panel
      ptr panel
      uint cpu_total
      uint cpu_count
      pptr task_progress

      ptr msg
      ulong select1
      ulong select2
      ulong mailbox
      pulong task_mailboxes
      struct task_mailbox, ml_mailbox

      ptr string
      ptr progress
      int width
      int height
      ulong owner

      ;init app vars
      push_scope

      ;create my window
      static_call window, create, {}, {window}
      static_call window, get_panel, {window}, {window_panel}
      static_call string, create_from_cstr, {"Network Task Monitor"}, {string}
      static_call window, set_title, {window, string}
      static_call string, create_from_cstr, {"Status Text"}, {string}
      static_call window, set_status, {window, string}

      ;add my panel
      static_call flow, create, {}, {panel}
      static_call flow, set_flow_flags, {panel, flow_flag_down | flow_flag_fillw}
      static_call flow, add_back, {panel, window_panel}

      ;allocate array for progress bars
      static_call sys_cpu, total, {}, {cpu_total}
      static_call sys_mem, alloc, {cpu_total * long_size}, {task_progress, _}

      ;add num cpus progress bars to my app panel
      assign {0}, {cpu_count}
      loop_start
         static_call progress, create, {}, {progress}
         static_call progress, set_max, {progress, 48}
         static_call progress, set_color, {progress, 0xff00ff00}
         static_call progress, add_back, {progress, panel}
         assign {progress}, {task_progress[cpu_count * long_size]}
         assign {cpu_count + 1}, {cpu_count}
      loop_until {cpu_count == cpu_total}

      ;set to pref size
      method_call window, pref_size, {window}, {width, height}
      static_call window, change, {window, 32, 32, width, height}

      ;set owner
      static_call sys_task, tcb, {}, {owner}
      static_call window, set_owner, {window, owner}

      ;add to screen and dirty
      static_call gui_gui, add, {window}
      static_call window, dirty_all, {window}

      ;allocate array for child mailbox ID's
      static_call sys_mem, alloc, {cpu_total * mailbox_id_size}, {task_mailboxes, _}

      ;open global farm
      static_call sys_task, open_global, {"tests/gui/gui1/child", task_mailboxes, cpu_total}

      ;init task mailbox
      static_call sys_mail, mailbox, {&task_mailbox}

      ;set up mailbox select array
      static_call sys_task, mailbox, {}, {select1, _}
      assign {&task_mailbox}, {select2}

      ;app event loop
      loop_start
         ;new round of samples ?
         if {cpu_count ==  cpu_total}
            ;send out sample commands
            loop_start
               assign {cpu_count - 1}, {cpu_count}
               static_call sys_mail, alloc, {}, {msg}
               assign {1}, {msg->sample_mail_command}
               assign {sample_mail_size}, {msg->ml_msg_length}
               assign {task_progress[cpu_count * long_size]}, {msg->sample_mail_progress}
               assign {task_mailboxes[cpu_count * mailbox_id_size].mb_mbox}, {msg->ml_msg_dest.mb_mbox}
               assign {task_mailboxes[cpu_count * mailbox_id_size].mb_cpu}, {msg->ml_msg_dest.mb_cpu}
               assign {select2}, {msg->sample_mail_reply_id.mb_mbox}
               static_call sys_cpu, id, {}, {msg->sample_mail_reply_id.mb_cpu}
               static_call sys_mail, send, {msg}
            loop_until {!cpu_count}
         endif

         ;select on 2 mailboxes
         static_call sys_mail, select, {&select1, 2}, {mailbox}
         static_call sys_mail, read, {mailbox}, {msg}

         ;which mailbox had mail ?
         if {mailbox == select1}
            ;dispatch event to view
            method_call view, event, {msg->ev_data_view, msg}
         else
            ;update progress bar
            static_call progress, set_val, {msg->sample_mail_progress, msg->sample_mail_task_count}
            static_call progress, dirty, {msg->sample_mail_progress}

            ;count up replies
            assign {cpu_count + 1}, {cpu_count}
         endif

         ;free event message
         static_call sys_mem, free, {msg}

         ;be friendly
         static_call sys_task, yield
      loop_end

      ;wait for outstanding replys
      loop_while {cpu_count != cpu_total}
         static_call sys_mail, read, {select2}, {msg}
         static_call sys_mem, free, {msg}
         assign {cpu_count + 1}, {cpu_count}
      loop_end

      ;send out exit commands
      loop_start
         assign {cpu_count - 1}, {cpu_count}
         static_call sys_mail, alloc, {}, {msg}
         assign {0}, {msg->sample_mail_command}
         assign {sample_mail_size}, {msg->ml_msg_length}
         assign {task_mailboxes[cpu_count * mailbox_id_size].mb_mbox}, {msg->ml_msg_dest.mb_mbox}
         assign {task_mailboxes[cpu_count * mailbox_id_size].mb_cpu}, {msg->ml_msg_dest.mb_cpu}
         static_call sys_mail, send, {msg}
      loop_until {!cpu_count}

      ;free arrays
      static_call sys_mem, free, {task_mailboxes}
      static_call sys_mem, free, {task_progress}

      ;deref window
      static_call window, deref, {window}

      pop_scope
      vp_ret

   fn_function_end

And this is the child process source that gets launched by that with the open_global call.

%include 'inc/func.inc'
%include 'tests/gui/gui1/app.inc'

;;;;;;;;;;;
; test code
;;;;;;;;;;;

   fn_function tests/gui/gui1/child
      ;monitor task

      ptr msg

      push_scope
      loop_start
         ;read mail command
         static_call sys_mail, mymail, {}, {msg}
         breakif {!msg->sample_mail_command}

         ;sample command
         static_call sys_task, count, {}, {msg->sample_mail_task_count}
         assign {msg->sample_mail_reply_id.mb_mbox}, {msg->ml_msg_dest.mb_mbox}
         assign {msg->sample_mail_reply_id.mb_cpu}, {msg->ml_msg_dest.mb_cpu}
         static_call sys_mail, send, {msg}

         ;be friendly
         static_call sys_task, yield
      loop_end
      static_call sys_mem, free, {msg}

      pop_scope
      vp_ret

   fn_function_end

Regards

Chris

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: fast substring search
« Reply #2 on: May 25, 2016, 03:38:03 PM »
Hi Chris,

Welcome to the Forum. Thanks for the compliments about Nasm, and the sample code.

I'm not a very sophisticated macro user. I was hoping that maybe someone on the NASMX crew might have some insight, but what you're doing is quite different, I think. Sorry I can't help you much with it.

Best,
Frank


Offline vygr

  • Jr. Member
  • *
  • Posts: 9
Re: fast substring search
« Reply #3 on: May 25, 2016, 03:48:45 PM »
Thanks Frank.

Is there any chance of having a new function native to NASM aded that does what I'm doing in the string search macro ?

  %macro sub_string 2
      ;%1 = param to find
      ;%2 = param to search
      %strlen %%l1 %1
      %strlen %%l2 %2
      %assign _pos 0
      %if %%l1 <= %%l2
         %assign _pos %%l2 + 1 - %%l1
         %rep _pos
            %substr %%ss2 %2 _pos, %%l1
            %ifidn %%ss2, %1
               %exitrep
            %else
               %assign _pos _pos - 1
            %endif
         %endrep
      %endif
   %endmacro

Maybe, similar to the %substr command, but it defines a numeric macro, like an %assign but it's the result of doing the substring in string search ? This would seriously speed up my parser and optimizer, plus I believe be useful to the wider audience of NASM users.

Usage something like:

%search result 'substr' 'searched string'

Best regards

Chris

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: fast substring search
« Reply #4 on: May 25, 2016, 04:23:22 PM »
Hi Chris,

Although I am nominally a "developer" 'cause I added a few trivial bits of code - a long time ago - I am not currently active in Nasm development. My observation is that the guys currently doing it don't have a lot of time. If you could produce a patch to add this function, maybe they'd include it. Wouldn't hurt to ask.

https://lists.sourceforge.net/lists/listinfo/nasm-devel/

Honestly, I'm not optimistic, but you could discuss it with them.

Best,
Frank


Offline vygr

  • Jr. Member
  • *
  • Posts: 9
Re: fast substring search
« Reply #5 on: May 25, 2016, 04:31:04 PM »
OK, I'll look into adding patch !

Thanks Frank.

Chris

Offline vygr

  • Jr. Member
  • *
  • Posts: 9
Re: fast substring search
« Reply #6 on: May 25, 2016, 07:08:42 PM »
In case people are interested, this is an example of the lower level virtual processor that the script compiler macros target. Again this is all NASM macro's but this time keeping you insulated from the actual x64 raw operations. At this level you still get assistance with calling functions and not having to worry about juggling your registers manually, there topologically sorted and checked for being a DAG, no more worrying if you got your inputs wrong. :)

%include 'inc/func.inc'
%include 'class/class_vector.inc'

   fn_function class/vector/set_capacity
      ;inputs
      ;r0 = vector object
      ;r1 = vector capacity
      ;outputs
      ;r0 = vector object
      ;trashes
      ;r1-r3, r5-r8

      ;do we allready have room ?
      vp_mul long_size, r1
      vp_cpy [r0 + vector_capacity], r2
      if r1, >, r2
         ;grow the dynamic array
         vp_push r0
         vp_xchg r1, r2
         s_call sys_mem, grow, {[r0 + vector_array], r1, r2}, {r1, r2}
         vp_pop r0
         vp_cpy r1, [r0 + vector_array]
         vp_cpy r2, [r0 + vector_capacity]
      endif
      vp_ret

   fn_function_end

Regards all

Chris

Offline vygr

  • Jr. Member
  • *
  • Posts: 9
Re: fast substring search
« Reply #7 on: May 25, 2016, 07:16:33 PM »
And an example where the machinery that the s_call macro and variants use can be turned to other uses is the way you can call things like external libs, for example the sdl graphics lib.

   %macro sdl_create_window 6
      ;title, x, y, w, h, flags
      set_src %1, %2, %3, %4, %5, %6
      set_dst r7, r6, r2, r1, r8, r9
      map_src_to_dst
      sdl_call sdl_SDL_CreateWindow
   %endmacro

You can declare the parameter mapping and let the DAG checker do the work. You can't accidentally trip yourself up with the wrong order of copies.

Chris

Offline Bryant Keller

  • Forum Moderator
  • Full Member
  • *****
  • Posts: 360
  • Country: us
    • About Bryant Keller
Re: fast substring search
« Reply #8 on: May 26, 2016, 08:14:56 AM »
Hey Frank, sorry that I'm not as active as I probably should be, my Internet connectivity is questionable at best.

vygr,
Welcome to the forum! Frank mentioned something called NASMX, it's a general purpose macro package providing high level constructs and quite a bit of abstraction for the purpose of cross platform programming in NASM. NASMX does include a bunch of macros like nx_strtok, nx_strstr, and some general HL macros... but the ones you've got are good enough on their own and I doubt you would gain much more insight from NASMX.

Warning: Going off-topic here :D

However, there is something else that NASMX has that your code "insinuates" that you may want/need... A simple, but EXTREMELY POWERFUL type system. The type system you've implemented look a lot like what I did on the original version of NASMX which does indeed provides a good system for user-defined types, NASMX goes quite a bit beyond that (it even supports union types).

With NASMX, we take the approach of collecting as much information as possible as early as we can, and generate code as late as possible. The method your using is much more direct, which is good for it's ability to reduce compile-time overhead. However, you'll have a hard time trying to implement advanced types like union with the direct method. In fact, NASMX would probably still be using the more direct method if we weren't so determined to support union's. The compile-time overhead will be a major factor in whether or not you might want to implement a NASMX like type system, you're code is already doing much more work than NASMX does by providing code generation for an evaluator, it might not be worth it to create all the extra symbols (using up more memory at compile-time).

To give you a quick run-through of the NASMX type system.. At the top of ]NASMX.INC we define a bunch of symbols that for the built-in types and registers. Later some type specific classifications are defined for the built-ins and a few symbols are created for identification of user-types. Afterwards, we create some aliases that C users tend to like better. All these definitions have a purpose. They make our built-in types act the same way that our user-types will eventually act. This means we can create these macros and use them on both user-defined types and built-in types. The rest of the type system starts here and runs on until the nx_isalnum macro.


About Bryant Keller
bkeller@about.me

Offline vygr

  • Jr. Member
  • *
  • Posts: 9
Re: fast substring search
« Reply #9 on: May 26, 2016, 11:15:55 AM »
Hi Bryant, thanks for the reply. Was beginning to hear tumble weed at my end ;)

Do you think that a request, or a patch to add a %search built in would be worthwhile to the powers that be in NASM dev ? It would sure be a lot faster.

I'll take a look over the type system code, I currently have a simple, but as you say effective, type system for the evaluator, currently it just uses a string to indicate the type and address needs like pointer to pointer to... 'ppppI' for example. In the expression compiler as things are dereferenced the first char gets stripped off and the virtual register slot re-tagged with the remains. I hold all the generated instructions in an output buffer and run a simple optimize pass to convert references into copies and migrate constants, again, simple stuff but leaves the app level script code looking less embarrassing than it would otherwise :)

One thing I was going to do is to eventually emit an ARM and VP byte code form of my dynamic bound function format using NASM as purely a macro processor. Any thoughts on that ? Is there ever going to be ARM instruction support in NASM ?

Best regards

Chris

Offline vygr

  • Jr. Member
  • *
  • Posts: 9
Re: fast substring search
« Reply #10 on: May 26, 2016, 11:44:55 AM »
As far as compile time goes, I'm currently able to compile from a clean start, the OS, LIBS, GUI, and demo apps in 20 seconds. But the more I'm filling out the application level scripts that time is taking a hit. I've begun to avoid using extra %defines and %assign and conversions between string and token in order to boost performance.

Doing the whole thing from scratch in NASM, well macro assembler, might be seen as crazy, but climbing Everest is crazier in my book and people do that just because they can. :)

I once had a conversation where I told somebody that with a decent macro assembler you could do everything, and I'm going to make good on that statement. And enjoy myself in the process :)

Regards

Chris

Offline Bryant Keller

  • Forum Moderator
  • Full Member
  • *****
  • Posts: 360
  • Country: us
    • About Bryant Keller
Re: fast substring search
« Reply #11 on: May 27, 2016, 02:49:25 AM »
Hi Bryant, thanks for the reply. Was beginning to hear tumble weed at my end ;)

I live in a rural area, Internet access is a luxury not always available. I try to connect as often as I can. :)

Do you think that a request, or a patch to add a %search built in would be worthwhile to the powers that be in NASM dev ? It would sure be a lot faster.

I think some side effects to %substr itself would be a nice addition. I've not really dug into NASM source code in a long time, but maybe NASM could be updated to return the index of the last substing found in a built-in symbol. I think a feature like that would be kinda nice.

I'll take a look over the type system code, I currently have a simple, but as you say effective, type system for the evaluator, currently it just uses a string to indicate the type and address needs like pointer to pointer to... 'ppppI' for example. In the expression compiler as things are dereferenced the first char gets stripped off and the virtual register slot re-tagged with the remains. I hold all the generated instructions in an output buffer and run a simple optimize pass to convert references into copies and migrate constants, again, simple stuff but leaves the app level script code looking less embarrassing than it would otherwise :)

Yeah, that's one of the downsides to using macros that generate symbols, every time they are used they increase the overall compile time overhead.

One thing I was going to do is to eventually emit an ARM and VP byte code form of my dynamic bound function format using NASM as purely a macro processor. Any thoughts on that ? Is there ever going to be ARM instruction support in NASM ?

ARM support is something that a lot of people have asked for but never has seen the light of day. That said, I remember (a long time ago) there was something called NARM (a NASM port for ARM) though it was of a much older version of NASM and the developer's abandoned it long ago.

About Bryant Keller
bkeller@about.me

Offline vygr

  • Jr. Member
  • *
  • Posts: 9
Re: fast substring search
« Reply #12 on: May 27, 2016, 08:05:42 PM »
Even if %substr did emit some side effects it still wouldn't do this job. %substr just lets you create a sub string of an existing string, dons't do anything that you could use to search for the presence of a string within a string. You can construct a function to do that using %substr but it ends up slow because you have to create a new symbol and copy the contents for every positional test you perform, slow slow slow.

I will look into a patch for NASM to add %search later once I hit my patience limit with my compile times. I'm fairly sure that the feature would be accepted, it harms nothing and can benefit NASM developers.

Even without direct support for ARM, I'm sure that you can get NASM to emit ARM code just building up the opcodes ones self with 'db''s. A pain to do the jump offsets yourself but not impossible. There are assemblers of old that did just that, they only ever allowed you to produce 'db's as the output, but they had enough that you could produce code for multiple targets with the correct macros packs.

Heck if all else fails, I'll have to add writing an assembler to my list ;)

Regards

Chris