Author Topic: Fill memory (Read 17459 times)

TightCoderEx · « **on:** May 11, 2012, 11:45:15 PM »

In essence this procedure has but one purpose and that is to fill a block of memory with a single byte. To facilitate different aspects, it has 3 entry points

InitFrame: Calculates the area by the difference between RSP & RBP. You can call this procedure anywhere in the caller so long as there is a stack frame and no other registers have been pushed on the stack.

Code: [Select]

; =========================================================================================================
; This procedure has three entry points each of which finally falls into routine that fills memory.
; Interation count is reduced by writing as many QWORDs as possible, but region no matter size will be
; filled correctly.

;	ENTRY:	 AL = Fill pattern (Bits 4 - 63 are irrelevant as they will be shifted out)
;		RCX = Size of area in bytes
;		RDI = Pointer to area to be written.

;	LEAVE:	RAX = Pattern extended through all 64 bits
;		RCX = Unchanged except when InitFrame is called then sizeof fill area.
;		RDI = Pointer to fill area.

; ---------------------------------------------------------------------------------------------------------

	; InitFrame only requires AL be set, RCX & RDI are calculated
	
  InitFrame:	mov	rdi, rsp		; Get pointer to base of fill area
    		add	rdi, 8			; Bump past callers return
    		mov	rcx, rbp
    		sub	rcx, rdi		; Get actual number of bytes to fill

FillMem has three parts:
1: AL = 0, simply bounce to ZeroMem and RAX will be set accordingly
2: AL = -1 simply set RAX to zero and decrement once
3: AL = -2 through 2 or 1 - FE. Little more involved copying pattern.

Falling into this is how InitFrame determines what its pattern is going to be

Code: [Select]

    		
    	; Test if we are supposed to be filling with nulls
    	
     FillMem:	or	 al, al			; Are we going to fill with nulls
    		jz	ZeroMem
    		
    	; Test if we are supposed to be filling with -1's
    	
    		inc	 al			; if AL = FF, bump to NULL to set ZF
    		jnz	.Shift			; ZR = 0, means we have to extend pattern
    		
    	; More time effective that using .Shift and save a few bytes over move rax, -1.
    	
    		xor	rax, rax
    		dec	rax
    		jmp	ZeroMem + 3

Is there a better way of doing this?

Code: [Select]


	; Shift contents of AL through RAX
	    		
      .Shift:	push	rcx			; Save size of fill area
    		xor	ecx, ecx		; Trash bits
    		mov	 cl, 7
    		dec	 al			; Adjust back to original value
    		mov	 dl, al			; Save a copy of fill byte
    		
    	  .L0:	shl	rax, 8			; Shift in 8 zero bits
    	  	mov	 al, dl			; and copy fill byte into nullified space
    	  	loop	.L0
    	  	
    		pop	rcx			; Retrive buffer size
    		jmp	$ + 5			; Bounce over next instruction

and finally finish by filling buffer

Code: [Select]

    
	; This entry point just simply nullify's RAX as more often that not, calling code would need to 
	; do this
      
     ZeroMem:	xor	rax, rax		; Set Fill pattern
      
	; RAX = Fill pattern
	; RCX = Size of buffer
	; RDI = Pointer of area to be filled
      
	; As area may not be quadword aligned, preamble tests bits 0 - 2 as each of those indicates the
	; number of bytes, words, dwords that need to be written to align buffer on 64 bits
      
      		push	rcx			; Preserve
      		push	rdi
      		
      	; If bit 0 is on, fill one byte
      		sar	rcx, 1			; Shift bit 0 into CY
      		jnc	$ + 3
      		stosb
      		
      	; Now we are word aligned and if bit 1 was on fill another word
      	
      		sar	rcx, 1			; Shift bit 1 into CY
      		jnc	$ + 4
      		stosw
      		
      	; Now we are dword aligned and if bit 2 was on fill another dword
      	
      		sar	rcx, 1			; Shift bit 2 into CY
      		jnc	$ + 3
      		stosd
      		
      	; RCX now equals the number of qwords to fill
      	
      		repnz	stosq			; Finish by writing RCX qwords.
      		
      		pop	rdi
      		pop	rcx			; Recover
      		
      		ret

I've tested this fairly extensively, but for buffers smaller than 8 bytes or null for that matter I haven't. Doesn't seem reasonable this would be used for an area that small.

Frank Kotler · « **Reply #1 on:** May 12, 2012, 06:02:04 PM »

Code: [Select]

or rax, -1

... might be a shorter way to fill rax with -1. Dunno how it would compare for speed...

Best,
Frank

TightCoderEx · « **Reply #2 on:** May 12, 2012, 07:23:04 PM »

Quote from: Frank Kotler on May 12, 2012, 06:02:04 PM

Code: [Select]
or rax, -1
... might be a shorter way to fill rax with -1. Dunno how it would compare for speed...

Good catch Frank and it is shorter by 2 bytes. I don't generally concern myself too much with cycles unless they are in a high iteration count loop. I try to get from point A to point B as efficiently as possible algorithm wise using the least amount of code and at a glance, the logic stands right out. At least, that's the objective anyway

NASM - The Netwide Assembler

News:

Author Topic: Fill memory (Read 17459 times)

TightCoderEx

Fill memory

Frank Kotler

Re: Fill memory

TightCoderEx

Re: Fill memory