Author Topic: align and nop  (Read 7904 times)

Offline Olsonist

  • Jr. Member
  • *
  • Posts: 26
align and nop
« on: April 17, 2015, 12:05:51 AM »
I have a single jmp instruction at the beginning of a 16 byte block followed by an alignment:

%use         smartalign
ALIGNMODE      generic,16
BITS            64

section         .text

jmp elsewhere
align 16

0000000000000060   eb92               jmp   xxxx
0000000000000062   66666690           nop
0000000000000066   66666690           nop
000000000000006a   66666690           nop
000000000000006e   6690               nop

Is it possible to tell align to combine those 4 nops into a single 14 byte nop? This is on Haswell and in fact it will incur an LCP stall the first time. But then after that it will live in the micro-instruction cache for the rest of eternity.


Offline Rob Neff

  • Forum Moderator
  • Full Member
  • *****
  • Posts: 429
  • Country: us
Re: align and nop
« Reply #1 on: April 17, 2015, 03:59:00 PM »
Is it possible to tell align to combine those 4 nops into a single 14 byte nop? This is on Haswell and in fact it will incur an LCP stall the first time. But then after that it will live in the micro-instruction cache for the rest of eternity.

What is a 14 byte nop?  According to Intel's Architecture doc:
Code: [Select]
3.5.1.10 Using NOPs
Code generators generate a no-operation (NOP) to align instructions. Examples of NOPs of different
lengths in 32-bit mode are shown below:
1-byte: XCHG EAX, EAX
2-byte: 66 NOP
3-byte: LEA REG, 0 (REG) (8-bit displacement)
4-byte: NOP DWORD PTR [EAX + 0] (8-bit displacement)
5-byte: NOP DWORD PTR [EAX + EAX*1 + 0] (8-bit displacement)
6-byte: LEA REG, 0 (REG) (32-bit displacement)
7-byte: NOP DWORD PTR [EAX + 0] (32-bit displacement)
8-byte: NOP DWORD PTR [EAX + EAX*1 + 0] (32-bit displacement)
9-byte: NOP WORD PTR [EAX + EAX*1 + 0] (32-bit displacement)

If you are not satisfied with Nasm's default behavior you could certainly replace your align instruction with some combination of the above.

Offline Olsonist

  • Jr. Member
  • *
  • Posts: 26
Re: align and nop
« Reply #2 on: April 17, 2015, 09:03:56 PM »
The maximum length for an Intel 64 or IA-32 instruction is 15 bytes and it's very easy to construct a 15 byte nop:
Quote
db 66h
db 66h
...
xchg eax
I'll hack on the align macro and submit that.

Offline Olsonist

  • Jr. Member
  • *
  • Posts: 26
Re: align and nop
« Reply #3 on: April 17, 2015, 10:40:06 PM »
This kind of does what I need:

Quote
%macro NOP_N 1
   times %1 db 67h
   NOP
%endmacro