Author Topic: Check if SSE2 instructions present  (Read 6568 times)

Offline Borneq

  • Jr. Member
  • *
  • Posts: 26
Check if SSE2 instructions present
« on: April 30, 2011, 12:17:41 PM »
How can I determine if my P4 Processor is SSE2? Can I use CPUID? CPUID is available under Linux?

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: Check if SSE2 instructions present
« Reply #1 on: April 30, 2011, 06:42:11 PM »
Yeah, cpuid is available under Linux. That's about the only part I'm sure of. According to the first reference I came across...

we want cpuid with eax = 1, and bit 26 in edx indicates SSE2 available. Here's my attempt to test it:

Code: [Select]
; nasm -f elf gotsse2.asm
; ld -m elf_i386 -o gotsse2 gotsse2.o
; for P4, you won't need the "-m elf_i386"
; that tells a 64-bit ld that we want 32-bit code

global _start

section .data
    yep db "got sse2", 10
    yep_len equ $ - yep
    nope db "no sse2", 10
    nope_len equ $ - nope

section .text
    xor eax, eax
    ; "vendor string" in ebx, ecx, edx
    ; max level supported in eax
    cmp eax, 1
    jl exit

    mov eax, 1
    test edx, 1 << 26
    jnz gotsse2
    mov edx, nope_len
    mov ecx, nope
    jmp both
    mov edx, yep_len
    mov ecx, yep
    mov ebx, 1
    mov eax, 4
    int 80h
    xor ebx, ebx
    mov eax, 1
    int 80h

It claims I've got sse2. No idea if that's correct or not.

With eax = 1, ebx,ecx, edx contain the "vendor string" ("GenuineIntel" for us), and eax returns the "maximum supported level" of cpuid. I discovered that asking "number of cores" gives 16 on my P4. I wish! I forget the eax value for "how many cores", but it exceeds "maximum supported level", so it's probably good to check. I don't think there's any question of eax = 1 being supported, but I check anyway...

I'll attach another cpuid example that returns a more extensive string than "GenuineIntel"... Unless I'm mistaken, the Hz reported is the maximum it's capable of, and won't indicate if we're "throttled down" or "overclocked", so really not that useful(?). The string is "front padded" with spaces - would give neater output if these were trimmed... and a newline added...

In Linux, this is all figured out for us in /proc/cpuinfo - might be easier to read it from there...