Author Topic: sys_write confusion  (Read 15718 times)

Offline Gunner

  • Jr. Member
  • *
  • Posts: 74
  • Country: us
    • Gunners Software
sys_write confusion
« on: July 23, 2012, 01:37:52 AM »
I have been using MASM to write programs for over 10 years so I am not new to this, what I am new to is Linux.  Feel like a newb all over again.

Before anyone says just link and use the C functions, I will get to that after I understand what "goes on down here" in Linux.

So to create a buffer dynamically, we need to call sys_brk with ebx = 0 to get the address of the end of our data section.  Then we call it again with a size.  So, why does this still work with the second call to sys_brk commented out?
Code: [Select]
SECTION     .bss
New_Break   resd    1
lpBuffer    resb    12

Org_Break   resd    1

SECTION     .text
global      _start

_start:

    mov     ebx, 0
    mov     eax, sys_brk
    int     80H
    mov     [New_Break], eax
    ;~ push    eax
   
    push    eax
    call    PrintNum
   
    ;~ pop     eax
    ;~
    ;~ add     eax, 5
    ;~ mov     ebx, eax
    ;~ mov     eax, sys_brk
    ;~ int     80H
    ;~ mov     edi, eax
    ;~ push    eax
    ;~ call    PrintNum

    mov     edx, [New_Break]
    mov     dword [ecx], "ABCD"
    mov     byte [ecx +4], 10
   
    mov     edx, 5
    mov     eax, sys_write
    mov     ebx, stdout
    int     80H
       
    mov     ebx, Org_Break
    mov     eax, sys_brk
    int     80H
    push    eax
    call    PrintNum

Exit: 
    mov     eax, sys_exit
    xor     ebx, ebx
    int     80H

and the output:
Code: [Select]
gunner@LinuxDevel ~/Desktop/asm/projects/memalloc $ ./memalloc
136335360
ABCD
136335360

Offline Bryant Keller

  • Forum Moderator
  • Full Member
  • *****
  • Posts: 360
  • Country: us
    • About Bryant Keller
Re: sys_write confusion
« Reply #1 on: July 23, 2012, 02:49:55 AM »
Linux allocates you memory in chunks of 4kB at a time. You're executable and all it's data is smaller than 4kB so you have some wiggle room. If you were to increase the size of your executable to be ~3.9 kilobytes, then a page fault would occur when you would try to write the data into memory. The reason you use sys_brk is to make sure that, when you reach the 4kB boundary, a new page will be allocated for you and, if the page is swapped out, your data isn't lost. You can actually change the page boundary value when you build a custom kernel (it's one of the options in menuconfig) so I don't suggest just "assuming" that you've got plenty of space because your program doesn't exceed the 4kB boundary.

About Bryant Keller
bkeller@about.me

Offline Gunner

  • Jr. Member
  • *
  • Posts: 74
  • Country: us
    • Gunners Software
Re: sys_write confusion
« Reply #2 on: July 23, 2012, 03:01:52 AM »
Ah, makes sense.

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: sys_write confusion
« Reply #3 on: July 23, 2012, 04:25:40 AM »
Hi Gunner,

I see your helpful posts over at AsmCommunity all the time. Glad to see you over here!

Code: [Select]
; ...
    mov     edx, [New_Break]
    mov     dword [ecx], "ABCD"
    mov     byte [ecx +4], 10

Typo in line one here? What's in ecx? (artifact from PrintNum would be my guess - lpBuffer?) I fear that this is working "by chance". In any case, you print "ABCD(LF)"...

Code: [Select]
   
    mov     edx, 5
    mov     eax, sys_write
    mov     ebx, stdout
    int     80H
       
    mov     ebx, Org_Break
In Nasm, this is like "offset Org_Break". Not what you want here! I suspect that it's working because sys_brk seems to "round off" to PAGE_SIZE. and "offset Org_Break" is "close enough".

As Bryant points out, we really shouldn't ASSume that PAGE_SIZE is 4096. There's an "accessor function" to return PAGE_SIZE... but I forget the name. In your commented out code, you ask for 5 more bytes. sys_brk is going to round this up to PAGE_SIZE - you could use the return from this to determine what PAGE_SIZE actually is.

In order to use this like malloc() and actually get just 5 bytes, you'd want to use sys_brk to get a PAGE_SIZE block, and "carve out" 5 bytes... plus enough "meta information" to be able to free() it.

Another way to get more memory is with sys_mmap. I prefer sys_brk because the new memory is contiguous with the memory we've already got, where sys_mmap gives us memory up around 0x40000000 or so. We probably shouldn't ASSume that either, but it seems to be a good bet.

Anyway, I think that's why it's working even with the commented out code. It "should" segfault, I think (if you'd really had "[New_Break]" in ecx). I haven't tested your code - need to provide PrintNum and a few odds and ends before it'll assemble. I shall do this. The number you're printing isn't what I'd "expect"... different kernel, perhaps?

In any case, looking forward to hearing more from ya!

Best,
Frank


Offline Gunner

  • Jr. Member
  • *
  • Posts: 74
  • Country: us
    • Gunners Software
Re: sys_write confusion
« Reply #4 on: July 23, 2012, 10:55:33 PM »
The  mov     ebx, Org_Break is from one of the few things I found on this, and they used this to "free" the mem by setting to the last label in the bss section.  Doesn't seem right to me.  Originally, after the first sys_brk, I moved that address into "org_break". 

Let me get the code back to were it was and I will post it.  There really aren't  many Asm example on this out there.  I have written quite a few tutorials for Windows and MASM.  Haven't found anything really useful for NASM and Linux, so as I learn Linux, I am going to write tutorials for each step.

Just checked my reg date for AsmCommunity - 2002; damn time has gone by!  Back in the day, many great folks shared their knowledge over there.

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: sys_write confusion
« Reply #5 on: July 24, 2012, 12:09:08 AM »
That works. I still have concerns about the possible typo on line 39. What's in ecx?

I'm surprised that sys_brk gives an "exact" answer, and that it appears to give you exactly 5 more bytes. My recollection is that it "rounded off" to the top of the page. No great surprise that my memory is faulty. :)

I like your "dwtoa". I'll have to study that!

Tutorials? Excellent!

Best,
Frank


Offline Gunner

  • Jr. Member
  • *
  • Posts: 74
  • Country: us
    • Gunners Software
Re: sys_write confusion
« Reply #6 on: July 24, 2012, 12:31:59 AM »
ecx, heh.  Yeah, typo/remnants.  It works, (but why?!?  I will have to look into why) but jeesh that would be a hair pulling bug to find in a large program when something goes wrong.  That the edx should be ecx.

Yeah, I have two so far for NASM
NASM - Linux Getting command line parameters
NASM - Linux Terminal Input/Output w/int 80H

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: sys_write confusion
« Reply #7 on: July 26, 2012, 09:08:34 AM »
Those are great! One minor nitpick... you consistently equate stderr to 3. Pretty sure stderr is 2.

A guy named Jeff Owens wrote a huge package of stuff for Linux/Nasm. An IDE, an extensive library, documentation... tons of stuff - I'm not sure what all of it is! Jeff has abandoned the project - discouraged that no one was using it, I think. One part of it that I find extremely useful is "asmref". Asm-friendly descriptions of all the system calls - even example code for 'em! Plus kernel symbols, structures, etc. If you use his IDE, it's available as pop-up help, but it can be used freestanding, too. A great resource, IMHO!

http://home.myfairpoint.net/fbkotler/asmtools-0.9.69.tar.gz

I'm not sure all of this is up-to-date - some (all?) of it may be available at SourceForge, too. This is what I've got. Jeff's page has disappeared, so this may be "it".

Another good resource is Konstantin Boldyshev's page at http://asm.sf.net (formerly linuxassembly.org, but that one's dead). Konstantin's looking for a new maintainer, too. Sigh.

Where's Betov with that rebirth? :)

Best,
Frank


Offline Gunner

  • Jr. Member
  • *
  • Posts: 74
  • Country: us
    • Gunners Software
Re: sys_write confusion
« Reply #8 on: July 26, 2012, 11:34:37 PM »
Betov?  He wrote RosASM right?  A person named guga seems to be doing something with it now.  Or do you mean ReactOS?

Bryant Keller already brought that to my attention about stderr.  After tomorrow, I have two weeks off, so plenty of time to fix typos and add more tutorials.