Author Topic: What prevents your program from writing to memory that doesn't belong to it?  (Read 22170 times)

Offline hhh3h

  • Jr. Member
  • *
  • Posts: 41
I was told that programs are allocated a certain amount of memory by the operating system, in which they can read and write.  But what happens if they try to access memory beyond that range?

Does some system monitor memory access to ensure that you don't write to another area of memory?  And does that system limit performance since it has to check every time you access the heap to ensure it's within your range?

Offline Rob Neff

  • Forum Moderator
  • Full Member
  • *****
  • Posts: 429
  • Country: us
Does some system monitor memory access to ensure that you don't write to another area of memory?

Yes, that is the job of the operating system.  When the memory is allocated by the operating system and assigned to your application your application can read/write that memory without a performance hit.  When you attempt to access memory outside that defined range a fault occurs that will cause the operating system to trap on that fault and kill your app.  This is true for all protected-mode operating systems - basically all 32/64-bit operating systems available today.

FYI: This protection is also heavily influenced and synchronized with the memory paging algorithm used by the OS - a whole 'nother topic. :)

Offline hhh3h

  • Jr. Member
  • *
  • Posts: 41
Ok that makes sense, but when the OS gives memory to your application, does it give it at once, or does it only give memory a little bit at a time (like when you dimension/malloc memory for a specific variable).

The reason I ask is I am curious about out of bounds exceptions (like if you were to access index 5 of an array that only has indices 0-4).  Are they trapped automatically by the OS monitoring which addresses are being accessed?  Or do you have to manually ensure that each index accessed is indeed within the bounds of an array?

Offline Rob Neff

  • Forum Moderator
  • Full Member
  • *****
  • Posts: 429
  • Country: us
Ok that makes sense, but when the OS gives memory to your application, does it give it at once, or does it only give memory a little bit at a time (like when you dimension/malloc memory for a specific variable).

When the OS loads your program you have a default area called the heap - memory "reserved" for your application to use.  By reserved I mean the OS will let you allocate from it up to it's maximum size.  It is only when you actually call malloc() that the memory can then be accessible by your program.

When you define local variables - ie: those that will reside on the stack - those are instantly available to you ( up to your programs default stack size ).  These sizes can be specified via command line parameters to the linker.  I recommend you not use them unless you have a very specific need to do so.

The reason I ask is I am curious about out of bounds exceptions (like if you were to access index 5 of an array that only has indices 0-4).  Are they trapped automatically by the OS monitoring which addresses are being accessed?  Or do you have to manually ensure that each index accessed is indeed within the bounds of an array?

As a general rule: Always always always test the index to ensure you do not exceed an array boundary.

For arrays allocated on the stack you will introduce very subtle bugs that will either overwrite other variables or will overwrite pointers to things such as return addresses.  These will not necessarily cause a fault on their own!

For arrays allocated from the heap - you may have contiguous pages of memory allocated to your program where you might once again overwrite other variables used by your program.  If by chance you just keep writing and writing to an array way past it's end boundary you will encounter the fault once your buggy app attempts to read/write to memory it doesn't have rights to.

Note that in protected mode the operating system will ensure you do not overwrite any system memory owned by the OS itself.  Thus protected mode really means protecting the OS from YOU. :)

Offline georgelappies

  • Jr. Member
  • *
  • Posts: 12
Cool, interesting thread. But is there no way to get around this? How can one program alter the execution of another program for instance by replacing a call to a function with its code instead of the original program's code?

Offline TightCoderEx

  • Full Member
  • **
  • Posts: 103
with its code instead of the original program's code?

The first thing that comes to my mind, why would you want to do this.  In assembly, there are very few limitations of what can do and therefor the combinations and permutations of such an exercise are far too numerous.  Maybe give us some specifics of what you need to do, but this does sound a lot like wanting to know how to write a virus.

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Nobody here wants to write a virus. If anyone does, take it elsewhere - "how to" write malware is absolutely forbidden here!

However, how to write good "secure" code ought to be on topic - I don't think there's any objection to saying "don't use gets()"... and don't write an asm equilavlent to gets()!

How all of this "magic" works is an interesting topic for discussion, but probably too long and complex to cover fully here. It's a mixture of (OS) software and the hardware capabilities of 386+. Rob has mentioned "paging". Other topics to learn more about might include "privelege levels" ("rings" - if you can "get ring zero", you can do anything - it's a bug in the OS if you can do this - "user" programs are in "ring 3", only the kernel should have access to "ring zero"), "virtual memory" (related to "paging"), and perhaps "segmentation"... Besides the Intel/AMD manual, the OSdev guys probably have a lot of info on this

Best,
Frank


Offline Cyrill Gorcunov

  • NASM Developer
  • Full Member
  • *****
  • Posts: 179
  • Country: 00
Cool, interesting thread. But is there no way to get around this? How can one program alter the execution of another program for instance by replacing a call to a function with its code instead of the original program's code?

Usually OS provides some way for legit programs to patch other programs if needed.
Have no idea about how Windows OS deal with it but on Linux you can use ptrace() for that.
Still your program should have enough rights or OS simply refuse your attempts.

http://en.wikipedia.org/wiki/Ptrace

Offline TightCoderEx

  • Full Member
  • **
  • Posts: 103
Have no idea about how Windows OS deal with it but on Linux you can use ptrace() for that.

In Windows entry point to each API has move edi, edi.
Probably rdi, rdi on 64 bit version, and this facilitates hot patching, so windows can do an update without interrupting applications. Not that big a deal on desktops, but I suppose it's real handy on servers so they don't have to shut down all the time.

Code: [Select]
        mov     edi, edi is replaced with
Code: [Select]
        jmp to_where_ever to facilitate the patching mechanism.

Offline hhh3h

  • Jr. Member
  • *
  • Posts: 41
Well this is a very interesting read.  My main reason for posting this question was in many high-level languages, if you declare an array as having a certain number of elements, and then you mistakenly try to read or write an element past that point, the program will throw an exception.  It makes debugging nice and easy.

So I was wondering if the above exception was caused by (1) the compiler itself adding these checks into the code, or (2) whether it was just a general error thrown by the OS, so as long as you trap it you'll be fine.

^ And my conclusion from the above is that #2 is false and not to be relied upon to avoid data corruption... because the OS will only throw said exception if you try to read/write to memory outside your entire program's allocation (it doesn't care if you read beyond the max size of local array variable within your program's allocation).

^ And as for #1, I am still unsure.  Does the compiler add these checks in?  So for instance, if all you write is (pseudo-code):

Code: [Select]
// Assume you have an array dimensioned called myArray with 4 elements (indicies 0-3);
myArray[4] = 9000;

Does the compiler automatically add in checks like this?

Code: [Select]
Compare 4 to count of myArray;
If 4 is less than 0 or greater than 3, thrown an error;
Otherwise, 4 is within bounds, so let myArray[4] = 9000;

Thanks
« Last Edit: May 15, 2012, 12:27:48 AM by hhh3h »

Offline Bryant Keller

  • Forum Moderator
  • Full Member
  • *****
  • Posts: 360
  • Country: us
    • About Bryant Keller
Usually OS provides some way for legit programs to patch other programs if needed.
Have no idea about how Windows OS deal with it but on Linux you can use ptrace() for that.
Still your program should have enough rights or OS simply refuse your attempts.

http://en.wikipedia.org/wiki/Ptrace

Windows has ReadProcessMemory and WriteProcessMemory for accessing remote processes. The down side to using these functions are that some AV's will flag any app that uses them as a virus.

Well this is a very interesting read.  My main reason for posting this question was in many high-level languages, if you declare an array as having a certain number of elements, and then you mistakenly try to read or write an element past that point, the program will throw an exception.  It makes debugging nice and easy.

So I was wondering if the above exception was caused by (1) the compiler itself adding these checks into the code, or (2) whether it was just a general error thrown by the OS, so as long as you trap it you'll be fine.

^ And my conclusion from the above is that #2 is false and not to be relied upon to avoid data corruption... because the OS will only throw said exception if you try to read/write to memory outside your entire program's allocation (it doesn't care if you read beyond the max size of local array variable within your program's allocation).

^ And as for #1, I am still unsure.  Does the compiler add these checks in?  So for instance, if all you write is (pseudo-code):

Code: [Select]
// Assume you have an array dimensioned called myArray with 4 elements (indicies 0-3);
myArray[4] = 9000;

Does the compiler automatically add in checks like this?

Code: [Select]
Compare 4 to count of myArray;
If 4 is less than 0 or greater than 3, thrown an error;
Otherwise, 4 is within bounds, so let myArray[4] = 9000;

Thanks

Actually, it's a bit of both 1 & 2. If the stack is corrupted by a bounds violation, you overwrite things which shouldn't be overwritten (like return addresses). The exception you're referring too is a segmentation fault which happens in response to you overwriting that return address and at the next 'ret' instruction, the code jumps to a memory location that the operating system says you don't have access too. Some compilers try to "predict" this corruption before your 'ret' throws you into the unknown by placing a "magic cookie" on the stack and doing a quick check to make sure the cookie hasn't been corrupted, if it has then the stack gets unwound and execution returns to the OS (this is a common type of buffer overflow exploit prevention).

About Bryant Keller
bkeller@about.me

Offline Cyrill Gorcunov

  • NASM Developer
  • Full Member
  • *****
  • Posts: 179
  • Country: 00
In Windows entry point to each API has move edi, edi.

You miss the point. What you're point to -- is a prologue to an entry point, but before you're able to patch another's program memory, you need to connect to an alien process space, and that is what ptrace() about (and on windows I think one need to obtain process handle first, and if your program has not enough privileges -- you simply can't patch other's program memory). Once you're connected -- you can do whatever you want -- patch prologues, or completely overwrite program execution segment.

Offline hhh3h

  • Jr. Member
  • *
  • Posts: 41
Thank you much for the info and insight, you guys