Author Topic: Character Pointers - Total Noob  (Read 20815 times)

Offline pprocacci

  • Jr. Member
  • *
  • Posts: 11
Character Pointers - Total Noob
« on: April 07, 2010, 08:11:50 AM »
Hello all,

I'm new to assembly and Frank was very helpful in my initial posting to the nasm-users mailing list, pointing me in this direction for further information.  All have been very helpful.  However in contiuing my learning experience I am having a really rough time trying to understand how to write to memory locations to addresses passed to subroutines on the stack.  I have a very brief example that I will provide, and if someone cares to point out specifically why this doesn't work, please let me know.  I'm assuming it has to do with alignment or something, but I'm not well versed and not sure if that is the proper terminology.  Without further adeiu, I am trying to mimic the bzero c library function available in all POSIX compliant machines.  This is for self educational purposes only to help me understand the basics.  I understand the C language quite well as it's what I'm most familiar with, but would like to explore asm in more depth.  Thanks in advance!

# C Prototype
void bzero(char *addr, unsigned int);

#############################
section .text
global bzero

bzero:
        push   ebp            ;
        mov    ebp, esp     ; Stack Frame
        mov eax, [ebp+8]  ; Move address of pointer into eax
        mov ecx, [ebp+12] ; Move length into ecx
.loop:
       cmp ecx, 0            ; Is length zero?
       je .done                ; Yep, jump to done.
       mov byte [eax], 0   ; Move a zero into the address referenced in eax.
       sub ecx, 1             ; Subtract 1 from length
       add eax, 1             ; Add one to address
       jmp .loop               ; Jump to the top of the loop
.done:
       pop ebp                 ; Restore Frame
       ret                     ; Return
#############################

The above results in a segmentation fault do the the following "mov byte [eax], 0".  The address stored in eax is correct due to the debugging I've already done, namely after loading the address into eax I've returned from the routine and printed the value of the address in c.  It matched my original pointer passed to the subroutine.  Knowing this, I am nearly certain the problem is in fact "trying to move a zero into the address referenced in eax".

I've been scouring google for nasm examples detailing a specific example of a subroutine which does this but I've come up empty.  This question is trivial, and I am hoping the answer is quite trivial as well.

I appreciate any help with my lack in asm knowledge, and thank those that respond in advance.

~Paul

Host: FreeBSD 7.2 (32bit) Pentium 3
nasm -ggnu -f elf -o bzero.o bzero.s
gcc -g -o test test.c bzero.o
« Last Edit: April 07, 2010, 04:45:25 PM by pprocacci »

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: Character Pointers - Total Noob
« Reply #1 on: April 07, 2010, 04:55:43 PM »
Hi Paul,

Your bzero "works for me". I'm calling it like so (since I don't "do C" much):

Code: [Select]
; nasm -f elf callbzero.asm
; ld -o callbzero callbzero.o bzero.o

global _start
extern bzero

section .bss
    buffer resb 10

section .text
_start:
    nop
    push 10
    push buffer
    call bzero
    add esp, 4 * 2
   
    mov eax, 1
    int 80h

How are you calling it to get the segfault?

If the number in eax were "right", it would not segfault. Unless it's a bug in your CPU - I haven't found one yet! :)

Best,
Frank


Offline pprocacci

  • Jr. Member
  • *
  • Posts: 11
Re: Character Pointers - Total Noob
« Reply #2 on: April 07, 2010, 05:21:05 PM »
Hey Frank.  Thanks for the follow up.  Here is the c program that I've written for testing to calls my test subroutine:

############################
extern void bzero();

int main(void){
  char *testing = "This is a test";

  bzero(testing,1);
  printf("%s\n", testing);
  _exit(0);
}
############################

This should result in no output being that the first character should be a zero.  Results in a segmentation fault as described above.  With a slight modification to my assembly routine I can return the address of the pointer:

############################
section .text
global byte_zero

byte_zero:
        push   ebp              ;
        mov    ebp, esp         ; Stack Frame
        mov eax, [ebp+8]        ; Move address of pointer into eax
        pop ebp
        ret
############################

This coupled with a small modification to my c source, can compare the pointer passed with the pointer returned:

############################
extern char *bzero();

int main(void){
  char *testing = "This is a test";

  printf("%p %p\n", testing, bzero(testing,1));
  _exit(0);
}
############################

If I run this:
nat# ./test
0x8048525 0x8048525

And here is a small debugging session:
nat# gdb ./test
<--snip-->
(gdb) b bzero
Breakpoint 1 at 0x8048493: file bzero.s, line 5.
Starting program: /root/tmp/test

Breakpoint 1, bzero () at bzero.s:7
7               mov eax, [ebp+8]        ; Move address of pointer into eax
Current language:  auto; currently asm
(gdb) s
8               mov ecx, [ebp+12]       ; Move length into ecx
(gdb) x/b 0x8048555
0x8048555 <_fini+97>:   0x54
(gdb) x/b 0x8048556
0x8048556 <_fini+98>:   0x68
<--snip-->
Program exited normally.
(gdb)
###############################################

I just found something interesting out as well.  While running this through the debugger, the program _did_ complete without a segfault!  It's only when I run it without the debugger that I run into problems.  Does this sound like anything you've run into before?

Now, again when I run without debugging, I get that core file.  If I examine where it died:

###############################################
nat# gdb ./test test.core
<--snip-->
#0  bzero.loop () at bzero.s:12
12              mov byte [eax], 0
###############################################

Weird weird weird!

Also, just a brief mention of this, I ran your callbzero and it worked!  Now I'm really confused!

~Paul
« Last Edit: April 07, 2010, 05:34:42 PM by pprocacci »

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: Character Pointers - Total Noob
« Reply #3 on: April 07, 2010, 05:56:04 PM »
The pointer 0x8048525 is the tipoff! I should have gone on to explain that for eax to be "good", it would have to point to writable memory, as well as "valid" memory. 0x8048xxx is "read-only" memory - "section .text" lives there, and apparently C is putting your string there, too. In a "small" program, writable memory - "section .data" and "section .bss" would start at 0x8049xxx, with the "xxx" being the size of the code (plus header), "rounded up".

Apparently, you're trying to write to a "constant string"(?) placed in readonly memory. If I moved "buffer" into "section .text", or put it into "section .rdata" (is it rdata or rodata? I never use it, so I'd have to look it up), my code would do the same thing. I don't know how you persuade C that this is a "non-constant" string... "static"? "volatile"? Does the "regular" bzero work on this string? In any case, I'm pretty sure it's a "C problem", your assembly language is fine!

Best,
Frank

P.S. http://www.nasm.us/doc/nasmdoc7.html#section-7.9.2 - It is "section .rodata" - that's what you don't want!


Offline pprocacci

  • Jr. Member
  • *
  • Posts: 11
Re: Character Pointers - Total Noob
« Reply #4 on: April 07, 2010, 06:24:09 PM »
Frank, you nailed it!

############################
       .file   "test.c"
        .section        .rodata
.LC0:
        .string "This is a test"
############################

As far as convincing gcc that the string should be "writable" I honestly don't know as I've never had to convince gcc of such a parameter.  It's generally always "just worked".  I won't go any further debugging this as it's clear what you stated to be spot on, and a C discussion belongs elsewhere.

Again, thank you for your help!

~Paul

Offline Bryant Keller

  • Forum Moderator
  • Full Member
  • *****
  • Posts: 360
  • Country: us
    • About Bryant Keller
Re: Character Pointers - Total Noob
« Reply #5 on: April 08, 2010, 01:55:31 PM »
This is been a gripe of mine every since the release of gcc4. The GCC developers dropped support for -fwriteable-strings command line argument on newer versions. You can, however, make writeable C strings without doing any section hacks or arsing with gas code. Instead of using char * label = "literal"; use the alternate convention of char label[] = "literal";. This second form is still writeable.

About Bryant Keller
bkeller@about.me

Offline Frank Kotler

  • NASM Developer
  • Hero Member
  • *****
  • Posts: 2667
  • Country: us
Re: Character Pointers - Total Noob
« Reply #6 on: April 08, 2010, 04:25:44 PM »
I hoped there'd be someone around here who knew that. Thanks, Bryant!

Funny coincidence, the last problem I dealt with that was "your subroutine looks okay, show us the calling code" involved "[]", too - "push size1" vs "push [size1]", in that case. Important, those "Republican parentheses"! :)

Some C discussions don't belong here. Whether "void main" is legal is probably not productive to discuss (although what it would mean might be). But if it involves "how do I interface my asm with it", I think it's "on topic" to discuss C's little "ways". After all, one of the reasons to learn asm is to better understand what HLLs are doing, no?

Best,
Frank


Offline pprocacci

  • Jr. Member
  • *
  • Posts: 11
Re: Character Pointers - Total Noob
« Reply #7 on: April 08, 2010, 06:12:02 PM »
Bryant:
As far as using label[] as opposed to *label, I actually figured that out as well after posting.  Like you mentioned I did remember seeing -fwriteable-strings a long long time ago, but when I looked for it in my doc `man gcc`, it wasn't their.  I very recently started using gcc4.x was using gcc2.x for the longest time on an ancient FreeBSD machine firewall/router.  In fact, this is probably the first time I tried compiling something I had written since the upgrade.  This explains a lot.

In regards to the C discussion, I suppose it could be useful for beginners like myself coming from a C background to find out about the little tidbits that we take for granted.  Most lists that I am familiar with generally like to keep their discussions to the reason the forum was created in the first place.  (i.e. it's an asm forum, keep it that way!)   Given this specific case though, I think it is important to discuss why it wasn't working.  I was trying to avoid bringing a C discussion into a "nasm forum".  This isn't necessarily a C problem though.  GCC is to blame!

I'll quit while I'm ahead, good talking to both of you!

~Paul