Author Topic: How do I do a 16bit copy from source? (Read 40544 times)

ben321 · « **on:** March 24, 2015, 07:18:31 AM »

I have a 16bit program (com file or boot program) that after the last byte of code contains image/picture data. I need to copy that picture data from the source file (or disk) into ram. In a previous thread, I managed (with the help of others) to figure out how to write data into the VGA video-ram. But the next part is getting some data from the source. I was looking at using REP MOVSB which uses the DI register to specify the first destination address, the SI register to specify the first source address, and the CX register to specify how many bytes to copy. Unfortunately from what I've read, while DI is based in the DS (data segment) where it needs to be, the SI is based in the ES (extended segment) which is where it does NOT need to be for this particular operation. I'm pretty sure that my program's code (and the all-important image data that immediately follows the program's code) is NOT stored in the ES, but rather the CS (code segment). I need to know how to direct the SI source register to be based in the CS, not the ES (if this is even possible), so that I can load up the picture data from my bootable disk image or COM file, to be displayed on the screen.

My info source for the fact that the SI register operates in the ES segment of memory, is this web page http://faydoc.tripod.com/cpu/movsb.htm

Frank Kotler · « **Reply #1 on:** March 24, 2015, 10:41:42 AM »

But that's not what that web page says. I thought for a minute that you might have gotten ahold of the old Nasm v0.98 manual, which made exactly that mistake (I was one of several who fixed that and other mistakes in it). movsb moves from ds:si to es:di. So what you need to do is set es to your video memory at 0xA000, just like you did with ds in your last example.

Now ds... ds wants to point to your data. You say your data is right after your code. In a DOS .com file, everything's in one segment and dos sets your segregs to that segment, so you're all set. With 64000 bytes of data (for the whole screen), you don't have much room for code, but you can make it. A bootsector, however, is 512 bytes. After your code, that doesn't leave much room for picture. That's all that's going to be loaded by "real hardware", and presumably by your VM... unless you arrange to load more. In that case, you want to point ds:si to where you loaded it.

Best,
Frank

z80a · « **Reply #2 on:** March 24, 2015, 04:40:21 PM »

<humor, but true> In the beginning the gods IBM sent some minions to Boca Raton to created their PC, which was based on a tester. There's a lot of history that explains where some of the future "quirks" based on decisions there. But IBM also released technical manuals with complete schematics AND ROM listings (POST & BIOS) for both the PC and AT. If you are fortunate enough to own them, or can get a copy, the BIOS gives you access to disk, video, keyboard, etc. You know, the Basic Input Output System. You can also find them (int access calls) online.

I have found that all software interrupt access points still seem support original calls (plus later documented & undocumented ones). While the ROM (which most just refer to as BIOS) will load the first sector, your boot program, after that, via int 13h, could load any sectors you wish. This is before an OS, for example DOS is loaded, so no MS-DOS calls (int 21h & company, or 32 bit Linux int 80h, 64 bit syscall) are available to you. But all of BIOS is.

I showed you how you don't have to use DS, you can load ES, and so as Frank stated, you could use movsb. In reality, I have found you can run 32 bit instructions (make sure you enable in compiler), just drop to 16 bit instructions where needed (BIOS compatible calls). Again, look up GRDB.EXE (supports 32 bit) debugger, you can easily step through code. As I recall, in Windows 7, a patch might be needed, Linux & PC-BSD don't as it runs just fine in DOSBOX. You can trace over actual hardware calls where needed to debug, before committing to a floppy or USB stick.

ben321 · « **Reply #3 on:** March 24, 2015, 08:26:33 PM »

Quote from: Frank Kotler on March 24, 2015, 10:41:42 AM

But that's not what that web page says. I thought for a minute that you might have gotten ahold of the old Nasm v0.98 manual, which made exactly that mistake (I was one of several who fixed that and other mistakes in it). movsb moves from ds:si to es:di. So what you need to do is set es to your video memory at 0xA000, just like you did with ds in your last example.

Now ds... ds wants to point to your data. You say your data is right after your code. In a DOS .com file, everything's in one segment and dos sets your segregs to that segment, so you're all set. With 64000 bytes of data (for the whole screen), you don't have much room for code, but you can make it. A bootsector, however, is 512 bytes. After your code, that doesn't leave much room for picture. That's all that's going to be loaded by "real hardware", and presumably by your VM... unless you arrange to load more. In that case, you want to point ds:si to where you loaded it.

Best,
Frank

I must have misread something then.

Also how do I set cs:si? That is, how do I set the si (source index) to work in the cs (code segment)? By default the source index works in the ds (data segment) (the ds:si configuration). This is to allow transferring data from the data segment to the extended segment. But in this case I need to configure the CODE SEGMENT to transfer to the extended segment. The configuration I need for the si is this program is the cs:si configuration. What commands do I need to do this?

z80a · « **Reply #4 on:** March 24, 2015, 10:11:16 PM »

".COM" files and every pass off to a boot record I've seen CS=DS=ES=SS. You DON'T change CS unless you've set up for "branch being cut underneath you! Remember in segmented mode (real mode), CS:IP points to next instruction. That is, CS shifted 4 bits + IP equals physical address of next instruction. 16 bits for IP and offset of CS = 20 bit address = 1M that the 8086 (8088) could address. Minus ROM & video areas is where the 640K limit came from. If you look at an Master Boot Record, you will note they place data (partition table) at end of 512 byte record.

You can PUSH CS on stack and POP into ES or DS, if you wish them to be equal (again on boot, this should already be).

ben321 · « **Reply #5 on:** March 24, 2015, 10:53:21 PM »

Quote from: z80a on March 24, 2015, 10:11:16 PM

".COM" files and every pass off to a boot record I've seen CS=DS=ES=SS. You DON'T change CS unless you've set up for "branch being cut underneath you! Remember in segmented mode (real mode), CS:IP points to next instruction. That is, CS shifted 4 bits + IP equals physical address of next instruction. 16 bits for IP and offset of CS = 20 bit address = 1M that the 8086 (8088) could address. Minus ROM & video areas is where the 640K limit came from. If you look at an Master Boot Record, you will note they place data (partition table) at end of 512 byte record.

You can PUSH CS on stack and POP into ES or DS, if you wish them to be equal (again on boot, this should already be).

But I don't want to change the definition of where CS is in memory. CS should stay exactly where it is. I want to change SI. I want to change SI so that it points to memory WITHIN CS. Normally SI points to data located in DS (data segment). But because my picture data is located right after the code in my COM file or bootable disk image, my picture data will not be in the DS segment, but rather in the CS segment. Therefore SI pointing to memory within DS is completely useless. I need to get my source index (SI) to point to my CS (code segment), or else my attempt to read the picture data that IMMEDIATELY follows the code will be useless. How do I get SI to point to memory within the CS segment?

Frank Kotler · « **Reply #6 on:** March 24, 2015, 10:58:03 PM »

Well...

Code: [Select]

cs rep movsb

would do it, but I'm not sure it's a good idea. I seem to recall reading that in some cases, if an interrupt occurs during the "rep", it can lose track of the cs override and go back to ds for the rest of the "rep". I don't know if that's just buggy hardware, but I know it "can" be a problem. What's the objection to just setting ds and es where you want 'em?
Warning: untested code ahead!

Code: [Select]

; for DOS
; nasm -f bin myprog.asm -o myprog.com
bits 16 ; the default, anyway

org 100h ; where dos WILL load us

section .text ; the default, anyway

mov ax, 13h
int 10h

; dos will have cs=ds=es=ss= our one and only segment
; we want our "destination segment" to be VGA ram
mov ax, 0A000h
mov es, ax

mov si, mypic
mov di, 0 ; ?

mov cx, 64000

rep movsb

; wait for a key
xor ax, ax
int 16h

; politely restore text mode
mov ax, 3
int 10h

ret

mypic:
times 320 db 0
times 320 db 1
times 320 db 2
; etc.

For faster execution, if you're moving an "even" number of bytes, divide cx by four and do "movsd" (one of those 32-bit instructions that we can use in 16-bit code, as z80a mentions).

You may still be a little confused how segmented memory works when you mention the data being "in" cs. Data is in memory. It is where it is. A combination of a segment register and an offset either does or does not point to it - doesn't matter which segment register it is.

A bootsector gets loaded to 7C00h. Usually, you'd want "org 7C00h", and ds set to 0. But it can be done as "org 0" (the default, if you don't say) with ds set to 7C0h (they refer to the same memory). It's a good bet that your BIOS (or VM) jumps to 0:7C00h, but there are rumors that certain Compac Presario models had a BIOS that jumped to 7C0h:0... so we shouldn't really "count on" the value in cs.

Your first example was all position-independent code (thanks to those relative jumps) and would have worked anywhere it was loaded. Now that you're trying to access data, you may need to "care more" about these issues...

Edit: I see z80a has posted. Segregs should be equal after dos loads a .com file. I wouldn't "count on" this being true when "any" BIOS boots us.

Edit2: Well I see you've posted too, Ben. See if this clears anything up. If not, we can try again...

Best,
Frank

ben321 · « **Reply #7 on:** March 24, 2015, 11:26:31 PM »

Quote from: Frank Kotler on March 24, 2015, 10:58:03 PM

Well...
Code: [Select]
cs rep movsbwould do it, but I'm not sure it's a good idea. I seem to recall reading that in some cases, if an interrupt occurs during the "rep", it can lose track of the cs override and go back to ds for the rest of the "rep". I don't know if that's just buggy hardware, but I know it "can" be a problem. What's the objection to just setting ds and es where you want 'em?
Warning: untested code ahead!
Code: [Select]
; for DOS ; nasm -f bin myprog.asm -o myprog.com bits 16 ; the default, anyway org 100h ; where dos WILL load us section .text ; the default, anyway mov ax, 13h int 10h ; dos will have cs=ds=es=ss= our one and only segment ; we want our "destination segment" to be VGA ram mov ax, 0A000h mov es, ax mov si, mypic mov di, 0 ; ? mov cx, 64000 rep movsb ; wait for a key xor ax, ax int 16h ; politely restore text mode mov ax, 3 int 10h ret mypic: times 320 db 0 times 320 db 1 times 320 db 2 ; etc.For faster execution, if you're moving an "even" number of bytes, divide cx by four and do "movsd" (one of those 32-bit instructions that we can use in 16-bit code, as z80a mentions).

You may still be a little confused how segmented memory works when you mention the data being "in" cs. Data is in memory. It is where it is. A combination of a segment register and an offset either does or does not point to it - doesn't matter which segment register it is.

A bootsector gets loaded to 7C00h. Usually, you'd want "org 7C00h", and ds set to 0. But it can be done as "org 0" (the default, if you don't say) with ds set to 7C0h (they refer to the same memory). It's a good bet that your BIOS (or VM) jumps to 0:7C00h, but there are rumors that certain Compac Presario models had a BIOS that jumped to 7C0h:0... so we shouldn't really "count on" the value in cs.

Your first example was all position-independent code (thanks to those relative jumps) and would have worked anywhere it was loaded. Now that you're trying to access data, you may need to "care more" about these issues...

Edit: I see z80a has posted. Segregs should be equal after dos loads a .com file. I wouldn't "count on" this being true when "any" BIOS boots us.

Edit2: Well I see you've posted too, Ben. See if this clears anything up. If not, we can try again...

Best,
Frank

So by default ALL the segments refer to the same memory? So reading from DS (the default operation for MOVSB) will automatically be reading from CS as well, unless the location of DS has previously been changed in your code?

And just curious, how many bytes are in a segment? If segment location 0xA000, offset 0x0000 is the same as flat memory address 0xA0000, then it would appear that each segment is exactly 16 bytes in size. But that makes no sense, because if each segment was 16 bytes in size, then the maximum possible offset within a segment would be 0xF. But I know for a fact that the maximum offset within a segment is in fact 0xFFFF. The alternative would be that the segment number represented the most significant 2 bytes of a 4 byte address, and the offset represented the lower 2 bytes of the address. But I know that that is also not the case, because if it was segment 0xA000 for the VGA video memory would correspond to a flat address of 0xA0000000, which I know is NOT the address for video memory. I've read that the correct address for VGA video memory is actually 0xA0000.

It would seem that neither of the 2 obvious possible explanations for segment:offset notation are the correct explanation. So can you please tell me how Segment 0xA000 with Offset 0x0000 ends up representing a flat memory address of 0xA0000?

Frank Kotler · « **Reply #8 on:** March 25, 2015, 12:22:52 AM »

In real mode, the value in the segment register is multiplied by 16 (shifted left by 4) and added to the offset to form a linear address (which is the same as the physical address, since paging isn't enabled).

In protected mode - once we've set bit 0 of cr0 and made a far jump to reload cs - the rules change. The value in a segment register (called a "selector") is used as an index into an array of "descriptors" which describe various properties of the segment - "base", "limit", various "flags" - all arranged in an order that must make some "hardware sense" because it sure wasn't for the convenience of programmers! Now, the "base" gets added to the offset to form a linear address. Once "paging" is enabled, the high bits tell which "page table" to look in to find the physical address, the low bits give the offset into that physical "page" (4k usually, although the hardware allows 1M pages - one of those bits in the descriptor). The OS takes care of this - don't worry about it.

In any OS you're likely to encounter, the "base" is 0 and the "limit" is 4G-1, so all "selectors" refer to the entire 4G memory space (fs is likely an exception). ds:esi and es:edi point to the same memory. We're mostly pretty happy to forget about segmented memory.

But you're interested in real mode segmented memory, so a segment of 0xA000 multiplied by 16 is 0xA0000, added to an offset of 0x0000 is... 0xA0000... and there's your pixel!

While segments can start as close as 16 bytes apart, a "full segment" is 64k.

Best,
Frank

ben321 · « **Reply #9 on:** March 25, 2015, 12:36:41 AM »

Quote from: Frank Kotler on March 25, 2015, 12:22:52 AM

In real mode, the value in the segment register is multiplied by 16 (shifted left by 4) and added to the offset to form a linear address (which is the same as the physical address, since paging isn't enabled).

In protected mode - once we've set bit 0 of cr0 and made a far jump to reload cs - the rules change. The value in a segment register (called a "selector") is used as an index into an array of "descriptors" which describe various properties of the segment - "base", "limit", various "flags" - all arranged in an order that must make some "hardware sense" because it sure wasn't for the convenience of programmers! Now, the "base" gets added to the offset to form a linear address. Once "paging" is enabled, the high bits tell which "page table" to look in to find the physical address, the low bits give the offset into that physical "page" (4k usually, although the hardware allows 1M pages - one of those bits in the descriptor). The OS takes care of this - don't worry about it.

In any OS you're likely to encounter, the "base" is 0 and the "limit" is 4G-1, so all "selectors" refer to the entire 4G memory space (fs is likely an exception). ds:esi and es:edi point to the same memory. We're mostly pretty happy to forget about segmented memory.

But you're interested in real mode segmented memory, so a segment of 0xA000 multiplied by 16 is 0xA0000, added to an offset of 0x0000 is... 0xA0000... and there's your pixel!

While segments can start as close as 16 bytes apart, a "full segment" is 64k.

Best,
Frank

Ok. I understand some of that more now. But doesn't that meant that (at least in real mode) segments can overlap? If each segment index is multiplied by only 16, then wouldn't Segment 0x0001 + Offset 0x0000 refer to the same memory address as Segment 0x0000 + Offset 0x0010?

z80a · « **Reply #10 on:** March 25, 2015, 01:05:24 AM »

Yes, they most definitely can overlap.

z80a · « **Reply #11 on:** March 25, 2015, 01:10:45 AM »

There also is no memory protection, you can have self modifying code.

Frank Kotler · « **Reply #12 on:** March 25, 2015, 01:43:51 AM »

Yup. Makes it a real PITA to do arithmetic on. If my bootsector is loaded at 0x7C00 and I've got 64000 bytes of data to load, what segment should I use so that it won't overlap? I suppose in this case, you could load it straight to video memory at segment 0xA000... ?

Best,
Frank

z80a · « **Reply #13 on:** March 25, 2015, 02:11:12 AM »

I'm assuming your not messing with your hard drives boot record, but a floppy or USB stick. I further assume you have created a bitmap memory image of what your trying to display, so that it can be directly loaded into video memory with proper video mode enabled.
Why not write your program to use BIOS int 13h to read sectors. Use Unix 'dd' (there's a Windows version), place your boot program in first sector and again use 'dd' to write your bitmap to the sectors you have your boot program reading.

NASM - The Netwide Assembler

News:

Author Topic: How do I do a 16bit copy from source? (Read 40544 times)

ben321

How do I do a 16bit copy from source?

Frank Kotler

Re: How do I do a 16bit copy from source?

z80a

Re: How do I do a 16bit copy from source?

ben321

Re: How do I do a 16bit copy from source?

z80a

Re: How do I do a 16bit copy from source?

ben321

Re: How do I do a 16bit copy from source?

Frank Kotler

Re: How do I do a 16bit copy from source?

ben321

Re: How do I do a 16bit copy from source?

Frank Kotler

Re: How do I do a 16bit copy from source?

ben321

Re: How do I do a 16bit copy from source?

z80a

Re: How do I do a 16bit copy from source?

z80a

Re: How do I do a 16bit copy from source?

Frank Kotler

Re: How do I do a 16bit copy from source?

z80a

Re: How do I do a 16bit copy from source?