Code Gems #4

Code Gems Part 4

This text comes from IMPHOBIA Issue X - June 1995

* Demand paging on SVGA boards in PM *

(This is going to be deep protected mode system coding, be prepared...) It would be very nice to 'map' the video memory to the linear address space so we could reach it as a one megabyte long array. Some cards support it, the rest not : at these card s only the 'bank switching' routines allow to access the entire video memory. Our goal is to reduce the number of bank switches as possible. Several techniques has been developed, but many of them has a big problem : the routine which determines whether a bank-switch is necessary must run very much times. The next method solves this problem. It maps an 1MB long memory area to the video memory on any SVGA card, and bank-switch will occur only if necessary. It works in protected and flat virtual mode only, NOT in (flat) real mode. Essentially it's a kind of 'virtual memory' technique based on PAGING. Let's set up the 4k-page table reserving one megabyte above the highest physical memory address (let's say from 800000h to 8ffffffh) and map it to a0000h by 64k steps:

Linear address|Physical address
--------------.-------------.---------
     0-   fff |     0-   fff|This is
  1000-  1fff |  1000-  1fff|the very
  2000-  2fff |  2000-  2fff|normal
...           |             |mapping,
7ff000-7fffff |7ff000-7fffff|no diff.
--------------+-------------+---------
800000-800fff | a0000- a0fff|800000-
801000-801fff | a1000- a1fff|80ffff:
...           |             |mapped to
80f000-80ffff | af000- affff|a0000-
              |             |affff
--------------+-------------+---------
810000-810fff | a0000- a0fff|810000-
811000-811fff | a1000- a1fff|81ffff:
...           |             |mapped to
81f000-81ffff | af000- affff|a0000-
              |             |affff
--------------+-------------+---------
...           |             |
--------------+-------------+---------
8f0000-8f0fff | a0000- a0fff|8f0000-
...           |             |8fffff:
8ff000-8fffff | af000- affff|mapped to
              |             |a0000-
              |             |affff

Great. From 8 to 9 megabytes we can address the a0000 - affff segment sixteen times. Now comes the TRICK. Mark all 4k pages between 810000- 8fffff as 'NOT PRESENT' and pages in 800000-80ffff as 'PRESENT', and hook interrupt 0e ('page fault' exception). If a page fault occurs, it means that a bank switch needed - mark the accurate pages as 'PRESENT' and old ones as 'NOT PRESENT', do the bank-switch, and return from the exception.
The fault handler looks like this:

        push    eax edx

; Get page fault address:
        mov     eax,cr3
    ; Substract starting address
        sub     eax,800000h
    ; Put bank's number to AL
        shr     eax,16

; SVGA bankswitch
        mov     dx,svga_switch_port
        out     dx,al

; Mark pages present/absent
        (not too difficult to do :-)
; Bye
        pop     edx eax
        iretd

This example assumes that the video memory can be browsed in 64k banks and the bank-switch is simple :-)

Now let's observe the problems...

1. Paging must be enabled. This causes some slowdown. But paging can be disabled when no video-operations are in use.

2. EMM compatibility. No problem with VCPI, the only difference is that paging may not be disabled at virtual-mode callbacks. The big problem is the DPMI, which doesn't allow modifying the page table.

3. Reading the video memory. Some SVGA cards have a separate write and read bank register, but that makes no difference. The page fault handler can not determine whether it was a read or write operation at a MOVS. So it will (occasionally) switch bank TWICE in a single MOVS instruction! This means that reading the video memory should be eliminated as possible.

4. Words and doublewords written to bank boundaries. This is the roughest problem. When no special code is inserted to handle this case, the system will fall to an infinite exception cycle :-( Let's take look at an example : a STOSD occurs to 80fffe. It causes a page fault since the pages from 810000 are not present. The fault handler enables the pages between 810000-81ffff and disables 800000-80ffff, and returns to the instruction which caused the exception but then immediately a new fault is generated because the 80fffe address is in an absent page...
Of course this can be avoided by
a) writing bytes only,
b) writing words to even addresses and dwords to dword boundaries. If none of these conditions can be satisfied, the fault handler must decode the instruction which caused the exception and emulate it. This can be pretty simple if only one-two kind of instructions access the desired area. Of course it's enough to check only those instructios which caused a page fault above the address fffc:

        mov     eax,cr3
        cmp     ax,0fffch
        ja      possible_bank_override
normal_fault:
        ...
        iretd
; Check the instruction which caused
; the page fault (STOSD's code is A5)

possible_bank_override:
        mov     edx,[esp]
        cmp     byte ptr[edx],0a5h
        jne     normal_fault

; Now emulate a STOSD
        ...
        iretd

Ok. This is the end until the next issue. If You have questions, ideas, or anything, feel free to contact me:

           Ervin / AbaddoN
              Ervin Toth
       48/A, Kiss Janos street
            1126 Budapest
               HUNGARY
           (+36)-1-201-9563
           ervin@sch.bme.hu