Operating Systems Development Series Bootloaders 4
Operating Systems Development Series Bootloaders 4
This series is intended to demonstrate and teach operating system development from the ground up.
Introduction
Welcome! In the previous tutorial we talked about how to load and execute a sector. We also have taken a look at the Rings in assembly
language, and a detailed look at the BIOS Parameter Block (BPB).
In this tutorial, we are going to use everything we learned to parse the FAT12 file system, and load our second stage loader by name,
This tutorial is going to have alot of code in it. I will do my best to explain everything in detail. Also, this tutorial will have some math in it as
well.
Ready?
The reason I clear interrupts (cli) as because the interrupts will execute (hence the system is not haulted) even when I wanted to. This can
cause problems. So, just having a hlt instruction (without cli) can Triple Fault the cpu.
Because of this, I always end all of my demos with cli and hlt.
Filesystems - Theory
Yippe! Its time to talk about filesystems :)
A File System is nothing more then a specification. It helps create the concept of a "file" on a disk.
A file is a group of data that has represents something. This data can be anything we want it to be. It all depends on how we interprate the
data.
As you know, a sector is 512 bytes in size. A file is stored across these sectors on disk. If the file exceeds 512 bytes, we have to give it more
sectors. Because not all files are evenly 512 bytes in size, we will need to fill in the rest of the bytes (That the file doesnt use). Kind of like what
we did for our bootloader.
If a file spans accross several sectors, we call these sectors a Cluster in the FAT file systems. For example, our kernel will most likley span alot
of sectors. To load our kernel, we will need to load the cluster (The sectors) from where it is located.
If a file spans across different sectors (Not contigous) across different clusters, it is said to be Fragmented. We will need to collect the
different parts of the file.
There are alot of different kinds of file systems. Some are widley use (Like FAT12, FAT16, FAT32, NTFS, ext (Linux), HFS (Used in older MACs);
other filesystems are only used by specific companies for in house use (Like the GFS - Google File System).
Alot of OS developers also create there on versions of the FAT file systems (or even something completely new). These are useually not as good
as the most common filesystems though (Like FAT and NTFS).
Okay, so we know a little about file systems now. We are going to be using FAT12 for its simplicity. If we decide, we can always use a different
one. :)
FAT12 has no support for hierarchical directories. This means there is only one directory-Thr Root Directory.
Cluster Addresses were only 12 bits long, which limits the maximum number of clusters to 4096
The Filenames are stored in the FAT as a 12 bit identifier. The Cluster Addresses represent the starting clusters of the files.
Because of the limited cluster size, The maximum number of files possible is 4,077
The Disk Size is stored only as a 16 bit count of sctors, limiting it to 32 MiB in size
FAT12 uses the value "0x01" to identify partitions
FAT16 has support for directories, and over 64,000 files as it uses a 16 bit cluster (file) address, as apposed to FAT16. However, FAT16 and
FAT12 are very simular.
To make things simple, we are going to use FAT12. We might spruce things up with FAT16 (or even use FAT32) later :) (FAT32 is quite different
then FAT 12/16, so we might ust use FAT16 later.)
http://www.brokenthorn.com/Resources/OSDev6.html 1/9
8/18/2015 Operating Systems Development Series
Boot Extra Reserved File Allocation File Allocation Root Directory (FAT12/FAT16 Data Region containng files and
Sector Sectors Table 1 Table 2 Only) directories.
This is a typical formatted FAT12 disk, from the bootsector to the very last sector on disk.
Understanding this structure will be important when loading and searching for our file.
Note that there are 2 FATs on a disk. It is located *right after* the reserved sectors (or the bootloader, if there is none).
Also note: The Root Directory is right after all of the FATs. This means...
if we add up the number of sectors per FAT, and the reserved sectors, we can get the first sector to the Root Directory. By searching
the Root Directory for a simple string (our filename), we can effectivly find the exact sector of the file on disk :)
Boot Sector
This section contains the BIOS Parameter Block and the bootloader. Yep--Ours. The BIOS Parameter Block contains information tat help describe
our disk.
Remember the bpbReservedSectors member of our BPB? Any extra reserved sectors are stored here, right after the bootsector.
Remember that a cluster represents a series of contiguous sectors on disk. the size of each cluster is normally 2 KB to 32 KiB. The file peices are
linked (from one cluster to another using a common data structure, such as a Linked List.
There are two FATs. However, one is just a copy of the first one for data recovery purposes. It useally isnt used.
The File Allocation Table (FAT) is a list of entrys that map to each of these clusters. They help identify important information to aid in
storing data to these clusters.
Each entry is a 12 bit value that represents a cluster. The FAT is a linked list-like structure with these entrys that helps identify what
clusters are in use.
A FAT is just an array of these values--thats all. When we find the starting sector form the Root Directory, we can look through the FAT to find
which clusters to load. How? We simply check the value. If the value is between 0x02 and 0xfef, this value represents the next cluster to load
for the file.
Lets look at this in a deeper way. A cluster, as you know, represents a series of sectors. We define the amount of sectors it represents from
the BIOS Paramete Block:
bpbBytesPerSector: DW 512
bpbSectorsPerCluster: DB 1
In our case, each cluster is 1 sector. When we get the first sector of Stage 2 (We get this from the root directory), we use this sector as the
starting cluster number in the FAT. Once we find the starting cluster, we just refrence the FAT to determin the cluster (The FAT is just an array
of 32 bit numbers. We just compare this number with the list above to determin what to do with it.)
The root directory is a table of 32 byte values that represent information reguarding file and directories. This 32 byte value uses the format:
http://www.brokenthorn.com/Resources/OSDev6.html 2/9
8/18/2015 Operating Systems Development Series
Bit 6 : Unused
Bytes 12 : Unused
Bytes 13 : Create time in ms
Bytes 14-15 : Created time, using the following format:
Bit 0-4 : Seconds (0-29)
Bit 5-10 : Minutes (0-59)
Bit 11-15 : Hours (0-23)
Bytes 16-17 : Created year in the following format:
Bit 0-4 : Year (0=1980; 127=2107
Bit 5-8 : Month (1=January; 12=December)
Bit 9-15 : Hours (0-23)
Bytes 18-19 : Last access date (Uses same format as above)
Bytes 20-21 : EA Index (Used in OS/2 and NT, dont worry about it)
Bytes 22-23 : Last Modified time (See byte 14-15 for format)
Bytes 24-25 : Last modified date (See bytes 16-17 for format)
Bytes 26-27 : First Cluster
Bytes 28-32 : File Size
I bolded the important parts--everything else is just junk Microsoft added that we can add to when we create a FAT12 driver, much later.
Wait a sec! Remember that DOS filenames are limited to 11 bytes? This is why:
0 through 10, hmm... thats 11 bytes. Having a filename less then 11 bytes will miss up the data entry (The 32 byte entry table displayed above).
This, of course, is bad :) Because of this, we have to pad the filenames with characters, and insure it is 11 bytes.
Remember in a previous tutorial I explained how their are internal and external filenames? The filename structure I explained is the internal
filename. As it is limited to 11 bytes, the filename "Stage2.sys" has to become
We are going to be refrencing the BIOS Parameter Block (BPB) alot. Here is the BPB that we created from the prevuis tutorials for refrence:
bpbBytesPerSector: DW 512
bpbSectorsPerCluster: DB 1
bpbReservedSectors: DW 1
bpbNumberOfFATs: DB 2
bpbRootEntries: DW 224
bpbTotalSectors: DW 2880
bpbMedia: DB 0xF0
bpbSectorsPerFAT: DW 9
bpbSectorsPerTrack: DW 18
bpbHeadsPerCylinder: DW 2
bpbHiddenSectors: DD 0
bpbTotalSectorsBig: DD 0
bsDriveNumber: DB 0
bsUnused: DB 0
bsExtBootSignature: DB 0x29
bsSerialNumber: DD 0xa0a1a2a3
bsVolumeLabel: DB "MOS FLOPPY "
bsFileSystem: DB "FAT12 "
Please see the previous tutorial for a detailed explanation of each member.
What we are trying to do is to load a second stage loader. Lets look at what we need to do in detail:
I am using "STAGE2.SYS", for my second stage. You can look at an example of its internal filename in the above section.
Creating Stage 2
Okay, Stage2 is a seperate program then the bootloader. Our Stage2 will be very simular to a DOS COM program, sound cool?
All Stage2 does right now is print a message and halt. Everything you have already seen from the bootloader:
http://www.brokenthorn.com/Resources/OSDev6.html 3/9
8/18/2015 Operating Systems Development Series
; Note: Here, we are executed like a normal
; COM program, but we are still in Ring 0.
; We will use this loader to set up 32 bit
; mode and basic exception handling
;*************************************************;
; Prints a string
; DS=>SI: 0 terminated string
;************************************************;
Print:
lodsb ; load next byte from string from SI to AL
or al, al ; Does AL=0?
jz PrintDone ; Yep, null terminator found-bail out
mov ah, 0eh ; Nope-Print the character
int 10h
jmp Print ; Repeat until null terminator found
PrintDone:
ret ; we are done, so return
;*************************************************;
; Second Stage Loader Entry Point
;************************************************;
main:
cli ; clear interrupts
push cs ; Insure DS=CS
pop ds
;*************************************************;
; Data Section
;************************************************;
To assemble with NASM, just assemble it as a binary program (COM programs are binary), and copy it into the floppy disk image. For example:
No PARTCOPY required :)
To get the size, just multiply the number of entrys in the root directory. Seems simple enough :)
In Windows, whenever you add a file or directory to a FAT12 formatted disk, Windows automatically adds the file information to the root
directory, so we dont need to worry about it. This makes things much simpler.
Dividing the number of root entrys by bytes per sector will tell us how many sectors the root entry uses.
Here is an example:
http://www.brokenthorn.com/Resources/OSDev6.html 4/9
8/18/2015 Operating Systems Development Series
Remember that the root directory table is a table of 32 byte values (entrys) that reprsent the file information.
Yippe--Okay, we know how much sectors to load in for the root directory. Now, lets find the starting sector to load from :)
This is another easy one. First, lets look at a FAT12 formatted disk again:
Boot Extra Reserved File Allocation File Allocation Root Directory (FAT12/FAT16 Data Region containng files and
Sector Sectors Table 1 Table 2 Only) directories.
Okay, note that the Root Directory is located directly after both FATs and reserved sectors. In other words, just add the FATs + reserved
sectors, and you found the root direcory!
For example...
Pretty easy, huh? Now, we just read the sector to some location in memory:
This example code is taken directly from the bootloader a the end of the tutorial. It loads the root directory:
LOAD_ROOT:
xor cx, cx
xor dx, dx
mov ax, 0x0020 ; 32 byte directory entry
mul WORD [bpbRootEntries] ; total size of directory
div WORD [bpbBytesPerSector] ; sectors used by directory
xchg ax, cx
Lets look back at the 32 byte root directory table again (Section Root Directory Table. Remember the first 11 bytes represent the filename.
Also remember that, because each root directory entry is 32 bytes, Every 32 bytes will be the start of the next entry - Pointing us back
to the first 11 bytes of the next entry.
Hence, all we need to do is compare filenames, and jump to the next entry (32bytes), and test again until we reach the end of the sector. For
example...
http://www.brokenthorn.com/Resources/OSDev6.html 5/9
8/18/2015 Operating Systems Development Series
mov di, 0x0200 ; Root directory was loaded here
.LOOP:
push cx
mov cx, 11 ; eleven character name
mov si, ImageName ; compare the 11 bytes with the name of our file
push di
rep cmpsb ; test for entry match
pop di
je LOAD_FAT ; they match, so begin loading FAT
pop cx
add di, 32 ; they dont match, so go to next entry (32 bytes)
loop .LOOP
jmp FAILURE ; no more entrys left, file doesnt exist :(
Okay, so the root directory is loaded and we found the files entry. How do we get its starting cluster?
This should look familiar :) To get the starting cluster, just refrence byte 26 in the file entry:
mov dx, [di + 0x001A] ; di contains starting address of entry. Just refrence byte 26 (0x1A) of entry
bpbNumberOfFATs: DB 2
bpbSectorsPerFAT: DW 9
Okay, so how do we find out how many sectors there are in both FATs? Just multiply sectors per FAT by the number of sectors :) Seems simple,
...but...
xor ax, ax
mov al, [bpbNumberOfFATs] ; number of FATs
mul WORD [bpbSectorsPerFAT] ; multiply by number of sectors per FAT
Now that we know how many sectors to read. Just, um... read it :)
Yey! Now with the FAT stuff out of the way (Not completely!), load in stage 2!
LOAD_FAT:
http://www.brokenthorn.com/Resources/OSDev6.html 6/9
8/18/2015 Operating Systems Development Series
xor ax, ax
mov al, BYTE [bpbNumberOfFATs] ; number of FATs
mul WORD [bpbSectorsPerFAT] ; sectors used by FATs
mov cx, ax
There is one little problem we havnt discussed yet though. Okay, We have a cluster number from the FAT. But, How do we use it?
The problem is that this cluster represents a linear address, while, in order to load sectors, we will need a segment/track/head address.
(Interrupt 0x13)
There are two ways to access a disk. Either through Cylinder/Head/Sector (CHS) addressing or Liogical Block Addressing (LBA).
The LBA represents an indexed location on disk. The first block being 0, then 1, and so on. LBA simply represents sectors are sequentially
numbered with LBA 0. Cant get more basic then that.
You will need to know how to convert between LBA and CHS.
Heres an example...
LBACHS:
xor dx, dx ; prepare dx:ax for operation
div WORD [bpbSectorsPerTrack] ; divide by sectors per track
inc dl ; add 1 (obsolute sector formula)
mov BYTE [absoluteSector], dl
http://www.brokenthorn.com/Resources/OSDev6.html 7/9
8/18/2015 Operating Systems Development Series
read it in:
Okay, remember each cluster number in the FAT entry is 12 bits. This is a problem. If we read in 1 byte, we are only copying a part of the
cluster number!.
Yet, then again, we run into a problem. Copying 2 bytes (from a 12 bit value) means that we will copy a part of the next cluster entry. For
example, imagine this is your FAT:
| |
01011101 0111010 01110101 00111101 0011101 0111010 0011110 0011110
| | | | | |
| |1st cluster | |3rd cluster-| |
|-0 cluster ----| |2nd cluster---| |4th cluster----|
Notice all even clusters accopy all of the first byte, but part of the second. Also notice that all odd clusters occopy a part of their first
byte, but all of the second!
Okay, so what we need to do is to read a 2byte (word) value from the FAT (This is our cluster).
If the cluster is even, Mask out the top 4 bits, as it belongs to the next cluster.
If it is odd, shift it down 4 bits (to discard the bits used by the first cluster.) For example...
.EVEN_CLUSTER:
.ODD_CLUSTER:
.DONE:
http://www.brokenthorn.com/Resources/OSDev6.html 8/9
8/18/2015 Operating Systems Development Series
Demo
The first shot contains the bootloader loading Stage 2 successfully. Stage 2 prints the loading operating system message.
The second shot displays an error message when it cannot find the file (within the root directory).
This demo contains most of the code in this lesson, 2 source files, 2 directories, and 2 batch programs. The first directory contains the stage 1
program -- our bootloader, the second directory contains our stage 2 program - STAGE2.SYS.
Conclusion
Wow, this tutorial was hard to write. Simply because it is hard explaining such a complex topic in every detail, while still trying to make it very
easy to follow. I hope I did well :)
If you have any suggestions that could imporove this tutorial, please let me know :)
In the next tutorial, we will begin building on Stage 2. We will talk about A20, and look at Protected Mode in alot more detail...
~Mike
BrokenThorn Entertainment. Currently developing DoE and the Neptune Operating System
Would you like to contribute and help improve the articles? If so, please let me know!
Home
Chapter 5 Chapter 7
http://www.brokenthorn.com/Resources/OSDev6.html 9/9