Operating Systems

Filesystems

FATFS


The boot record

After the partition table, the boot-record is the second most important information on your hard drive. Most commercial applications for disc-recovery are capable of regenerating destroyed boot-records, but perhaps you want to try your luck (don't!). If that is the case the structure is like this:

The first three bytes contain a JUMP instruction to skip the information and make extensions possible (because the MBR loads this sector into memory and transfers execution to it). Following the jump is the record itself:

Boot record of FAT16
Field Offset Size Default
Jump 0 3  
OEM ID 3 8 MSWIN4.0
Bytes Per Sector 11 2 512
Sectors Per Cluster 13 1 See cluster sizes
Reserved Sectors 14 2 1
FAT's 16 1 2
Root Entries 17 2 512/544
Sectors (small, for FDD) 19 2 0
Media Descriptor 21 1 See Media Descriptors
Sectors Per FAT 22 2 Must be calculated
Sectors Per Track 24 2 Depends on your HDD, see Appendix
Heads 26 2 See above
Hidden Sectors 28 4  
Sectors (large, for HDD) 32 4 Size of the partition, see Boot Process
Physical Drive No. 36 1 80h
Current Head 37 1 0
Signature 38 1 for WinNT: 28h or 29h
Serial number (ID) 39 4 Random
Volume Label 43 11  
System ID (filesystem) 54 8 FAT12, FAT16, FAT, FAT32
Total   62  

(By "Default" is meant values, that do apply to most setups - at least, that's a good guess).

The OEM ID describes the program, that created the boot record. This is often "MSWIN4.0" (Win95), "IBM 20.0" (OS/2), "MSDOS5.0" (MS-DOS later than 4)

Bytes per sector is almost always 512.

Sectors per cluster is a bit more difficult. Clusters are in the FATFS the basic allocation unit, meaning that all files occupy at least one cluster (size up to 32KB). The lost space is called slack-space. To determine the size of each cluster Microsoft published this table:

Cluster sizes for drives
Disk type Drive size FAT type Sectors pr. cluster * Cluster size
Floppy 360 KB 12-bit 2 1 KB
" 720 KB 12-bit 2 1 KB
" 1.2 MB 12-bit 1 512 bytes
" 1.44 MB 12-bit 1 512 bytes
" 2.88 MB 12-bit 2 1 KB
Fixed 0 MB - 15 MB 12-bit 8 4 KB
" 16 MB - 127 MB 16-bit 4 2 KB
" 128 MB - 255 MB 16-bit 8 4 KB
" 256 MB - 511 MB 16-bit 16 8 KB
" 512 MB - 1023 MB 16-bit 32 16 KB
" 1024 MB - 2047 MB 16-bit 64 32 KB
*Valid values are 1,2,4,8,16,32,64 and 128 (only 1 to 64 supported)

Reserved sectors is the sectors before the FAT's. The value is at least 1 (the boot-record).

FAT's is the number of FAT's on the drive. Usually, this is 2.

Root Entries is the number of files/directories available in the root directory. As far as I know, this is often 512, but DOS uses 1 for the volume-label, meaning you actually only can have 511 files in your root directory. (my OS/2-bootdrive has 544 entries).

Sectors (small) is the size of the partition in sectors. If the number is too large to fit in this field, the size will be placed in "sectors (large)" and this field is 0.

Media Descriptor tells us which kind of disk we are dealing with. The following numbers are defined:

Media Descriptors
Type Capacity Size and type
F0h 2.88 MB 3.5", 2-sided, 36-sectors per track
F0h 1.44 MB 3.5", 2-sided, 18-sectors per track
F9h 720 KB 3.5", 2-sided, 9-sectors per track
F9h 1.2 MB 5.25", 2-sided, 15-sectors per track
FDh 360 KB 5.25", 2-sided, 9-sectors per track
FFh 320 KB 5.25", 2-sided, 8-sectors per track
FCh 180 KB 5.25", 1-sided, 9-sectors per track
FEh 160 KB 5.25", 1-sided, 8-sectors per track
F8h - fixed disk

Sectors per FAT. Now, this is where it gets complicated - I used good old math to solve this one, even though I have a feeling it is completely unnecessary.

Assume:
To is the total amount of sectors,
Fo is the amount of free sectors for data
Fs is the size of one FAT in sectors
Cs is the cluster size
Ss is the sector size
Rs is the reserved sectors before the FAT's
Re is the entries in the root-directory
Ds is the size of a entry (=32 bytes)
The size of the FAT must equal the free amount of sectors divided by the cluster size in sectors multiplied by two (because FAT-16 uses 2 bytes to describe 1 cluster, but FAT-12 only uses 1½ byte for each cluster) divided by the sector size (of course rounded up, but we'll overlook that for now)

     2 *  Fo
Fs = -------
     Cs * Ss

The free amount of sectors must be the total amount minus the FAT's, the Root Directory and the Reserved Sectors

Fo = To - 2 * Fs - Rs - Re*Ds/Ss

Lets solve that:

     2 *  ( To - 2 * Fs - Rs - Re*Ds/Ss)               2 ( To - Rs - Re*Ds/Ss )
Fs = -----------------------------------   ==>   Fs =  ------------------------
               Cs * Ss                                        Cs * Ss + 4

And then you just put in your values (remember to round the result up (131.02=132) ). And then verify the result by calculating how many cluster you got on your drive and how many the FAT can handle.

An alternative and a lot simpler way is to take the total amount of sectors in the partition, divide it by the number of clusters per sector and divide it by the amount of FAT-entries per sector. The calculation seems simpler, explains why you cannot make a partition larger than 509MB with a clustersize of 16 sectors, is a lot simpler to program, but it is not such a mathematically optimal approach as the previously described method - therefore it is probably the way DOS calculates it.

Sectors Per Track and Heads you got to get from some kind of information program, e.g. INT 13h, service 08h - see appendix 1.

Hidden sectors is the number of sectors before the boot-record on the physical disk (often equal to the RelativeSectors-field in the partition table). It is necessary to boot correctly.

Sectors (large) contains the number of sectors on the partition if "sectors (small)" could not handle the amount.

Now for all the starred entries. The starred entries are part of the so-called extended BIOS parameter block, which, as far as I know is used on all FAT drives.

The physical drive number contains the number which the drive is assigned. It seems that this number is either 80h for hard disks or 00h for floppy drives, and is actually assigned at run-time except if the drive is to be used as a boot-drive. If you have two partitions on a drive they'll be numbered 80h and 81h and your floppy drives is 00h and 01h.

Current Head should be without importance except for users of Windows NT, please refer to Microsoft Knowledge Base for more info.

Signature must be either 28h or 29h for Windows NT to recognize the drive.

ID should just be a random number different from other drives, but has no valuable meaning.

Volume Label you should be able to figure out without my help :)

System ID defines which file-system is in use on the disk. This is either FAT12 ("FAT") or "FAT16" depending on the size and media (for HDD 16-bit is most common, except for disk with less than 15MB capacity).

And here follows the executable code which makes the system load an operating system and start executing it.

And by the way, if you want to track a boot-record, the last two bytes in the boot-record are always 55h AAh.

The FAT

When using the FATFS you actually have a quite simple point of view on the drive (which supports the claim that Gates designed it during a 5-days stay at a hotel). Just think of the drive as a series of cells (clusters). When a file is allocated, you have a number, that points to the first cluster. In the FAT you have a list of all the clusters on the disk and on a clusters entry the number of the next cluster in the chain is written. So, if you have a file, MYFILE.TXT, which starts at cluster 3 and continues in cluster 5, 6, 7 and finally 9, the FAT would look like this:

[2] ?
[3] 5
[4] ?
[5] 6
[6] 7
[7] 9
[8] ?
[9] <EOF>

As far as I know, it is illegal to do backwards references, meaning entry 5 could never contain a number less than 6.

This means that the FAT is a very important tool in recreating the contents of a file, and therefore a drive with a destroyed FAT is quite badly hit. The developers of the FATFS saw this flaw and made two copies of the chain, but placed right after each other instead of placing them far from each other to protect them from corruption if a disk-crash happened (of course this would have caused performance loss and never protected them from misbehaving applications).

The size of each field in the FAT is either 16-bit or 12-bit giving respectively 65535 and 4096 clusters as the max. size of a media. The 12-bit version is only used on diskettes and drives with a size of less than 16MB.

A series of numbers are reserved for internal use. Depending on the size of each entry in the FAT (12- or 16-bit), the numbers are as follows:

Media Descriptors
Number Description
0 Free cluster
???? Cluster in use, next cluster in chain
FF0-FF6 / FFF0-FFF6 Cluster is reserved
FF7 /FFF7 Cluster contains bad sectors
FF8-FFF / FFF8-FFFF End of file

To establish the position of the first cluster on your drive, you just add up the reserved sectors, the two FATs and the size of the root directory. To verify it you could track the first directory entry and compare it to the cluster-boundary.

Since cluster 0 and 1 are reserved they contain [EOF] [EOF] (on HDD) or [reserved] [EOF] (on FDD). In hex it becomes (on HDD) F8 FF FF FF ... or (on FDD) F0 FF FF.

The Directory Entries

Directory entries(DE) contains the entry into the chain of clusters and the filename. The perhaps most important DE is the root directory, because it contains cluster indexes that points to all sub-dirs. The location of the root-dir can easily be established, as it is positioned following the FAT's, so just add up the values from the boot record. But a knock-out of the root-dir is actually not very bad, because as you probably noticed, all directories contain two entries "." and "..", and we can track those ones, since they always will be positioned in the beginning of a cluster if the cluster contains a directory.

MS-DOS always allocate one cluster to contain DE's whenever you create a directory. This is probably the case because directories are nothing more than files that DOS can interpret as directories, and therefore the same mechanism used to tie files together is used to tie large directories with sizes larger than one cluster together (on 8K clusters this is rarely used since a directory can contain 254 files, at least for DOS, because Win95 uses additional DE to store long filenames).

A simple example of a tree

             root-dir                     root-dir
            /   |    \                    +--mydir1
          /     |      \                  |  +-myfile1
        /       |        \                |  +-myfile2
      /         |          \              +--mydir2
   mydir1     mydir2      mydir3          |  +-myfile3
  /     |    /     |        |             |  +-myfile4
/       | myfile3  |        myfile5       +--mydir3
myfile1 |          |                         +-myfile5
        |      myfile4
     myfile2

'
C:\>dir
...
MYDIR1             -DIR-    ??????
MYDIR2             -DIR-    ??????
MYDIR3             -DIR-    ??????
C:\>cd mydir1
C:\MYDIR1>dir
...
MYFILE1        ??? ????
MYFILE2        ??? ????
C:\MYDIR1>

Now, the key to controlling such a tree is of course the structure of a DE:

Directory Entry
  Offset Size (bytes)
Name 0 8
Extension 8 3
Attributes 11 1
Reserved =00h 12 8
Index in EA DATA. SF (*) 20 2
Date 22 4
Entry Cluster in chain 26 2
Size of file in bytes 28 4
Total   32

(*) this only applies to users of OS/2, since OS/2 uses a file in the root-directory called EA DATA. SF, which contains the extended attributes for files (like long names and file types or icon positions in folders). I don't know if any scaling-factor is involved or if it is only a index to a record in the EA-file. In non-OS/2 systems this field should be zeroed out. Be aware that Win95 uses the reserved area for extra data storage and uses directory entries for storage of long filenames, that do not correspond to the table showed above - more about that in Win95 LFN.

Name contains the name, in uppercase, of the file. Blank spaces = #32 (space) " "

Extension contains the extension, in uppercase.

Attributes is a binary coded field with this content:

Attribute bits
Attribute Bit Binary Hex
Read Only 1 .......? 01h
Hidden 2 ......?. 02h
System File 3 .....?.. 04h
Volume ID 4 ....?... 08h
Directory 5 ...?.... 10h
Archive 6 ..?..... 20h

Date is a specially coded field of 32-bits composed like this (From Most Significant Bit (MSB) and down):

The DOS Date-Time format
Name Size (bits) Value
Year (from 1980 - meaning year=1980+value) 7 0..128
Month 4 1..12
Day 5 1..31
Hour (24-hour) 5 0..23
Minute 6 0..59
Seconds (times 2!) 5 0..30
Total size 32  

Entry Cluster in chain is the first cluster, that makes up the file. If the attributes.directory flag is set, this field points to a new directory entry.

Size of file in bytes, I believe, needs no explanation except the fact it can be a valuable resource in recovering damaged files since we can calculate how many clusters the file is supposed to consist of.

Lets learn by example, we have a file, MYFILE.TXT, with a size of 9948 bytes dated 1997-03-21:17.48.22 and ReadOnly plus Archive active starting in cluster 34656, lets analyze...

The DE for this file would look like this:

"MYFILE  TXT" 21 00 00 00 00 00 00 00 00 00 00 -- -- -- -- 60 87 DC 26 00 00
 -name---ext  at -------------zeros----------- ----date--- clus. ---size----

Tips

Appendix 1 - INT 13h Get Disk Geometry

INT 13h, service 08h:
Input:
AH=08h
DL=physical disk number,
00..79h is diskettes, 80h..FFh is HD's starting from 80h.
Return:
CF (carry flag) is set if error
AH = status code, see a interrupt list or ignore...:)
DL = number of disks responding to system
CH = max cylinder, add 1 for number of, get 2 upper bits from CL and insert as MSBs
CL = max sectors, add 1 for sectors/track, only lower 6 bits
CX: lsb left: SSSSSSHH:CCCCCCCC
max cyl = CCCCCCCCHH (lsb left)
max sec = SSSSSS (lsb left)
Taken almost directly from Ralph Brown interruptlist.

Appendix 2 - Described structures in C

/* these typedefs and their sizes MUST comply to Intels standards ! */
typedef unsigned char      byte;  /* 1 byte */
typedef unsigned int       word;  /* 2 bytes */
typedef unsigned long int  dword; /* 4 bytes */

/* Partion Table Entry [16 bytes] */
struct s_PartionEntry
{
 byte                   Bootflag;
 byte                   StartingSide;
 /* problem : lsb is lowest: SSSSSSHH,CCCCCCCC, but cyl=CCCCCCCCHH ,
    call fix_pt with a s_PartionTable */
 unsigned               StartingSector:6;
 unsigned               StartingCylinder:10;
 byte                   SystemIndicator;
 byte                   EndingSide;
 unsigned               EndingSector:6;
 unsigned               EndingCylinder:10;
 dword                  RelativeSectors;
 dword                  NumberOfSectors;
};

/* Partion Table / Master Boot Record [512 bytes] */
struct s_PartionTable
{
 byte                   LoadInstruction[446];
 struct s_PartionEntry  Partions[4];
 word                   Signature; /*=AA55*/
};

/* Dos Boot Record [512 byte] */
struct s_DosBootRecord
{
 byte                   JumpInstruction[3];
 byte                   OEMID[8];
 word                   BytesPerSector;
 byte                   SectorsPerCluster;
 word                   ReservedSectors;
 byte                   FATs;
 word                   RootEntries;
 word                   SmallSectors;
 byte                   Media;
 word                   FATSize;
 word                   TrackSize;
 word                   Heads;
 dword                  HiddenSectors;
 dword                  LargeSectors;
 byte                   DriveNumber;
 byte                   CurrentHead;
 byte                   Signature;
 dword                  ID;
 byte                   VolumeLabel[11];
 byte                   SystemID[8];
 byte                   LoadInstructions[512-64];
 word                   BR_Signature; /*=AA55h*/
};

/* Time-field in DirEntry [4 byte] */
struct s_DosTime
{
 unsigned Sec    :5;
 unsigned Min    :6;
 unsigned Hour   :5;
 unsigned Day    :5;
 unsigned Month  :4;
 unsigned Year   :7;
};

/* standard DOS Attributes in a DirectoryEntry [1 byte] */
struct s_DosAttributes
{
 unsigned ReadOnly  :1;
 unsigned Hidden    :1;
 unsigned System    :1;

 unsigned VolumeID  :1;
 unsigned Directory :1;
 unsigned Archive   :1;
 unsigned reserved  :2;
};

/* Directory entry [32 byte] */
struct s_DirEntry
{
 char                   Name[8];
 char                   Ext[3];
 struct s_DosAttributes Attributes;
 byte                   reserved[8];
 word                   EA_Index;
 struct  s_DosTime      Time;
 int                    EntryCluster;
 dword                  Size;
};


/* And now some sample code to print the structures */

/* Print a partion table on stdout */
void write_partiontable(struct s_PartionTable PT)
{
 int lp1;
 printf("Bt SI | Side    Cyl   Sect -> Side    Cyl   Sect |  StartSec       Size\n");
 for (lp1=0; lp1<4; lp1++)
  printf("%c  %02hX |  %3hu %6u %6u ->  %3hu %6u %6u | %9ld  %9ld = %5ld\n",
         ( (PT.Partions[lp1].Bootflag==0x080) ? 'Y' : 'N'),
         PT.Partions[lp1].SystemIndicator,
         PT.Partions[lp1].StartingSide,
         PT.Partions[lp1].StartingCylinder,
         PT.Partions[lp1].StartingSector,
         PT.Partions[lp1].EndingSide,
         PT.Partions[lp1].EndingCylinder,
         PT.Partions[lp1].EndingSector,
         PT.Partions[lp1].RelativeSectors,
         PT.Partions[lp1].NumberOfSectors,
         PT.Partions[lp1].NumberOfSectors/2048);
 if (!(PT.Signature==0xAA55)) printf("MBR does not conform to standards!\n");
};

/* Print a boot record on stdout */
void write_bootrec(struct s_DosBootRecord BR)
{
 printf("\nOEM ID\t\t\t\t%.8s",BR.OEMID);
 printf("\nBytes Per Sector\t\t%u",BR.BytesPerSector);
 printf("\nSectors per cluster\t\t%hu",BR.SectorsPerCluster);
 printf("\nReserved Sectors\t\t%u",BR.ReservedSectors);
 printf("\nNumber of FATs\t\t\t%hu",BR.FATs);
 printf("\nEntries in root-directory\t%u",BR.RootEntries);
 printf("\nSectors (small)\t\t\t%u",BR.SmallSectors);
 printf("\nMedia Descriptor\t\t%hXh",BR.Media);
 printf("\nSize of FAT in sectors\t\t%u",BR.FATSize);
 printf("\nLength of track, in sectors\t%u",BR.TrackSize);
 printf("\nHeads\t\t\t\t%u",BR.Heads);
 printf("\nHidden sectors\t\t\t%lu",BR.HiddenSectors);
 printf("\nLarge sector count\t\t%lu",BR.LargeSectors);
 printf("\nDrive number\t\t\t%hXh",BR.DriveNumber);
 printf("\nCurrent Head\t\t\t%hd",BR.CurrentHead);
 printf("\nSignature\t\t\t%hXh",BR.Signature);
 printf("\nID\t\t\t\t%08lXh",BR.ID);
 printf("\nVolume Label\t\t\t%.11s",BR.VolumeLabel);
 printf("\nSystem ID\t\t\t%.8s",BR.SystemID);
 printf("\n");
 if (!(BR.BR_Signature==0xAA55)) printf("BootRecord does not conform to standards!\n");
};

/* Print a DE on stdout */
void write_dir(struct s_DirEntry Entry)
{
 if ( (Entry.Name[0]>32) && (Entry.Name[0]<180) )
  printf("%.8s %.3s  %9lu  %4d-%02d-%02d  %02d.%02d.%02d %c %c %c %c %c %c ->  %4Xh = %d \n",
         Entry.Name,
         Entry.Ext,
         Entry.Size,
         Entry.Time.Year+1980,
         Entry.Time.Month,
         Entry.Time.Day,
         Entry.Time.Hour,
         Entry.Time.Min,
         Entry.Time.Sec*2,
         (Entry.Attributes.Archive ? 'A' : ' '),
         (Entry.Attributes.ReadOnly ? 'R' : ' '),
         (Entry.Attributes.Hidden ? 'H' : ' '),
         (Entry.Attributes.System ? 'S' : ' '),
         (Entry.Attributes.Directory ? 'D' : ' '),
         (Entry.Attributes.VolumeID ? 'V' : ' '),
         Entry.EntryCluster,
         Entry.EntryCluster);
};

/* small function to fix a partion table, use 0 to fix and 1 to unfix */
void fix_pt(struct s_PartionTable *PT, int unfix)
{
 int lp1;
 byte tmp;
 for (lp1=0; lp1<4; lp1++)
  if (!unfix)
  {
   tmp = PT->Partions[lp1].StartingCylinder & 0x03 << 8;
   PT->Partions[lp1].StartingCylinder >>= 2;
   PT->Partions[lp1].StartingCylinder += tmp;
   tmp = PT->Partions[lp1].EndingCylinder & 0x03 << 8;
   PT->Partions[lp1].EndingCylinder >>= 2;
   PT->Partions[lp1].EndingCylinder += tmp;
  } else
  {
   tmp = PT->Partions[lp1].StartingCylinder & 0x300 >> 8;
   PT->Partions[lp1].StartingCylinder <<= 2;
   PT->Partions[lp1].StartingCylinder += tmp;
   tmp = PT->Partions[lp1].EndingCylinder & 0x300 >> 8;
   PT->Partions[lp1].EndingCylinder <<= 2;
   PT->Partions[lp1].EndingCylinder += tmp;
  };
};

Links/Sources


Last updated 2001-11-24 FlushedSector
fs@proglang.cjb.net
Standard Disclaimer

Pierre's LibraryPierre's Library - Changelog:

Analyse d'audience