The structure of the fat file system. File systems FAT, FAT32 and NTFS. Structure and placement of directories

There are many ways to store information and programs on a hard drive. A very well-known system that saves various information in the form of files, grouping them into folders with a unique assignment. However, few people thought about how the physical storage of information on the media actually takes place.

In order for information to be stored on a physical medium, it must be prepared for use in a computer operating system. The operating system allocates free disk space to save information. To do this, you need to divide the disk into small containers - sectors. Formatting a disk at a low level allocates a certain size for each sector. The operating system groups these sectors into clusters. Top-level formatting sets all clusters to the same size, typically between 2 and 16 sectors. In the future, one or more clusters are allocated for each file. The cluster size depends on the operating system, disk capacity, and the required speed.

In addition to the area for storing files on the disk, there are areas necessary for the operation of the operating system. These areas are designed to store boot information and information to map file addresses to physical locations on the disk. The boot area is used to start the operating system. After BIOS boot the boot area of ​​the disk is read and executed to start the operating system.

FAT file system

The FAT file system appeared with the Microsoft DOS operating system, after which it was improved several times. It has FAT12, FAT16 and FAT32 versions. The name FAT itself comes from the file system's use of a kind of database in the form of a "file allocation table" (File Allocation Table), which contains an entry for each cluster on the disk. The version numbers refer to the number of bits used in the item numbers in the table. Therefore, the file system has a limit on the supported disk size. In 1987, it did not support disks larger than 32 MB. WITH advent of Windows 95 came out a new version file FAT systems 32 with theoretical support for drives up to 2TB. Constant problems with supporting large disks appear due to the fixed number of elements, limited number bits used in determining the position of the cluster. For example, the FAT16 version does not support more than 2 16 or 65536 clusters. The number of sectors in a cluster is also limited.

Another problem with large disks was the inability to use the huge space allocated for small files. Due to the fact that the number of clusters is limited, their size was increased in order to cover the entire capacity of the disk. This leads to inefficient use of space when storing most files that are not a multiple of the cluster size. For example, FAT32 allocates 16 KB clusters for disk partitions ranging from 16 GB to 32 GB. To store a 20 KB file, you will need two 16 KB clusters, which will occupy 32 KB on disk. 1 KB files take up 16 KB of disk space. Thus, on average, 30-40% of the disk capacity is wasted for storing small files. Partitioning a disk into small partitions allows you to reduce the size of the cluster, but it is not used in practice for disks with a capacity of more than 200 GB.

File fragmentation is also not a small problem of the file system. Since a file may require several clusters to locate, which may not be physically consecutive, the time it takes to read slows down programs. Therefore, there is a constant need for.

NTFS file system

Early 90s Microsoft started development of completely new software designed for environments with more resource consumption than ordinary home users. For the needs of business and industry, the resources provided by DOS-based Windows operating systems have become insufficient. Microsoft Corporation worked with IBM on the OS / 2 operating system with the HPFS (High Performance File System) file system. Corporate development did not bring success and soon each company again went its own way. Microsoft has developed various versions Windows NT operating system, on which Windows 2000 and Windows XP are built. Each of them uses its own version of the NTFS file system, which continues to evolve.

NTFS (New Technology File System - "file system new technology") is the standard file system for operating systems on Windows based NT. It was designed to replace FAT. NTFS is more flexible than FAT. Its system areas store mostly files rather than fixed structures like FAT, allowing them to be modified, expanded, or moved during use. A simple example is the Master File Table (MFT) - "master file table". MFT is a kind of database with various information about files on a disk. Small files (1 KB or less) can be stored directly in the MFT. For large files NTFS allocates clusters, but unlike FAT, the cluster size usually does not exceed 4 KB, and the built-in compression method eliminates problems with unused space allocated for files. And in NTFS you can use .

The NTFS file system is designed for a multi-user environment and has built-in security and access control mechanisms. For example, OS Windows 2000 and Windows XP (except "Home Edition") allow you to set access permissions for individual files and encrypt them. However high level security complicates the work ordinary users with a computer. You must be extremely careful when setting passwords and file permissions so as not to lose important data.

Many users are faced with a misunderstanding of the basics of how file Windows systems. It would seem, why an unnecessary theory? In fact, it is the knowledge of the deep functioning of various file systems that allows you to correctly choose one or another file system for one or another storage medium. Sometimes an error in the choice can become critical later when solving the problem of information recovery or premature wear of the media.

The file system consists of a file management system and a collection of files on a certain type of media (CD, DVD, FDD, HDD, Flash, etc.). A file management system provides users and applications with the ability to access files, store them, and maintain the integrity of their content. The most common long-term storage medium in modern computing systems is HDD- Winchester. This term applies to any sealed disc with aerodynamically designed magnetic read heads.

File systems modern operating systems are installed in hard disk partitions.

FAT 32. Simplicity and reliability.

There are three FAT file systems: FAT12 (for FDD floppy disks), FAT16, FAT32. They differ in the number of bits (12, 16, 32) to indicate the cluster number in the file management system. In FAT file systems, the logical disk space of any logical drive is divided into a system area and a data area. BR- boot record boot record; RS - reserved sectors; FAT1, FAT2 - tables 1 and 2 of file allocation; RDir (Root directory, ROOT) – root directory. The data area is divided into clusters, which are 1 or more contiguous sectors. In the FAT table, clusters belonging to the same file are linked in a chain. The map of the data area is, in fact, the File Allocation Table (FAT) Each element of the FAT table (12, 16 or 32 bits) corresponds to one disk cluster and characterizes its state: free, busy or is a bad cluster (bad cluster) . The FAT16 file management system uses a 16-bit word to indicate the cluster number, and 65536 clusters can be addressed.

A cluster is the minimum addressable unit of disk space allocated for a file. A file or directory occupies an integer number of clusters. Splitting a data area into clusters instead of using sectors allows you to: reduce the size of the FAT table, reduce file fragmentation, reduce the length of file chains, speed up file access. The last cluster may not be fully utilized, resulting in a noticeable loss of disk space if the cluster size is large. On a floppy disk, a cluster occupies 1 or 2 sectors. On the hard disk - 4, 8, 16, 32, 64 - sectors in one cluster. Each element has the following structure: file name, file attribute, fallback field, creation time, creation date, last access date, fallback, last modification date, last modification time, initial Fat cluster number, file size.

In this example, the file named MyFile.txt is placed starting from the 8th cluster and spans 12 clusters. The chain of clusters for this case: 8,9,A,B,15,16,17,19,1A,1B,1C,1D. Cluster number 18 is marked as bad by code F7. It cannot be used to host data. This code is set by the disk formatting and checking utilities. The 1D cluster is marked with the FF code as the final one belonging to this file. Free clusters are marked with code 0. When a new cluster is allocated for writing to a file, the 1st free cluster is taken. Since files on the disk are changed, deleted, moved, enlarged and reduced, this placement rule leads to fragmentation, i.e. the data of one file is not located in adjacent clusters, and sometimes very remote from each other. A complex chain is formed. This results in slower file handling. Since Fat is used very intensively when accessing the disk, it is loaded into RAM. The Fat32 system is much more efficient in using disk space, as it uses smaller clusters compared to previous versions fat. Compared to Fat16, this gives a savings of 10-16%.

A directory element in an attribute field can store the following values:

1) archive (installed when a file is changed and removed by a program that backs up files to another medium);

2) directory;

3) volume label;

4) systemic;

5) hidden;

6) read-only.

Long names in FAT32 are enforced using multiple directory entry entries: for a single file (one entry is one entry for the 8.3 name, and 24 entries for the longest name, which can be up to 256 characters long. Therefore, long names are not recommended.

The main disadvantage of FAT is slow file handling. When creating a file, the rule works - the first free cluster is selected. This leads to disk fragmentation and complex file chains. Hence the slowdown in working with files.

Basically, the FAT file system is something to be avoided today. Therefore, it is vital to choose the right one that will allow you to avoid this file system.

NTFS: convenience and high speed.

One of the basic concepts used when working with NTFS is the concept of a volume. It is possible to create a fault-tolerant volume that occupies several partitions, that is, the use of RAID technology. NTFS divides the entire usable disk space of a volume into clusters - blocks of data addressed as units of data. NTFS supports cluster sizes from 512 bytes to 64 KB; 2 or 4 KB of the disk are allocated for the MFT zone - the space that can be occupied by the main MFT service metafile, increasing in size. Writing data to this area is not possible. The MFT zone is empty so that the service file (MFT) does not fragment as much as possible as it grows.

MFT (general file table) - a centralized directory of all other disk files, including itself. The MFT is divided into fixed size 1KB records, each record corresponding to a file. The first 16 files are of a service nature and are inaccessible to the operating system - they are called metafiles, and the very first metafile is the MFT itself. These first 16 MFT elements are the only part of the disk that has a strictly fixed position. A copy of these same 16 entries is kept in the middle of the volume for security, as they are very important. The remaining parts of the MFT file can be located in arbitrary places on the disk - you can restore its position using itself, "hooking" on the very basis - on the first element of the MFT. Each file in NTFS is represented by streams, it has no data, but "streams". One of the streams is the file data. You can define multiple data streams for a single file.

Main features of NTFS:

Work on large disks is efficient (much more efficient than in FAT);

There are means to restrict access to files and directories;

NTFS partitions provide local security for both files and directories;

A transaction mechanism has been introduced, in which file operations are logged;

Significant increase in reliability;

Removed many restrictions on the maximum number of disk sectors and/or clusters;

A file name in NTFS, unlike the FAT and HPFS file systems, can contain any characters, including the full set of national alphabets, since the data is presented in Unicode, a 16-bit representation that gives 65535 different characters. Maximum length file name in NTFS - 255 characters.

NTFS also has built-in compression that you can apply to individual files, entire directories, and even volumes (and then override or reassign them as you see fit). A directory on NTFS is special file A that stores links to other files and directories.

NTFS provides file-level security; this means that permissions to volumes, directories and files may depend on account user and the groups to which he belongs. Each time a user accesses a file system object, their permissions are checked against a list of permissions. this object. If the user has a sufficient level of rights, his request is granted; otherwise, the request is rejected. This security model applies to both local user login on NT machines and remote network requests.

NTFS also has some self-healing features. NTFS supports various mechanisms for checking system integrity, including transaction logging, which allows you to replay file write operations against a special system log.

The main drawback of the NTFS file system is that service data takes up a lot of space (for example, each element of the directory takes 2 KB) - for small partitions, service data can take up to 25% of the media volume.

Thus, when choosing a file system type, we do not choose some abstract action, we make a set of decisions that affect the entire system as a whole. Why do you need to know all the ins and outs of the file system in such detail? This is necessary for its possible recovery, which we will discuss in one of the following articles =)

File system it's just a way of organizing data on the media, there is nothing complicated in this organization.

Perhaps you are thinking: “that the file system is a complex and incomprehensible thing, because operating systems work with it, and everything simply cannot be there ...”

You are partially right, but all the raisins are in the file system driver, i.e. in a program that provides an API for the rest application programs. It just does things like:

  • create a file
  • delete a file
  • rename
  • copy
  • show directory contents
  • move to another directory, etc.

The very principle of the organization of the file system is simple.

In this post, I will not consider how the driver works and how it creates / deletes files, I will tell you about the principle of file organization FAT16 systems.

(about how to write a driver, there is a separate one)

Why FAT16?

I find it the most convenient for learning, it is easy to comprehend. And knowing the idea, it is no longer difficult to learn other file systems - FAT32, NTFS, etc.

Why do I need to know how the file system works?

Knowing the principle of organizing the file system, you can develop your own driver or file manager on any computing device.

Description of the FAT16 file system

For your convenience, here is a list of questions to which you will find answers:

FAT16 file system divides the entire address space of the media into two areas:

  • system area
  • data area

For clarity, we will depict the entire address space as a rectangle. The small upper part of the rectangle (address space) is the system area, the lower massive one is the data area.

All data that we store on our media, i.e. all files and directories are stored in the data area. The system area, on the other hand, stores the parameters of this medium and the characteristics of files and directories - the file name, directory name, file attributes, etc.

Let's start with a simple one, a few words about the data area and how data is stored there

About the data area...

In order not to address every byte (although some storage media allow you to work byte by byte), a different minimum addressable unit is used in the file system - sector. Size sectors 512 bytes. In addition to the sector, the FAT16 file system also uses such a concept as cluster. cluster it one or more contiguous sectors.

This parameter (the number of sectors per cluster) is often manipulated when formatting storage media. Because the speed of work and the “degree of data packaging” depend on it. FAT16, like all file systems, uses the concept of a file. A file is a data area that has a name and some attributes. Physically, in the data area, this is one or more busy clusters, and the file occupies an integer number of clusters. Even if it occupies a little more than two clusters, three clusters will be considered for the file system occupied by the file. Therefore, than smaller size cluster, the greater the “degree of data packing” and the more economically the data area is used. On the other hand, reading a file from large chunks of memory i.e. clusters faster than small ones. Therefore, the choice of cluster size is a matter of compromise.

File system FAT16 imposes limits on cluster size, no more than 128 sectors(i.e. no more than 64 kb) and on the number of clusters is not more than 65525 pieces. If you use everything to the maximum, i.e. the maximum size of sectors and the maximum number of clusters, it turns out that FAT16 cannot address more than 4.2 gigabytes of information.

If we perform formatting in automatic mode (when we do not specify the cluster size), then the cluster size is chosen to be minimal, at which the resulting number of clusters does not exceed 65525.

About the system area...

The system area is created when the media is formatted and is descriptive. It consists of the following parts:

Let's analyze each part in more detail.

1. Boot sector

The boot sector is parameter table and program loader. Size boot sector usually 512 bytes, but it could be more.

Consider the structure of the boot sector.

Do not be afraid of a large number of fields in the boot sector, he is redundant. For example, it stores information that is not relevant for flash drives: the number of sectors on a track, the number of heads. So, not all parameters will be useful for us.

If you look HEX code, some media formatted in FAT16 format, then we will see the value of the fields. As an example, I will give the HEX code of an image in FAT16 format created in WinImage. For the convenience of orienting in the code, I marked with colors which fragment of the HEX code belongs to which parameter.

P.S. The value for each cell is considered from right to left, for example, if it is written 00 02 h, then it is actually 02 00 h, i.e. 512

P.S. The boot sector always ends at 55AAh.

It is important to pay attention to the parameter " ReservedSectors» - the number of reserved sectors, by offset 0Eh. At the very beginning, I said that the boot sector is usually 512 bytes in size, but it can be more. Its size is determined by the parameter " ReservedSectors", in our case ReservedSectors = 01h, so the boot sector occupies 1st sector or 512 bytes.

2. FAT

After boot sector with size 512* ReservedSectors bytes, table goes FAT1, its size is determined two-byte field - SectorPerFat (16h) boot sector. In the example above, the value given field equals 0001h or 1 , i.e. one sector or 512 bytes.

What is FAT?

First of all, this is an abbreviation - File Allocation Table, meaning "file location table". This table With one column And 512/2 number of lines(if the size of the FAT table is 512 bytes or SectorPerFat is 0001h, as in our case). Each line FAT tables occupies 2 bytes of memory, so the number of lines for our case is 512/2 .

Table serves as a map across clusters, each line characterizes any cluster, the first line is the first cluster, the second is the second, and so on for all the clusters that are in the data area. The table is preceded by a table descriptor F8FFh(same value as 15h boot sector) and placeholder FFFFh. Next are the rows of the table, the values ​​of which can be the following:

  • 0000h- free cluster;
  • 0002h-FFEFh- number of the next element in the chain;
  • FFF0h-FFF6h- reserved;
  • FFF7h- defective;
  • FFF8h-FFFFh- the last one in the chain;

I will give an example HEX code with explanation.

Blue I have framed FAT1 table, red FAT2 table(copy of FAT1 table). painted over green square This table descriptor F8FFh and placeholder FFFFh. Unfilled squares are table rows. I did not mark all the lines with a green frame, circled only non-zero ones.

How it is used and why FAT is needed, I will explain a little later.

3. Root directory

After the FAT tables comes " root directory". This is the area of ​​memory that contains 32-byte elements. Every element describes, any file or directory located in the root directory or another language "at the root" of the hard drive / flash drive. It turns out the root directory describes everything that is in the root.

The size of the root directory depends on the setting RootEntries (11h) boot sector. It indicates maximum number of 32-byte elements in the root directory. It turns out the size of the directory is RootEntries * 32, for our case it is 512 * 32 = 16384 bytes.

Each element has the following structure:

I will give an example of a HEX code with an explanation.

Green I have framed memory area responsible for the root directory, blue 32-byte root directory entries. Not empty 32-byte elements I painted over in blue.

Here are two non-empty 32-byte elements, means in the root directory store two "somethings", it can be both files and other directories. In this case, for simplicity of the example, two files are stored in the root " 1.txt" And " test.txt».

Let's take a closer look at these two 32-byte elements; for convenience, I marked the fragment of the HEX code and the corresponding parameter of the 32-byte element in the table with colors.

P.S.. If the first byte of the filename is replaced by "E5", That windows explorer will count it as remote. Such a file can be restored by replacing the first character E5 in the name with the previous value. I'm not completely sure, but I think this is how the recycle bin works in Windows. When placing it in the trash, the operating system saves the file name somewhere and replaces the first byte in the name with E5, and when restoring, it assigns the file its former name.

P.S.. File names in the FAT16 system are stored in the format 8.3 . Those. 8 -bytes allocated for name and 3 bytes allocated for extension. Names are encoded in the format ASCII, one character is one byte. Therefore, the name cannot be longer than 8 characters, and extensions more than 3. If the name shorter than 8 characters, That missing bytes are filled in 20h(space character in ASCII code).

P.S.. Let me remind you that the value for each cell is considered from right to left, for example, if it is written 00 02 h, then it is actually 02 00 h, i.e. 512 V decimal system calculus.

The most important parameter for us is located at 1Ah — « low word of first file cluster". It stores the number of the cluster in which the contents of the file are located, which means we can work with information given file, i.e. read it, edit it, etc.

For example " 1.txt» stored in a cluster number 0x0003 or 3 in the decimal system. And this means that if we let's move on To cluster №3 in the data area (remember, the data area is just consecutive clusters) we get to the contents of this file.

You may have a "practical" question, but how to find this third cluster? By what address is it?

How to find the cluster address knowing its number?

For this, you need to know how much system space do you have And how big are the clusters(i.e. how many sectors (or 512 bytes) does the cluster contain).

The following figure will help you find out the size of the system area:

Example for my case

Boot sector has volume 512*ReservedSectors bytes, in my case 512 bytes. Further, the FAT table occupies me one sector, those. 512 bytes(since SectroPerFat is 1). Table two(because NumberOfFATs is equal to 2), then two tables in total 512*2=1024 bytes. The size of the root directory is 512 32-character elements, i.e. 512*32=16384 bytes. We believe:

512 (boot sector) + 1024 (two FAT tables) + 16384 (root directory) = 17920 bytes or 4600 in hexadecimal system.

As a result, in our case, the data area starts with 0x4600, We'll see:

We see the contents of some file, but not ours. The data of the file we are interested in (1.txt) is stored in cluster №3.

Now we need to find out the cluster size, the boot sector parameter will help us with this - SectorPerCluster(0xD, parameter size 1 byte). In our case cluster size 4th sectors, i.e. 512*4=2048 bytes or 800 in hexadecimal system. It is important to note that clusters are numbered from two, not from one (!).

We calculate from what addresses starts cluster №3:

0x4600 (system area) + 0x800 (second cluster) = 0x4E00

Let's calculate what address ends cluster number 3:

0x4E00 (beginning of cluster #3) + 0x800 (512*4 or size of one cluster in HEX) = 0x5600

As a result, the cluster No. 3 lies in the address range 0x4E000x5600.

Let's see the HEX code

blue framed I marked 1.txt file content. Everything above the frame is the contents of another file. Empty areas of the sector are filled with 0x00.

So why do we need a FAT table?

If the file occupies more than one cluster (in our case, if the file is larger than 2048 bytes), then the FAT table comes to the rescue. It is something like a "map" of clusters. Those. when will we know sector number, with which the file of interest to us begins, the first thing we need to look at same line number in FAT.

If the string matters 0xFF8-0xFFFF, then this means that this is the last cluster for a given file, i.e. file occupies just one cluster.

If the string matters 0x0002-0xFFEF, then this means that file stretched to another cluster. Number means next cluster number, which holds the continuation of the file. We must continue reading the file by given number cluster.

After reading a new cluster, you need to look at the value of the line at this number in the FAT. If the value of the line is 0x FF8-0xFFFF, then this means that this cluster is the last one in the file. If 0x0002-0xFFEF, then this is the number for the next cluster, read further and repeat the action. Reading a file is a conditional loop.

So we figured out the files, now it's time to deal with the directories.

What is a directory?

The directory for the FAT16 file system (and for many others) is a special zero-size file that stores a list of its contents.

Let's say we added the directory " TEST_DIR» with file « in_dir.txt". Then in the root directory a new 32-byte element will appear, it describes a directory same as file, but with slight differences.

I marked in red the parameters specific to directories, these are 0x10- directory label and 0x00000000- file size.

As you can see in the blue square, we have a directory in cluster №5 let's see what's there.

The contents of the "file" TEST_DIR in fact, this is the same root directory, i.e. set of 32-byte elements. I have marked each element with a green border.

The elements describe the name of the file or directory, attributes and the number of the cluster in which its data is located. In any folder, always there two directories With name "." And "..".

The first lies in the cluster №5 , i.e. This the same directory, A the second one is for cluster number 0. Underneath it number means "root directory", i.e. this is the output to the root directory.

Description of the file " in_dir.txt» standard, as for the root directory (see root directory). For us, the main thing is the number of the cluster in which the contents of this file are located (marked with a red square).

We are watching cluster №6 and see the content of the file in_dir.txt". I marked the beginning of the cluster with the red line.

You will be interested:


In addition to all other tasks, it fulfills its main purpose - it organizes work with data according to a certain structure. For these purposes, the file system is used. What is a FS and what it can be, as well as other information about it will be presented below.

general description

The file system is a part of the operating system that is responsible for placing, storing, deleting information on media, providing users and applications with this information, and ensuring its safe use. In addition, it is she who helps in data recovery in the event of a hardware or software failure. This is why the file system is so important. What is FS and what can it be? There are several types:

For hard drives, that is, devices with random access;

For magnetic tapes, that is, devices with serial access;

For optical media;

Virtual systems;

Network systems.5

The logical unit of data storage in the file system is a file, that is, an ordered collection of data that has a specific name. All data used by the operating system is presented in the form of files: programs, images, texts, music, videos, as well as drivers, libraries, and so on. Each such element has a name, type, extension, attributes, and size. So, now you know, the File system is a collection of such elements, as well as ways to work with them. Depending on the form in which it is used and what principles are applicable to it, several main types of FS can be distinguished.

Program approach

So, if a file system is considered (what it is and how to work with it), then it should be noted that this is a multi-level structure, at its top level there is a file system switch that provides an interface between the system and specific application. It converts file requests into a format that is accepted by the next level - drivers. They, in turn, refer to the drivers specific devices that store the required information.

At client-server applications FS performance requirements are quite high. Modern systems are designed to provide efficient access, support for large volumes of media, data protection from unauthorized access, and maintaining the integrity of information.

FAT file system

This type was developed back in 1977 by Bill Gates and Mark McDonald. It was originally used in OS 86-DOS. If we talk about what the FAT file system is, then it is worth noting that initially it was not able to support hard drives, but only worked with flexible media up to 1 megabyte. Now this restriction is no longer relevant, and this FS was used by Microsoft for MS-DOS 1.0 and subsequent versions. FAT uses certain file naming conventions:

The name must start with a letter or number, and it can contain any ASCII character, in addition to space and special elements;

The length of the name should be no more than 8 characters, after it a dot is placed, and then the extension is indicated, which consists of three letters;

File names can use any case, and are not distinguished or preserved.

Since FAT was originally designed for the single-user DOS operating system, it did not provide for the storage of data about the owner or access rights. On this moment this file system is the most widespread, most support it to one degree or another. Its versatility makes it possible to use it on volumes that are being worked with by different operating systems. This is a simple FS that is not able to prevent file corruption due to incorrect computer shutdown. As part of operating systems based on it, there are special utilities that check the structure and correct file inconsistencies.

NTFS file system

This FS is the most preferred for working with Windows NT, as it was developed specifically for it. The OS includes the convert utility, which converts volumes with FAT and HPFS to NTFS volumes. If we talk about what the NTFS file system is, it is worth noting that it has significantly expanded the ability to control access to certain directories and files, introduced many attributes, implemented dynamic file compression tools, fault tolerance, and supports the requirements of the POSIX standard. In this FS, you can use names up to 255 characters long, while a short name in it is generated in the same way as in VFAT. Understanding what the NTFS file system is, it is worth noting that in the event of an operating system failure, it is able to recover itself, so the disk volume will remain available, and the directory structure will not suffer.

Features of NTFS

On an NTFS volume, each file is represented by an entry in the MFT table. The first 16 table entries are reserved by the file system itself for storing special information. The very first entry describes the file table itself. When the first record is destroyed, the second is read to find the mirror MFT file, where the first record is identical to the main table. The logical center of the disk contains a copy of the bootstrap file. The third entry in the table contains the log file, which is used for data recovery. The seventeenth and subsequent entries of the file table contain information about the files and directories that are on the hard disk.

The transaction log contains a complete set of operations that change the volume structure, including operations to create files, as well as any commands that affect the directory structure. The transaction log is designed to recover NTFS from a system crash. The entry for the root directory contains a list of the directories and files that are in the root directory.

EFS Features

The Encrypting File System (EFS) is a Windows feature that can store information on a hard drive in an encrypted format. Encryption has become the strongest protection that this operating system can offer. In this case, encryption for the user is a fairly simple action, for this you only need to check the box in the properties of the folder or file. You can specify who can read such files. Files are encrypted when they are closed, and when they are opened, they are automatically ready for use.

Features of RAW

Devices designed for data storage are the most vulnerable components, which are most often subject to damage not only physically, but also logically. Certain hardware problems can be fatal, while others have solutions. Sometimes users have a question: "What is the RAW file system?"

As you know, in order to write any information to a hard drive or flash drive, the drive must have a file system. The most common are FAT and NTFS. And RAW isn't even the file system we usually think of. It's actually a logical fallacy installed system, that is, its actual absence for Windows. Most often, RAW is associated with the destruction of the structure of the file system. After that, the OS does not just access the data, but also does not display technical information by equipment.

UDF Features

The Universal Disk Format (UDF) is designed to replace CDFS and add support for DVD-ROM devices. If we talk about what it is, then this is a new implementation old version for which it meets the requirements It is characterized by certain features:

Filenames can be up to 255 characters long;

The name can be lower or upper case;

The maximum path length is 1023 characters.

Starting with Windows XP, this file system is read/write.

This FS is used for flash drives that are supposed to be used when working with different computers operating under different operating systems, in particular Windows and Linux. It was EXFAT that became the “bridge” between them, since it is able to work with data received from the OS, each of which has its own file system. What it is and how it works will be clear in practice.

conclusions

As is clear from the above, each operating system uses certain file systems. They are intended to store ordered data structures on physical media information. If you suddenly, when using a computer, have a question about what the final file system is, then it is quite possible that when you tried to copy a certain file to the media, you received a message about exceeding the allowed size. That is why it is necessary to know in which file system what file size is considered acceptable so that you do not encounter problems when transferring information.

Before the advent of the operating system Microsoft Windows NT users personal computers seldom there was a problem of a choice of file system. All owners of operating systems (OS) MS-DOS and Microsoft Windows used one of the varieties of the file system called FAT (FAT-12, FAT-16 or FAT-32).

Now the situation has changed. When installing Microsoft Windows NT/2000/XP OS, when formatting a disk, you need to make a choice between three file systems - FAT-16, FAT-32 or NTFS.

In this article, we will talk about internal arrangement listed file systems, consider their inherent disadvantages and advantages. Armed with this knowledge, you will be able to make an informed choice in favor of a particular file system for Microsoft Windows.

Briefly about the FAT file system

The FAT file system appeared at the dawn of the development of personal computers and was originally intended for storing files on floppy disks.

Information is stored on disks and floppy disks in portions, in sectors of 512 bytes. The entire space of a floppy disk was divided into regions of a fixed length, called clusters. A cluster may contain one or more sectors.

Each file occupies one or more clusters, possibly non-contiguous. File names and other information about files, such as size and date of creation, are located in the initial area of ​​the floppy disk dedicated to the root directory.

In addition to the root directory, other directories can be created in the FAT file system. Together with the root directory, they form a tree of directories containing information about files and directories. As for the location of file clusters on the disk, this information is stored in the initial area of ​​\u200b\u200bthe diskette, called the file allocation table (File Allocation Table, FAT).

For each cluster, the FAT table has its own individual cell, which stores information about how this cluster is used. Thus, the file allocation table is an array containing information about clusters. The size of this array is determined by the total number of clusters on the disk.

The directory stores the number of the first cluster allocated to a file or subdirectory. The numbers of the remaining clusters can be found using the FAT file allocation table.

When developing the FAT table format, the task was to save space, because The floppy disk has a very small size (from 180 KB to 2.44 MB). Therefore, only 12 binary digits were allocated to store the cluster numbers. As a result, the FAT table was packed so tightly that it occupied only one sector of the floppy disk.

The FAT table contains critical important information about the location of directories and files. If, as a result of a hardware failure, software or the malicious effects of viruses, the FAT table will be damaged, access to files and directories will be lost. Therefore, for the purpose of safety net, two copies of the FAT table are usually created on the disk.

Various versions of FAT

After the advent of large-capacity hard disks (in those days, disks of 10-20 MB in size were considered large), the number of clusters increased, and 12 bits were not enough to store their numbers. A new 16-bit file allocation table format was developed, where two bytes were allocated to store the number of one cluster. The old file system designed for floppy disks became known as FAT-12, and the new one became FAT-16.

The enlarged FAT-16 table no longer fits in one sector, however, with large disk volumes, this drawback did not play a significant role. As before, for insurance, two copies of the FAT table were stored on the disk.

However, when the volume of the disk began to be measured in hundreds of MB and even in gigabytes, the FAT-16 file system again became inefficient. In order for cluster numbers to fit into 16 digits, when formatting large disks, you have to increase the cluster size to 16 KB or even more. This caused problems when it was necessary to store a large number of small files on the disk. Since file storage space is allocated in clusters, even a very small file has to allocate too much disk space.

As a result, another, apparently, the last attempt to improve the FAT file system was made - the cell size of the file allocation table was increased to 32. This made it possible to format disks of hundreds of MB and units of GB using relatively big size cluster. The new file system became known as FAT-32.

Standard 8.3

Before the advent of Microsoft Windows 95, personal computer users were forced to use the very inconvenient "8.3 standard" for naming files, in which the file name had to consist of 8 characters plus 3 extension characters. This limitation was imposed not only by the programming interface of the MS-DOS operating system, but also by the directory entry structure of the FAT file system.

After modifying the structure of directory entries, the limit on the number of characters in a file name was practically removed. The filename can now be up to 255 characters long, which is obviously sufficient in most cases. However, this modified FAT file system has become incompatible with the MS-DOS operating system, as well as with the Microsoft shell running in its environment. Windows versions 3.1 and 3.11.

You can read more about the formats of internal FAT structures in our article "Data Recovery in FAT Partitions" published on this site.

FAT file system limitations

When deciding whether to use the FAT file system to format a drive, you should be aware of its inherent limitations. These restrictions concern, first of all, the maximum size of a FAT drive, as well as the maximum size of a file located on this drive.

The maximum size of a FAT-16 logical drive is 4 GB, which is very small modern concepts. Microsoft, however, does not recommend creating FAT-16 disks larger than 200 MB, as thus the disk space will be used very inefficiently.

Theoretically, the maximum size of a FAT-32 disk can be 8 TB, which should be enough to deploy any modern applications. This value is obtained by multiplying the maximum number of clusters (268,435,445) by the maximum cluster size allowed in FAT-32 (32 KB).

However, in practice the situation looks a little different.

Due to internal limitations, the ScanDisk utility in Microsoft 95/98 is unable to work with disks larger than 127.53 GB. A year ago, such a limitation would not have caused problems, but today inexpensive 160 GB disks have already appeared on the market, and soon their volume will be even larger.

As for the new Microsoft Windows 2000/XP operating systems, they are not able to create FAT-32 partitions larger than 32 GB. If you need partitions of this size or more, Microsoft will suggest that you use the NTFS file system.

Another significant limitation of FAT-32 is imposed on the size of files - it cannot exceed 4 GB. This limitation will affect, for example, when recording video clips to disk or when creating large database files.

A FAT-32 directory can store a maximum of 65534 files.

Disadvantages of FAT

In addition to the limitations discussed above, the FAT file system has other disadvantages. The most significant, apparently, is the complete absence of access control tools, as well as the possibility of losing information about the location of all files after the destruction of a fairly compact FAT table and its copy.

By booting the computer from a system floppy disk, an attacker can easily access any files stored on disks with the FAT file system. It will not be difficult for him to then copy these files to a ZIP device or some other external storage medium.

When using FAT on server disks, it is impossible to provide reliable and flexible differentiation of user access to directories. That is why, and also because of its low fault tolerance, FAT is not commonly used on servers.

The presence of compact FAT file allocation tables makes this file system a vulnerable target for computer viruses- it is enough to destroy the initial fragment of the FAT disk, and almost all data will be lost.

NTFS file system

The modern NTFS file system developed by Microsoft for its operating system. Microsoft systems Windows NT is devoid of the limitations and disadvantages of FAT. Since its inception, the emerging NTFS file system has undergone several enhancements, the most recent of which (at the time of this writing) has been made in Microsoft Windows XP.

In the file NTFS system all file attributes (name, size, location of file extents on disk, etc.) are stored in a hidden system file$MFT. To store information about each file (and directory) in $MFT is allocated from one to several KB. With a large number of files stored on disk, the size of the $MFT file can reach tens or even hundreds of MB.

Small files (on the order of hundreds of bytes) are stored directly in $MFT, which significantly speeds up access to them.

Note, however, that the overhead of NTFS for storing system information, although it exceeds the overhead of FAT, is still not very large compared to the volume of modern disks. Due to the fact that the $MFT file is usually located closer to the middle of the disk, the destruction of the first tracks of an NTFS disk does not lead to such fatal consequences as the destruction of the initial areas of a FAT disk.

The NTFS file system has many features not found in FAT. They allow you to achieve much more flexibility, reliability and security compared to FAT.

We list some of the most interesting opportunities NTFS modern versions.

Access control tools

NTFS access control tools are quite flexible and allow you to control access at the level of individual files and directories, providing (or blocking) access to them to individual users or groups of users.

Although at first glance it may seem that access control tools are needed only for file servers, they will be required even if several users have access to the computer.

File encryption

The access control tools mentioned above will be useless if the physical NTFS disk falls into the hands of an attacker. Using modern utilities, the contents of such a disk can be easily read in any operating system environment - DOS, Microsoft Windows or Linux.

In order to protect user files from unauthorized access, Microsoft Windows 2000/XP operating systems provide additional encryption of files stored on NTFS partitions. And although the strength of such encryption may not be very high, it is quite sufficient in most cases.

Software RAID

Using NTFS, you can create a so-called software RAID array 1 (Mirrored set). This array, made up of two physical or logical disks of the same size, allows you to duplicate (or, as they say, "mirror") files.

Such an array can save your files in the event of a physical failure of one of the disks that make up the array, so it is often used to increase the reliability of the disk system.

Volume Sets

The NTFS file system allows you to combine several partitions located on one or more physical disks into one logical volume. This may be necessary, for example, to store large database files that do not fit on one physical disk, or to create a directory with a total volume of files that exceeds the size of the physical disk.

Sets created from several partitions or physical disks are called Volume Set (in Microsoft Windows NT terminology) or Spanned Volume (in Windows 2000/XP terminology).

Packing files

To save disk space, you can use the ability of NTFS to pack (compress) files. In addition, NTFS allows you to create so-called sparse (sparse) files that contain areas of null data. Such files can be large but take up little disk space because only the significant bytes of the file are actually stored.

Note that packing files will result in some slowdown. This circumstance, however, will not always matter. For example, office documents can be packed without a noticeable decrease in speed, but the same cannot be said about database files that are accessed simultaneously by a large number of users. With relatively inexpensive, high-capacity discs on the market, packaging media should only be used when really needed. This, however, applies to other NTFS features as well.

Multithreaded files

If necessary, several streams of information can be stored in one file recorded on an NTFS disk. This allows, in particular, to supply document files with additional information, store several versions of documents in one file (for example, in different languages), store in separate streams of one file programming code and data, etc.

hard ties

Hard links (hard links) allow you to assign several different names to one physical file by placing these names (ie links to the file) in different directories. Deleting a link does not delete the file itself. Only when all links of the file are destroyed will the file itself be deleted.

Note that such features are typical for file systems used in Unix-like operating systems, for example, in Linux, FreeBSD, etc.

Override points

NTFS system objects such as reparse points allow you to override any file or directory. In this case, for example, rarely used overridden files or directories can actually be stored on magnetic tape, loaded to disk only when necessary.

Transitions

Using NTFS transitions, you can mount another hard drive or CD into the drive's directory. This feature originally existed in the file systems of Unix-like operating systems.

Disk space quota

The NTFS file system, used in Microsoft Windows 2000/XP, allows you to quote or limit the disk space available to users. This feature is especially useful when creating file servers.

Change Logging

In the course of its work, the operating system performs various actions on files (creation, modification, deletion). All such changes are stored in a special journal created on an NTFS volume and can be used by programs Reserve copy, indexing systems, etc. Logging changes increases the reliability of the file system, allowing in some cases to continue working after non-critical failures of the operating system and hardware. Although, of course, most serious failures result in the need to recover data from backup or using special utilities data recovery.

NTFS limitations

Despite the abundance of features, the NTFS file system also has some limitations. However, in most cases they do not play a significant role.

The maximum size of an NTFS logical drive is approximately 18,446,744 TB, which is obviously enough for all modern applications, as well as applications that will appear in the near future. The maximum file size is even larger, so this limitation is also not significant.

There is no limit to the number of files stored in a single NTFS directory, so this also has an advantage over FAT.

Comparison of NTFS and FAT for file access speed

From a perspective point of view, functionality, security and reliability NTFS is far ahead of FAT. However, comparing the performance of these file systems does not give an unambiguous result, since performance depends on many different factors.

Because FAT is much simpler in operation and internal structures than NTFS, FAT is likely to be faster when dealing with small directories. However, if the contents of the directory are so small that they fit entirely within one or more $MFT file entries, or vice versa, if the directory is very large, NTFS will "win".

The palm will most likely go to NTFS when searching for non-existent files or directories (because it does not require a complete scan of the contents of the directory), when accessing small files (on the order of hundreds of bytes), and also in case of severe disk fragmentation.

To increase the performance of NTFS, you can increase the cluster size, but this can lead to wasteful use of disk space when storing a large number of files that are larger than 1-2 KB and amount to tens of KB. By increasing the cluster size to 64 KB, you can get the maximum performance improvement, but you will have to forego packing files and using defragmentation utilities.

Packing files located on small disks (about 4 GB) may increase performance, while compressing large disks may decrease performance. In any case, the packaging will cause additional load on the CPU.

So what to choose - FAT or NTFS?

As you can see, NTFS has numerous advantages over FAT, and its limitations are negligible in most cases. If you are faced with choosing a file system, consider using NTFS first and FAT second.

What might be the barriers to replacing FAT with NTFS?

The most serious obstacle is the need to use Microsoft Windows NT/2000/XP. For normal operation of this OS, at least 64 MB of RAM and a processor with a clock speed of at least 200-300 MHz are required. However, these requirements are not met only by very old computers that are not capable of running modern versions of Microsoft Windows.

If your computer can run under Microsoft Windows 2000/XP, and you do not have a single application designed exclusively for Microsoft Windows 95/98/ME, we recommend that you switch to a new operating system as soon as possible, replacing this FAT to NTFS.

At the same time, you will also get a noticeable increase in the reliability of work, because. after installing all the necessary service packs (Service Pack), as well as the correct driver versions peripherals, Microsoft Windows 2000/XP will run very stable.

In some cases, you have to combine several file systems within one physical disk. For example, if your computer has three operating systems Microsoft Windows ME, Microsoft Windows XP and Linux, you can create three file systems - FAT, NTFS and Ext2FS. The first of them will be "visible" when working in Microsoft Windows ME and Linux, the second - only in Microsoft Windows XP, and the third - only in Linux (note that in LINUX there is also the possibility of accessing NTFS partitions).

But if you are creating a server (file, database or Web) based on Microsoft Window NT/2000/XP, then NTFS is the only reasonable choice. Only in this case it will be possible to achieve the necessary stability, reliability and security of the server.

There is also a generally accepted (and, in our opinion, erroneous) opinion that home computer users do not need either the Microsoft Window NT/2000/XP operating system or the NTFS file system.

Of course, if the computer is used exclusively for gaming, for compatibility reasons, it is best to install Microsoft Windows 98/ME and format the drives in FAT. However, if you work not only in the office, but also at home, it is better to use modern, professional and reliable solutions. This will allow, in particular, to organize protection against intrusion on your computer via the Internet, restrict access to directories and files with critical data, and also increase the chances of successful information recovery in the event of various kinds of failures.