Why so many filesystems for Linux? What’s the difference?
Why so Many File Systems ?
There are three main reasons why there are so many File Systems on Linux:
* It’s open source: effectively everyone owns it.
* File Systems competing for better performance and or scalability.
* File Systems allowing for compatibility/portability of existing data (migrations from other systems).
Open
source means anyone can contribute their value, and they have. This has
made available about 20 different file systems for Linux. Ranging from
very rudimentary simple file systems to extremely complex and rich file
systems. As storage needs have grown, there has been the need for
increasing scalability in file systems. This second reason for so many
has led to file systems which claim to run faster, handle more files,
scale to larger volumes, and can handle more concurrent access to data.
Lastly, as mainframe and mini computer systems have given way to less
expensive Intel Architecture based commodity PC servers running Linux
as well as moving from non-Linux PC operating systems to Linux, the
need to preserve access to existing data that was stored on those other
systems has resulted in additional file systems which understand that
data and storage.
File System Comparison
The
following list describes the Linux file system characteristics and
indicates when this file system is best used. This list is not
exhaustive of all the file systems available in the world, but focuses
on those which have appreciable market share or attention in the market
today. A detailed comparison of file system features can be found at: http://en.wikipedia.org/wiki/Comparison_of_file_systems
- EXT2
- Recommended to move to EXT3
- Not Journaled
- POSIX access control
EXT2
file system is the predecessor to the EXT3 file system. EXT2 is not
journaled, and hence is not recommended any longer (customers should
move to EXT3).
- EXT3
- Most popular Linux file system, limited scalability in size and number of files
- Journaled
- POSIX extended access control
EXT3
file system is a journaled file system that has the greatest use in
Linux today. It is the "Linux" File system. It is quite robust and
quick, although it does not scale well to large volumes nor a great
number of files. Recently a scalability feature was added called
htrees, which significantly improved EXT3’s scalability. However it is
still not as scalable as some of the other file systems listed even
with htrees. It scales similar to NTFS with htrees. Without htrees,
EXT3 does not handle more than about 5,000 files in a directory.
- FAT32
- Most limited file system, but most ubiquitous
- Not Journaled
- No access controls
FAT32
is the crudest of the file systems listed. It’s popularity is with its
widespread use and popularity in the Windows desktop world and that it
has made its way into being the file system in flash RAM devices
(digital cameras, USB memory sticks, etc.). It has no built in security
access control, so is small and works well in these portable and
embedded applications. It scales the least of the file systems listed.
Most systems have FAT32 compatibility support due to its ubiquity.
- GFS
- Useful in clusters for moderate scale out and shared SAN volumes
- Symmetrical Parallel Cluster File System, Journaled
- POSIX access controls
The
RedHat Global File System (Sistina acquisition) was open sourced in mid
2004. It is a parallel cluster file system (symmetrical) which allows
multiple machines to access common data on a SAN (Storage Area
Network). This is important for allowing multiple machines access to
the same data to ease management (such as common configuration files
between multiple webservers). It also allows applications and services
which are written to direct disk access to be scaled out to multiple
nodes. The practical limit is 16 machines in a SAN cluster, however.
- GPFS
- Useful in clusters for scaleout of large files on shared SAN volumes
- Symmetrical Parallel Cluster File System, Journaled
- POSIX access controls
The
IBM Global Parallel File System is closed source from IBM. It, like
GFS, is a parallel cluster file system with similar characteristics to
GFS. GPFS works best when the node count is less than 10 and the files
being accessed are few and large. Video editing is the sweet spot for
GPFS. It’s practical limit is 20 machines in a SAN cluster. GPFS also
includes very rich management features, such as Hierarchical Storage
Management.
- JFS
- High performance and scalability
- Journaled
- POSIX extended access controls
The
IBM Journaled File System is the file system used by IBM in AIX and
OS/2. It is a feature rich file system ported to Linux to allow for
ease of migration of existing data. It has been shown to provide
excellent overall performance across a variety of workloads.
- NSS
- Best for shared LAN file serving, excellent scalability in number of files
- Journaled
- NetWare Trustee access control (richer than POSIX)
The
Novell Storage Services file system used in NetWare 5.0 and above, and
most recently open sourced and included in Novell SUSE’s SLES 9 SP1
Linux distribution and later (used in Novell’s Open Enterprise Server
Linux product). The NSS file system is unique in many ways, mostly in
its ability to manage and support shared file services from
simultaneous different file access protocols. It is designed to manage
access control (using a unique model, called the Trustee Model, that
scales to hundreds of thousands of different users accessing the same
storage securely) in enterprise file sharing environments. It and its
predecessor (NWFS) are the only file systems that can restrict the
visibility of the directory tree based on UserID accessing the file
system. It and NWFS have built-in ACL rights inheritance. It includes
mature and robust features tailored for the file sharing environment of
the largest enterprises. The file system also scales to millions of
files in a single directory. NSS supports multiple data streams and
rich metadata (its features are a superset of existing filesystems on
the market for data stream, metadata, namespace, and attribute
support).
- NTFS
- The Windows file system, best for workgroup shared LAN file serving
- Journaled
- Windows access controls (richer than POSIX)
The
Microsoft Windows file system for the Windows NT kernel (Windows NT,
Windows 2000, Windows XP, and Windows 2003). The Linux OpenSource
version of this filesystem is only capable of read-only of existing
NTFS data. This allows for migration from Windows and access to Windows
disks. NTFS includes an ACL model which is not POSIX. The NTFS ACL
model is unique to Microsoft, but is a derivative of the Novell NetWare
2.x ACL model. NTFS is the default (and virtually only option) on
Windows servers. It includes rich metadata and attribute features. NTFS
also supports multiple data streams and ACL rights inheritance since
its Windows 2000 implementation. In Windows 2003 R2, Microsoft included
a feature called "Access Based Enumeration". This is similar to
visibility in NSS and NWFS, but is not implemented in the file system
layer, but rather as a feature of the CIFS protocol engine in Windows
2003 R2, so this feature is only available when accessing Windows 2003
via the CIFS protocol. See CIFS below.
- NWFS
- Recommended move to NSS
- Not Journaled
- NetWare Trustee access control (richer than POSIX)
The
NetWare [traditional] File System is used in NetWare 3.x through 5.x as
the default file system, and is supported in NetWare 6.x for
compatibility. It is one of the fastest file systems on the planet,
however it does not scale, nor is it journaled. An Open Source version
of this file system is available on Linux to allow access to its file
data. However, the OSS version lacks the identity management tie-ins so
it has found little utility. Customers of NWFS are encouraged to
upgrade to NSS.
- OCFS2
- Useful in Database clusters for scaleout and moderate scaleout on shared SANs
- Symmetrical Parallel Cluster File System, Journaled
- POSIX access controls
The
Oracle Cluster File System v2 is a symmetrical parallel cluster file
system specifically designed to support the Oracle Real Application
Clusters (RAC) Database. While it supports general file access, it does
not scale in number of files (like EXT3 without htrees). It is the
first symmetrical parallel cluster file system to be accepted into the
Linux Mainline Kernel (January 2006).
- PolyServe Matrix Server
- The best file system for cluster scaleout
- Symmetrical Parallel Cluster File System, Journaled
- POSIX access controls
Matrix
Server is a symmetrical parallel cluster file system for Linux (and
Polyserve has a version for Windows servers as well). Rooted in
technology from Sequent Computers, Matrix server is the premier
parallel cluster file system on Linux today. It boasts order of
magnitude performance over competing cluster parallel filesystems (GFS,
GPFS, OCFS2 etc.). It should be used when parallel cluster file system
scaling is needed.
- ReiserFS
- Best performance and scalability when number of files is great and/or files are small
- Journaled
- POSIX extended access controls
The
Reiser File System is the default file system in SUSE Linux
distributions. Reiser FS was designed to remove the scalability and
performance limitations that exist in EXT2 and EXT3 file systems. It
scales and performs extremely well on Linux, outscaling EXT3 with
htrees. In addition, Reiser was designed to very efficiently use disk
space. As a result, it is the best file system on Linux where there are
a great number of small files in the file system. As collaboration
(email) and many web serving applications have lots of small files,
Reiser is best suited for these types of workloads.
- VxFS
- Best for migrations from Unix to Linux
- Journaled (an asymmetric parallel cluster file system version is also available)
- POSIX access controls
The
Veritas File System is closed source. The Veritas full storage suite is
essentially the Veritas File system that is popular on Unix (including
Solaris). Approximately 70% of Unix deployments in the world are ontop
of the Veritas File System. As a result, this file system is one of the
best to be used when data is to be directly migrated from Unix to
Linux, and when training in volume and filesystem management is to be
preserved within the IT staff. The Vertias File System has excellent
scalability characteristics, just like it has on Unix systems. Veritas
has recently ported their cluster version of VxFS to Linux. Their
cluster parallel filesystem (cVxFS) is an asymmetric model, where one
node is the master, and all other nodes are effectively read-only
slaves (they can write through the master node).
- XFS
- Best for extremely large file systems, large files, and lots of files
- Journaled (an asymmetric parallel cluster file system version is also available)
- POSIX extended access controls
The
XFS file system is Open Source and included in major Linux
distributions. It originated from SGI (Irix) and was designed
specifically for large files and large volume scalability. Video and
multi-media files are best handled by this file system. Scaling to
petabyte volumes, it also handles great deals of data. It is one of the
few filesystems on Linux which supports Data Migration (SGI contributed
the Hierarchical Storage Management interfaces into the Linux Kernel a
number of years ago). SGI also offers a closed source cluster parallel
version of XFS called cXFS which like cVxFS is an asymmetrical model.
It has the unique feature, however, that it’s slave nodes can run on
Unix, Linux and Windows, making it a cross platform file system. Its
master node must run on SGI hardware.
2 评论: (+add yours?)
A good article to cross reference is here.
www.novell.com/connectionmagazine/2006/q4/tech_talk_4.html
Post a Comment