Linux Storage: Filesystems Introduction
Linux Filesystem
In this article, our focus is files and filesystems. The UNIX philosophy of "everything is a file" continues to hold true in Linux. Although it's not an absolute rule, most resources in Linux are treated as files. These files can encompass a wide range of content, from the text of a school assignment to the humorous GIF you download (from a source you trust, of course).
Linux also extends this notion to encompass other elements, including devices and pseudo-devices. For instance, consider the command echo "Hello modern Linux users" > /dev/pts/0
, which displays the message Hello modern Linux users on the screen. While you might not typically think of these resources as files, you can interact with them using the same methods and tools familiar from dealing with regular files. As an example, the kernel exposes specific runtime information for a process, such as its PID (Process ID) or the binary used to execute the process.
VFS - Basic Concepts
Let's first delve into more precise definitions of essential terms:
Drive
A drive refers to a physical block device, which could be a hard disk drive (HDD) or a solid-state drive (SSD). In the context of virtual machines, drives can also be emulated, such as /dev/sda (SCSI device), /dev/sdb (SATA device), or /dev/hda (IDE device).
For example:
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 232.9G 0 disk
├─sda1 8:1 0 500M 0 part /boot
├─sda2 8:2 0 1.5G 0 part [SWAP]
└─sda3 8:3 0 231.9G 0 part /
Partition
Partitions are logical divisions within drives, consisting of a set of storage sectors. For instance, you might decide to create two partitions on your HDD, resulting in /dev/sdb1 and /dev/sdb2.
$ fdisk -l
Disk /dev/sda: 232.9 GiB, 250059350016 bytes, 488397168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Device Start End Sectors Size Type
/dev/sda1 2048 1026047 1024000 500M EFI System
/dev/sda2 1026048 3074047 2048000 1G Linux swap
/dev/sda3 3074048 488397134 485323087 231.9G Linux filesystem
Volume
A volume bears similarity to a partition but offers greater flexibility. It is also formatted for a specific filesystem. We will explore volumes in more depth in the section titled Logical Volume Manager.
$ lvdisplay
--- Logical volume ---
LV Path /dev/vg01/lv01
LV Name lv01
VG Name vg01
LV UUID hW4pH6-ZHvm-DcTg-M64Y-JePM-1PfX-CpNdqH
LV Write Access read/write
LV Creation host, time myserver, 2023-01-01 12:00:00 +0000
LV Status available
Super Block
Filesystems, when formatted, contain a distinct section at the beginning dedicated to storing metadata about the filesystem. This metadata includes details like the filesystem type, block configuration, state, and the number of inodes per block.
$ dumpe2fs /dev/sda3 | grep "superblock"
Primary superblock at 0, Group descriptors at 1-1
Backup superblock at 32768, Group descriptors at 32769-32769
Backup superblock at 98304, Group descriptors at 98305-98305
Inodes
In a filesystem, inodes serve as repositories for metadata concerning files, encompassing information such as size, owner, location, date, and permissions. It's important to note that inodes do not store the actual filename or the file's data; these details are maintained in directories, which essentially function as specialized regular files, mapping inodes to filenames.
$ ls -i <filename>
12345 filename.txt
Below is a table that include some command filesystem commands:
VFS - Linux Virtual Filesystem
Linux employs an abstraction known as the Virtual File System (VFS) to grant file-like access to various types of resources, whether they are stored in memory, locally attached, or accessible over a network.
The fundamental concept behind VFS is to introduce a layer of separation between clients (making system calls) and the individual filesystems responsible for executing operations on specific devices or resources. In essence, VFS decouples the standard operations like open, read, and seek from the intricate implementation details.
Within the kernel, VFS serves as an abstraction layer that offers clients a consistent method for interacting with resources, all built around the file concept. In the Linux environment, a file is devoid of any prescribed structure; it merely represents a sequence of bytes. The interpretation of these bytes is left entirely to the discretion of the client, allowing for flexibility in how the data is utilized.
Linux's Virtual File System (VFS) abstracts access to different kinds of filesystems by providing a unified and consistent interface for interacting with these filesystems. Here's how VFS accomplishes this abstraction:
- Common Interface: VFS defines a standard set of system calls and data structures that applications and system components use to interact with files and directories. These include operations like open, read, write, close, stat, and many others. Regardless of the underlying filesystem, applications can use these common interfaces to perform file-related tasks.
- File Abstraction: VFS abstracts the concept of a "file." In Linux, a file is not limited to regular files; it can represent various resources, including regular files, directories, devices, sockets, and more. VFS treats all of these resources as files, providing a consistent way to access and manipulate them.
- Filesystem Drivers: Each specific filesystem, whether it's ext4, NTFS, FAT, or any other, has its own filesystem driver that interfaces with VFS. These drivers implement the low-level details of how data is stored, organized, and retrieved on a particular filesystem. VFS acts as a bridge between these drivers and the applications making system calls.
- Filesystem Registration: When the Linux kernel boots, it loads the necessary filesystem drivers based on the filesystems present on the system. These drivers register themselves with VFS, informing it of their capabilities and how to interact with them.
- Mounting: VFS allows multiple filesystems to be "mounted" at different mount points within the directory hierarchy. When a filesystem is mounted, its driver registers with VFS, and the directory tree below the mount point becomes part of that filesystem. This enables Linux to support a variety of filesystem types simultaneously.
In summary, Linux's VFS abstracts access to different filesystems by presenting a common interface to applications and system components while delegating the actual filesystem-specific operations to the appropriate filesystem drivers. This abstraction allows Linux to support a wide range of filesystems seamlessly, making it a versatile and flexible operating system.
Common Filesystem Operations
Creating filesystems
To utilize a filesystem, the initial action involves its creation, which entails configuring the essential components that constitute the filesystem, utilizing a partition or volume as the input source. Once you've gathered all the necessary details, you can employ the mkfs command to establish the filesystem.
mkfs stands for "make filesystem." It is a command-line tool that is available on most Linux distributions. The primary purpose of mkfs is to format a block device, such as a partition or a volume, into a specific filesystem format, allowing it to store and manage files and directories. To create a filesystem using the mkfs command, you need to specify the filesystem type you want to create, the target device (partition or volume), and any optional parameters. The general syntax of the mkfs command is as follows:
$ mkfs -t filesystem_type device
For example:
$ mkfs -t ext4 /dev/sdb1
...
mke2fs 1.45.5 (07-Jan-2020)
Creating filesystem with 52428800 4k blocks and 13107200 inodes
Filesystem UUID: 9c7acdc2-4c19-4c15-bf5c-6c21b74a0b23
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912
Allocating group tables: done
Writing inode tables: done
Creating journal (262144 blocks): done
Writing superblocks and filesystem accounting information: done
...
The above command create an Ext4 filesystem on a partition. Once you have created the filesystem with mkfs, you can then make it available in the filesystem tree.
Mounting filesystems
Mounting a filesystem involves connecting it to the existing filesystem hierarchy, which begins at the root (/) directory. You can achieve this by using the mount command, which requires two primary inputs:
- The device you want to attach
- The location within the filesystem hierarchy where it should be attached.
$ mount [options] device directory
Additionally, you can specify additional options, such as read-only mode using -o, or create bind mounts using --bind to integrate directories into the filesystem structure. We will explore bind mounts further in the context of containers.
$ mount /dev/sdb1 /mnt/data
In this example, we are mounting the device /dev/sdb1 to the directory /mnt/data. The -o option can be used to specify additional mount options, such as read-only mode or user permissions.
To mount a network share:
mount -t cifs //server/share /mnt/networkshare -o username=myuser,password=mypassword
Here, we are mounting a network share using the CIFS (Common Internet File System) filesystem type. We specify the server and share location as //server/share and the mount point as /mnt/networkshare. Additionally, we provide authentication details using the -o option.
Check mounted filesystems
$ mount -t ext4,tmpfs 1
tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=797596k,mode=755)
/dev/mapper/elementary--vg-root on / type ext4 (rw,relatime,errors=remount-ro) 2
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755)
Moreover, it's important to note that mounts established using the mount command are temporary and valid only for the duration of the system's runtime. To ensure the persistence of these mounts across system reboots, you must configure them in the /etc/fstab file. For example:
# /etc/fstab: static file system information.
#
# <file system> <mount point> <type> <options> <dump> <pass>
/dev/sda1 / ext4 defaults 0 1
/dev/sdb1 /data ext4 rw,user 0 2
- <file system>: The block device or remote filesystem to be mounted.
- <mount point>: The directory where the filesystem will be mounted.
- <type>: The type of filesystem (e.g., ext4, iso9660).
- <options>: Mount options (e.g., defaults, rw for read-write, ro for read-only).
- <dump>: Used by the dump command to determine whether to back up the filesystem (usually set to 0).
- <pass>: Used by the fsck command to determine the order of filesystem checks during boot (usually set to 1 for the root filesystem and 2 for others).
Common Filesystem Layouts
Linux has a common filesystem layout that organizes files and directories in a structured manner. Understanding this layout is essential for navigating and managing a Linux system effectively. Here's an explanation of some common directories in the Linux filesystem:
The above directories, along with their respective purposes, form the core of the Linux filesystem layout.
Linux Pseudo Filesystems
Pseudo filesystems, often referred to as virtual filesystems, are a fundamental part of the Linux kernel that allows access to kernel data structures, hardware devices, and system-related information as if they were regular files and directories. These filesystems are essential for exposing critical system data to users, processes, and system utilities.
Below are some common Pseudo Filesystems in Linux
/proc - Process Information
The /proc pseudo filesystem provides a dynamic view of running processes and kernel parameters. It allows users to access detailed information about processes, CPU and memory usage, system configurations, and more. Each process is represented by a directory with its PID (Process ID).
~] ls /proc
1 1469 1580 16606 19 39 762 crypto kpagecgroup slabinfo
10 14698 15848 1664 2 4 763 devices kpagecount softirqs
11 1470 15851 1675 20 40 8 diskstats kpageflags stat
12 14701 15852 16752 21 41 813 dma latency_stats swaps
120 1482 15873 16756 23 42 8394 driver loadavg sys
122 1484 15921 16757 257 43 8409 dynamic_debug locks sysrq-trigger
12220 1485 15976 168 26 5 8496 execdomains mdstat sysvipc
124 1487 15985 1685 27 51 8506 filesystems meminfo thread-self
1244 1489 16 1712 28 6 8619 fs misc timer_list
1278 1492 16053 1725 29 69 8632 interrupts modules tty
13 1497 16319 1731 3 70 8704 iomem mounts uptime
1314 1498 16523 1732 30 73 acpi ioports mtrr version
132 15 16571 1733 32 755 buddyinfo irq net vmallocinfo
133 1502 16572 1739 33 756 bus kallsyms pagetypeinfo vmstat
1370 1506 16573 1778 34 757 cgroups kcore partitions zoneinfo
14 1507 16575 178 35 758 cmdline key-users pressure
1401 1521 16578 179 36 759 consoles keys schedstat
14338 1549 16579 18 37 761 cpuinfo kmsg self
/sys - Kernel Parameters
The /sys pseudo filesystem offers access to kernel parameters and configurations. It exposes device and system-related information, including hardware settings, power management, and kernel module configurations.
~] ls /sys
block bus class dev devices firmware fs hypervisor kernel module power
/dev - Device Files
While not entirely a pseudo filesystem, the /dev directory contains special device files that serve as interfaces to hardware devices. These files allow users and applications to interact with hardware devices as if they were files, making it an essential part of Linux device management.
~] ls /dev
autofs loop-control nvram tty15 tty32 tty5 ttyS0 vcsa4
block loop0 ptmx tty16 tty33 tty50 ttyS1 vcsa5
btrfs-control loop1 pts tty17 tty34 tty51 ttyS2 vcsa6
char loop2 random tty18 tty35 tty52 ttyS3 vcsu
console loop3 rtc tty19 tty36 tty53 uhid vcsu1
core loop4 rtc0 tty2 tty37 tty54 uinput vcsu2
cpu loop5 shm tty20 tty38 tty55 urandom vcsu3
cpu_dma_latency loop6 snapshot tty21 tty39 tty56 userfaultfd vcsu4
cuse loop7 stderr tty22 tty4 tty57 vcs vcsu5
disk mapper stdin tty23 tty40 tty58 vcs1 vcsu6
fd mqueue stdout tty24 tty41 tty59 vcs2 vfio
full net tty tty25 tty42 tty6 vcs3 vhost-net
fuse ng0n1 tty0 tty26 tty43 tty60 vcs4 vhost-vsock
hpet null tty1 tty27 tty44 tty61 vcs5 xvda
hugepages nvme0 tty10 tty28 tty45 tty62 vcs6 xvda1
initctl nvme0n1 tty11 tty29 tty46 tty63 vcsa xvda127
input nvme0n1p1 tty12 tty3 tty47 tty7 vcsa1 xvda128
kmsg nvme0n1p127 tty13 tty30 tty48 tty8 vcsa2 zero
log nvme0n1p128 tty14 tty31 tty49 tty9 vcsa3
/run - Runtime Data
The /run directory stores runtime data and system state information. It is commonly used for system services and applications to store temporary files and runtime data that persists across reboots.
~] ls /run
acpid.socket chrony dbus irqbalance mount sshd.pid user
agetty.reload chrony.d faillock lock rpcbind sssd.pid utmp
amazon-ec2-net-utils cloud-init gssproxy.pid log screen sudo
atd.pid console gssproxy.sock lsm sepermit systemd
auditd.pid credentials initctl motd setrans tmpfiles.d
blkid cryptsetup initramfs motd.d sm-notify.pid udev
/tmp - Temporary Files
The /tmp directory is used for temporary files created by various processes. It is typically cleared upon system reboot and provides a convenient location for programs to create and manage temporary data.
~] ls /tmp/
pyright-15985-9EBarp4ZNzX6
python-languageserver-cancellation
systemd-private-1d0916b7fbfb43c4845b381cb85e2885-chronyd.service-HMONVW
systemd-private-1d0916b7fbfb43c4845b381cb85e2885-dbus-broker.service-U5nshu
systemd-private-1d0916b7fbfb43c4845b381cb85e2885-policy-routes@ens5.service-T1MlC0
systemd-private-1d0916b7fbfb43c4845b381cb85e2885-systemd-logind.service-jE798m
systemd-private-1d0916b7fbfb43c4845b381cb85e2885-systemd-resolved.service-6zqxRV
Linux pseudo filesystems play a vital role in providing transparency and accessibility to kernel data and system information. They enable users and system administrators to interact with the kernel and hardware devices through familiar file and directory structures. Understanding and utilizing these virtual filesystems is essential for efficient system monitoring, debugging, and configuration on Linux-based systems.