📓 Archive

EXT2

FGJ: Create:2024/10/14 Update: [2024-12-18]

  • Intro(EXT2-FS) #

    • 注意 #

      Caution

      a).: 上述的图例是模拟 1M 的 ext2 文件系统,采用 block = 1024B ,inode size = 128B,进行划分的。
      b).: 转换成数值的时候是小端读取。比如 [80 00 00 00] ==int=> 0x00000080 = 0x80 = 128。
      c).: bitmap 采用的是 (MSB)10001001(LSB),参考 block-bitmap 中的部分描述、以及 wiki,比如 10000111,指的是第1,2,3,8个,不是1,6,7,8。

    • 准备 #

      在 Ubuntu-22.04 上准备两个文件系统,一个干净的 fs1m,一个需要写入数据作对比 fs1m-with-datafs1m 备份下载fs1m-with-data 备份下载
      创建一个1M的文件:dd if=/dev/zero of=fs1m count=256 bs=4K
      格式化文件系统:mke2fs -b 1024 -I 128 fs1m
      查看格式化后的信息:dumpe2fs fs1m

      复制 fs1mfs1m-with-data
      挂载:mount -o loop fs1m-with-data /mnt
      进入目录并写入数据:cd /mnt && mkdir 12302 && cd 12302 && echo 'something' > Hello.java
      卸载:umount /mnt

      使用xxd查看十六进制信息:xxd -a -u -g1 -s 1024 -l 16 fs1m //使用*代替连续00,大写字母输出,一个字节分割,offset=1024, len=16。

    • 解析 #

      • 普通属性的解析 #

        # 读取 boot block 里面的内容,发现都是 00.
        ➜  data git:(master) ✗ xxd -a -u -g1 -s `echo '1024 * 0' | bc` -l `echo '1024' | bc` fs1m
        00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        *
        000003f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        
        
        
        
        
        # 读取 super block 里面的前 264 个字节
        # https://www.nongnu.org/ext2-doc/ext2.html#superblock-structure
        # s_inodes_count = (0x80) = 128(个)
        # s_blocks_count = (0x400) = 1024(个)
        ➜  data git:(master) ✗ xxd -a -u -g1 -s `echo '1024 * 1' | bc` -l `echo '264' | bc` fs1m
        00000400: 80 00 00 00 00 04 00 00 33 00 00 00 DA 03 00 00  ........3.......
        00000410: 75 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00  u...............
        00000420: 00 20 00 00 00 20 00 00 80 00 00 00 00 00 00 00  . ... ..........
        00000430: 4B 8D 0C 67 00 00 FF FF 53 EF 01 00 01 00 00 00  K..g....S.......
        00000440: 4B 8D 0C 67 00 00 00 00 00 00 00 00 01 00 00 00  K..g............
        00000450: 00 00 00 00 0B 00 00 00 80 00 00 00 38 00 00 00  ............8...
        00000460: 02 00 00 00 03 00 00 00 F4 24 E0 D7 33 DF 41 49  .........$..3.AI
        00000470: 9B E9 F5 4B 47 00 86 29 00 00 00 00 00 00 00 00  ...KG..)........
        00000480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        *
        000004c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 03 00  ................
        000004d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        000004e0: 00 00 00 00 00 00 00 00 00 00 00 00 77 9A 23 0D  ............w.#.
        000004f0: 64 91 41 16 93 8C 22 CF FB 46 9E 62 01 00 00 00  d.A..."..F.b....
        00000500: 0C 00 00 00 00 00 00 00                          ........
        
        
        
        
        
        # 读取 GDT 块组描述中的前 32 个字节
        # bg_block_bitmap[4] = 06 00 00 00 = 6
        # bg_inode_bitmap[4] = 07 00 00 00 = 7
        # bg_inode_table[4] = 08 00 00 00 = 8
        # bg_free_blocks_count[2] = DA 03 = 986
        # bg_free_inodes_count[2] = 75 00 = 117
        # bg_used_dirs_count[2] = 02 00 = 2
        ➜  data git:(master) ✗ xxd -a -u -g1 -s `echo '1024 * 2' | bc` -l `echo '32' | bc` fs1m
        00000800: 06 00 00 00 07 00 00 00 08 00 00 00 DA 03 75 00  ..............u.
        00000810: 02 00 04 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        
        
        
        
        
        # 读取 GDT 保留块的中的的数据,发现都是 00 。上一个块是第 3 个,且里面的 block bitmap 是从 index=6 开始的,所以,3,4,5 都是 GDT 保留块
        ➜  data git:(master) ✗ xxd -a -u -g1 -s `echo '1024 * 3' | bc` -l `echo '1024 * 3' | bc` fs1m
        00000c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        *
        000017f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        
        
        
        
        
        # 读取 block bitmap 中的数据。通过 super block 中的属性 `s_blocks_count = 1024`个,所以虽然 block bitmap 占了一个块,但是只用了 1024 个bit,也就是 128 个字节,正好八行,我们可以打印十行看看。
        # 发现除了前八行后,后面还会有 FF 之类的。完全不用关心这些。
        # 第一行的 FF FF FF FF 1F 表示前 37 个 block 被占用了。(需要注意 1F(0b00011111)是从右往左算的)
        # 最后一行还有个 80 表示 (1000 0000),有 1 个block 被占用。
        # 总共 1024 ,占用 37 + 1 = 38, 空闲 986,与 bg_free_blocks_count 相等。
        ➜  data git:(master) ✗ xxd -a -u -g1 -s `echo '1024 * 6' | bc` -l `echo '16 * 10' | bc` fs1m
        00001800: FF FF FF FF 1F 00 00 00 00 00 00 00 00 00 00 00  ................
        00001810: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        *
        00001870: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80  ................
        00001880: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF  ................
        00001890: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF  ................
        
        
        
        
        
        # 读取 inode bitmap 中的数据。通过 super block 中的属性 `s_inodes_count = 128`个,所以虽然 block bitmap 占了一个块,但是只用了 128 个bit,也就是 16 个字节,正好一行,我们可以打印两行看看。
        # 与 block bitmap 类似,除了我们需要的第一行,其他的不用关心。
        # 第一行的 FF 07 表示前 11 个 inode 被占用了。 前10个 inode 是被 ext2 文件系统保留的,其中第 2 个 inode 是根目录,第 11 个 inode 是lost+found 目录。
        # 总共 128 ,占用 11, 空闲 117,与 bg_free_inodes_count 相等。
        ➜  data git:(master) ✗ xxd -a -u -g1 -s `echo '1024 * 7' | bc` -l `echo '16 * 2' | bc` fs1m
        00001c00: FF 07 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        00001c10: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF  ................
        
        
        
        
        
        # 读取 inode table 中的数据。因为 `s_inodes_count = 128`个 ,且 `s_inode_size = 128B`,一个block = 1024B ,所以一个 block 可以装 1024B / 128B = 8 个 inode。 128个就需要 128 / 8 = 16 块。 也就是 index[8,24)。
        # 假设我们要读取第 2 个 inode,则如下打印:
        # 参考 https://www.nongnu.org/ext2-doc/ext2.html#inode-table
        # i_mode[2] = ED 41 = 0o40755
        # i_uid[2] = 00 00 = 0
        # i_size[4] = 00 04 00 00 = 1024
        # i_atime[4] = 4B 8D 0C 67 = 1728875851 = 2024-10-14 11:17:31
        # i_ctime[4] = 4B 8D 0C 67
        # i_mtime[4] = 4B 8D 0C 67
        # i_dtime[4] = 00 00 00 00
        # i_gid[2] = 00 00
        # i_links_count[2] = 03 00
        # i_blocks[4] = 02 00 00 00
        # i_flags[4] = 00 00 00 00
        # i_osd1[4] = 00 00 00 00
        # i_block[15 * 4]; (18 00 00 00 ....... 00 00),使用 15 个 32bit 整形值 来表示 inode 对应的数据块,前12个直接块,13一级,14二级,15三级
        ➜  data git:(master) ✗ xxd -a -u -g1 -s `echo '1024 * 8 + 128' | bc` -l `echo '128' | bc` fs1m
        00002080: ED 41 00 00 00 04 00 00 4B 8D 0C 67 4B 8D 0C 67  .A......K..gK..g
        00002090: 4B 8D 0C 67 00 00 00 00 00 00 03 00 02 00 00 00  K..g............
        000020a0: 00 00 00 00 00 00 00 00 18 00 00 00 00 00 00 00  ................
        000020b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        *
        000020f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        
        # 去第 block[0] =(18 00 00 00)= 24 块中看看数据。
        # 参考: https://www.nongnu.org/ext2-doc/ext2.html#linked-directory-entry-structure
        # inode[4] = 02 00 00 00 = 2
        # rec_len[2] = 0C 00 = 12
        # name_len[1] = 01 = 1
        # file_type[1] = 02 = 2
        # name[0-255] = 2E = 64 ==> ascii [.]
        # 所以此块有三个目录(.|..|lost+found) ,当前只分析了[.]目录,后面的两个同理。
        ➜  data git:(master) ✗ xxd -a -u -g1 -s `echo '1024 * 24' | bc` -l `echo '1024' | bc` fs1m
        00006000: 02 00 00 00 0C 00 01 02 2E 00 00 00 02 00 00 00  ................
        00006010: 0C 00 02 02 2E 2E 00 00 0B 00 00 00 E8 03 0A 02  ................
        00006020: 6C 6F 73 74 2B 66 6F 75 6E 64 00 00 00 00 00 00  lost+found......
        00006030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        *
        000063f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        
      • 带有数据的解析 #

        # 目录结构:
        root@12302:~# tree /mnt
        /mnt
        ├── 12302
        │   └── Hello.java
        └── lost+found
        
        
        
        # 从根目录出发 / = inode = 2
        # block = block[0] = 18 00 00 00 = 24
        ➜  data git:(master) ✗ xxd -a -u -g1 -s `echo '1024 * 8 + 128' | bc` -l `echo '128' | bc` fs1m-with-data
        00002080: ED 41 00 00 00 04 00 00 1B B0 0C 67 1A B0 0C 67  .A.........g...g
        00002090: 1A B0 0C 67 00 00 00 00 00 00 04 00 02 00 00 00  ...g............
        000020a0: 00 00 00 00 01 00 00 00 18 00 00 00 00 00 00 00  ................
        000020b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        *
        000020f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        
        # 查看 block = 24, 第 24 块里面的数据
        # inode[4] = 0C 00 00 00 = 12
        # rec_len[2] = D4 03 = 980
        # name_len[1] = 05 = 5
        # file_type[1] = 02 = 2
        # name[0-255] = 31 32 33 30 32 ==> ascii [12302]
        ➜  data git:(master) ✗ xxd -a -u -g1 -s `echo '1024 * 24' | bc` -l `echo '1024' | bc` fs1m-with-data
        00006000: 02 00 00 00 0C 00 01 02 2E 00 00 00 02 00 00 00  ................
        00006010: 0C 00 02 02 2E 2E 00 00 0B 00 00 00 14 00 0A 02  ................
        00006020: 6C 6F 73 74 2B 66 6F 75 6E 64 00 00 0C 00 00 00  lost+found......
        00006030: D4 03 05 02 31 32 33 30 32 00 00 00 00 00 00 00  ....12302.......
        00006040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        *
        000063f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        
        
        
        
        # 继续查看目录 12302 = inode = 12
        # 12302 的 block = block[0] = 26 00 00 00 = 38
        ➜  data git:(master) ✗ xxd -a -u -g1 -s `echo '1024 * 8 + 128 * 11' | bc` -l `echo '128' | bc` fs1m-with-data
        00002580: ED 41 00 00 00 04 00 00 7C B0 0C 67 7A B0 0C 67  .A......|..gz..g
        00002590: 7A B0 0C 67 00 00 00 00 00 00 02 00 02 00 00 00  z..g............
        000025a0: 00 00 00 00 0A 00 00 00 26 00 00 00 00 00 00 00  ........&.......
        000025b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        *
        000025e0: 00 00 00 00 E3 A9 A0 96 00 00 00 00 00 00 00 00  ................
        000025f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        
        # 查看 block = 38, 第 38 块里面的数据
        # 三个数据项[.|..|Hello.java]
        # inode[4] = 0E 00 00 00 = 14
        # rec_len[2] = E8 03 = 1000
        # name_len[1] = 0A = 10
        # file_type[1] = 01 = 1 ,表示是一个文件
        # name[0-255] = 48 65 6C 6C 6F 2E 6A 61 76 61 ==> ascii [Hello.java]
        ➜  data git:(master) ✗ xxd -a -u -g1 -s `echo '1024 * 38' | bc` -l `echo '1024' | bc` fs1m-with-data
        00009800: 0C 00 00 00 0C 00 01 02 2E 00 00 00 02 00 00 00  ................
        00009810: 0C 00 02 02 2E 2E 00 00 0E 00 00 00 E8 03 0A 01  ................
        00009820: 48 65 6C 6C 6F 2E 6A 61 76 61 00 00 00 00 00 00  Hello.java......
        00009830: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        *
        00009bf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        
        
        
        
        # 继续查看文件 Hello.java = inode = 14
        # Hello.java 的 block = block[0] = 2F 02 00 00 = 559
        ➜  data git:(master) ✗ xxd -a -u -g1 -s `echo '1024 * 8 + 128 * 13' | bc` -l `echo '128' | bc` fs1m-with-data
        00002680: A4 81 00 00 6F 00 00 00 68 B0 0C 67 7A B0 0C 67  ....o...h..gz..g
        00002690: 68 B0 0C 67 00 00 00 00 00 00 01 00 02 00 00 00  h..g............
        000026a0: 00 00 00 00 01 00 00 00 2F 02 00 00 00 00 00 00  ......../.......
        000026b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        *
        000026e0: 00 00 00 00 98 DB 7D D7 00 00 00 00 00 00 00 00  ......}.........
        000026f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        
        # 查看 block = 559, 第 559 块里面的数据
        ➜  data git:(master) ✗ xxd -a -u -g1 -s `echo '1024 * 559' | bc` -l `echo '1024' | bc` fs1m-with-data
        0008bc00: 70 75 62 6C 69 63 20 63 6C 61 73 73 20 48 65 6C  public class Hel
        0008bc10: 6C 6F 7B 0A 20 20 20 20 70 75 62 6C 69 63 20 73  lo{.    public s
        0008bc20: 74 61 74 69 63 20 76 6F 69 64 20 6D 61 69 6E 28  tatic void main(
        0008bc30: 53 74 72 69 6E 67 5B 5D 20 61 72 67 73 29 7B 0A  String[] args){.
        0008bc40: 09 20 20 20 53 79 73 74 65 6D 2E 6F 75 74 2E 70  .   System.out.p
        0008bc50: 72 69 6E 74 6C 6E 28 22 48 65 6C 6C 6F 20 57 72  rintln("Hello Wr
        0008bc60: 6F 6C 64 22 29 3B 0A 20 20 20 20 7D 0A 7D 0A 00  old");.    }.}..
        0008bc70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        *
        0008bff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        

        示例结构图如下

      • 多个块组 #

        Note

        有些文档有诸如此类的描述: 描述整个分区的文件系统信息,例如块大小、文件系统版本号、上次mount的时间等等。超级块在每个块组的开头都有一份拷贝。
        有一些疑问,比如什么样的拷贝,那些属性拷贝,还是完全拷贝等?

        为了对描述中的拷贝精确定位,验证多个块组中的 super block 是否一致。需要生成一个多个 Block Group 的文件系统。
        因为如果使用 block = 1024B 划分的话,每个块组最多可以包含块为 (block bitmap = 1024 * 8 = 8192) 个块。所以我们需要放大文件系统容量,这次采用 10M 的,还是按照 block = 1024B 进行划分,(10M / 1024B = 10240 块 > 8192)则会产生第二个 Block Group。

        准备数据:
        dd if=/dev/zero of=fs10m count=10 bs=1024K
        mke2fs -b 1024 -I 128 fs10m

        可以使用 diff 命令快速比对
        diff --color <(xxd -p -u -c 16 -s `echo '1024 * 1' | bc` -l `echo '264' | bc` fs10m) <(xxd -p -u -c 16 -s `echo '1024 * 8193' | bc` -l `echo '264' | bc` fs10m)



        可以看到,并不是完全复制,对照 super block 结构 不一样的地方在于
        1).: s_state[2],第一个为01 00,第二个为00 00
        2).: s_block_group_nr[2],第一个为00 00,第二个为01 00

        所以除了这两个块组的自身属性外,可以认为是完全复制。

        # 查看 (Block Group 0)super block
        # 
        ➜  data git:(master) ✗ xxd -u -g1 -s `echo '1024 * 1' | bc` -l `echo '264' | bc` fs10m
        00000400: 00 0A 00 00 00 28 00 00 00 02 00 00 5B 26 00 00  .....(......[&..
        00000410: F5 09 00 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
        00000420: 00 20 00 00 00 20 00 00 00 05 00 00 00 00 00 00  . ... ..........
        00000430: F6 C3 0D 67 00 00 FF FF 53 EF 01 00 01 00 00 00  ...g....S.......
        00000440: F6 C3 0D 67 00 00 00 00 00 00 00 00 01 00 00 00  ...g............
        00000450: 00 00 00 00 0B 00 00 00 80 00 00 00 38 00 00 00  ............8...
        00000460: 02 00 00 00 03 00 00 00 C8 CC 95 B3 98 60 40 B5  .............`@.
        00000470: A3 AC 66 22 AC 81 23 B2 00 00 00 00 00 00 00 00  ..f"..#.........
        00000480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        00000490: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        000004a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        000004b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        000004c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 27 00  ..............'.
        000004d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        000004e0: 00 00 00 00 00 00 00 00 00 00 00 00 C8 7F C3 0C  ................
        000004f0: 07 50 46 64 86 E5 1F F8 EC C1 C7 9F 01 00 00 00  .PFd............
        00000500: 0C 00 00 00 00 00 00 00                          ........
        
        # 查看 (Block Group 1)super block
        # 
        ➜  data git:(master) ✗ xxd -u -g1 -s `echo '1024 * 8193' | bc` -l `echo '264' | bc` fs10m
        00800400: 00 0A 00 00 00 28 00 00 00 02 00 00 5B 26 00 00  .....(......[&..
        00800410: F5 09 00 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
        00800420: 00 20 00 00 00 20 00 00 00 05 00 00 00 00 00 00  . ... ..........
        00800430: F6 C3 0D 67 00 00 FF FF 53 EF 00 00 01 00 00 00  ...g....S.......
        00800440: F6 C3 0D 67 00 00 00 00 00 00 00 00 01 00 00 00  ...g............
        00800450: 00 00 00 00 0B 00 00 00 80 00 01 00 38 00 00 00  ............8...
        00800460: 02 00 00 00 03 00 00 00 C8 CC 95 B3 98 60 40 B5  .............`@.
        00800470: A3 AC 66 22 AC 81 23 B2 00 00 00 00 00 00 00 00  ..f"..#.........
        00800480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        00800490: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        008004a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        008004b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        008004c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 27 00  ..............'.
        008004d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        008004e0: 00 00 00 00 00 00 00 00 00 00 00 00 C8 7F C3 0C  ................
        008004f0: 07 50 46 64 86 E5 1F F8 EC C1 C7 9F 01 00 00 00  .PFd............
        00800500: 0C 00 00 00 00 00 00 00                          ........
        
    • 总结 #

      1).: 一个 inode 里面会包含 数据块。如果是目录的 inode ,则 block 里面是目录形式的数据。如果是文件的 inode,则 block 里面是文件的数据。
      2).: 对于多个块组的 super block 并不是完全复制,有些许属性会改变,比如块组号,块组状态。除了这两个属性,其他完全一致,所以大体上也可以认为是复制(拷贝)。

  • Reference #


comments powered by Disqus