Digging into ext2 - The Blavatnik School of Computer Science

Download Report

Transcript Digging into ext2 - The Blavatnik School of Computer Science

Digging into ext2
Nezer J. Zaidenberg
Agenda
 Linux kernel module programming (summary from TLDP)
 Abstract (Virtual) in C
 Initial code review and work on ext2
 How to start working on ex 3 + more digging methods
 Some clues on ex3
References
 The linux kernel - www.kernel.org (or local mirror)
 The linux documentation project – Kernel module programming – 2.6
www.tldp.org (http://www.tldp.org/LDP/lkmpg/2.6/html/lkmpg.html)
 Under KERNEL tree
 Documentation/Kbuild
 fs/*.c
 fs/ext2/*.c
 UNIX filesystems
 Understanding The Linux Kernel
 The linuxkerenl (from tldp.org) has a fairly reqasonable description of
ext2.h and file systems in general but is referring to version 2.0 of Linux
(so The concepts are correct but but some files have different names!)
Hello world module (1/2) in kernel 2.6
#include<linux/module.h>
/* Needed by all modules */
#include <linux/kernel.h>
/* Needed for KERN_INFO */
#include <linux/init.h>
/* Needed for the macros */
l
static int __init hello_2_init(void)
{
printk(KERN_INFO "Hello, world 2\n");
return 0;
}
Hello world module (2/2)
static void __exit hello_2_exit(void)
{
printk(KERN_INFO "Goodbye, world 2\n");
}
module_init(hello_2_init);
module_exit(hello_2_exit);
Makefile for kernel module
obj-m += hello-2.o
all:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules
clean:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean
Makefile for multiple objects
obj-m += startstop.o
startstop-objs := start.ostop.o
all:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules
clean:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean
Implementing abstract functions in C
class base
{
virtual foo()
virtual bar()
Widget W;
}
class derived : public class base // Extends class base
{
foo()
}
Calling base and derived functions
base b;
derived d;
base * ptr=&b;
(*ptr).foo();
(*ptr).bar();
ptr=&d;
(*ptr).foo();
(*ptr).bar();
How does the compiler knows
which function to call?
The Virtual implementation
 When implementing virtual in memory, we keep for every
object a “Virtual table”
 The virtual table holds function names and pointers to
functions
 When we use (*ptr)->foo() the function call goes to the
Virtual table and looks for foo.
 When it finds foo it will call the right function.
What does it have to do with C?
Abstract functions in C

The fact that C have no virtual table built by the compiler for us does not mean
we cannot built one ourselves. (If you think it through everything in C++/Java
compiles to assembler (close to C) with no classes either)

We just have to build the virtual table ourselves…

For every struct widget we can add structwidget_operations * which will hold
pointers to functions that operate on widget. (and can be implemented as we
want)

Since the compiler in C lacks the class and inheritance mechanism we have to do
lots of things ourselves such as



Deliver the * this to the functions
Initialize the struct widget operations
Call the function in a bit akward way (b.foo() -> (*(b->ops.foo)(&b))
The abstract interface
structfile_operations {
struct module *owner;
loff_t(*llseek) (struct file *, loff_t, int);
ssize_t(*read) (struct file *, char __user *, size_t, loff_t *);
ssize_t(*aio_read) (structkiocb *, char __user *, size_t, loff_t);
ssize_t(*write) (struct file *, const char __user *, size_t, loff_t *);
ssize_t(*aio_write) (structkiocb *, const char __user *, size_t,
…
Complete struct (and others) found under Kernel source tree include/linux/fs.h
Filling the abstract interface (GCC)
structfile_operations fops = {
read: device_read,
write: device_write,
open: device_open,
release: device_release
};
Filling the abstract interface (C99)
structfile_operations fops = {
.read = device_read,
.write = device_write,
.open = device_open,
.release = device_release
};
fs/ext2/super.c

1426 static void __exit exit_ext2_fs(void)

1427 {

1428
unregister_filesystem(&ext2_fs_type);

1429
destroy_inodecache();

1430
exit_ext2_xattr();

1431 }

1432

1433 module_init(init_ext2_fs)

1434 module_exit(exit_ext2_fs)
fs/ext2/super.c
static int __init init_ext2_fs(void)

1407

1408 {

1409
int err = init_ext2_xattr();

1410
if (err)

1411

1412
err = init_inodecache();

1413
if (err)

1414

1415
return err;
goto out1;
err = register_filesystem(&ext2_fs_type);
fs/ext2/super.c

1416
if (err)

1417

1418

1419 out:

1420

1421 out1:

1422
exit_ext2_xattr();

1423
return err;

1424 }
goto out;
return 0;
destroy_inodecache();
What have we just seen
 Init the module
 Exit
 Register/unregister the file system
 It would be perfectly reasonable if your module just call
registerfs and unregisterfs in the init module and exit
module (you don’t REALLY need inode cache and xattr
etc.)
Struct ex2_fs_type (fs/ext2/super.c)
1399 static structfile_system_type ext2_fs_type = {
1400
.owner
= THIS_MODULE,
1401
.name
= "ext2",
1402
.get_sb
= ext2_get_sb,
1403
.kill_sb
= kill_block_super,
1404
.fs_flags
1405 };
= FS_REQUIRES_DEV,
What have we just seen
 We have 5 fields
 .owner = THIS_MODULE (so if this file system is mounted we
cannot unmount this module)
 .name = “ext2” (the name that will be specified by –t to
mount)
 .get_sb = what to do to get super block (call ext2 method)
 .kill_sb = how to free the super block (call linux method)
 FS_REQUIRES_DEV This file system is based on device
ext2_get_sb (from fs/ext2/super.c)
1288 static int ext2_get_sb(struct file_system_type *fs_type,
1289
int flags, const char *dev_name, void *data,
structvfsmount *mnt)
1290 {
1291
return get_sb_bdev(fs_type, flags, dev_name,
data, ext2_fill_super, mnt);
1292 }
What have we just seen
 We call standard Linux kernel method to get super block
for file system based on BLOCK device This function
initialize the block device
 We give this function as argument – a function to fill the
private file super block (read it from disk) called
ext2_sb_fill
Continuing the dig… fs/super.c

You’ll find get_sb_bdev at lines 751-813 (note it’s a different file)

This function does a lot we don’t really care about (initialize the block device) etc. But in lines 795-800 you shall find
:
795
error = fill_super(s, data, flags & MS_SILENT ? 1 : 0);
796
if (error) {
797
up_write(&s->s_umount);
798
deactivate_super(s);
799
goto error;
800

}
Lets take it from there
Back to fs/ext2/super.c
738 static int ext2_fill_super(struct super_block *sb, void *data, int silent)
739 {
740
structbuffer_head* bh;
…
756
sbi = kzalloc(sizeof(*sbi), GFP_KERNEL);
757
if (!sbi)
758
return -ENOMEM;
759
sb->s_fs_info = sbi;
760
sbi->s_sb_block = sb_block;
…
fs/ext2/super.c
779
if (blocksize != BLOCK_SIZE) {
780
logic_sb_block = (sb_block*BLOCK_SIZE) / blocksize;
781
offset = (sb_block*BLOCK_SIZE) % blocksize;
782
} else {
783
784
785
logic_sb_block = sb_block;
}
fs/ext2/super.c
786
if (!(bh = sb_bread(sb, logic_sb_block))) {
787
printk ("EXT2-fs: unable to read superblock\n");
788
gotofailed_sbi;
789
}
790
/*
791
* Note: s_es must be initialized as soon as possible because
792
*
793
*/
794
es = (struct ext2_super_block *) (((char *)bh->b_data) + offset);
795
sbi->s_es = es;
some ext2 macro-instructions depend on its value
fs/ext2/super.c
…
798
799
if (sb->s_magic != EXT2_SUPER_MAGIC)
goto cantfind_ext2;
…
1042
sb->s_op = &ext2_sops;
…
1045
root = ext2_iget(sb, EXT2_ROOT_INO);
fs/ext2/super.c
299 static const structsuper_operations ext2_sops = {
300
.alloc_inode
301
.destroy_inode = ext2_destroy_inode,
302
.write_inode
303
.delete_inode = ext2_delete_inode,
304
.put_super
314 };
= ext2_alloc_inode,
= ext2_write_inode,
= ext2_put_super,
fs/ext2/super.c
305
.write_super
= ext2_write_super,
306
.statfs
307
.remount_fs
= ext2_remount,
308
.clear_inode
= ext2_clear_inode,
309
.show_options = ext2_show_options,
= ext2_statfs,
310 #ifdef CONFIG_QUOTA
311
.quota_read
= ext2_quota_read,
312
.quota_write
= ext2_quota_write,
313 #endif
What have we just seen
 The function find the exact block to read based on block
device block size and super block block size
 The function reads the super block via the bread API for
block devices (the data is read into buffer header struct)
 The function stores private data
 The function checks magic number (you should also do)
 The function set super block ops
 The function reads the root directory
Saving private data
 Linux structsuper_block has a void * called s_fs_info
 Linux structinode has a void * called i_private
 Those pointers can be used to save private data (your file
system structure)
 You can use those pointers data later
Include files and structs

Include/linux/fs.h has most struct definitions including

structsuper_block (lines 1106-1176)

structinode (lines 623-688)

structblock_device (lines 550-580)

Structbuffer_head is found at line 60 of buffer_head.h

Operations




super_operationsfs.h : 1358
Inode_operationsfs.h: 1311
File_ioerations: fs.h 1281
Address_space_operationsfs.h : 486 (for mmap)
Lots of time we will go from our file
system code to Linux code and back
 Our uxfs_get_sb -> calls Linux bd_get_sb -> calls our
uxfs_fill_sb
 Our Inode ops -> fills Linux do_sync_real -> which calls our
get block
 This makes sense so don’t be surprised when you see it
How to start working (from scratch)
on ex3
 Learn :
 Build hello world module
 Build a blank (do nothing) file system module
 Think :
 What are you going to hold in your file system? Think about
super block, Inodes
 Implement :
 Write mkfs.uxfs (a user program)
 Write fsdebug (a user program that prints/manipulates your file
system)
Ex 3
 Move to kernel
 Make your file system module mountable (mount your
floppy)
 (make sure you read SB correctly)
 Implement directories (so we can ls the root of the module)
 Implement files (via read/write/lseek)
 Implement mmap
 You’re done you can work on bonus now
More digging methods + some
advice
 Install kernel-dev-2.6.27.5 RPM from
ftp://mirror.isoc.org.il/fedora/releases/10/Fedora/i386/os/
Packages/kernel-devel-2.6.27.5-117.fc10.i686.rpm
 rpm –hiv –-force kernel-dev-2.6.27.5-117.fc10.i686
 Download complete kernel source from www.kernel.org
 You can printk each function that ext2 calls in each
operation as you traverse the file
Some clues on EX3
 You can use the unixfilesystems solutions (but some porting is required)
 You can use the ext2 solution (which is complete solution +all bonuses)
but “encumbered” with many performance optimizations you don’t
need
 If ext2 has implementation to a function you need -> you should
probably implement it as well.
 If ext2 uses Linux default implementation -> It should be OK for you too.
 All ext2 functions are prefixed with ext2
 Use something to help you navigate in the kernel (I use vi ctags, grep
and cscope)
minix
 Minix is a “mini-Unix” (OS built by Andrew Tenenbaum for
Educational purposes which later inspired LinusTurvalds to
write Linux) has a unix-like file system MUCH simpler then
ext2.
 I am NOT familiar with minix enough to recommend it but
from what I browsed it has all the features you need for
the ex. (+some extra) and is much simpler then ext2
 The implementation for minixfs under Linux was authored
by LinusTurvalds at 1991-1992
Amount of lines of code to
understand
Fs name
Lines of code
ext4
26404
ext3
16
ext2
8431
minix
2243
What is required of you
~1000 (+500 bonus)
For those of you that want to base
you work on MINIX FS
 Ignore everything that has to do with minix versions (for
simplicity assume minix version 1)
 Ignore everything that has to do with aio (your file system
is not required to support aio)
 Ignore everything that has to do with inode_cache
(performance optimization that was not required)
 The rest is pretty much your homework… (with some
bonuses such as bitmaps)
More minix
 Zones = blocks
 Zmap = blocks map
 Minix is documented (including file system) in Andrew
Tanenbaum book Operating systems (Latest edition with
Albert Woodhull) This documentation describes MINIX
code for the minixfs (which is different then Linux code)
still it can help if you don’t understand the structure
Some VMWARE and Linux admin
clues
 install rpm -hiv (package you downloaded) (may require
–force if you try to force install of older package)
 yum install emacs (or eclipse or anything)