CS 471 - Lecture 8 File Systems Ch. 10,11 George Mason University Fall 2009

Download Report

Transcript CS 471 - Lecture 8 File Systems Ch. 10,11 George Mason University Fall 2009

CS 471 - Lecture 8 File Systems Ch. 10,11 George Mason University Fall 2009

File-System Interface

    

File Concept File Operations Access Methods Directory Structure Access control

GMU – CS 571 10.2

Files

A file is a named collection of related information that is recorded on secondary storage

Several information storage media (magnetic/optical disks)

The operating system provides a

uniform logical view

of information storage

GMU – CS 571 10.3

Files

Files

are mapped onto physical storage devices.

represent programs (both source and object forms) and data.

have a certain

structure

that may be considered as sequence of bits, bytes, lines, records…

• •

meaning defined by file’s creator have

attributes

that are recorded by the O.S. (name, size, type, location, protection info, time info, etc.)

logically contiguous

Information about files are kept in the

directory structure

, which is also maintained on the secondary storage.

GMU – CS 571 10.4

Basic File Operations

    

Create Write Read Delete Others

reposition within the file, append, rename, truncate, ...

For write/read operations, the operating system needs to keep a

file position pointer

for each process

Need to update it dynamically and properly

GMU – CS 571 10.5

File Operations

  

To avoid searching the directory entries repeatedly, many systems require that an

open()

system call be issued before that file is first used actively.

Operating System keeps

a

system-wide open-file table

containing information about all open files

per-process open-file tables

containing information about all open files of each process The

open

operation takes a file name and searches the directory, copying the directory entry into the open-file table. It returns a

pointer

to the entry in the open file table.

GMU – CS 571 10.6

File Operations

  

The per-process open table contains info about

• • • •

Position pointer (current location within file) Access rights Accounting Pointer to the system-wide open-file table entry The system-wide open table includes info about

• • •

File location on the disk File size File open count (the number of processes using this file) A process that completes its operations on a given file will issue a

close()

system call.

GMU – CS 571 10.7

Process A’s Open-File Table Process B’s Open-File Table GMU – CS 571 .

.

.

.

.

.

File Operations (Cont.)

.

.

.

.

.

.

.

.

.

.

.

. .

.

10.8

System-Wide Open-File Table

An Example Program Using File System Calls (1/3)

/* File copy program. Error checking and reporting is minimal. */ /* “myfilecopy oldfile newfile” will copy the contents of “oldfile” to “newfile” */ /* The program will read blocks of 4K from the “oldfile” to a buffer, and store them to “newfile” sequentially */ #include /* include necessary header files */ #include #include #include int main(int argc, char *argv[]); /* ANSI prototype */ #define BUF_SIZE 4096 /* use a buffer size of 4096 bytes */ #define OUTPUT_MODE 0700 /* protection bits for output file */

GMU – CS 571 10.9

An Example Program Using File System Calls (2/3)

int main(int argc, char *argv[]) { int in_fd, out_fd, rd_count, wt_count; char buffer[BUF_SIZE]; if (argc != 3) exit(1); /* error if argc is not 3 */ /* Open the input file and create the output file */ in_fd = open(argv[1], O_RDONLY); /* open the source file */ if (in_fd < 0) exit(2); /* if it cannot be opened, exit */ out_fd = creat(argv[2], OUTPUT_MODE); /* create the destination file */ if (out_fd < 0) exit(3); /* if it cannot be created, exit */

GMU – CS 571 10.10

An Example Program Using File System Calls (3/3)

/* Copy loop */ while (TRUE) { rd_count = read(in_fd, buffer, BUF_SIZE); /* read a block of data */ if (rd_count <= 0) break; /* if end of file or error, exit loop */ wt_count = write(out _fd, buffer, rd_count); /* write data */ if (wt_count <= 0) exit(4); /* wt_count <= 0 is an error */ } /* Close the files */ close(in_fd); close(out_fd); if (rd_count == 0) /* no error on last read */ else exit(0); exit(5); /* error on last read */ }

GMU – CS 571 10.11

File Types

 

Most operating systems associate a type with a file File type can be used to operate on files in reasonable ways

ex: Windows

file type (i.e. suffix) used to determine what program to open a file with

ex: Unix – info stored in file (‘magic number’) can be used for differentiation – suffix not always used

GMU – CS 571 10.12

File Types – Name, Extension

GMU – CS 571 10.13

File Structure

    

None - sequence of words, bytes Simple record structure

• • •

Lines Fixed length Variable length Complex Structures

• •

Formatted document Relocatable load file Can simulate last two methods with first method by inserting appropriate control characters Who decides:

• •

Operating system Program

GMU – CS 571 10.14

Internal File Structure

     

Disk systems have a well-defined

block size

determined by the size of a sector.

All disk I/O is performed in units of one block (physical record).

Each block is one or more sectors

A sector can hold 32 – 4096 bytes Files are made of

logical records.

Often, a number of logical records will be

packed

into physical records.

Operating System will perform translation from

logical records physical records.

to Internal fragmentation

GMU – CS 571 10.15

File Access Methods

Sequential Access

Information is processed in order, one record after the other (tape model)

Example: editors and compilers

read next write next reset (rewind)

GMU – CS 571 10.16

File Access Methods

  

Direct Access

The file is made up fixed-length rapidly

in any order logical records

that allow programs to read and write records

read n write n

or alternatively:

position to n read next write next n

= relative block number request to read block N translated into physical address B*N + start (for block size B) ex: database

Other access methods often built on top of direct access

GMU – CS 571 10.17

Directory Structure

The directory acts as a symbol table that translates file names into their directory entries.

Operations on a directory

• •

Search for a file Create a file

• • • •

Delete a file List a directory Rename a file …

GMU – CS 571 10.18

Organize the Directory (Logically) to Obtain

  

Efficiency – locating a file quickly Naming – convenient to users

Two users can have same name for different files

The same file can have several different names Grouping – logical grouping of files by properties, (e.g., all Java programs, all games, …)

GMU – CS 571 10.19

Single-Level Directory

A single directory for all users

Naming problem Grouping problem GMU – CS 571 10.20

Two-Level Directory

Separate directory for each user

    Path name Can have the same file name for different user Efficient searching No grouping capability GMU – CS 571 10.21

Tree Directory Structure

Tree-structured directories

extend the structure to a tree of arbitrary height

User-imposed structure

• • •

Relative paths vs. absolute paths Directory deletion policy Concept of a ‘current directory’

GMU – CS 571 10.22

Acyclic-Graph Directories

 

Allows shared subdirectories and files.

A shared file will “exist” in multiple directories at once.

GMU – CS 571 10.23

Achieving File Sharing

Option 1: Duplicate all information about the shared file in both directories (Problem?)

Option 2: Create a new directory entry called

link

The

link

is effectively a pointer to another file or directory

When the directory entry of a referred file is a link, we

resolve

the link by using the path name

(symbolic link in Unix)

“ln –s reports/report1.txt myreport”

GMU – CS 571 10.24

Achieving File Sharing (Cont.)

Option 3: Each entry in a directory can point to a little data structure (File Control Block [FCB], or “i-node”) that keeps information about the file

The directory entries corresponding to a shared file will all point to the same file control block

• •

Non symbolic or “

hard

” links in Unix “ln reports/report1.txt myreport”

FCB of the file “root“ Directory myreport “reports” Directory GMU – CS 571 report1.txt

10.25

Achieving File Sharing (Cont.)

 

What to do when a shared file is deleted by a user?

• •

The deletion of a link should not affect the original file If the original file is deleted, we may be left with dangling pointers.

Solutions

Using backpointers, delete also all links. The search may be expensive.

Alternatively, leave the links intact until an attempt is made to use them (Unix symbolic links). May lead to infrequent but subtle problems.

In case of non symbolic (or in Unix, “hard”) links: Preserve the file until all references are deleted. Keep the count of the

number

of the references, delete the file when the count reaches zero.

GMU – CS 571 10.26

File Protection

File owner/creator should be able to control:

• •

what can be done by whom

Types of access

• • • • • • Read Write Execute Append Delete List GMU – CS 571 10.27

Access Lists and Groups

 

Mode of access: read, write, execute Three classes of users RWX a)

owner access

7

1 1 1 RWX b)

group access

6

1 1 0

 

RWX c)

public access

1

0 0 1 Ask manager to create a group (unique name), say G, and add some users to the group.

For a particular file (say

game

) or subdirectory, define an appropriate access.

GMU – CS 571 10.28

Windows XP Access-control List Management

GMU – CS 571 10.29

A Sample UNIX Directory Listing

GMU – CS 571 10.30

File System Implementation

   

File System Structure File System Implementation Allocation Methods File System Performance

GMU – CS 571 10.31

File System Structure

An operating system may allow multiple file systems.

Once the user interface is determined, the file system must be implemented to map the

logical

file system to the

physical

secondary-storage devices.

File control block

– storage structure that keeps information about a given file (Unix “i-nodes”).

Ownership, size, permissions, access date info, location of data blocks

GMU – CS 571 10.32

Schematic View of Virtual File System

Operating System Concepts – 7 th Edition, Jan 1, 2005 11.33

Silberschatz, Galvin and Gagne ©2005

GMU – CS 571

Layered File System

File system is organized into layers

Logical File System Layer

the file-system structure (through directories and FCBs). manages

File-Organization Module

manager.

performs mapping between logical blocks and physical blocks. It also includes free space manager and block allocation

Basic File System Layer

generic commands to the appropriate device driver ( disk issues

I/O Control Layer)

to read and write physical blocks on the

10.34

Storage Structure

A disk is a physical memory storage device that can be used for:

a single file system (in its entirety)

• •

multiple file systems in part for file systems, in part for other purposes (e.g. for

swap space

or unformatted

(raw)

disk space)

These parts are known as

partitions, slices

or

minidisks.

GMU – CS 571 10.35

Storage Structure (Cont.)

Each partition can be either “raw” (containing no file system), or “cooked” (with a file system)

Raw disk

contains a large sequential array of logical blocks, without any file-system data

• •

can be used as swap space can be used for special (e.g. database) applications

GMU – CS 571 10.36

Storage Structure (Cont.)

 

Each partition that contains a file system has a

device directory

The device directory keeps information (name, location, size, type, owner) for files on that partition.

GMU – CS 571 10.37

GMU – CS 571

Accessing Disk Sub-system

Disks allow direct access to stored data

Disk access time has two components

Random access time (positioning) that includes seek time and rotational latency (5-10 ms)

Transfer time (10 MB/s)

Compare to the memory access time of 10-100 nanoseconds

10.38

GMU – CS 571

Accessing Disk Sub-system

 

When a process needs I/O, it issues a system call to the OS

• • • •

input or output from what disk address to what memory address how many sectors At any point in time, the disk may have several pending requests that must be scheduled:

• • • •

FCFS SSTF (shortest seek time first) SCAN …

10.39

Implementation of “Open” and “Read”

GMU – CS 571

Figure (a) refers to opening a file.

Figure (b) refers to reading a file.

10.40

Allocation Methods

The allocation method refers to how disk blocks are allocated for files:

Contiguous allocation

Linked allocation

Indexed allocation

GMU – CS 571 10.41

Contiguous Allocation

 

Each file occupies a set of contiguous blocks on the disk.

Simple – only starting location (block #) and length (number of blocks) are required.

GMU – CS 571 10.42

Contiguous Allocation

 

Efficient access to multiple blocks of a file Both sequential and direct access can be supported.

A major problem is determining how much space is needed for a new file.

How to let files grow?

Finding space for a new file:

First-fit

and

best fit …

These algorithms suffer from

external fragmentation:

free space is broken into multiple chunks.

GMU – CS 571 10.43

Extent-Based Systems

Many newer file systems (I.e. Veritas File System) use a modified contiguous allocation scheme

Extent-based file systems allocate disk blocks in

extents 

An

extent • •

is a contiguous block of disks Extents are allocated for file allocation A file consists of one or more extents.

GMU – CS 571 10.44

Linked Allocation

Each file is a linked list of disk blocks: blocks may be scattered anywhere on the disk.

block = pointer GMU – CS 571 10.45

Linked Allocation

  

Each file is a linked list of disk blocks: blocks may be scattered anywhere on the disk.

Each block contains a pointer to the next block.

Each directory entry has a pointer to the first and last disk blocks of the file.

GMU – CS 571 10.46

Linked Allocation

     

External fragmentation is eliminated.

The size of a file does not need to be declared at the time of creation.

However, it can be used effectively only for sequential access files. Inefficient for direct access files.

Another disadvantage is the space required for the pointers.

One solution is to collect blocks into multiples (

clusters)

and to allocate the clusters rather than blocks.

Another problem of linked allocation is

reliability:

what will happen if a pointer is lost or damaged?

GMU – CS 571 10.47

File-Allocation Table (FAT)

   

A variation of the linked allocation method A section of the disk at the beginning of each partition is used as the

File Allocation Table.

The table entries give the block number of the next block in the file.

The scheme can result in a significant number of disk head seeks, unless the FAT is cached.

GMU – CS 571 10.48

Indexed Allocation

  

Indexed allocation supports direct access, without suffering from external fragmentation or size-declaration problems.

However, wasted space may be a problem.

How large the index block should be?

To reduce the wasted space, we want to keep the index block small

If the index block is too small, it will not be able to hold pointers for a large file.

• • •

Linked scheme Multilevel scheme Combined scheme

index table GMU – CS 571 10.49

Indexed Allocation – Mapping (Cont.)

 GMU – CS 571 outer-index 10.50

index table file

Combined Scheme (Unix)

  

Keep the first N pointers of the index block in the file’s i-node (FCB).

The first 12 of these pointers point to

direct blocks

The next three pointers point to

indirect blocks

GMU – CS 571 10.51

File System Performance

Disk access is the bottleneck for the file system performance

Caching

Most disk controllers have an on-board cache that can store entire tracks at a time

Subsequent requests can be served through the on-board cache

Most systems maintain a separate section of main memory for a disk cache (block cache, or buffer cache), where blocks are kept under the assumption that they will be re-used in near future

GMU – CS 571 10.52

Caching

A

page cache

caches pages rather than disk blocks using virtual memory techniques

Memory-mapped I/O uses a page cache

Routine I/O through the file system uses the buffer (disk) cache

GMU – CS 571 10.53

Memory-mapped I/O

  

Memory-mapped I/O uses the same address bus to address both memory and I/O devices, and the CPU instructions used to access the memory are also used for accessing devices.

Port-mapped I/O uses a special class of CPU instructions specifically for performing I/O. A device's direct memory access (DMA) is a memory-to-device communication method, that bypasses the CPU.

GMU – CS 571 10.54

Unified Buffer Cache

A unified buffer cache uses the same page cache to cache both memory-mapped pages and ordinary file system I/O

GMU – CS 571 10.55

File System Performance (Cont.)

LRU is a reasonable

block replacement policy

BUT: if a critical block (such as File Control Block, or i-node) is read into the cache and modified, but not re-written to the disk, a crash will leave the file system in an inconsistent state.

Critical blocks must be written immediately.

Avoiding inconsistency

Write through-cache: write every modified block to disk as soon as it has been written

UNIX solution

The system call

sync

forces all the modified blocks out onto the disk immediately.

A program, usually called

update,

is invoked in the background to call

sync

every 30 seconds.

GMU – CS 571 10.56

File System Performance

Block-read-ahead: When reading block

k

to the cache in memory, read also block

k+1

Reduce disk arm motion through

Putting blocks that are likely to be accessed in sequence close to each other

Disk scheduling algorithms that serve pending disk access requests in an order that reduces the delay

GMU – CS 571 10.57

Distributed File Sharing

Sharing of files on multi-user systems is desirable

On distributed systems, files may be shared across a network

• •

Manually via programs like FTP Automatically, seamlessly using

distributed file systems •

Semi automatically via the

world wide web 

Network File System (NFS) is a common distributed file-sharing method

GMU – CS 571 10.58

File Sharing – Remote File Systems

  Client-server

model allows clients to mount remote file systems from servers

• •

Server can serve multiple clients Client and user-on-client identification is insecure or complicated

• • • NFS CIFS

is standard UNIX client-server file sharing protocol is standard Windows protocol Standard operating system file calls are translated into remote calls Distributed Information Systems

(distributed naming services)

such as LDAP, DNS, NIS, Active Directory implement unified access to information needed for remote computing

GMU – CS 571 10.59

File Sharing – Failure Modes

  

Remote file systems add new failure modes, due to network failure, server failure Recovery from failure can involve state information about status of each remote request Stateless protocols such as NFS include all information in each request, allowing easy recovery but less security

GMU – CS 571 10.60

File Sharing – Consistency Semantics

 Consistency semantics

specify how multiple users are to access a shared file simultaneously

Similar to process synchronization algorithms

Tend to be less complex due to disk I/O and network latency (for remote file systems)

Andrew File System (AFS) implemented complex remote file sharing semantics

Unix file system (UFS) implements:

Writes to an open file visible immediately to other users of the same open file

Sharing file pointer to allow multiple users to read and write concurrently

AFS has session semantics

Writes only visible to sessions starting after the file is closed

GMU – CS 571 10.61

The Sun Network File System (NFS)

 An implementation and a specification of a software system for accessing remote files across LANs (or WANs)  The implementation is part of the Solaris and SunOS operating systems running on Sun workstations using an unreliable datagram protocol (UDP/IP protocol and Ethernet)

Operating System Concepts – 7 th Edition, Jan 1, 2005 11.62

Silberschatz, Galvin and Gagne ©2005

NFS (Cont.)

 Interconnected workstations viewed as a set of independent machines with independent file systems, which allows sharing among these file systems in a transparent manner  A remote directory is mounted over a local file system directory    The mounted directory looks like an integral subtree of the local file system, replacing the subtree descending from the local directory Specification of the remote directory for the mount operation is nontransparent; the host name of the remote directory has to be provided  Files in the remote directory can then be accessed in a transparent manner Subject to access-rights accreditation, potentially any file system (or directory within a file system), can be mounted remotely on top of any local directory

Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Jan 1, 2005 11.63

NFS (Cont.)

 NFS is designed to operate in a heterogeneous environment of different machines, operating systems, and network architectures; the NFS specifications independent of these media  This independence is achieved through the use of RPC primitives built on top of an External Data Representation (XDR) protocol used between two implementation-independent interfaces  The NFS specification distinguishes between the services provided by a mount mechanism and the actual remote-file-access services

Operating System Concepts – 7 th Edition, Jan 1, 2005 11.64

Silberschatz, Galvin and Gagne ©2005

Three Independent File Systems

Operating System Concepts – 7 th Edition, Jan 1, 2005 11.65

Silberschatz, Galvin and Gagne ©2005

Mounting in NFS

Mounts S1:/usr/shared Over U:/usr/local/

Operating System Concepts – 7 th Edition, Jan 1, 2005 11.66

Cascading mounts S2:/usr/dir2 Over U:/usr/local/dir1

Silberschatz, Galvin and Gagne ©2005

NFS Mount Protocol

     Establishes initial logical connection between server and client Mount operation includes name of remote directory to be mounted and name of server machine storing it   Mount request is mapped to corresponding RPC and forwarded to mount server running on server machine Export list – specifies local file systems that server exports for mounting, along with names of machines that are permitted to mount them Following a mount request that conforms to its export list, the server returns a file handle —a key for further accesses File handle – a file-system identifier, and an inode number to identify the mounted directory within the exported file system The mount operation changes only the user’s view and does not affect the server side

Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Jan 1, 2005 11.67

NFS Protocol

    Provides a set of remote procedure calls for remote file operations. The procedures support the following operations:   searching for a file within a directory reading a set of directory entries   manipulating links and directories accessing file attributes  reading and writing files NFS servers are

stateless

; each request has to provide a full set of arguments (NFS V4 is just coming available – very different, stateful) Modified data must be committed to the server’s disk before results are returned to the client (lose advantages of caching) The NFS protocol does not provide concurrency-control mechanisms

Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Jan 1, 2005 11.68

Three Major Layers of NFS Architecture

 UNIX file-system interface (based on the

open, read, write

, and

close

calls, and

file descriptors

) 

Virtual File System

(VFS) layer – distinguishes local files from remote ones, and local files are further distinguished according to their file-system types  The VFS activates file-system-specific operations to handle local requests according to their file-system types  Calls the NFS protocol procedures for remote requests  NFS service layer – bottom layer of the architecture  Implements the NFS protocol

Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Jan 1, 2005 11.69

Schematic View of NFS Architecture

Operating System Concepts – 7 th Edition, Jan 1, 2005 11.70

Silberschatz, Galvin and Gagne ©2005

NFS Path-Name Translation

 Performed by breaking the path into component names and performing a separate NFS lookup call for every pair of component name and directory vnode  To make lookup faster, a directory name lookup cache on the client’s side holds the vnodes for remote directory names

Operating System Concepts – 7 th Edition, Jan 1, 2005 11.71

Silberschatz, Galvin and Gagne ©2005

NFS Remote Operations

     Nearly one-to-one correspondence between regular UNIX system calls and the NFS protocol RPCs (except opening and closing files) NFS adheres to the remote-service paradigm, but employs buffering and caching techniques for the sake of performance File-blocks cache – when a file is opened, the kernel checks with the remote server whether to fetch or revalidate the cached attributes  Cached file blocks are used only if the corresponding cached attributes are up to date File-attribute cache – the attribute cache is updated whenever new attributes arrive from the server Clients do not free delayed-write blocks until the server confirms that the data have been written to disk

Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Jan 1, 2005 11.72