Chapter One - Bucks County Community College

Download Report

Transcript Chapter One - Bucks County Community College

Chapter Four
UNIX File Processing
Lesson A
Extracting Information from Files
2
Objectives
Explain the UNIX approach to file
processing
Use basic file manipulation commands
Extract characters and fields from a file
using the cut command
3
Objectives
Rearrange fields inside a record using the
paste command
Merge files using the sort command
Create a new file by combining cut, paste,
and sort
4
UNIX Approach to
File Processing
Based on the approach that files should be
treated as nothing more than character
sequences
Because you can directly access each
character, you can perform a range of
editing tasks – this offers flexibility in terms
of file manipulation
5
Understanding UNIX File
Types
Regular files, also known as ordinary files
– Create information that you maintain and manipulate,
and include ASCII and binary files
Directories
– System files for maintaining file system structure
Special files
– Character special files relate to serial I/O devices
Communicates one character at a time
– Block special files relate to devices such as disks
Communicates using blocks of data
6
File Structures
Files can be structured in many ways
depending on the kind of data they store
UNIX stores data, such as letters and
product records, as flat ASCII files
Three kinds of regular files are
– Unstructured ASCII character
– Unstructured ASCII records
– Unstructured ASCII trees
7
8
Processing Files
When performing UNIX commands, UNIX
processes data by receiving input from a
standard input device (e.g. keyboard) and
sends it to a standard output device
(e.g.monitor)
System administrators and programmers
refer to standard input as stdin, standard
output as stdout
A third standard device is called standard
error, or stderr. When UNIX detects errors, it
directs the data to stderr, which is the monitor
9
Using Input and Error
Redirection
You can use redirection operators to retrieve
input from something other than the standard
input device and send output to something other
than the standard output device
Examples of redirection:
– Redirect the ls command output to a file, instead of to
the monitor (or screen)
– Redirect a program that receives input from the
keyboard to receive input from a file instead
– Redirect error messages to files, instead of to the
screen by default
10
Using Input and Error
Redirection
Create a file by:
typing in all the
commands,or by
redirecting the cat
command output
to a file
11
Manipulating Files
When you manipulate files, you work with the
files themselves, as well as their contents
Create files using output redirection
– cat command - concatenate text via output
redirection
– touch command - used to create empty files
12
Manipulating Files
Delete files when you no longer needed
– rm command - permanently removes a file or an
empty directory
– The -r option of the rm command will remove a
directory and everything it contains
Copy files as a means of back-up or as a means
to assist with new file creation
– cp command - copies the file(s) specified by the
source path to the location specified by the
destination path
13
Manipulating Files
Moving a file in order to change the
directory that contains it
– mv command - removes file from one directory
and places it in another
Finding a file helps you locate it in the
directory structure
– find command - searches for the file that has
the name you specify
14
Manipulating Files
15
Manipulating Files
Combining files using output redirection
– cat command - concatenate text of two different files
via output redirection
– paste command - joins text of different files in side by
side fashion
Extracting fields of a file using output redirection
– cut command - removes specific columns or fields
from a file
16
Manipulating Files
17
Manipulating Files
Re-arranging the contents of a file
– sort command - sorts a file’s contents
alphabetically or numerically
– The sort command offers many options:
You can sort the contents of a file and redirect the
output to another file
Utilizing a sort key which provides the option of
sorting on a field position within each line
18
Manipulating Files
19
Lesson B
Assembling Extracted Information
20
Objectives
Create a script file
Use the join command to link files using a
common field
Use the awk command to create a
professional-looking report
21
Using Script Files
UNIX users create shell script files to contain
commands that can be run sequentially as a set
– this helps with the issues of command
automation and re-use of command actions
UNIX users use the vi editor to create script files,
then make the script executable using the
chmod command with the x argument
22
Using Script Files
Type out the script
and then make it
executable using
the chmod
command.
23
Using the Join Command
The join command is used in relational database
processing
Relational databases consider files as tables
and records as rows
Relational databases also consider fields as
columns that can be joined to create new
records
The UNIX join command lets you extract
information from files sharing a common field
24
25
Using the Join Command to
Create the Vendor Report
Use the join
command to
create reports
showing the
relationship
between two files
26
A Brief Introduction to the
Awk Program
Awk, a pattern-scanning and processing
language helps to produce professionallooking reports
The awk command lets you do the same
things as the cat command (in conjunction
with the join command), but more quickly
and easily
27
A Brief Introduction to the
Awk Program
Awk uses a print
formatting function
from the C
programming
language to
achieve a more
professionallooking report
28
Using the awk Command to
Refine the Vendor Report
To refine and automate the vendor report,
create a shell script that includes only the
awk command, not a series of separate
commands. To have awk perform the
automation properly, redirect its input to
come from a disk file and not from the
keyboard.
29
Using the awk Command to
Refine the Vendor Report
Awk has many
features that let
you manage
your report
output to your
specification
30
Chapter Summary
UNIX supports regular files, directories, and
character and block special files
File’s structures depend on data being stored
and three kinds of regular files are unstructured
ASCII characters, records and trees
When running, UNIX receives input from the
standard input device (keyboard) also known
as stdin, and sends output to the standard
output device (monitor) also known as stdout.
Another standard device, stderr, refers to the
error file that defaults to the monitor
31
Chapter Summary
The touch command updates a file’s time and
date stamps and creates empty files
The rmdir command removes empty directories
The cut command extracts specific columns or
fields from a file
To combine two or more files, use the paste
command
Use the sort command to sort a file’s contents
alphabetically or numerically
32
Chapter Summary
To automate command processing, include
commands in a script file that you can later
execute as a program
Use the join command to extract data from two
files sharing a common field and use this field
to join the two files
Awk is a pattern-scanning and processing
language useful for creating a formatted report
with a professional look
33
34
35