Engr 691-10: Special Topics in Engineering Science

Download Report

Transcript Engr 691-10: Special Topics in Engineering Science

Csci 490 / Engr 596
Special Topics / Special Projects
Software Design and Scala Programming
Spring Semester 2010
Lecture Notes
Pipes and Filters
Architectural Pattern
Created: 14 September 2004, Revised: 13 April 2010
Definition
The Pipes and Filters architectural pattern
provides a structure for systems that process a
stream of data.
Each processing step is encapsulated in a filter
component.
Data are passed through pipes between adjacent
filters.
Recombining filters allows you to build families of
related filters
1
Context
Programs that must process streams of data
2
Problem
Build a system that
Must be built by several developers
Decomposes naturally into several
independent processing steps
For which the requirements are likely to
change
3
Forces
Possible to substitute new filters for existing ones or
recombine steps into different communication structure
Components implementing small processing steps easier
to reuse than components implementing large steps
Two steps share no information if they are not adjacent
Different sources of input data exist
Possible to display or store final results of computation
in various ways
If user stores intermediate results in files, then likelihood
of errors increases and file system cluttered
Possible for parallel execution of steps
4
Solution
Divide task into a sequence of processing steps
Implement each step by filter program that
consumes from its input and produces data on
its output incrementally
Connect output of one step as input to
succeeding step by means of pipe
Enable filters to execute concurrently
Connect input to sequence to some data source
Connect output of sequence to some data sink
5
Structure
Pipe
Data
Source
Pipe
Filter1
Pipe
Filter2
Data
Sink
6
Structure (cont.)
Filter
 Processing units of the pipeline
enrich data by computing new information from input data and adding it to
output data stream
refine data by concentrating or extracting information from input data
stream and passing only that information to output stream
transform input data to new form before passing it to output stream
do some combination of enrichment, refinement, and transformation
 Active filter
separate process or thread
pulls data from the input data stream
pushes the transformed data onto the output data stream
 Passive filter
called as a function, a pull of the output from the filter
called as a procedure, a push of output data into the filter
7
Structure (cont.)
Pipes
 Connectors between data source and first filter, between
filters, and between last filter and data sink
Data source
 Entity (e.g., a file or input device) that provides input data
to system.
 Either actively push data down pipeline or passively supply
data when requested
Data sink
 Entity that gathers data at end of pipeline
 Either actively pull data from last filter element or
passively respond when requested by last filter element
8
Implementation
Divide functionality into sequence of processing
steps
 Each step depends upon outputs of previous step and
becomes filter in system
Define type and format of data to be passed along
each pipe
Determine how to implement each pipe connection
 Pipe connecting to passive filter might be implemented as
direct call of adjacent filter
push connection as call of downstream filter as procedure
pull connection as call of upstream filter as function
9
Implementation (cont.)
Design and implement filters
 Active filter needs to run with its own thread of control
heavyweight operating system process
 having its own address space
lightweight thread
 sharing an address space with other threads
 Passive filter not require separate thread of control
 Selection of the size of the buffer
large buffers use up much available memory but involve less
synchronization and context-switching overhead
small buffers conserve memory at the cost of increased overhead
 Different processing options
10
Implementation (cont.)
Design for robust handling of errors
 Example: Unix program use stderr channel to report errors
 Recover from errors
discard bad input and resynchronize at some well-defined point
later in input data.
back up input to some well-defined point and restart processing,
using different processing method for bad data
Configure pipes-and-filters system and initiate
processing
 Use standardized main program to create, connect, and
initiate needed pipe and filter elements of pipeline
 Use end-user tool to create, connect, and initiate needed
pipe and filter elements of pipeline
11
Example
A retargetable compiler for programming language
Program
text
Source
stream/
characters
Lexical
Analyer
lexical
tokens
AST
Parser
Semantic
Analyer
augmented
AST
…
Source element reads program text from file (or sequence of files) as
stream of characters
Lexical analyzer converts stream of characters into stream of lexical
tokens for language – keywords, identifiers, operators, etc.
Parser recognizes sequence of tokens that conforms to language
grammar and translates sequence to abstract syntax tree
Semantic analyzer reads abstract syntax tree and writes appropriately
augmented abstract syntax tree
12
Example (cont.)
augmented
AST
…
Global
optimizer
optimized
AST
Intermediate
code generator
instruction
sequence
for VM
Local
optimizer
efficient
sequence
…
Global optimizer reads augmented syntax tree and outputs
equivalent that is more efficient in space and time usage
Intermediate code generator translates augmented syntax
tree to sequence of instructions for virtual machine
Local optimizer converts sequence of intermediate code
instructions into more efficient sequence
13
Example (cont.)
efficient
sequence
…
Backend
code
generator
instruction
sequence
for RM
relocatable
binary
module
Assembler
Linker
single
executable
module
Sink
File
Backend code generator translates sequence of virtual machine
instructions into sequence for some real platform
 for some hardware processor augmented by operating system and runtime library calls
Assembler needed to translate symbolic instruction sequence into
relocatable binary module if previous step generated assembly code
Linker needed to bind separate modules with library modules to form
single executable (i.e., object code) module if previous steps generated
sequence of binary modules
Sink element outputs resulting binary module into file
14
Example (cont.)
Program
text
stream/
characters
Source
augmented
AST
…
Global
optimizer
efficient
sequence
…
Lexical
Analyer
optimized
AST
Backend
code
generator
lexical
tokens
AST
Intermediate
code generator
instruction
sequence
for RM
Semantic
Analyer
Parser
instruction
sequence
for VM
relocatable
binary
module
Assembler
Linker
Local
optimizer
augmented
AST
…
efficient
sequence
…
single
executable
module
Sink
File
15
Example (cont.)
Pipeline support different variations
If source code preprocessing is to be supported
 Preprocessor filter inserted in front of lexical analyzer
If language to be interpreted rather than translated into object code
 Backend code generator (and all components after it) replaced by
interpreter for virtual machine
If compiler to be retargeted to different platform
 Backend code generator (and assembler and linker) for new platform
substituted for old one
If compiler to be modified to support a different language with
same lexical structure
 Parser, semantic analyzer, global optimizer, and intermediate code
generator replaced
If a load-and-go compiler desired
 File-output sink replaced by loader that loads executable module into
main memory and starts module executing
16
Example (cont.)
To make the system more efficient or convenient
System of filters may directly share global state
Combine adjacent active filters and replace
pipe by an upstream function call or
downstream procedure call
Make new information available in a filter
Example: symbol table information for runtime
debugging tools
17
Variants
A generalization allows filters with multiple input or output pipes
to be connected in any directed graph structure
Restrict to directed acyclic graph structures
 tee filter in Unix
provides mechanism to split stream into two streams, named pipes provide
mechanisms for constructing network connections, and filters with
multiple input files/streams provide mechanisms for joining two streams
cat
# create two named pipes
mknod pipeA p mknod pipeB p
# set up side chain computation
#(running in the background)
cat pipeA >pipeB &
# set up main pipeline computation
cat filename | tr -cs "[:alpha:]" "[\n*256]" \
| tr "[:upper:]" "[:lower:]" | sort | tee pipeA | uniq \
| comm -13 - pipeB | uniq
tr
tr
sort
tee
cat
uniq
comm
uniq
18
Consequences
Benefits
Intermediate files unnecessary, but possible.
Flexibility by filter exchange.
Flexibility by recombination.
Reuse of filter elements.
Rapid prototyping of pipelines.
Efficiency by parallel processing
19
Consequences
Liabilities
Sharing state information is expensive or inflexible
Efficiency gain by parallel processing is often an
illusion
Data transformation overhead
Error handling
20
References
Frank Buschmann, Regine Meunier, Hans
Rohnert, Peter Sommerlad, and Michael Stal.
Pattern-Oriented Software Architecture: A
System of Patterns, Wiley, 1996.
Mary Shaw and David Garlan. Software
Architecture: Perspectives on an Emerging
Discipline, Prentice-Hall, 1996.
21
Acknowledgement
This work was supported by a grant from
Acxiom Corporation titled “The Acxiom
Laboratory for Software Architecture and
Component Engineering (ALSACE).”
22