Transcript Document

CSCE 431:
Scripting Languages and
Rapid Prototyping
What are Scripting Languages?
tr.v. script·ed, script·ing, scripts
1. To prepare (a text) for filming or broadcasting.
2. To orchestrate (behavior or an event, for
example) as if writing a script: “the brilliant,
charming, judicial moderate scripted by his White
House fans” (Ellen Goodman).
- http://www.thefreedictionary.com/scripting
CSCE 431 Scripting Languages & Rapid Prototyping
Scripting Language (cont.)
• Programming language that supports scripts - programs written for
a special run-time environment that can interpret (rather than
compile) and automate the execution of tasks which could
alternatively be executed one-by-one by a human operator
• Can be viewed as a domain-specific language for a particular
environment; in the case of scripting an application, this is also
known as an extension language
• High-level language, ~10x less code than system programming
language
• Often implies “small” (≤ few thousand lines of code)
• Can be domain-specific, e.g. Awk, Sed string processing
languages
• Can be environment-specific, e.g. UNIX Shell, Visual Basic for
Applications
- http://en.wikipedia.org/wiki/Scripting_language
CSCE 431 Scripting Languages & Rapid Prototyping
Scripting Languages
• Used for one-time tasks
• Customizing administrative tasks
• Simple or repetitive tasks
• Example: run an application with a sequence of
different parameters
• Extension language
• LISP in EMACS an early example
• Controls execution of an application
• Programmatic interface to graphical application
CSCE 431 Scripting Languages & Rapid Prototyping
Classes of Scripting
• Web browser – PHP, Javascript,…
• Extension language – Lua, Tcl, VBA,…
• GUI – JavaFX, Tcl/Tk,…
• String processing – Awk, Sed,…
• OS scripting – Shell, Cshell,…
• General languages – Perl, Python, Ruby,…
• Overlap among these
CSCE 431 Scripting Languages & Rapid Prototyping
System Programming Languages
vs. Scripting Languages
• System programming languages
• C, C++, Java,…
• Designed for building data structures and
algorithms from scratch
• Concerns: efficiency, expressiveness, strong typing,
design support, read-only code, compiled
• Scripting languages
• Tcl, Perl, Python, Ruby, Awk, Sed, Shell, Lua,…
• Designed for gluing components together
• Concerns: rapid prototyping, typeless, higher level
programming, interpreted, code-on-fly
CSCE 431 Scripting Languages & Rapid Prototyping
System vs. Scripting
Languages Uses
• System programming languages
• Component (e.g. library) creation
• Machine interfaces (e.g. device drivers)
• Scripting languages
• Component gluing
• Use component as a primitive
• System integration
• Extension languages
CSCE 431 Scripting Languages & Rapid Prototyping
System vs. Scripting
Language Level
1000
[Ousterhout 1998]
Scripting
Instructions/statement
Visual Basic
100
Java
C++
Tcl/Perl
C
10
Assembly
1
System programming
None
Degree of Typing
CSCE 431 Scripting Languages & Rapid Prototyping
Strong
Typeless Scripting
• Facilitates connecting components
• Variables are containers
• Usually string-oriented for uniform representation
• Can generate and then execute code on fly
• Code is a string
• Allows code reuse
• Example: UNIX filter programs read and write byte
streams, can create pipelines
• select | grep scripting | wc
• select reads text selected on display, grep finds all lines
containing “scripting”, wc counts them
• Can reuse in different situations
CSCE 431 Scripting Languages & Rapid Prototyping
Typing in System
Programming Languages
• Finds errors at compile-time
• Permits optimizations
• But makes it difficult to reuse code
• Might have to do type conversion
• Might have to recompile, but may not have
source
CSCE 431 Scripting Languages & Rapid Prototyping
Tcl Example
button .b –text Hello! –font {Times 16} –
command {puts hello}
• Tcl command creates button control
• “Hello!” in 16-pt Times on button
• Prints “hello” when it is clicked
• Mixes 6 things in one statement:
•
•
•
•
•
Command name (button)
Button control (.b)
Property names (-text, -font, -command)
Strings (Hello!, hello)
Font name (Times 16)
• Typeface name (Times)
• Typeface size (16)
• Tcl script (puts hello)
• Tcl represents all as strings
• Arguments can be specified in any order, defaults for >20
unspecified properties
CSCE 431 Scripting Languages & Rapid Prototyping
Tcl Example (cont.)
• Java - takes 7 lines in 2 methods
• C++/MFC – 25 lines in 3 procedures
• Font in MFC:
CFont *fontPtr = new CFont();
fontPtr->CreatFont(16, 0, 0, 0, 700, 0,
0, 0, ANSI_CHARSET,
OUT_DEFAULT_PRECIS,
CLIP_DEFAULT_PRECIS,
DEFAULT_QUALITY,
DEFAULT_PITCH|FF_DONTCARE,
”Times New Roman”);
buttonPtr->SetFont(fontPtr);
CSCE 431 Scripting Languages & Rapid Prototyping
Tcl Example (cont.)
• Most extra code for strong typing
• SetFont needs CFont object that must be created
and initialized
• Must call CreateFont to initialize object
• Has 14 parameters
CSCE 431 Scripting Languages & Rapid Prototyping
Error Checking?
• Strong typing helps find errors
• At static analysis/compile time
• Efficiency – no need for runtime checks
• Scripting languages check values when used
• Cannot have font size xyz
• But must pay cost of runtime checks
• Must thoroughly test code to find errors
CSCE 431 Scripting Languages & Rapid Prototyping
Interpretation
• Most scripting languages are “interpreted”
• This could be “compiled on the fly” or “quickly
compile and then execute” for performance
• Speeds up development loop
• Flexibility for users to program apps at runtime
• Example: Tcl interface (extension language) on
Synopsys Design Compiler for logic synthesis
• Generate and execute code on fly
• E.g. Reformat HTML as Tcl, execute to display page
CSCE 431 Scripting Languages & Rapid Prototyping
Efficiency
• Scripting languages less efficient
• Interpretation rather than compilation
• Run-time “type” checking
• Power and ease-of-use rather than running on
“bare metal” of processor
• Example
•
•
•
•
Scripting – variable-length string
System – binary value in machine word
Scripting – hash table
System – indexed array
CSCE 431 Scripting Languages & Rapid Prototyping
Is Efficiency an Issue?
• Usually not
• Smaller scripted apps
• Time spent in components, not scripting
CSCE 431 Scripting Languages & Rapid Prototyping
Higher-Level Programming
• Scripting statements execute 100s-1000s of
machine instructions
• System PL statements execute ~5 machine
instructions
• Example
• Perl regular expression substitution as easy as
integer addition
• Tcl variable trace triggers updates when variable
set
• Scripting 5-10x more productivity
CSCE 431 Scripting Languages & Rapid Prototyping
Productivity
Application
Comparison
DB app
C++: 2 mo
Tcl: 1 day
Comp. sys. test &
install
C: 272k lines, 120 mo
C FIS app: 90k lines, 60 mo
Tcl/Perl: 7.7k lines, 8 mo
DB library
C++: 2-3 mo
Tcl: 1 wk
Security scanner
C: 3k lines
Tcl: 300 lines
Display oil well prod.
curves
C: 3 mo
Tcl: 2 wk
Query dispatcher
C: 1.2k lines, 4-8 wk
Tcl: 500 lines, 1 wk
2.5
Spreadsheet
C: 1460 lines
Tcl: 380 lines
4
Sim. & GUI
Java: 3.4k lines, 3-4 wk
Tcl: 1.6k lines, <1 wk
2
[J. K. Ousterhout 1998]
Code
ratio
47
Effort
ratio
Comments
60
Tcl more functionality
22
8-12
10
Tcl more functionality
6
Tcl version first
4-8
Tcl more functionality
Tcl version first
3-4
Code ratio = ratio of #lines of two implementations
Effort ratio = ratio of development times
CSCE 431 Scripting Languages & Rapid Prototyping
Tcl first, more
functionality
System PL Benefits
• Scripting best for gluing, system integration,
extension languages
• System PL best for complex data structures
and algorithms
• 10-20x faster execution time
CSCE 431 Scripting Languages & Rapid Prototyping
When to Use Scripting?
• Is app’s main task to connect preexisting
components?
• Will app manipulate variety of different things?
• Does app include GUI?
• Does app do lot of string manipulation?
• Will app’s functions evolve rapidly over time?
• Does app need to be extensible?
• Does app need to be highly portable?
CSCE 431 Scripting Languages & Rapid Prototyping
When to Use System PL?
• Does app implement complex data structures
or algorithms?
• Does app manipulate large datasets, so
execution speed is critical?
• Are app’s functions well-defined and slow to
change?
• Are there a small number of target platforms?
CSCE 431 Scripting Languages & Rapid Prototyping
Not Either/Or
• Most platforms provide scripting and system PL
• IBM
• Job Control Language (JCL) - sequence jobs on OS/360
• ~1st scripting PL
• Jobs ran in FORTRAN, PL/1, Algol, COBOL, Assembler
• UNIX
• Shell – sh/csh
• C
• PC
• Visual Basic
• C/C++
• Web
• JavaScript, Perl, Tcl
• Java
CSCE 431 Scripting Languages & Rapid Prototyping
Why Scripting’s Popularity?
• GUIs
• Often half of development effort
• Fundamentally gluing components
• Most scripting had origins in GUI development
• Internet
• Gluing
• Many platforms
• Component frameworks
• ActiveX, JavaBeans
• Manipulate components
CSCE 431 Scripting Languages & Rapid Prototyping
Why Scripting’s Popularity?
• Better scripting technology
•
•
•
•
More advanced scripting languages
Faster machines
Compile-on-fly
Garbage collection
• More casual programmers
•
•
•
•
Quickly learn language
“whip up” a script for few-times use
E.g. DB queries in spreadsheet
Speed of development and use, not execution
CSCE 431 Scripting Languages & Rapid Prototyping
Scripting and OOP
• Key benefits of OOP
• Encapsulation – information hiding
• Interface inheritance – same methods and APIs for
different implementations
• Some OO scripting languages
• Python, Perl 5+, Object Rexx, Incr Tcl
• Typeless objects
CSCE 431 Scripting Languages & Rapid Prototyping
Extensibility
• Many scripting languages provide facility to
add to language
• New commands in Tcl
• Ruby open classes
• Key for extension language use
• Hook scripting language to internals of application
components
• Tk, incr Tcl implemented as Tcl extensions
CSCE 431 Scripting Languages & Rapid Prototyping
Language Comparison
• Lutz Prechelt, An Empirical Comparison of
Seven Programming Languages, IEEE
Computer, 2000
• C, C++, Java, Tcl, Rexx, Perl, Python
• 80 implementations of same program
• Convert phone numbers to mnemonic strings
• Based on number to string mapping
• z1000 – 1000 non-empty random phone numbers
• m1000 – 1000 arbitrary (could be empty) random
phone numbers
• z0 – no phone numbers
CSCE 431 Scripting Languages & Rapid Prototyping
Runtime
CSCE 431 Scripting Languages & Rapid Prototyping
Load/Preprocess Dictionary
CSCE 431 Scripting Languages & Rapid Prototyping
Search Runtime Only
CSCE 431 Scripting Languages & Rapid Prototyping
Memory Consumption
CSCE 431 Scripting Languages & Rapid Prototyping
Program Length
CSCE 431 Scripting Languages & Rapid Prototyping
Programming Time
CSCE 431 Scripting Languages & Rapid Prototyping
Productivity
CSCE 431 Scripting Languages & Rapid Prototyping
Observations
• People write similar LOC/h
• Scripting takes 2-3x less code
• So 2-3x less development time
• Scripting memory consumption ~higher
• Java outlier
• Lot of variation
• Scripting ~10-20x longer load and preprocess
• Scripting ~similar search times
• String-oriented application
• C/C++ code had more bugs
CSCE 431 Scripting Languages & Rapid Prototyping
Case Study: Pluto System
• Swedish pension system
•
•
•
•
•
•
Perl connecting Java systems
All fund transactions
All payments
One account per citizen
On-line since 2000
Manages $40B+
[Lundborg, Lemonnier 2007, http://erwan.lemonnier.se/talks/pluto.html]
CSCE 431 Scripting Languages & Rapid Prototyping
System Details
• 320k lines Perl
• 68k lines SQL
• 27k lines shell script
• 26k lines HTML
• 750 GB Oracle DB
• 500M entries in some tables
• 5.5M users
• Daily batch processing
CSCE 431 Scripting Languages & Rapid Prototyping
Used Simple Perl
• C-like code
• No advanced Perl constructs
• Simple run-time type checking code
• Lots of cross-checking
• Defensive programming
• Nothing tricky
CSCE 431 Scripting Languages & Rapid Prototyping
Why Perl?
• Integrates w/UNIX, Oracle
• Can focus on algorithms
• Fast development cycle to cope with changing
requirements
• Remember, a government project!
CSCE 431 Scripting Languages & Rapid Prototyping
But
• Hard to read
• Little typing
• Poor integration with Java
• Slow, but not as slow as DB
• Hard to parallelize (DB server is parallel)
CSCE 431 Scripting Languages & Rapid Prototyping
Type Checking Experience
• Most bugs found during unit test
• Very few type-related crashes in 7 years
• Typing is maybe better for efficiency (help the
compiler) rather than safety
CSCE 431 Scripting Languages & Rapid Prototyping
What is Rapid Prototyping?
1. Quick assembly of a partial or complete system for
experimentation prior to full requirements,
specification and implementation
2. Quick assembly of a tool or system for temporary or
permanent use, using special-purpose languages and
existing tools
• Goals
• Rapid! - <10% of time for traditional C/C++ implementation
• Functionality – implement functions to permit experimentation
or use
• Okay performance – system only needs to be fast enough to
try it out or to be acceptable
• Easily modified – for iteration during experimentation, or for
maintainability
CSCE 431 Scripting Languages & Rapid Prototyping
Relationship to Agile
• Agile
• Build system using “final” technology
• Each iteration is a working system, gradually
adding features
• User stories more than requirements/specs
• Rapid prototyping
• May never be a “production” system
• Need the system to elicit user stories
• What does user want?
• “I know it when I see it” – Potter Stewart
CSCE 431 Scripting Languages & Rapid Prototyping
Rapid Prototyping Tools
• Shells
• Bourne shell (sh), C shell (csh), Korn shell (ksh), born-again shell (bash),
PowerShell
• Pattern languages
• Awk, gawk, sed, grep, perl
• Extension languages
• Emacs LISP
• Scripting languages
• Tcl, Python, Ruby,…
• Parser generators
• Lex, yacc
• Utilities
• UNIX: comm, diff, ed, find, sort, uniq, wc
• Existing tools
• Reuse, don’t code!
CSCE 431 Scripting Languages & Rapid Prototyping
Tool Characteristics
• They exist!
• Lots of needed functionality already built in
• Avoid coding
• Quick edit-compile-debug loop
• Interpreted, compile-on-fly, pre-compiled
• Simple I/O
• Mostly just ASCII text, not binary
• Frequently stream I/O – easier interfacing
• Easily controlled from other programs
• Command line interface, extension language, streams,
configuration files
• Often used in combination
• E.g. awk scripts in a shell script
CSCE 431 Scripting Languages & Rapid Prototyping
Shell Languages
• Command Interpreters
• Programming language access to UNIX commands
• sh, csh are “standard” and portable
• Applications
•
•
•
•
Need general control constructs
File testing, directory access
Need functionality of many UNIX tools
“Central control” of application
• Performance
• Commands dominate runtime, not shell
CSCE 431 Scripting Languages & Rapid Prototyping
Pattern Languages
• Domain
• Scan text input, do computation, pass result to output
• Key Ideas
• Regular expression matching
• Multiple matching patterns
• State machine transformations
• Conditional expressions
• Equations
• Performance
• Stream/file I/O dominates
• CPU efficiency less important
CSCE 431 Scripting Languages & Rapid Prototyping
Awk
• Language
• Set of <pattern, action> pairs
• For each input line, execute actions of matching
pairs
• Examples
• {print $2} – print second field of every line
• length > 72 {i++} – count # lines > 72 chars
END {print i}
• Applications
• Good for simple computations on input stream
• E.g. Take average of a column of numbers
CSCE 431 Scripting Languages & Rapid Prototyping
Sed
• Language
•
•
•
•
Stream editor
Commands are [addr [,addr]] function [args]
Apply commands w/matching addresses to each line
Can read/write files, test, branch, use buffer space, etc.
• Buffer size is not infinite
• Examples
• sed –e ‘s/
//’ file – delete first 3 spaces of each line
• sed –e ‘r file1’ file – place contents of file1 between
each line
• Applications
• Good for local transformations on text streams
• E.g. capitalize first word of each sentence
• Most common use is regular expression query-replace
CSCE 431 Scripting Languages & Rapid Prototyping
Grep
• Language
• Regular expression pattern search
• Print all lines that do/do not match pattern
• grep – limited RE, egrep – full RE, fgrep – fixed strings
• Example
• grep ‘^[abc]h’ file – print all lines beginning with “ah”, “bh” or
“ch”
• grep –v ‘Defect’ file – print all lines except those containing
“Defect”
• grep –n foo * - print all lines (w/filename and line number) of all
files with “foo”
• Applications
• General tool for simple text file searches
• egrep usually fastest and most general
CSCE 431 Scripting Languages & Rapid Prototyping
Perl
• Language
•
•
•
•
Practical extraction and report language
Combination of awk, sed, sh, csh, etc. functionality
C-like syntax, operators, semantics
Faster, more powerful than awk,…
• Features
•
•
•
•
•
•
Fast text, binary scan
Command line switch parsing
Math, system, file testing, I/O control, messaging functions
Subroutines
Awk (a2p), sed (s2p), find (find2perl) translators
Still evolving – Perl 5, Perl 6 (not backward compatible)
• Applications
•
•
•
•
•
Facilities management – more secure than C/C++
Networking
Web
General-purpose
For simple things, use awk, sed,…
CSCE 431 Scripting Languages & Rapid Prototyping
Ancient Perl Example
• Archie
• Program to access database of useful files available via
anonymous FTP
• Forerunner of Gopher, the forerunner of the Web
• C version
• 7889 lines of C code and scripts
• 1822 lines for VMS operating system support
• 230 lines for MS DOS support
• Told you this was an ancient example
• Perl version
• 1146 lines of Perl code and scripts
• Runs on any platform supporting Perl
• Almost as fast and more reliable
CSCE 431 Scripting Languages & Rapid Prototyping
Extension Languages
• Tcl
• General-purpose
• Small and efficient
• Easily linked to applications – its original purpose
• Emacs LISP
• Write programs to process text
• Programs to interact with other processes, e.g. shell
windows
• Used to create mail, news group readers, SW
development environments
• Advantages – can watch things happen, capture and
replay keystrokes, can run in batch
• Disadvantages – slow startup time, large memory
CSCE 431 Scripting Languages & Rapid Prototyping
Parser Generators
• Lex, Yacc
• Generate lexical analyzer and parser w/C callbacks
• Many relatives
• Applications
•
•
•
•
Parse interchange languages of some complexity
E.g. HTML, Verilog, VHDL,…
Use pattern languages for simple format languages
Slow compared to compiled parser
CSCE 431 Scripting Languages & Rapid Prototyping
UNIX Utilities
• Think of them as special-purpose tools for program
construction
• Use in combination of not-quite-ideal tools rather than
write new one
• Examples
• How many words in a word list?
#!/bin/csh –f
Sort $1 | uniq | wc | awk {print $1}
• Change files “foo.dat” to “foo.d”
#!/bin/csh –f
foreach i (`ls *.dat`)
mv $i {$i:r}.d
end
CSCE 431 Scripting Languages & Rapid Prototyping
Existing Tools
• Use your suite of programs as a toolbox
• Follow “tool building” and software reuse practices
• See Software Tools, Kernighan, Plauger, 1976
• Identify common needs in multi-person project
• Parser for common language
• Display routines
• Special-purpose widgets
• Build on your past, recycle, don’t dispose
CSCE 431 Scripting Languages & Rapid Prototyping
Portability
• Similar tools on different platforms
• Windows, Apple OS, UNIX/Linux
• Don’t forget iOS, Android build on Linux
• Challenge is now GUI, Apple restrictive
development environment
CSCE 431 Scripting Languages & Rapid Prototyping
Conclusions
• Your first reaction to any new programming
tasks should not be to start writing C or C++
• For many applications, the prototype is the final
system
• Specifications change a lot, so you want to
minimize up-front investment and/or
maintenance effort
• Scripting languages can be used to build
systems with existing components
• Best of both worlds with scripting, system
development languages
CSCE 431 Scripting Languages & Rapid Prototyping