Transcript sql1.ppt

SQL - Structured Query Language
For database analyses
SQL – Structured Query Language
• SQL is designed to do variable selection and
manipulation on data stored in multiple tables
Why multiple tables? (aka database)
• Avoid duplication of information
• Ease of updating information
• Linked by unique identifiers/codes
Where would we use? A single entity with many
different operations with common information
University
• Many functions: students taking classes &
getting grades, scheduling classes in rooms,
students paying tuition & fees, faculty
teaching classes, faculty getting paid, etc
Tables of
• Students, faculty with demographics
• Students*classes, faculty*classes
• Tuition payments, faculty payroll, etc.
• Linked by studentID, facultyID, classID, etc
Business
• Multiple functions including selling, buying,
production, and paying employees
Tables of orders, suppliers, customers, products,
accounts payable & receivable, payroll, ….
Linked by productID, customerID, …
Baseball
Lahman database – season summaries
• Tables of teams, batting (player*season),
pitching (player*season), salaries,
demographics (Master)
• Linked by playerID, teamID, etc.
• Why baseball? Nonproprietary, not small,
interesting? … maybe
In SAS
• PROC SQL
• Designed to run interactively (end with QUIT;)
• Tables are stored as SAS datasets
Let’s look at the baseball data in SAS….