SQL Reporting
Download
Report
Transcript SQL Reporting
Geoff Kimber
Realize the importance of learning SQL
Recognize SQL platform differences
Identify relational database concepts
Apply SQL syntax to a real-life example
Review SQL references
Data organized in tree structure with parent
and child segments
Parent to child is a one to many relationship
Implies repeating information, generally in
child segments
Fileman
Prescription file with refill multiple field
Data consists of records stored in tables
Each row is unique
Column values are of the same data type
Sequence of rows and columns is insignificant
Structured Query Language [SQL]
Is an ANSI computer language that is used to
interact with many relational databases
▪ Oracle
▪ MS-SQL Server
▪ MySQL, PostgreSQL and others
SQL is a set-based language
SQL is structured like English
▪ SEQUEL (Simple English QUEry Language)
VA is truly data rich
Each site has between 2-4 TB of data†
▪ non-imaging data
VA FileManager [FileMan]
Only within VistA
Limited functionality for non-programmers
Runs in the production environment
▪ Workflow and resource concerns
† - Pham R. The VA Data Lifecycle (Internals, Data Flows, and Business Intelligence)
Data repositories (warehouses)
Moves the analysis off-line
Provides accessible, yet very secure, data
▪ ODBC, ADO and LINQ
Greater functionality and access control
▪ The “official” data is secured within VistA
▪ Independent of VistA
▪ Microsoft Active Directory based access
▪ Maintain the front-line stance of CPRS/VistA
Various VA relational databases
Corporate Data Warehouse (CDW)
Regional Data Warehouse (RDW)
VISN Data Warehouse (VDW)
Potential for local data warehouses
▪ Extract raw data from VistA off-hours
▪ UPSERT the data into an SQL server automatically
▪ Users run their own reports and access is secure
Data warehouses = relational databases
SQL = the language of relational databases
Relational databases are everywhere
Learn a portable skill
Take a systems approach to collaboration
Build reusable code
Work more efficiently
VA has an enterprise license
Microsoft SQL Server is functionally superior
to Microsoft Access
Secure
Backups
Procedural code
No database size limit (software)
Efficiency
Two of the different SQL flavors
Transact SQL (Microsoft SQL Server)
▪ Manage databases/services, procedural code
Jet SQL (Microsoft Access)
▪ Information retrieval, program component
Each program has its own SQL flavor
Core keywords and functions follow standards
ANSI guidelines are “considered”
In SQL:
Something is either part of a set or it is not
Order is meaningless
SQL Server supports procedural code
▪ VBA can be used in MS Access for procedures
Dimension tables
Fact tables
Contain attributes that describe records in
fact tables
Each record contains unique identifier
Generally more stable
Drug name, provider name, patient name and
address
▪ Similar to Drug file (#50), New Person file (#200),
Patient file (#2)
Capture operational data
Contain multiple unique identifiers
One identifier for each record
One identifier for each data element in dimension
tables
Similar to Fileman IEN
Usually contain numerical and date fields
Prescription fills, Outpatient visits, lab test results
Similar to prescription file (#52), V POV
(#9000010.07)
Dim.LocalDrug
LocalDrugSID
Sta3n (3 digit station number)
LocalDrugNameWithDose
VaClassification
PricePerDispenseUnit
NationalDrugSID
LocalDrug
SID
LocalDrugNameWithDose
PricePer
VaClassifi Dispense National LocalDrug
cation
Unit
DrugSID SID
735055
ATENOLOL 50MG TAB
CV100
0.0079
2223484 735055
735056
ATENOLOL 100MG TAB
CV100
0.0075
2221541 735056
739331
ATENOLOL 25MG TAB
CV100
0.0068
2223485 739331
741066
ATENOLOL 12.5MG (1/2 OF 25MG) TAB CV100
0.0042
2223485 741066
748289
STUDY: ATENOLOL TAB
NULL
-1
IN000
748289
Dim.NationalDrug
NationalDrugSID
DrugNameWithoutDose
DosageForm
Strength
NationalDrugSID
DrugNameWithoutDose
DosageForm Strength
2221541
ATENOLOL
TAB
100
2221542
ATENOLOL
INJ
500
2221543
ATENOLOL
TAB
50
2221544
ATENOLOL/CHLORTHALIDONE
TAB
NULL
2223485
ATENOLOL
TAB
25
sPatient.sPatient
PatientSID
PatientName
PatientSSN
AddressLine1
City
State
ZipCode
RxOut.RxOutpat
RxOutpatSID
PatientSID
ProviderSID
LocalDrugSID
IssueDate
RxNumber
Rxout.RxoutpatFill
RxoutpatFillSID
RxOutpatSID
FillType
ReleaseDateTime
Qty
DaysSupply
UnitPrice
Metadata
Information about data
Includes CDW to VistA field crosswalk
http://vaww.cdw.r02.med.va.gov/metadata/Reports/Form
s/AllItems.aspx
Clauses perform set-based calculations
CLAUSE
SELECT
Definition
Defines what you will see. You can identify columns,
calculations, literal text, and even perform branching.
FROM
Defines the origin of the data. Tables can be joined.
WHERE
Defines data filters. Filters data before it is queried.
GROUP BY
HAVING
ORDER BY
Defines how data is grouped. Works hand in hand with
aggregate functions, i.e. how many in each group
Defines data filters. Filters data after the query is run. An
example: return only those items having a count > 5.
Defines the field(s) used to sort the data – ASC or DESC
Function
Definition
COUNT
Aggregate Function: Counts the number in a group.
SUM
Aggregate Function: Sums the numbers in a group
AVG
Aggregate Function: Averages numbers in a group
MIN and MAX
Aggregate Function: Finds the minimum and maximum
numeric values in a group.
UCASE and LCASE
Scalar Function: Returns text in uppercase or lowercase,
respectively
LEN
Scalar Function: Returns the length of a string
Microsoft Access supports the aggregate functions FIRST and LAST
SQL Server has a COUNT(DISTINCT...) aggregate function
Operator
AND
OR
NOT
=, >, <, >=, <=
Definition
Logical Operator: A and B are both true to equal TRUE
Logical Operator: A or B is true to equal TRUE
Logical Operator: Switches TRUE to FALSE and vice versa
Equality Operators: equals, greater than, etc.
BETWEEN
Equality Operator: TRUE if value is between the two criteria.
Values that equal the criteria are included.
Like
Equality Operator: TRUE if text contains the pattern defined.
LIKE allows for wildcard characters (see below)
+, -, *, /
+ or &
Mathematical Operators
Concatenation Operators: Join two text strings. (see below)
Access uses “*” as a wildcard and double quotes; SQL Server uses “%” and single quotes
Access uses the ampersand to concatenate “&”, SQL Server the plus sign “+”
Select LocalDrugSID
,NationalDrugSID
,LocalDrugNameWithDose
from vdw.dim.localdrug
Where LocalDrugNameWithDose like
'%atenolol%'
Case is not normally important, but helps
with readability
Spaces, tabs, and hard returns have limited
importance
Important in text fields
Not important between commands
Help improve readability
Microsoft query analyzer color codes key
words for readability
SQL databases have many different data types
Varchar
Int
▪ Bigint
▪ Smallint
Decimal
Float
Datetime
Money
Select
Main command used to retrieve data from SQL
tables
Followed by list of fields to return
‘*’ (without quotes) returns all fields in table
▪ Normally, avoid using ‘*’ unless you need all the fields in
a table
From vdw.dim.localdrug
Indicates source of data from which you are
selecting
Vdw.dim.localdrug
▪ VDW is database name
▪ VDW => VISN Data Warehouse
▪ RDW => Regional Data Warehouse
▪ CDW => Corporate Data Warehouse
▪ Dim is schema
▪ Localdrug is table name
Where
Allows you to restrict or filter the table contents
to just the results you want
Extremely versatile
Like
‘%’ is a multicharacter wildcard
‘%atenolol%’ - contains ‘atenolol’
‘atenolol%’ – begins with ‘atenolol’
‘%atenolol’ – ends with ‘atenolol’
Case is not normally important, but helps
with readability
Spaces, tabs, and hard returns have limited
importance
Important in text fields
Not important between commands
Help improve readability
Microsoft query analyzer color codes key
words for readability
Select LocalDrugSID
,NationalDrugSID
,LocalDrugNameWithDose
from vdw.dim.localdrug
Where localdrugnamewithdose like
'%atenolol%'
LocalDrugSID
NationalDrugSID
LocalDrugNameWithDose
735055
2223484
ATENOLOL 50MG TAB
735056
2221541
ATENOLOL 100MG TAB
739331
2223485
ATENOLOL 25MG TAB
741066
2223485
ATENOLOL 12.5MG (1/2 OF 25MG) TAB
1200004446
2223484
STUDY: ATENOLOL 50MG TAB
Select VA FileMan Option: PRINT File Entries
OUTPUT FROM WHAT FILE: DRUG//
SORT BY: GENERIC NAME// 'GENERIC
NAME["ATENOLOL"
WITHIN GENERIC NAME["ATENOLOL", SORT BY:
FIRST PRINT FIELD: NUMBER
THEN PRINT FIELD: PSNDF:NUMBER
THEN PRINT FIELD: GENERIC NAME
THEN PRINT FIELD:
DRUG LIST
PAGE 1
NUMBER NUMBER
GENERIC NAME
-----------------------------------------------------------------------3217 4331 ATENOLOL 25MG TAB
4068 4329 ATENOLOL 50MG TAB
4069 4328 ATENOLOL 100MG TAB
9751 4331 ATENOLOL 12.5MG (1/2 OF 25MG) TAB
12975 4329 STUDY: ATENOLOL 50MG TAB
Used to connect two or more tables together
using data elements that are shared by
individual tables
Similar to fileman jumps, with more power
Inner joins, outer joins and full joins
Produces a result that includes only the
records that are the same in both tables
Select a.localdrugnamewithdose
, b.strength
from vdw.dim.localdrug as a
inner join vdw.dim.nationaldrug b
on a.NationalDrugSID = b.NationalDrugSID
where LocalDrugNameWithDose like
‘%atenolol%’
localdrugsid
localdrugnamewithdose
strength
735055
ATENOLOL 50MG TAB
50
735056
ATENOLOL 100MG TAB
100
739331
ATENOLOL 25MG TAB
25
741066
ATENOLOL 12.5MG (1/2 OF 25MG) TAB
25
Results contain all records in one table and
only matching records in another table
Left joins – return all results in table specified in
‘from’ statement and all matching tables from
‘joined’ table
Right joins – return all results in table specified in
‘joined’ table ‘from’ statement and all matching
tables from ‘from’ statement
Select a.localdrugnamewithdose
, b.strength
from vdw.dim.localdrug as a
left outer join vdw.dim.nationaldrug b
on a.NationalDrugSID = b.NationalDrugSID
where LocalDrugNameWithDose like
‘%atenolol%’
localdrugsid localdrugnamewithdose
strength
735055
ATENOLOL 50MG TAB
50
735056
ATENOLOL 100MG TAB
100
739331
ATENOLOL 25MG TAB
25
741066
ATENOLOL 12.5MG (1/2 OF 25MG) TAB
25
1200004446 STUDY: ATENOLOL 50MG TAB
NULL
Indicates lack of data
Does *not* mean zero
Math operations with null values have null results
▪ 2*null = null
▪ 0 * null = null
Text and logic operations on null values return null
Several functions exist to deal specifically with
null values
Returns all values in both tables
Not particularly applicable in our setting
http://www.w3schools.com/sql/
w3schools is an excellent learning resource
http://msdn.microsoft.com
Microsoft Developer Network is the “go to” place
for all coding questions for Microsoft
SAMS Teach Yourself SQL in 10-Minutes
Excellent starting point (uses older SQL style)
SAMS Teach Yourself Microsoft SQL Server T-SQL in
10-Minutes (good intro for Microsoft SQL Server)
\\vhacdwa10\BIPLTrainingContent\CDW_Data_101
Recorded Live Meetings
http://vaww.cdw.r02.med.va.gov/Pages/
CDW Home
CDW Best practices
MetaData
Relationship Diagrams
Additional Training
VA TMS
Books 24x7
Forta, Ben. SAMS Teach Yourself Microsoft SQL Server T-SQL in 10-Minutes. Indianapolis, IN:
SAMS, 2008. Print.
Forta, Ben. SAMS Teach Yourself SQL in 10-Minutes –Third Edition. Indianapolis, IN: SAMS,
2004. Print.
Kline, Kevin, Hunt, Brand, and Kline, Daniel. SQL in a Nutshell –Third Edition. Cambridge,
MA: O’Reilly Media, 2008.
Molinaro, Anthony. SQL Cookbook. Cambridge, MA: O’Reilly Media, 2005.
Henderson, Ken. The guru's guide to Transact-SQL. Boston: Addison-Wesley, 2000. Print.
Roman Steven. Access Database Design & Programming –Third Edition. Cambridge, MA:
O’Reilly Media, 2002. 11-124. Print
Bluttman Ken, and Freeze Wayne S. Access Data Analysis Cookbook. Cambridge, MA:
O’Reilly Media, 2007. 1-106. Print
Getz, Ken, Litwin, Paul, and Baron, Andy. Access Cookbook Cambridge, MA: O’Reilly Media,
2002. 1-56. Print
Date CJ. SQL and Relational Theory. Cambridge, MA: O’Reilly Media, 2009.
http://www.w3schools.com/sql/
http://msdn.microsoft.com