Brian http://brianwmitchell.com (#1 blog on SQL PDW) @brianwmitchell (Woeful mid-200’s followers) Microsoft’s DW Business Momentum #1 Unit Share for DW All-up (SQL Server) PDW Volume 6x Grew Year Over Year SQL.

Transcript Brian http://brianwmitchell.com (#1 blog on SQL PDW) @brianwmitchell (Woeful mid-200’s followers) Microsoft’s DW Business Momentum #1 Unit Share for DW All-up (SQL Server) PDW Volume 6x Grew Year Over Year SQL.

Brian
http://brianwmitchell.com (#1 blog on SQL PDW)
@brianwmitchell (Woeful mid-200’s followers)
Microsoft’s DW Business Momentum
#1 Unit Share
for DW All-up
(SQL Server)
PDW Volume
6x
Grew
Year
Over Year
SQL Server
Enterprise
Revenue Growing
> 30%
Strategic PDW
Customer
Wins in
Fortune 1000
Companies
Upgrading SQL Server to PDW Gains 100x Improvement
“…basic queries that previously took 20 minutes only
took seconds using the SQL Server 2008 R2 Parallel
Data Warehouse.”
-Tom Settle, Assistant VP, Data Warehousing, Hy-Vee
Benefits
5
Leader in Gartner MQ for DW and BI
Challengers
As of February 2013
Leaders
Microsoft
Niche players
Business Intelligence
Visionaries
Completeness of Vision
Ability to Execute
Ability to Execute
Data Warehousing
Challengers
Leaders
Microsoft
Niche players
Visionaries
Completeness of Vision
“Microsoft exhibits one of the best value propositions on the market with a low cost and a
highly favourable price/performance ratio”- Gartner, February 2012
6
How do I optimize my
fleet based on weather
and traffic patterns?
Click stream
Wikis/blogs
Sensors/RFID/
devices
Social sentiment
Audio/video
Big Data
Log files
Spatial &
GPS coordinates
Data market feeds
eGov feeds
Weather
Text/image
Keep legacy
investment
Limited
scalability
Buy new tier one
hardware appliance
Acquire “Big Data”
solution
High
costs
Significant
training
Acquire Business
Intelligence
Solution
complexity
Insights on Any Data of Any Size
• Pure hardware costs are ~50%
lower
Control Node
• Price per raw TB is close to
70% lower due to higher
capacity
Mgmt. Node
• 70% more disk I/O bandwidth
LZ
Backup Node
Infiniband
& Ethernet
Infiniband & Ethernet
•
•
•
•
$
Fiber Channel
160 cores on 10 compute nodes
1.28 TB of RAM on compute
Up to 30 TB of temp DB
Up to 150 TB of user data
Estimated Total HW component List
Price: $1MM
•
•
•
•
128 cores on 8 compute nodes
2TB of RAM on compute
Up to 168 TB of temp DB
Up to 1PB of user data
$ Price: $500K
Estimated Total HW component List
One standard node type
Doubled memory to 256GB
Host 0
Updating to the newest Infiniband
Host 1
Host 2
JBOD
IB &
Ethernet
Host 3
Direct attached SAS
Moving from SAN to JBODs
Significant reduction in costs
Moving away from dependency on handful of key SAN vendors
Leverage Windows Server 2012 technologies to achieve the same level
of reliability and robustness
Easy path to other DAS architectures and potentially different types
Backup and LZ are now reference architectures that are not in the basic
topology
Customers can use their own hardware*
Customers can use more than 1 BU or LZ
Scale unit concept
Capacity scale unit: adding 2/3 compute nodes and related storage
Spare scale unit
Base scale unit: min populated rack w/ networking
2 – 56 compute nodes
•
1 – 7 racks
•
1, 2, or 3 TB drives
•
15.1 – 1268.4 TB raw
•
53 – 6342 TB User data
•
Up to 7 spare nodes
available across the entire
appliance
3 Rack
181.2TB (Raw)
1¼ Rack
75.5TB
2 Rack 1 1/2 Rack
30TB (Raw)
15TB (Raw)
(Raw)
120.8TB (Raw)90.6TB (Raw)
Full Rack 1/2 Rack
¼ Rack
60TB (Raw)
•
Full Rack
2/3 Rack
1/3 Rack
67.9TB (Raw)
45.3TB (Raw)
22.6TB (Raw)
•
2 – 54 compute nodes
•
1 – 6 racks
•
1, 2, or 3 TB drives
•
22.65 – 1223.1 TB raw
•
79 – 6116 TB User data
•
Up to 6 spare nodes available
across the entire appliance
HP
Quarter-rack
Half
Three-quarters
Full rack
One-&-quarter
One-&-half
Two racks
Two and a half
Three racks
Four racks
Five racks
Six racks
Seven racks
DELL
Quarter-rack
2 thirds
Full rack
One and third
One and 2 third
2 racks
2 and a third
2 and 2 thirds
Three racks
Four racks
Five racks
Six racks
Base
1
1
1
1
2
2
2
3
3
4
5
6
7
Base
1
1
1
2
2
2
3
3
3
4
5
6
Active
0
1
2
3
3
4
6
7
9
12
15
18
21
Active
0
1
2
2
3
4
4
5
6
8
10
12
Compute
2
4
6
8
10
12
16
20
24
32
40
48
56
Compute
3
6
9
12
15
18
21
24
27
36
45
54
Incr.
N/A
100%
50%
33%
25%
20%
33%
25%
20%
33%
25%
20%
17%
Capacity inc.
N/A
100%
50%
33%
25%
20%
17%
14%
13%
33%
25%
20%
Spare
1
1
1
1
2
2
2
3
3
4
5
6
7
Spare
1
1
1
2
2
2
3
3
3
4
5
6
Total
4
6
8
10
13
15
19
24
28
37
46
55
64
Total
5
8
11
15
18
21
25
28
31
41
51
61
Raw disk: 1TB
15.1
30.2
45.3
60.4
75.5
90.6
120.8
151
181.2
241.6
302
362.4
422.8
Raw disk: 1TB
22.65
45.3
67.95
90.6
113.25
135.9
158.55
181.2
203.85
271.8
339.75
407.7
Raw disk: 3TB
45.3
90.6
135.9
181.2
226.5
271.8
362.4
453
543.6
724.8
906
1087.2
1268.4
Raw disk: 3TB
67.95
135.9
203.85
271.8
339.75
407.7
475.65
543.6
611.55
815.4
1019.25
1223.1
Capacity
53-227 TB
106-453 TB
159-680 TB
211-906 TB
264-1133 TB
317-1359 TB
423-1812 TB
529-2265 TB
634-2718 TB
846-3624 TB
1057-4530 TB
1268-5436 TB
1480-6342 TB
Capacity
79-340 TB
159-680 TB
238-1019 TB
317-1359 TB
396-1699 TB
476-2039 TB
555-2378 TB
634-2718 TB
713-3058 TB
951-4077 TB
1189-5096 TB
1427-6116 TB
2 to 56 nodes
15 TB to 1.3 PB
raw
Up to 6 PB user
data
2 or 3 node
increments for
small topologies
Agility Due to Virtualization
CTL
Storage Spaces manage the physical disks on
the JBOD(s)
• 33 logical mirrored drives (66 physical drives)
• 4 hot spares
MAD
FAB
AD
VMM
Host 0
Compute 1
Host 1
Clustered Shared Volumes (CSV) allows all nodes
to access the LUNs on the JBOD as long as at
least one of the attached nodes is active
Compute 1
Host 2
JBOD
One cluster across the whole appliance
VMs are automatically migrated on failure
Failback continues to be through CSS use of
Windows Failover Cluster Manager
* 3 nodes per JBOD in Dell Configuration
IB &
Ethernet
Compute 2
Host 3
Direct attached SAS
• Each LUN is composed of 2 drives in
RAID1 mirroring configuration
• Distributions are now split into 2 files
• TempDB and Log are across all 16
LUNs
• No fixed TempDB or log size allocation
• VHDXs are on JBODs to ensure high
availability
• Disk I/O further parallelized relative to
V1: bandwidth to increase by ~70% in
V2 RTM
Disk 1
Disk 2
Node 1: Distribution A – file 1
Disk 3
Disk 4
Node 1: Distribution A – file 2
Disk 5
Disk 6
Node 1: Distribution B – file 1
Disk 7
Disk 8
.
.
.
Node 1: Distribution B – file 2
.
.
.
.
.
.
Disk 29
Disk 30
Node 1: Distribution H – file 1
Disk 31
Disk 32
Node 1: Distribution H – file 2
Disk 33
Disk 34
Node 2: Distribution A – file 1
Disk 35
Disk 36
Node 2: Distribution A – file 1
.
.
.
.
.
.
.
.
.
Disk 65
Disk 66
Disk 67
Disk 68
Disk 69
Disk 70
.
.
.
.
.
.
Fabric storage (VHDXs for node)
Hot spares
JBOD
Tem
p DB
Log
Tem
p DB
Log
10x - 100x
Memory-Optimized
Columnstore Index for
Improved Query Times
Performance
Improvement
Data Compression Reducing
Storage Costs and Data
Scans
7x
Data Compression
By restructuring our warehouse with xVelocity, data loading and reporting are
significantly faster. The large report that used to take 17
render now takes only 3
seconds.
Other / Previous Versions
• SQL Server 2008 R2 PDW up to 100x faster than scale up DW (in cases)
• SQL Server 2008 R2 PDW handle extreme parallelization for query
complexity and concurrency
minutes to
Updateable Columnstore
Segments
Row Group
Clustered ColumnStore index is comprised of 2 parts
• ColumnStore
• Delta Store
•
Data is compressed into Segments
• Ideally ~1 Million Rows (subject to system
resource availability)
•
A collection of segments representing a set of entire
rows is called a Row Group
•
The minimum unit of I/O between disk and memory is
a Segment (Red block is a single Segment)
•
Execution Batch Mode (as opposed to traditional row
mode) moves multiple rows between iterators
• ~ 1000 Rows
•
Dictionaries (Primary and Secondary) are used to
store additional metadata about segments
C1
C2
C3
C4
C5
C6
ColumnStore
Delta (Row)
Store
C1
C2
C3
…
•
C4
C5
C6
Row Mode Scan Example
•
SQL 2012 implements Batch Mode processing to handle rows batch-ata-time in addition to row-at-a-time
• SQL 2008 and before only had Row processing
•
Typically batches of about 1000 rows are moved between iterators
• Significantly less CPU is required due to the average number of
instructions per row decreasing
•
Batch Mode processing is only available for some operators
• Hash Join / Aggregate are supported
• Merge Join, Nested Loop Join and Stream Aggregate are not
supported
SELECT COUNT(*) FROM FactInternetSales_Column
SELECT COUNT(*) FROM FactInternetSales_Row
352 ms
6704 ms
Batch Mode Scan Example
Segments
Row Group
INSERT’s are added to the Delta Store Table
• NOTE: The Delta Store table is a Page Compressed Heap
•
DELETE’s from ColumnStore are logical, data is not physically
removed until REBUILD is issued
• DELETE from Delta Store is physical as it is row based
•
UPDATE is INSERT + DELETE
•
Delta Store is automatically converted to ColumnStore at ~1M
rows by background “Tuple Mover” process
• Can also be forced with REORGANIZE at ~1M rows
•
Converting Delta Store to ColumnStore with REORGANIZE is
an ONLINE operation
•
ADD / DROP / ALTER COLUMN is supported
• Partition switching also supported
C2
C3
C4
C1
C2
C3
…
•
C1
C5
C6
ColumnStore
Changes to data can be applied directly to a Clustered
ColumnStore Index
Delta (Row)
Store
•
C4
C5
C6
~1M
Rows
•
INSERTED a single record into a table with Clustered ColumnStore Index
•
Screenshots taken from DMV sys.pdw_nodes_column_store_row_groups (Subset of total rows returned)
•
Before row Inserted:
•
After single row Inserted (Delta Store has been created and single row represented)
•
After REBUILD
• REORGANIZE only moved Delta Store Records into Segment
• REBUILD affects entire index (or entire Partition index)
Segment Row Count Increased by 1
•
The state_description field has 3 states: COMPRESSED / OPEN / CLOSED
• COMPRESSED represents a row group that is stored in ColumnStore format
• OPEN represents a Delta Store that is accepting new rows
• CLOSED represented a full Delta Store ready for REORGANIZE
•
When Inserting 102,400 rows or more in a single batch into a ColumnStore Index distribution, the data will compress automatically
•
When Inserting 102,399 or less rows in a single batch, the data will be stored in the Delta Store
•
The actual maximum number of rows per Delta Store is 1,048,576 at which point it is CLOSED
• This is also the ideal Segment size that SQL Server will try to create when first building a ColumnStore index from a table
• When Index Build encounters memory pressure, DOP is reduced first, then Segment Size
•
Only the REBUILD statement can compress a Delta Store that is not in the CLOSED state
• Neither REORGANIZE or the Tuple Mover process will have any effect on an OPEN Delta Store
•
•
Consider a table with a single partition range of 100, therefor 2 partitions
•
INSERT the partition key values 50 and 150 into the table
•
REBUILD the index on partition 1
•
INSERT a 3 more records into partition 1 (Note partition 2 is still in Delta Store form also)
•
REBUILD the entire index (As opposed to the single partition above)
Note:
When attempting a partition MERGE or SPLIT operation, be aware that a table with a ColumnStore index will return an
error and fail to execute whereas a traditional row based Clustered Index / Heap will perform the operation successfully
•
Single Create Table Statement
CREATE TABLE <TableName> (Col1 int, Col2 varchar(10), Col3 datetime)
WITH (DISTRIBUTION = HASH (Col1), CLUSTERED COLUMNSTORE INDEX)
•
Create Table Statement with Create Index Statement
CREATE TABLE <TableName> (Col1 int, Col2 varchar(10), Col3 datetime)
CREATE CLUSTERED COLUMNSTORE INDEX <IndexName> ON <TableName>
•
•
CTAS
•
This approach allows the user to explicitly define the name of the index
CREATE TABLE <TableName>
WITH (DISTRIBUTION = HASH(Col1), CLUSTERED COLUMNSTORE INDEX)
AS
SELECT * FROM <TableName2>
Note: When CTAS is performed, ColumnStore Index is created before data is Inserted – Next Slide
•
Non-Clustered Indexes need to be dropped before a Clustered ColumnStore index can be created
•
ColumnStore index creation is memory intensive where data already exists in the table
• Be aware of Workload Management Resource Classes
•
Example CTAS Query – Total Rows: 60,398
CREATE TABLE FactInternetSales_Copy
WITH (CLUSTERED COLUMNSTORE INDEX, DISTRIBUTION = HASH(SalesOrderNumber))
AS
SELECT * FROM FactInternetSales
•
Order of Operations For Query:
1. Create Table FactInternetSales_Copy
2. Create Clustered ColumnStore Index on FactInternetSales_Copy
3. Insert Into FactInternetSales_Copy From FactInternetSales
•
Because the Clustered ColumnStore Index is created BEFORE data is Inserted, an INSERT of under 16384 rows per distribution will
populate the Delta Store
• No records in the resulting table are stored in ColumnStore
Initial CTAS
•
Issue Statement - ALTER INDEX <IndexName> ON FactInternetSales_Copy REBUILD
After REBUILD
•
Note: Q_Tables (Internal TEMP Tables) never create a Clustered ColumnStore Index during Data Movement
•
Existing SQL 2012 DMV’s
sys.pdw_nodes_column_store_segments
sys.pdw_nodes_column_store_dictionaries
sys.pdw_nodes_column_store_row_groups (shown earlier in INSERT example)
DELETE Row Count
•
sys.indexes and sys.index_columns have also been modified for ColumnStore information
•
ColumnStore Indexes are now Clustered
• Only a single index exists for a table
• Clustered Index + Non-Clustered ColumnStore is not supported
• Clustered ColumnStore + Non-Clustered Row Store Index is not supported
•
Non-Clustered ColumnStore indexes are not supported in PDW V2
•
Full DML (Insert / Update / Delete / Select) is now supported directly on ColumnStore
• Previous workaround involved maintaining a separate secondary row store table with UNION ALL
•
All existing PDW data types are now supported in ColumnStore indexes
• decimal with precision greater than 18 was not supported in SQL Server 2012
• binary / varbinary was not supported in SQL Server 2012
• datetimeoffset with scale greater than 2 was not supported in SQL Server 2012
•
Query Processing
• Batch Mode hash join spill now implemented – previously this would revert to row mode
• Aggregations without GROUP BY now supported
•
Some limitations still apply
• Avoid string data types for filtering or join conditions
• Some SQL clauses, for example: ROW_NUMBER() / RANK() etc. OVER (PARTITION BY … ORDER BY …)
• 2 factors contribute to exceptional query performance in PDW V2
1. ColumnStore Index Segment Elimination
2. Batch Mode Execution
•
ColumnStore indexing in PDW V2 is significantly advanced from SQL Server 2012
• Single copy of data
• Full DML Supported
• Query Processor enhancements
•
Be aware of the circumstances in which data is automatically compressed (rather than Delta Stored)
•
Server Roles in PDW V2 allow users to be categorized for Workload Management
•
XLargeRC should generally be reserved for memory intensive maintenance tasks
Introduction of pre-built resource classes in
PDW
A given resource class represents pre-defined
group of appliance resources:
• PDW concurrency slots in use
• Memory utilization
• Priority
User (DBA) controls how different requests are
be mapped to different resource classes.
PDW honors resource class at run-time
Default Behavior
• PDW Concurrency slots in use: 1
• Memory: V1 HW ~200MB, V2 HW ~400MB
• Priority: Medium
MediumRC
• PDW Concurrency slots in use: 3
• Memory: V1 HW – 600MB, V2 HW – 1.2GB
• Priority: Medium
LargeRC
• PDW Concurrency slots in use: 7
• Memory: V1 HW – 1.4GB, V2 HW ~2.8GB
• Priority: High
XLargeRC
• PDW Concurrency slots in use: 21
• Memory: V1 HW ~4.2GB, V2 HW ~8.4GB
• Priority: High
•
Resource Governor DMV’s added into PDW V2
•
Existing DMV sys.dm_pdw_exec_requests is extended to include the resource_class that the request came from
•
New DMV’s added to PDW V2
sys.dm_pdw_resource_waits
sys.dm_pdw_nodes_resource_governor_resource_pools
sys.dm_pdw_nodes_resource_governor_workload_groups (Example Below)
•
Assigning the correct Resource Class is vital to the quality of ColumnStore Index created
•
Look at the difference in Segment size between default and XLargeRC Resource Class
Default
•
… ALTER SERVER ROLE XLargeRC ADD MEMBER charlesf
XLargeRC
Query Relational
and nonrelational data
with familiar SQL
Use Familiar T-SQL to Query
Data in PDW and Data in
Hadoop Natively
Removes the Need to Learn
New Query/MapReduce
Languages
010101010101010101
1010101010101010
01010101010101
101010101010
Other/Previous Versions
• Microsoft Windows-based Hadoop Distribution
• SQL Server 2008 R2 PDW had data connectors between PDW and
Hadoop
Regular
T-SQL
External Tables and full SQL query access to data stored in
HDFS
Results
Enhanced
PDW Query
Engine
HDFS bridge for direct & fully parallelized
access of data in HDFS
Joining ‘on-the-fly’ PDW data with data from HDFS
Parallel import of data from HDFS in PDW tables for
persistent storage
PDW V2
Structured data
External Table
HDFS bridge
Parallel export of PDW data into HDFS including
‘round-tripping’ of data
HDFS Data Nodes
Unstructured data
• Direct parallel data access between PDW Compute Nodes and Hadoop Data Nodes
• Support of all HDFS file formats
• Introducing “structure” on the “unstructured” data
Hadoop
Hadoop
HDFS
DB
SQL in, results out
HDFS
DB
SQL in, results stored in HDFS
• Parallel Data Transfers
PDW Appliance
Hadoop Cluster
• Cost-based decision on how much data needs to be pushed to PDW
• SQL operations on HDFS data pushed into Hadoop as MapReduce jobs
Hadoop
MapReduce
HDFS
DB
External Tables are mapped to HDFS files on Hadoop
Fields in the file are defined as columns in the PDW External
table
File characteristics are also provided during definition
CREATE EXTERNAL TABLE ClickEvent
(
url varchar(50),
event_date date,
user_IP varchar(50)),
WITH
(LOCATION
=‘hdfs://MyHadoop:5000/clickstream/click.txt’,
FORMAT_OPTIONS (FIELD_TERMINATOR = '|'));
• Internal representation of data residing in Hadoop/HDFS
• Introducing new T-SQL
CREATE EXTERNAL TABLE table_name ({<column_definition>} [,...n ])
{WITH (LOCATION =‘<URI>’,[FORMAT_OPTIONS = (<VALUES>)])}
[;]
1.
Indicates
‘External’ Table
2.
Required location of Hadoop
cluster and file
(support of delimited text file
only in PDW V2)
3.
Optional Format Options associated with data
import from HDFS
(e.g. arbitrary field delimiters & reject-related
thresholds)
Example – Creating external table
The table is linked to external data located in Hadoop
CREATE EXTERNAL TABLE ClickStream(url varchar(50), event_date date, user_IP
varchar(50)) WITH (LOCATION =‘hdfs://MyHadoop:5000/tpch1GB/employee.tbl’,
FORMAT_OPTIONS (FIELD_TERMINATOR = '|'));
Example – Creating external table as Select (CETAS)
The data is getting exported to Hadoop and linked to PDW
CREATE EXTERNAL TABLE ClickStream WITH
(LOCATION =‘hdfs://MyHadoop:5000/tpch1GB/employee.tbl’, FORMAT_OPTIONS
(FIELD_TERMINATOR = '|')) AS SELECT url, event_date, user_IP FROM ClickStream
•
•
•
•
Direct HDFS access
Functional part of Data Movement Service
Hides HDFS complexity
HDFS file types supported by use of appropriate RecordReader interface
PDW Node
SQL Server
PDW Node
DMS
DMS
HDFS
HDFS
HDFS
Hadoop
Cluster
SQL Server
PDW Appliance
Query against
external table
executed in PDW
Control Node
DMSEngine
Ser er
PDW
Load
Manager
DMS
Manager
Computel Nodes
SQL
Server
HDFS Bridge reads data
blocks by using Hadoop
RecordReaders interface
DMS
DMS
Converter
Sender
Receiver
Writer
Each row is converted for
bulk insert and hashed
based on the distribution
column
(5) Hashed row is sent to
appropriate node
receiver for loading
DMS
HDFS
HDFS
HDFS
Hadoop
Cluster
(6) Row is bulk inserted
into destination table
Converter
Sender
Receiver
Writer
• Java runtime libraries have to be installed on all Compute Nodes and Control Node
• JRE requires additional re-distribution rights
• Can be automated placing JRE package to the PDW installation media
• Hadoop connectivity needs to be enabled in PDW
1 – HDInsight, 2- Hortonworks, 3 – Cloudera
• List of external tables
Interactively
Visualize
Petabytes of Data
Other/Previous Versions
• Microsoft releases PowerView with SQL 2008 R2
• Microsoft will release in-memory BI default in Excel 2013 (soon)
Direct Query with
PowerView
T-SQL additions to increase compatibility for:
SQL Server Data Tools
Microsoft BI Tools
3rd Party Tools, like Tableau
Catalog SPs examples:
• sp_tables_rowset;2
• sp_catalogs_rowset
• sp_executesql
SET options:
• SET ROWCOUNT
• SET FMTONLY
Configuration Functions:
• @@LANGUAGE
• @@SPID
Built-in function examples:
• db_id
• db_name
• object_id
General T-SQL improvements examples:
• cross/outer apply
• sp_prepare
• sp_execute
•
•
•
•
•
•
LOCK_TIMEOUT
FMTONLY
ROWCOUNT
Mixed parameters to be
supported
Calling SPs w/o EXEC
Named parameter support
(@app = ‘blah’)
•
•
•
•
•
•
•
•
•
•
•
•
SP_EXECUTESQL
SP_TABLES_ROWSET;2
SP_CATALOGS_ROWSET
SP_CATALOGS_ROWSET;2
SP_VIEW_ROWSET
SP_PRIMARY_KEYS_ROWSET
SP_SPECIAL_COLUMNS_100
SP_COLUMNS_100
SP_COLUMNS
SP_STATISTICS
SP_DATATYPE_INFO_90
SP_FKEYS
•
•
•
•
•
•
•
•
SP_PROCEDURE_PARAMS_RO
WSET
SP_PROCEDURE_PARAMS_100_
MANAGED
SP_SPROC_COLUMNS
SP_FOREIGN_KEYS_ROWSET;3
SP_FOREIGN_KEYS_ROWSET2
SP_OLEDB_RO_USRNAME
SP_OLEDB_LANGUAGE
SP_OLEDB_DEFLANG
•
•
•
•
•
•
•
•
•
•
•
•
•
•
DB_NAME
DB_ID
OBJECT_ID
OBJECT_NAME
SCHEMA_ID
SCHEMA_NAME
COLLATIONNAME
COLLATIONPROPERTY
SERVERPROPERTY (**)
DATABASEPROPERTYEX (**)
OBJECTPROPERTY
OBJECTPROPERTYEX
INDEXPROPERTY
COL_NAME
•
•
•
•
HAS_DBACCESS
USER_NAME
USER
SUSER_SNAME
•
•
•
@@LANGUAGE
@@SPID
STR
•
•
•
•
•
•
•
•
MASTER_FILES
SYSTYPES
SYSOBJECTS
SYSUSERS
ASSEMBLIES
ASSEMBLY_MODULES
ASSEMLY_TYPES
NUMBERED_PROCEDURES
•
•
•
•
CROSS/OUTER APPLY
SP_PREPARE
SP_EXECUTE
SP_UNPREPARE
Targeted Driver
PDW Tool
SSRS/Reporting Services
SSRS - SS2012 (report builder and SSDT)
SSRS - SS2008 R2 (report builder)
SSAS/Analysis Services
SSAS – SS2012
SSAS – SS2008 R2
Linked Server (DQ)
DQ – SS2008
SSIS/Integration Services
SSIS - SS2012
SSIS - SS2008 R2
PowerPivot for Excel 2010
Power View
SS2012 w/ and w/o direct query
MS BI Direct Query
Excel 2010
Direct Query 2010
Access 2010
Master Data Services
SS2012
SS2008
Quality Services
SS2012
SS2008
SNAC11
OLEDB
ODBC
SNAC10
OLEDB
ODBC
.NET
(sqlclient)
32 bit 64 bit 32 bit 64 bit 32 bit 64 bit 32 bit 64 bit 32 bit 64 bit
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
n/a
n/a
X
X
X
n/a
X
n/a
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
n/a
X
X
n/a
X
X
X
X
X
X
X
X
X
X
X
n/a
n/a
n/a
n/a
n/a
n/a
X
n/a
n/a
X
n/a
n/a
X
X
X
X
n/a
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
Lowest list price; Lowest
Compressed Price /
Performance
Lower Hardware Cost
(Remove SAN)
Other/Previous Versions
• SQL Server 2008 R2 PDW already one of the lower cost options
Any data, any size, anywhere
• Any Size
•
•
•
•
•
xVelocity (Memory-Optimized Columnstore Index) for 10-100x improvements
Data Compression (7x) reducing storage costs and data scans
Start at quarter rack for departmental deployments
Linear Scale up to 56 Nodes and 6 Petabytes of Storage
Fault Tolerance for ETL and Disaster Recovery
•
•
Run queries using regular T-SQL on both structured data and unstructured data
Query “Big data” (Hadoop) directly from within PDW
• Any Data
•
Interactively Visualize Petabytes of Data (PowerView direct query)
Agility and Simplicity At Lowest Cost
• Agility due to Virtualization
•
Entire appliance now virtualized and runs Windows Server 2012
•
•
Workload Manager
Integrates with System Center Operations Manager
•
•
Lowest list price; Lowest Compressed Price / Performance
SQL Server Compatibility (SQL Server Data Tools, increased T-SQL support, SSIS
2012)
• Management Simplicity
• Lowest Cost
Wednesday June 5 @ 3:15PM / Artin Avanes
Wednesday June 5 @ 3:15 PM / Jeff Spiller
Tuesday June 4 @ 8:30AM / Brian Mitchell
Windows
Azure
mva
http://channel9.msdn.com/Events/TechEd
www.microsoft.com/learning
http://microsoft.com/technet
http://microsoft.com/msdn

Brian http://brianwmitchell.com (#1 blog on SQL PDW) @brianwmitchell (Woeful mid-200’s followers) Microsoft’s DW Business Momentum #1 Unit Share for DW All-up (SQL Server) PDW Volume 6x Grew Year Over Year SQL.

Transcript Brian http://brianwmitchell.com (#1 blog on SQL PDW) @brianwmitchell (Woeful mid-200’s followers) Microsoft’s DW Business Momentum #1 Unit Share for DW All-up (SQL Server) PDW Volume 6x Grew Year Over Year SQL.

Directory