Transcript SQL Server
SQL Server: A Data Platform for Large-Scale Applications José Blakeley, Software Architect Database Systems Group, Microsoft Corporation SQL Server Design Philosophy To solve 90+% of the problems automatically Self Tuning Dynamic Optimization Self Configuration Self Management Self Healing The remaining percentage takes higher touch Requires careful design for scalability Good knowledge of how the RDBMS platform works This talk describes how we are codifying our solution for large-scale DW into SQL Server 2 UCI ISG Seminar 1/8/2010 SQL Data Platform Any Place, Any Type, Any Scale Enterprise Data Platform Beyond Relational Dynamic Development Pervasive Insight 4 UCI ISG Seminar 1/8/2010 Outline SQL Server Data Platform Basic concepts – common ground Assumptions Workloads SQL Server Parallel DW DBMS Philosophy System Architecture Software Architecture Summary 5 UCI ISG Seminar 1/8/2010 Workload Types Online Transaction Processing (OLTP) Balanced read-update ratio (60%-40%) Fine-grained inserts and updates OLTP =throughput Day-to-day e.g., 10s K/s High transaction business Usually very short transactions e.g., 1-3 tables Sometimes multi-step e.g., financial Relatively small data sizes e.g., few TBs Data Warehousing and Business Analysis (DW) 7 Read-mostly (90%-10%) Few updates in place, high-volume bulk inserts Concurrent query throughput e.g., 10sDW K / hr = Analysis over Per query response time < 2 s recorded data Snowflake, star schemas are common e.g., 5-10 tables Complex queries (filter, join, group-by, aggregation) Very large data sizes e.g., 10s TB - PB UCI ISG Seminar 1/8/2010 436,892,631 rows Star Join Schema TECSPURSL00 (dbo) (Read-only) FTxlatOrgId CTxlatOrgId TRCreditedSubsidiaryId BillingCurrencyID 34 rows Region (dbo) (Read-only) ProductId DataSourceId FiscalYear (dbo) (Read-only) FiscalYearId FiscalYearName ManagementReportingId RecordTypeId RegionId SalesDateId AreaId BillingMonthSalesDateID RegionCode SELECT FE0.RegionName, RegionName FL0.FiscalYearName, SUM (A.ActualRevenueAmt) FROM TECSPURSL00 A JOIN SalesDate L ON A.SalesDateID = L.SalesDateID UpperGeography (dbo) (Read-only) JOIN UpperGeography UG CreditedSubsidiaryID CreditedSubRegionID ON A.TRCreditedSubsidiaryId = CreditedRegionID UG.CreditedSubsidiaryID CreditedAreaID CreditedBigAreaID JOIN Region FE0 ON UG.CreditedRegionID = FE0.RegionID 118 rows JOIN FiscalYear FL0 ON L.FiscalYearID = FL0.FiscalYearID GROUP BY FE0.RegionName, FL0.FiscalYearName 41 rows ActualQuantityCnt ActualLicenseCnt SecondaryLicenseCnt AdjustedLicenseCnt ActualRevenueAmt AdjustedGrossRevenueAmt JointVentureRevenueAmt LicenseTransactionItemId GeographyId JointVentureActualQuantityCnt JointVentureActualLicenseCnt JointVentureSecondaryLicenseCnt BillDocTypeID SalesOfficeID ReasonID BillingStatusID TopParentProductID ParentProductID BundleFlagID ExtractListID ServiceRevenueAmt ServiceActualQuantityCnt ServiceAdjustedRevenueAmt SalesDate (dbo) (Read-only) SalesDateId FiscalWeekID FiscalMonthID FiscalQuarterID FiscalYearID FiscalQuarterName FiscalMonthName FiscalYearName CalendarMonthName FWBeginDate FWEndDate CalendarDate CalendarDateName MonthRelativeID FMBeginDate FMEndDate TransactionDataPopulated FiscalSemesterID FiscalSemesterName AggregatedSalesDateID RevenueCommitmentMultiplier LicenseCommitmentMultiplier CommitmentPeriod DataSourcePlanningFlag AdvisorID 8 UCI ISG Seminar OrderStatusID 13,517 rows 1/8/2010 7/7/2015 Why look at workloads? $B 8 6 OLTP Mixed DW 4 2 1995 9 2000 2005 2010 DBMS architectures of 30 years ago designed for mixed workloads OLTP and DW workloads have grown into big market segments Workload-specific engines can be profitable Innovations are driven by deeper understanding of workloads UCI ISG Seminar 1/8/2010 Changing TPC-H Landscape 68x perf 12% cost New DW startups disrupting TPC-H Column-store, compression, and scale-out 10 UCI ISG Seminar 1/8/2010 Data Warehousing SQL Server 2008 Provides out-of-the-box SMP scale for ~10 TB Many successful apps for 10-100 TB running today via custom scale-out Data compression – row and page Star join, bitmap filters, partitioned table parallelism Minimally Logged INSERT MERGE Resource governor – CPU and memory E.g., Danske Bank, Clalit Health, US Dep. of Agriculture, PanSTARRS Building out-of-the-box scale-out for 10s-100s of TB This is the focus of this talk 11 UCI ISG Seminar 1/8/2010 Outline SQL Server Data Platform Basic concepts – common ground Assumptions Workloads SQL Server Parallel DW DBMS Design Philosophy Hardware Architecture Software Architecture Summary 12 UCI ISG Seminar 1/8/2010 Design Philosophy Shared-nothing, distributed, parallel DBMS Appliance concept Implicit data and function partitioning Software + hardware solution Optimized for DW workloads Scalable to Petabytes 13 UCI ISG Seminar 1/8/2010 Appliance Concept Parallel DW DBMS System Software INDUSTRY STANDARD SERVERS Reference Hardware Platforms INDUSTRY STANDARD NETWORKING INDUSTRY STANDARD STORAGE 14 UCI ISG Seminar 1/8/2010 Hardware Architecture Compute Nodes Control Nodes SQL Active / Passive SQL SQL SQL SQL SQL SQL ETL Load Interface Corporate Backup Solution 15 SQL Dual Fiber Channel Data Center Monitoring SQL Dual Infiniband Client Drivers (ODBC, OLEDB, ADO.NET) Spare Compute Node 1 Rack Server Appliance UCI ISG Seminar 1/8/2010 Node Types Control node: Where clients apps connect Parallel engine runs here Contains system metadata Store user data Perform query execution Not accessible to outside world Landing Zone: Staging place for data loading Accessible to outside world Can be augmented with 3rd party HW and SW Backup node: Compute nodes: Backup file storage Accessible to outside world Can be augmented with 3rd party HW and SW Management node: Windows domain controller (Active Directory) SW upgrades staging place Holds SW images in case a node needs reimaging Software Architecture Query Tool MS BI (AS, RS) DWSQL Other 3rd Party Tools Internet Explorer IIS Data Access (OLEDB, ODBC, ADO.NET, JDBC) Admin Console Compute Node Compute Nodes Compute Nodes Data Movement Service Data Movement Service MPP Engine Coordinator User Data SQL Server Landing Zone Node DW Authentication DW Configuration DW Schema TempDB Data Movement Service SQL Server Control 17 Node UCI ISG Seminar 1/8/2010 Key Components MPP Engine Coordinator Provides single system image SQL compilation Global metadata and appliance configuration Global query optimization and plan generation Global query execution coordination Global transaction coordination Authentication and authorization Supportability (HW and SW status info) Parallel Loader Run from the Landing Zone SSIS or command line tool Parallel Database Copy High performance data export Enables Hub-Spoke scenarios Parallel Backup/Restore Backup files stored on Backup Nodes Backup files may be archived into external device/system Data Movement Service 18 Data movement across the appliance Distributed query execution operators UCI ISG Seminar 1/8/2010 Query Processing SQL statement compilation Builds an MPP execution plan A set of parallel QE steps Executes the plan Parsing, validation, optimization Coordinates workflow among steps Assembles the resultset Returns resultset to client Example – Schema TPCH --------------------------------------------------------------------------------- Customer Table -- distributiond on c_custkey -------------------------------------------------------------------------------CREATE TABLE customer ( c_custkey bigint, c_name varchar(25), c_address varchar(40), c_nationkey integer, c_phone char(15), c_acctbal decimal(15,2), c_mktsegment char(10), c_comment varchar(117)) WITH (distribution=hash(c_custkey)) ; --------------------------------------------------------------------------------- Orders Table -------------------------------------------------------------------------------CREATE TABLE orders ( o_orderkey bigint, o_custkey bigint, o_orderstatus char(1), o_totalprice decimal(15,2), o_orderdate date, o_orderpriority char(15), o_clerk char(15), o_shippriority integer, o_comment varchar(79)) WITH (distribution=hash(o_orderkey)) ; 20 --------------------------------------------------------------------------------- LineItem Table -- distributiond on l_orderkey -------------------------------------------------------------------------------CREATE TABLE lineitem ( l_orderkey bigint, l_partkey bigint, l_suppkey bigint, l_linenumber bigint, l_quantity decimal(15,2), l_extendedprice decimal(15,2), l_discount decimal(15,2), l_tax decimal(15,2), l_returnflag char(1), l_linestatus char(1), l_shipdate date, l_commitdate date, l_receiptdate date, l_shipinstruct char(25), l_shipmode char(10), l_comment varchar(44)) WITH (distribution=hash(l_orderkey)) ; UCI ISG Seminar 1/8/2010 Example - Query SELECT FROM WHERE GROUP ORDER TOP 10 L_ORDERKEY, SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)) AS REVENUE, O_ORDERDATE, O_SHIPPRIORITY CUSTOMER, ORDERS, LINEITEM C_MKTSEGMENT = 'BUILDING' AND C_CUSTKEY = O_CUSTKEY AND L_ORDERKEY = O_ORDERKEY AND O_ORDERDATE < '1995-03-05' AND O_ORDERDATE >= '1994-09-15 00:00:00.000' BY L_ORDERKEY, O_ORDERDATE, O_SHIPPRIORITY BY REVENUE DESC, O_ORDERDATE ; 21 UCI ISG Seminar 1/8/2010 Example – Execution Plan ------------------------------- Step 1: create temp table at control node -----------------------------CREATE TABLE [tempdb].[dbo].[Q_[TEMP_ID_664]] ( [l_orderkey] BIGINT, [REVENUE] DECIMAL(38, 4), [o_orderdate] DATE, [o_shippriority] INTEGER ); ------------------------------- Step 2: create temp tables at all compute nodes -----------------------------CREATE TABLE [tempdb].[dbo].[Q_[TEMP_ID_665]_[PARTITION_ID]] ( [l_orderkey] BIGINT, [l_extendedprice] DECIMAL(15, 2), [l_discount] DECIMAL(15, 2), [o_orderdate] DATE, [o_shippriority] INTEGER, [o_custkey] BIGINT, [o_orderkey] BIGINT ) WITH ( DISTRIBUTION = HASH([o_custkey]) ); -------------------------------- Step 3: SHUFFLE_MOVE -------------------------------SELECT [l_orderkey], [l_extendedprice], [l_discount], [o_orderdate], [o_shippriority], [o_custkey], [o_orderkey] FROM [dwsys].[dbo].[orders] JOIN [dwsys].[dbo].[lineitem] ON ([l_orderkey] = [o_orderkey]) WHERE ([o_orderdate] < '1995-03-05' AND [o_orderdate] >= '1994-09-15 00:00:00.000') INTO Q_[TEMP_ID_665]_[PARTITION_ID] SHUFFLE ON (o_custkey); 22 ------------------------------- Step 4: PARTITION_MOVE -----------------------------SELECT [l_orderkey], sum(([l_extendedprice] * (1 - [l_discount]))) AS REVENUE, [o_orderdate], [o_shippriority] FROM [dwsys].[dbo].[customer] JOIN tempdb.Q_[TEMP_ID_665]_[PARTITION_ID] ON ([c_custkey] = [o_custkey]) WHERE [c_mktsegment] = 'BUILDING' GROUP BY [l_orderkey], [o_orderdate], [o_shippriority] INTO Q_[TEMP_ID_664]; ------------------------------- Step 5: Drop temp tables at all compute nodes -----------------------------DROP TABLE tempdb.Q_[TEMP_ID_665]_[PARTITION_ID]; -------------------------------- Step 6: RETURN result to client -------------------------------SELECT TOP 10 [l_orderkey], sum([REVENUE]) AS REVENUE, [o_orderdate], [o_shippriority] FROM tempdb.Q_[TEMP_ID_664] GROUP BY [l_orderkey], [o_orderdate], [o_shippriority] ORDER BY [REVENUE] DESC, [o_orderdate] ; -------------------------------- Step 7: Drop temp table at control node -------------------------------DROP TABLE tempdb.Q_[TEMP_ID_664]; UCI ISG Seminar 1/8/2010 Other Important Functionality Fault tolerance All HW components have redundancy: CPUs, Disks, networks, power, storage processors Control and Compute nodes use failover clustering Management nodes have active & standby Integration with SQL Server BI tools 23 SS Integration Services (ETL) has PDW as a destination SS Analysis Services (OLAP) has PDW as a source SS Reporting Services Excel UCI ISG Seminar 1/8/2010 Future Challenges Richer DW analysis Richer hub-and-spoke configurations Execution strategies (DW) Optimization techniques (DW) Integration with other data services Isolation levels, deadlocks, logs Distributed, parallel query processing MPP to MPP Distributed transactions Map-reduce-like functionality inside the cluster Data mining, embedded analytics Streaming Increased HW architecture choices 24 Low-power clusters UCI ISG Seminar 1/8/2010 Summary SQL Server Data Platform overview Workload types SQL Server Parallel DW DBMS Design Philosophy System Architecture Software Architecture Query Example Future challenges 25 UCI ISG Seminar 1/8/2010 Resources Microsoft SQL Server 2008 case studies SQL Server best practices http://technet.microsoft.com/en-us/sqlserver/bb671430.aspx Dave Salch, Eric Kraemer, Umair Waheed, Paul Dyke, Fast Track Data Warehouse 2.0 Architecture, SQL Server Technical Article, MSDN, Nov. 2009. SQL Server Customer Advisory Team http://www.microsoft.com/sqlserver/2008/en/us/case-studies.aspx http://sqlcat.com/Default.aspx Research on balanced HW architectures 26 Yogesh Simmhan, et. al. GrayWulf: Scalable Software Architecture for Data Intensive Computing, HICSS pp.1-10, 42nd Hawaii International Conference on System Sciences, 2009 Alexander S. Szalay, et al., Low Power Amdahl-Balanced Blades for Data Intensive Computing, SC 2009. UCI ISG Seminar 1/8/2010 THANKS! 27 UCI ISG Seminar 1/8/2010