Transcript SQL201 - Microsoft SQL Server 2008 R2
Mark Souza Director Microsoft SQL Server
SQL201 - Microsoft SQL Server 2008 R2
SQL Server 2008 – Strong Release
The SQL Server 2008 R2 Journey
The origins of Kilimanjaro Self-service Business Intelligence Application & Multi-server Management Scaling for the next generation enterprise High End Scale out Data Warehouses CEP – Complex Event Processing Reaching the summit
Project “Gemini” Excel Add-in Report Builder 3.0
StreamInsight, Complex Event Processing Master Data Services SharePoint Publishing Application & Multi-Server Management Project “Gemini” SharePoint Management Console StreamInsight .Net Extensions Enterprise-level security, scalability Supports up to 256 Logical Processors SQL Server System Preparation Enhanced Data Compression
Solid Foundation for Enterprise Workloads Project “Madison”
MPP support for 100+ terabyte data warehouses Appliance-like data warehouse on industry standard hardware
Better Together with Windows Server
Hyper-V™ Live Migration Support for largest Windows Server hardware
The SQL Server 2008 R2 Journey
The origins of Kilimanjaro Self-service Business Intelligence Application & Multi-server Management Scaling for the next generation enterprise High End Scale out Data Warehouses CEP – Complex Event Processing Reaching the summit
What's in a name… Gemini - Gemini (pronounced /ˈgɛmɪnaɪ/ , Latin :
twins
, symbol
♊
) is one of the constellations of the zodiac known as "the twins"
The corporate Twins: IT Pro/End User
A widening gap between end user and IT needs
I’m not exactly sure what I need but I know I need it now… If I help this time I’m stuck maintaining it forever…
•
End Users
: – Access to corporate data – Mix in their own data – – Aggregate, augment data Organize, present solutions – Share insights with others • IT Professionals: – Know data is secure – Know data is consistent – – Keep systems running Keep the cost down – Track data access & usage
There need not be an end-user versus IT conflict or gap in meeting user needs The gap is caused by lack of enabling technology, heavy “app lifecycle” costs
The Challenges
Data warehouses do not cover all data or all users New formal BI solutions need time and resources Diverse users have diverse data needs Bottleneck Ad-hoc requests stress I.T. capacity Chaos Power users bypass I.T. with unsanctioned sources
Gemini: Uniting the Twins Re-draws the line between I.T. and end-user roles
• Directly model • Analyze • Personalize • Share data Empowered to create without IT dependence • Provision • Administer • Secure • Track data Managing compliance and resources without user obstruction
Excel is key for IW/Users
“It has to be Excel” “We don’t get OLAP & dimensional models” “What is data modeling anyway?” “Just make my Excel better" Use Excel as a catch all tool to Collect data Clean, prepare and integrate it Enrich and Analyze Create reports and visualizations Share them with others Easy sharing of insights is critical Each power user publishes data to 10’s-100’s consumers IT needs to know !
SNEAK PEAK
NICHOLAS DRITSAS PROGRAM MANAGER SQL SERVER PRODUCT TEAM
IT manage the "Spreadmarts"
• Excel is the IW tool of choice, but for IT: • Excel is a problem - “unmanageable” • Excel is an addiction – users “can’t quit it” • Why not make Excel part of the solution?
• Include Excel as part of a complete BI solution • Structured and manageable • Give IT insight into its usage • Provide IT with the technology to • Have insight and management • Become a strategic differentiator • Without being a bottleneck • Enable managed Self-Service
The SQL Server 2008 R2 Journey
The origins of Kilimanjaro Self-service Business Intelligence Application & Multi-server Management Scaling for the next generation enterprise High End Scale out Data Warehoues CEP – Complex Event Processing Reaching the summit
Challenges: People vs. Hardware
1990 2000 Underutilized hardware Overburdened Administrators 2010 Trends Database apps increasing at a higher rate than DBAs Overburdened DBAs Hardware computing capacity exploding Underutilized hardware
Introducing a better way
Today Tomorrow Control server sprawl with 1 to many management – setup is fast and easy Manage capacity through policies – save time , optimize resources Single unit of deployment – increase deployment and upgrade efficiency
Key Concepts
Data-Tier Application Component (DAC)
Think of this as the new unit of deployment for T-SQL apps and providing similar benefits of a MSI in a very general sense. There is a definition of all the parts that make up the app along with services such as Install, Uninstall, Upgrade, and eventually Repair.
Logical
Tables, Views, Constraints, SProcs, UDFs Users, Logins
DAC Physical
Indexes, Partitions FileGroups …
DAC Deployment Profile
Deployment Requirements, Management Policies, Failover Policies
Unit of Deployment
• • •
Data-Tier Application Unit (DAU)
Think of this as the overall unit of management. Or the deployed instance of a DAC Maps to a plain database in KJ. In SQL 11, a CDB - a more self-contained database (with additional dependent objects). Provides namespace and resource isolation.
SQL Server Manageability
DAU – (C)DB Schema
Tables, Views, Constraints, SProcs, UDFs, Users, Logins Indexes, Partitions, FileGroups DAC Confidential – Internal Use Only
Properties & Metadata
Deployment Requirements, Management Policies, Failover Policies
Unit of Management
17
Key Concepts (continued..)
• •
Utility Control Point (UCP)
Think of this as the central reasoning point of the utility. From here operations such as policy evaluation, discovery, deployment, impact, and what if analysis can be performed.
• • •
Connection Virtualization (Medusa)
Think of this as DNS for connection strings Decouples application from the physical location of DAU (CDB) Uses Active Directory (KJ). DBA Management Studio
UCP
Managed Instances SQL Server Manageability Confidential – Internal Use Only 18
Key Benefits
Control • Optimization • Efficiencies
Gain Visibility and Control
New wizards in SSMS – fast and easy setup Create a Control Point Enroll instances Insights refreshed every 15 minutes Management Studio Database Administrator Microsoft Confidential—Preliminary Information Subject to Change SQL Server Control Point
Key Benefits
Control • Optimization • Efficiencies
Improve Resource Optimization
At-a-glance views for insights ID consolidation opportunities Quickly drill-down to detailed views Simple UI for policy adjustments Microsoft Confidential—Preliminary Information Subject to Change
•
Application & Multi-Server Management
• •
Creating the UCP Insights – Health Check
Key Benefits
Control • Optimization • Efficiencies
Improve Efficiencies
Single unit of deployment Integration with Visual Studio Streamlined deployments & upgrades Client “Finance” Data-Tier Developer Management Studio Database Administrator Central management Microsoft Confidential—Preliminary Information Subject to Change
Application & Multi-Server Management
•
Creating the DAC
•
Migrating the DAC
Application & Multi-server Management
Productive database application development and management via Introduction of new Database Application Components (DAC) Application of Policy Based Administration to DACs Intellisense integration with Visual Studio Ability to version, deploy and reverse engineer a DAC Multi-server Management made easier through DAC experiences integrated with Management Studio and Visual Studio Import and Export of database application artifacts Support for reverse engineering a DAC from down-level systems Deployment to one or more target systems Monitoring of multiple instances of a database application on several servers via Management Studio
The SQL Server 2008 R2 Journey
The origins of Kilimanjaro Self-service Business Intelligence Application & Multi-server Management Scaling for the next generation enterprise High End Scale out Data Warehouses CEP – Complex Event Processing Reaching the summit
The Data Warehouse scale journey
Project Madison Massive scale-out to 100’s TB
FastTrack Reference Architecture – 10s TB Easier, predictable and cost effecient
Massive Scale-out
10s of TB
Fast Track DW
Appliance-like time to value Flexibility through choice of HW platforms Low TCO through commodity hardware and value pricing. Reduced risk through pre-tested and pre-tuned configurations Provides a clear upgrade path to “Madison” via Hub/Spoke Microsoft Confidential—Preliminary Information Subject to Change 27
Scale out Data Warehousing
Massively Parallel Processing
MPP True MPP, Shared Nothing Architecture Server/CPU’s have their own dedicated resources Secret Sauce is MPP Query Optimizer supporting Parallel operations Lightning-fast Queries, Data Loads And Updates Linear Scalability Lower TCO- Reduced DBA time
High-Level Madison Architecture
Control Rack Control Node
Active/Passive
Client Drivers ETL Load Interface Corp. Backup Solution Landing Zone Backup Node Management Node
Active/Passive
Data Rack Database Server Nodes Storage Nodes Compute Node Spare Node
30
Customer C-CUSTOMER_SK C_CUSTOMER_ID C_CURRENT_ADDR … Customer Demographics CD_DEMO_SK CD_GENDER CD_MARITAL_STATUS CD_EDUCATION …
Database Tables
Date Dim D_DATE_SK D_DATE_ID D_DATE D_MONTH … Item I_ITEM_SK I_ITEM_ID I_REC_START_DATE I_ITEM_DESC …
Store Sales Ss_sold_date_sk Ss_item_sk Ss_customer_sk Ss_cdemo_sk Ss_store_sk Ss_promo_sk Ss_quantity …
Promotion P_PROMO_SK P_PROMO_ID P_START_DATE_SK P_END_DATE_SK … Store S_STORE_SK S_STORE_ID S_REC_START_DATE S_REC_END_DATE S_STORE_NAME …
Madison Appliance Nodes
Large Tables Are Hash Distributed Smaller Tables Are Replicated D C I
SS
CD P S D C I
SS
CD P S D C I
SS
CD P S D C I
SS
CD P S D C I
SS
CD P D S C I
SS
CD P S
DBA Work Made Easy
Create Database
The SQL Server 2008 R2 Journey
The origins of Kilimanjaro Self-service Business Intelligence Application & Multi-server Management Scaling for the next generation enterprise High End Scale out Data Warehouses CEP – Complex Event Processing Reaching the summit
What Is CEP?
Complex Event Processing (CEP) is the continuous and incremental processing of event streams from multiple sources based on declarative query and pattern specifications with near-zero latency. Query Paradigm Latency
Database Applications
Ad-hoc queries or requests Seconds, hours, days
Event-driven Applications
Continuous standing queries Milliseconds or less Data Rate Hundreds of events/sec Tens of thousands of events/sec or more
response request Event input stream output stream
Microsoft’s CEP Solution
Data Sources, Operations, Assets, Feeds, Sensors, Devices
Input Data Streams Monitor & Record Mine & Design
CEP Engine
f(x) f'(x) g(y) h(x,y) Input Data Streams Output Data Streams Manage & Benefit
Operational Data Store & Archive Results CEP Engine
f(x) f'(x) g(y) h(x,y)
CEP Deployment Alternatives
Devices CEP Web servers
Data Sources
CEP
Aggregation & Correlation
CEP CEP Sensors CEP Feeds CEP
Event processing engines are deployed at multiple places on different scales • At the edge – close to the data source • In the mid-tier – consolidate related data sources • In the data center – historical archive, mining, large scale correlation
CEP CEP CEP
Complex Analytics & Mining
CEP CEP CEP CEP for lightweight processing and filtering CEP for aggregation and correlation of in-flight events CEP for complex analytics including historical data
LINQ Query Examples
LINQ Example – JOIN, PROJECT, FILTER:
from join e1 in e2 in on e1.ID equals e2.ID
where e1.f2 = “foo” select MyStream1 MyStream2 new { e1.f1, e2.f4 };
LINQ Example – GROUP&APPLY, WINDOW:
from group e3 in e3 by MyStream3 e3.i into SubStreams from s4 in SubStreams from e4 in s4.SlidingWindow(FiveMinutes,ThreeSeconds) select new { pl = new MyNewPayload (e4.i, e4.f)};
Join Filter Project Grouping Window
Recap: CEP Platform from Microsoft
CEP Application
Development experience with .NET, C#, LINQ and Visual Studio 2008
Development
Event sources CEP platform from Microsoft to build event driven applications Event targets Event Event traditional database Event Event Event continuous, consume and Event Event results incrementally Flexible adapter SDK with high performance to connect to different event sources and sinks C_ID C_NAME C_ZIP
`
The CEP platform does the heavy lifting for you to deal with temporal characteristics of event stream data
Static reference data