Title of Presentation

Download Report

Transcript Title of Presentation

Data Management in the Cloud
Agenda
• Cloud fundamentals
> Market view
> Cloud vendors
• Getting data in and out
• Cloud configurations
• A Day in the Clouds
• Security
2 > 7/17/2015
It’s “Good
Enough”
What is Cloud Computing?
• Essential characteristics
>
>
>
>
>
On-demand self-service
Broad network access
Resource pooling [virtual]
Rapid elasticity
Measured Service [pay per use]
• Service models
> Software-as-a-Service (SaaS)
> Platform-as-a-Service (PaaS)
> Infrastructure-as-a-Service (IaaS)
• Deployment models
> Private cloud
> Public cloud
> Hybrid cloud
3 > 7/17/2015
Source: Draft NIST Working Definition of Cloud Computing, 8-21-09, version 15
http://csrc.nist.gov/groups/SNS/cloud-computing/index.html
Primary Cloud Computing Services
•Infrastructure-as-aService
>Rent-a-server
>Rent-a-disk
•Software-as-a-Service
>Rent-a-seat
>Or free
•Platform-as-a-Service
>Middleware stacks
>Web services for hire
>Developer tools
4 > 7/17/2015
Public and Private Clouds
Public Cloud
Internet
Line of
Business
users
SaaS
Elasticity and
no CAPEX
Company LAN
Firewall
Private Cloud
Virtualization
SLAs
5 > 7/17/2015
Cloud User Surveys – Adoption Areas
Q: Rate your likelihood to pursue the cloud model for the following
Collaboration applications
67.3%
Web applications/Web serving
66.9%
59.4%
Data Back-up or Archive services
Business apps (CRM, HR, ERP)
55.6%
Personal productivity apps
55.1%
Data/Content Distribution services
54.8%
52.9%
Storage capacity on demand
IT Management software
51.3%
Server capacity on demand
50.6%
49.8%
Business Intelligence/Analytics
Application dev/test/deploy platform
49.1%
IT/Information Security
48.6%
0%
6 > 7/17/2015
10%
20%
30%
40%
Source: IDC Enterprise Panel, 3Q09, n = 263, September 2009
50%
60%
70%
80%
Cloud User Surveys - Benefits
Q: Rate the benefits commonly ascribed to the 'cloud'/on-demand model
Pay only for what you use
77.9%
Easy/fast to deploy to end-users
77.7%
75.3%
Monthly payments
68.5%
Encourages standard systems
67.0%
Requires less in-house IT staff, costs
Always offers latest functionality
64.6%
Sharing systems with partners simpler
63.9%
54.0%
Seems like the way of the future
0%
10% 20% 30% 40% 50% 60% 70% 80% 90%
(Scale: 1 = Not at all important 5 = Very Important)
Source: IDC Enterprise Panel, 3Q09, n = 263, September 2009
7 > 7/17/2015
Cloud User Surveys - Challenges
Q: Rate the challenges/issues of the 'cloud'/on-demand model
87.5%
Security
Availability
83.3%
Performance
82.9%
On-demand paym’t model may cost more
81.0%
Lack of interoperability standards
80.2%
Bringing back in-house may be difficult
79.8%
Hard to integrate with in-house IT
76.8%
Not enough ability to customize
76.0%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
8 > 7/17/2015
Source: IDC Enterprise Panel, 3Q09, n = 263, September 2009
Cloud Vendors
Magic Quadrant for Web Hosting and Hosted Cloud
System Infrastructure Services (On Demand), 2009
10 > 7/17/2015
Public Cloud Vendors at a Glance
Company
Flexiscale
(EMEA)
Amazon
AT&T Synaptic
Hypervisor
OS
Service
Contract
SLA
Xen
Windows, CentOS, Debian,
Ubuntu
TBD
100%
Xen
Windows, Red Hat, Fedora,
OpenSolaris, OpenSUSE,
Debian, Ubuntu, Gentoo
None
99.95%
VMware
Windows, Red Hat
Annual
99.7%
GNi Hosting
Xen, VMware,
Microsoft
Windows, Red Hat, CentOS,
Debian, Gentoo, Ubuntu
Monthly
100%
IBM
Xen, VMware
Windows, Red Hat, CentOS,
SUSE, AIX
Annual
Depends on
location
Xen
Red Hat, Fedora, CentOS,
Debian, Ubuntu, Gentoo
None
Savvis
VMware
Windows, Red Hat, Solaris
10 and x86
Monthly
GoGrid
Xen
Windows, CentOS, Red Hat
None
RackSpace
High Availability is vastly different between vendors
11 > 7/17/2015
Source: InfoWeek, From Amazon To IBM, What 12 Cloud Computing Vendors Deliver, Sep 5, 2009
100%
99.9% up to
99.99%
100% with
paybacks
Cloud Formations
Public Cloud Vendor Configurations
HP/Dell X64 4-16 CPUs,
64GB RAM
static IP address
with VM, VLAN
launch instance
save bundle
Virtual
Machine
Images
Load balancer
“Ephemeral”
GigE LAN
switches/
routers
SAN
NAS
SAN
SAN
RackSpace
Files
RackSpace
Network
Attached
Storage
Amazon
Elastic
Block
Storage
Amazon
Simple
Storage
Service
13 > 7/17/2015
SATA disks
Server Consolidation using VMware
Virtual Machines
Linux
Large Intel Server
Storage Area Network
14 > 7/17/2015
Private Clouds through Virtualization
users
Virtual
Control
Layer
users
Internal
Private
Cloud
My Data Center A
users
Internal
Private
Cloud
My Data Center B
15 > 7/17/2015
Public Cloud
Clouds Use Any Commodity Hardware
My momma always said,
“Clouds are like a box of chocolates.
You never know what you're gonna get.”
Forrest Gump (1994)
16 > 7/17/2015
Public and Private Clouds
• You get any available server and storage
> Mostly SANs
> Selectable CPU speed and memory size
> Often unknown
– Memory bus speed
– RAID configuration unknown
– Disk rotation speeds and capacities
• Performance will vary – a lot!
> But is it good enough?
• In contrast, EDW appliances are carefully balanced
> Ratio of IOPS to memory size/speed to CPU size/speed
17 > 7/17/2015
See http://www.cloudsleuth.net
New Skills Needed for Clouds
• “Any hardware” configuration planning
• Strong performance analysis and problem isolation
>
>
>
>
18 > 7/17/2015
Is it the commodity disk subsystem?
Is it the virtual machine tax?
Who else is sharing these resources?
Is it the BI Tools server or database server?
BI-DW Configurations
Generic DI/EDW/BI Data Flow
Operational
Data
ETL/ELT
Database
BI Tools/
Applications
staging
OLTP
EDW/
mart
ERP
System Management, Metadata, Security, Developer Tools
On premises, inside the firewall
20 > 7/17/2015
Data Mart in the Cloud Data Flow
Operational
Data
ETL/ELT
Database
BI Tools/
Applications
staging
OLTP
mart
ERP
System Management, Metadata, Security, Developer Tools
Either public or
private cloud
21 > 7/17/2015
On premises, inside the firewall
ETL and Data Cleansing in the Cloud
Operational
Data
ETL/ELT
Database
BI Tools/
Applications
staging
OLTP
mart
ERP
System Management, Metadata, Security, Developer Tools ?
Either public or
private cloud
22 > 7/17/2015
On premises, inside the firewall
It Can Happen!
Operational
Data
ETL/ELT
Database
BI Tools/
Applications
staging
OLTP
mart
ERP
System Management, Metadata, Security, Developer Tools?
On premises, inside the firewall
23 > 7/17/2015
Co-location Minimizes Latencies
Operational
Data
ETL/ELT
Database
BI Tools/
Applications
1 public cloud vendor
staging
OLTP
mart
ERP
System Management, Metadata, Security, Developer Tools?
On premises, inside the firewall
24 > 7/17/2015
Getting Data In and Out of a Cloud
Data Transfer To/From Cloud
Amazon Web Services
$0.100 per GB
data transfer in
$0.150 per GB
first 10 TB data transfer out
$0.110 per GB
next 40 TB data transfer out
$0.090 per GB
next 100 TB data transfer out
$0.080 per GB
data transfer out over 150 TB
RackSpace
$0.08/GB
Bandwidth in
$0.22/GB
Bandwidth out
As of July 2010
26 > 7/17/2015
What Does this Mean to BI?
Report Users
MB/day
work days
GB/month
Monthly
20
10
23
4.6
$0.69
100
10
23
23
$3.45
500
10
23
115
$17.25
500
50
23
575
$86.25
Batch
GB/day
work days
GB/month
Monthly
Extracts
10
30
300
$45.00
Extracts
50
30
1500
$225.00
2
30
60
$9.00
500
4
2000
$300.00
Redo log backup
Full backup
Assumptions: 500GB data mart; Transfer-out at $0.15 per GB/month
27 > 7/17/2015
Data Transfer with Public Clouds
Corporate Data Center
De-duplicated
Compressed
Encrypted
Secure
ETL
RDBMS
SANs
No fees for data
transfer inside
the same cloud
28 > 7/17/2015
Informatica Cloud – ETL and Replication
“We’re using Informatica
Cloud Services to replicate
millions of rows of data
from Salesforce to a
centralized database
running on Amazon EC2.”
Source: Informatica , Breakfast in the Cloud, May 24,2010 Slideshare
29 > 7/17/2015
Data Integration Considerations
• Initial loading
> Just send the tapes (sneaker-net)
• ETL
> Are all files available?
> RDBMS lookups
• Minimizing data movement
>
>
>
>
30 > 7/17/2015
Informatica push-down
MicroStrategy ROLAP push-down
SAS in-database
Minimize movement across domains
A Day in the Clouds
In the Cloud: Monday Morning 10am
Data
mart
Corporate
Data Center
Inspirational source: Steve Dine, TDWI, BI in the Cloud, Nov 5,2009
32 > 7/17/2015
In the Cloud: Monday Night 10pm ETL
Data
mart
Corporate
Data Center
33 > 7/17/2015
In the Cloud: Tuesday Morning 3am Reports
Month end
surge capacity
Corporate
Data Center
Inspirational source: Steve Dine, TDWI, BI in the Cloud, Nov 5,2009
34 > 7/17/2015
Data
mart
In the Cloud: Tuesday Morning 6am Backup
instance
bundle
rsync
Data
mart
Corporate
Data Center
Inspirational source: Steve Dine, TDWI, BI in the Cloud, Nov 5,2009
35 > 7/17/2015
snapshot
S3
S3
Load Balancing in VMware and Amazon EC2
Live migration
APP
APP
OS
OS
APP
OS
VMware or Xen
APP
APP
APP
OS
OS
OS
VMware or Xen
resource pool
Servers
Inspirational source: Steve Dine, TDWI, BI in the Cloud, Nov 5,2009
36 > 7/17/2015
APP
APP
APP
OS
OS
OS
VMware or Xen
High Availability in VMware and Amazon EC2
APP
APP
APP
APP
APP
OS
OS
OS
OS
OS
VMware or Xen
APP
APP
APP
OS
OS
OS
VMware or Xen
resource pool
Servers
Inspirational source: Steve Dine, TDWI, BI in the Cloud, Nov 5,2009
37 > 7/17/2015
APP
OS
VMware or Xen
Security
Security
• Physical security
> Retinal scans, motionsensors, etc.
• Database security
> Encryption, LDAP
> No network clear text
• Amazon network security
> Default: all ports are closed
> You create login key pairs
> Manages man-in-the-middle
and denial-of-service attacks
> Xen: Instances cannot access
hardware directly
39 > 7/17/2015
"This building is like a secure bunker,
and the campus is like a military base,"
Terremark SVP Norm Laudermilch
Inside Terremark's Secure Government Data Center
http://www.informationweek.com/story/showArticle.jhtml?articl
eID=218700118
Amazon Virtual Private Cloud
subnets
IP addresses
not exposed
to Internet
On premises network
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
EC2
router
Secure
VPN
VPN
gateway
Source: http://news.cnet.com/8301-19413_3-10318114-240.html?tag=mncol;posts
Source: Mike Culver, Amazon Web Services, Data Warehousing in the Public Cloud, Smart Data Collective
40 > 7/17/2015
S3
S3
S3
S3
In Summary
“Good Enough” Workloads for Clouds
Workload
Small-medium data marts
Public
Private
X
X
Sand box / data labs
X
BI tools, ETL tools
X
X
Development
X
X
Workload isolation
Partners, Systems Integrators
Non-core HR, CRM, collaboration,
eMail, occasional use applications
X
X
Major applications except highest
availability, highest performance
X
X
Short term projects
X
X
Proof-of-concept, prototypes
X
X
Quality assurance, software testing
X
X
42 > 7/17/2015
Summary
• Cloud adoption and maturity
will happen fast
> Most technology isn’t new
> ISVs gold-rush to clouds
• Many challenges, many
opportunities
> Elastic scale up and down
> New workflows, designs
• Helping Teradata clients get
into the cloud
> Teradata Express for VMware
and Amazon Web Services
43 > 7/17/2015