“The Cloud” Failing in the Cloud a “how to” guide Boston Code Camp November 22nd, 2014 (1:40 – 2:50) In the room with broken video Boston.

Download Report

Transcript “The Cloud” Failing in the Cloud a “how to” guide Boston Code Camp November 22nd, 2014 (1:40 – 2:50) In the room with broken video Boston.

“The Cloud”
Failing in the Cloud
a “how to” guide
Boston Code Camp
November 22nd, 2014
(1:40 – 2:50)
In the room with broken video
Boston Azure User Group
http://www.bostonazure.org
@bostonazure
Bill Wilder
http://blog.codingoutloud.com
@codingoutloud
My name is Bill Wilder
[email protected]
blog.codingoutloud.com
@codingoutloud
www.devpartners.com
www.cloudarchitecturepatterns.com
Who is Bill Wilder?
www.bostonazure.org
www.devpartners.com
Boston Code Camp 22 - Thanks to
our Sponsors!
• Gold
• Silver
• Bronze
• In-Kind
Donations
I will ass-u-me…
1. You know what “the cloud” is
2. You have an inkling about Amazon Web Services
and Windows Azure cloud platforms
3. You understand that such cloud platforms
include compute services [like hosted virtual
machines (VMs), in both IaaS and PaaS modes],
data and database services, messaging, DNS,
management, etc.
4. You are interested in understanding how to
absolutely fail in your next cloud project 
Classic Blunders
Ha ha! You fool! You fell victim to one of the classic
blunders - The most famous of which is "never get
involved in a land war in Asia" - but only slightly
less well-known is this: "Never go in against a
Sicilian when death is on the line”
Vizzini
The Princess Bride
Classic Blunders building Cloud Apps
Categories of Blunders
1.
Fighting the Last War: The Cloud is Hosting
2.
Mad Science: Physics doesn’t apply
3.
Apollo 13: Failure is not an option
4.
Unicorns are Real: Security is free
The term “cloud” is nebulous…
Cloud == Traditional hosting
“The Cloud” – practitioner viewpoint
Using the public cloud (for anything) means:
• Taking a dedependency pendency on the public Internet
• Taking a on a Cloud Vendor
• Software resources supported by someone else, running
in one or more data centers “in the cloud”
• Replaces or augments resources we’d otherwise own
• Some loss of control
But many practical uses that work (stay tuned )
As professionals:
• New concepts & skills to be learned & applied
Pets vs. Livestock
• Scaling exists
• Scaling is dynamic
• Scaling is bi-directional
– Cost engineering
• VMs can fail and be replaced (sometimes)
• VMs can go away
– Don’t get emotionally attached!
• Gather logs
• Automate All the Things
Create Ubuntu VM
Create Key
openssl req -x509 -nodes -days 365 \
-newkey rsa:2048 \
-keyout example1.key \
-out example1.pem
chmod 600 example1.key
azure vm create --location "West US" --ssh 22 foo
b39f27a8b8c64d52b05eac6a62ebad85__Ubuntu13_04-amd64-server-20130824-en-us-30GB
azureusername secretPa$$Here123
azure vm list
putty foo.cloudapp.net
call azure vm delete --quiet foo
download_blob_to_file.py
from azure.storage import *
blob_service = BlobService(
account_name = az_storage_account_name,
account_key = az_storage_account_key)
stream = blob_service.get_blob(
blob_container_name, blob_name)
with open(file_path, 'w') as f:
f.write(stream)
The term “cloud” is nebulous…
Automate all the things
•
•
•
•
•
•
Creating custom VM templates
Managing VM templates as a team (across accounts)
Advanced Automation Scripting
Continuous Deployment
Cross-region Considerations
… and many many more …
Optimize Costs
NIST – Cloud Platform Taxonomy
http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf
Private Cloud
Deployment Models
Community Cloud
Public Cloud
Hybrid Cloud
Infrastructure as a Service
IaaS
Platform as a Service
PaaS
Software as a Service
SaaS
Essential Characteristics
Rapid Elasticity
Broad network
access
Resource Pooling
On-demand self-service
Measured service
How do I control costs?
• Deallocate resources when you don’t need them
• Delete test VMs at night, rehydrate in morning
– VM $ >> Storage Cost $
– Even on IaaS VM running a database
That. Is. The. Easiest. Way. To. Save. Money.
There are also cost monitoring tools.
What are the costs?
•
All platforms have pricing calculators (e.g., http://www.windowsazure.com/en-us/pricing/calculator/)
A FEW EXAMPLE PRICES from Windows Azure
• Storage (at rest):
– $0.07 per GB per month  $0.12 per GB per month
• XS (shared core, 768 MB) Windows Server or Linux
– $0.02 per hour ($15 per month IYLIOAM)
• Small (1 core, 1.75 GB) Windows Server VM:
– $0.09 per hour ($67 per month IYLIOAM)
• Large (4 cores, 7 GB) Windows Server VM with SQL Server:
– $0.405 per hour ($301 per month IYLIOAM)
– Licensing is baked into the rental charge
• A7 (8 cores, 56 GB) running Oracle Database EE on Linux
– $13.92 per hour (~$10,300 per month IYLIOAM)
 This is non-discounted pricing for “regular” accounts (max price)
Costs are not fixed
• Free Trials are available
– Azure Free Trial: http://aka.ms/IaaS
– AWS Free Trial:
• Some services have a free tier
– e.g., Web Sites
• Discounts available:
– Long-term Pricing (Azure, AWS)
– Spot Pricing (Amazon Web Services)
• Enterprise Agreement discounts for Azure
• MSDN has 25-40% discounts for Test-Dev for Azure
– http://www.windowsazure.com/en-us/pricing/member-offers/msdn-benefits-details/
– http://www.windowsazure.com/en-us/offers/ms-azr-0060p
The term “cloud” is nebulous…
Physics does not apply
Speed of Light Matters
• Network speed is not infinite
– But it is pretty fast inside the data center!
• Co-location matters
– Lots of data centers to choose from
– http://azuremap.blob.core.windows.net/apps/work/bingma
p-geojson-display.html
• Customers vs. Compute + Data
– Mobile applications
• “Hybrid Cloud” applications are also popular
The Cloud is Infinite
• Well…
• Is the [public] cloud [platform that I’m using]
actually infinite?
• Reality is that it is pretty expansive – but not infinite
“all at once”
• Vertical Scaling vs. Horizontal Scaling
• Absolute Limits to Vertical Scale
– More pronounced in the cloud
– Horizonal scale is limited too, but *much* less limiting
The term “cloud” is nebulous…
Failure is not an option
Level 0:
Cloudpocolypse
level failure
Disaster Recovery and Business Continuity
1.
2.
3.
Multi-cloud solutions
On-prem backups
Private “clouds”
Level 1:
Cloud Platform
failure
Level 2:
Cloud Isolated
failure
Queue-centric Workflow Pattern
(more in a moment)
Level 3:
Transient
failure
Busy Signal Pattern
1.
Code for them
–
2.
Built into Azure SDKs
Not exceptional situation!!!
Building Resilient, Reliable Apps
Extend www.pageofphotos.com
example into Service Tier
• QCW enables applications where the UI and
back-end services are Loosely Coupled
QCW Example: User Uploads Photo
www.pageofphotos.com
Web
Server
Reliable Queue
Reliable Storage
Compute
Service
QCW
WE NEED:
• Compute (VM) resources to run our code
• Reliable Queue to communicate
• Durable/Persistent Storage
Where does Azure fit?
QCW [on Azure]
WE NEED:
• Compute (VM) resources to run our code
Web Roles (IIS) and Worker Roles (w/o IIS)
• Reliable Queue to communicate
Azure Storage Queues
• Durable/Persistent Storage
Azure Storage Blobs & Tables; WASD
QCW on Azure: User Uploads a Photo
www.pageofphotos.com
push
Web
Role
(IIS)
pull
Azure Queue
Worker
Role
Azure Blob
UX implications: user does not wait for thumbnail
(architecture!)
QCW enables Responsive UX
• Response to interactive users is as fast as a
work request can be persisted
• Time consuming work done asynchronously
• Comparable total resource consumption,
arguably better subjective UX
• UX challenge – how to express Async to users?
– Communicate Progress
– Display Final results
– Long Polling/Web Sockets (e.g., SignalR or Node.io)
QCW enables Scalable App
• Decoupled front/back provides insulation
–
–
–
–
–
Blocking is Bane of Scalability
Order processing partner doing maintenance
Twitter down
Email server unreachable
Internet connectivity interruption
• Loosely coupled, concern-independent scaling
– (see next slide)
– Get Scale Units right
–Key to optimizing operational CO$T$
General Case:
Many Roles, Many Queues
Web
Role
(Admin)
Web
Web
Role
Web
Role
(Public)
Role
(IIS)
(IIS)
Queue
Queue
Type 1
Type 1
Queue
Queue
Type 2
Type 2
Queue
Type 3
Worker
Worker
Role
Worker
Role
Worker
Role
Role
Type 1
Worker
Worker
Role
Worker
Role
Worker
Worker
Role
Role
Worker
Role
Worker
TypeRole
2
TypeRole
2
Type 2
Type 2
• Scaling best when Investment α Benefit
• Optimize for CO$T EFFICIENCY
• Logical vs. Physical Architecture depends on current scale
Reliable Queue & 2-step Delete
var url = “http://pageofphotos.blob.core.windows.net/up/<guid>.png”;
queue.AddMessage( new CloudQueueMessage( url ) );
(IIS)
Web
Role
Queue
Worker
Role
var invisibilityWindow = TimeSpan.FromSeconds( 10 );
CloudQueueMessage msg =
queue.GetMessage( invisibilityWindow );
(… do some processing then …)
queue.DeleteMessage( msg );
QCW requires Idempotent
• Perform idempotent operation more than
once, end result same as if we did it once
• Example with Thumbnailing (easy case)
• App-specific concerns dictate approaches
– Compensating action, Last write wins, etc.
• PARTNERSHIP: division of responsibility
between cloud platform & app
– Far cry from database transaction
QCW expects Poison Messages
• A Poison Message cannot be processed
– Error condition for non-transient reason
– Use dequeue count property
• Be proactive
– Falling off the queue may kill your system
• Determine a Max Retry policy per queue
– Delete, put on “bad” queue, alert human, …
QCW requires “Plan for Failure”
• VM restarts will happen
– Hardware failure, O/S patching, crash (bug)
• Bake in handling of restarts into our apps
– Restarts are routine: system “just keeps working”
– Idempotent support needed important
– Event Sourcing (commonly seen with CQRS) may
help
• Not an exception case! Expect it!
• Consider N+1 Rule
What’s Up? Reliability as EMERGENT PROPERTY
Typical Site Any 1 Role Inst
Operating System
Upgrade
Application Code
Update
Scale Up, Down, or In
Hardware Failure
Software Failure (Bug)
Security Patch
Overall System
What about the DATA?
• You: Azure Web Roles and Azure Worker Roles
– Taking user input, dispatching work, doing work
– Follow a decoupled queue-in-the-middle pattern
– Stateless compute nodes
• Cloud: “Hard Part”: persistent, scalable data
– Azure Queue & Blob Services
– Three copies of each byte
– Geo-replicated to sister data center
– Busy Signal Pattern
The term “cloud” is nebulous…
Security is free
“The Cloud”
Copyright © 2013 Elizabeth B. O’Connor • used with permission • www.elizabethboconnor.com
Reality is Resource-Constrained
“Security is always a
tradeoff; it must be
balanced with the cost.”
- Bruce Schneier
http://www.schneier.com/essay-207.html
@Bill Wilder
53
Reality is Resource-Constrained
“_______is always a
tradeoff; it must be
balanced with the cost.”
- Bruce Schneier
http://www.schneier.com/essay-207.html
@Bill Wilder
54
Members of
Microsoft Azure
Security Team
@Bill Wilder
55
Defense in Depth Approach
Layer
Data
Application*
Host
Defense-in-Depth
 Strong storage keys for access control
 SSL support for data transfers between all parties
 Front-end .NET framework code running under partial trust
 Windows account with least privileges
 Hardened version of Windows Server 2008 OS for both VM Host
and VM Guest operating systems
 Host boundaries enforced by external hypervisor
Network
 Host firewall limiting traffic to VMs
 VLANs and packet filters in routers
Physical
 World-class physical security
 ISO 27001 and SAS 70 Type II certifications for datacenter
processes
@Bill Wilder
56
Defenses Inherited by Azure Applications
Spoofing
Tampering/
Disclosure
Repudiation
Denial of
Service
Elevation of
Privilege
VM switch
hardening
VLANs
Top of Rack
Switches
Custom
packet
filtering
Partial Trust
Runtime
Certificate
Services
Monitoring
SharedAccess
Signatures
Diagnostics
Service
Configurable
scale-out
Hypervisor
custom
sandboxing
Virtual
Service
Accounts
HTTPS
Sidechannel
protections
@Bill Wilder
57
Unsecure Code is Unsecure Code
•
•
•
•
•
SQL Injection, Click Jacking (hey Robert!), XSS, …
POODLE/SSL 3.0
ShellShock
Denial of Service – DoS
O/S issues
– Platform as a Service – PaaS
Cloud Platform Characteristics
• Scaling – or “resource allocation” – is horizontal
– and ∞ (“illusion of infinite resources”)
• Resources are easily added or released
– self-service portal or API; cloud scaling is automatable
• Pay only for currently allocated resources
– costs are operational, granular, controllable, and transparent
• Optimized for cost-efficiency
– cloud services are MT, hardware is commodity
– MTTR over MTTF
• Rich, robust functionality is simply accessible
– like an iceberg
Questions?
Comments?
More information?
any final questions?
Developer Resources
• www.windowsazure.com/develop/ is
LOADED with Dev Libraries, Training Kits,
How To Guides across:
– Mobile (iOS, Android, Win Phone, Win 8 SDKs)
– .NET, Node.js, Java, PHP, Python, REST
– PowerShell, CLI
• Example: Create Node.js web site from Mac CLI
https://www.windowsazure.com/en-us/develop/nodejs/tutorials/create-a-website-(mac)/
• Example: Create Linux (CentOS) VM from CLI
(Node-based CLI – Windows not required)
https://www.windowsazure.com/en-us/develop/php/how-to-guides/command-line-tools/
https://www.windowsazure.com/en-us/develop/nodejs/how-to-guides/command-line-tools/
• Example: Install Couchbase + VNet on VM
http://blogs.msdn.com/b/jimoneil/archive/2012/06/16/couchbase-on-azure-a-tour-ofnew-windows-azure-features.aspx
@Bill Wilder
75
Azure Services
Compute
Virtual Machines
Cloud Services
Websites
Mobile Services
Batch
Network Services
ExpressRoute
Virtual Network
Traffic Manager
Data Services
Storage
SQL Database
HDInsight
Cache
Backup
Site Recovery
Machine Learning
StorSimple
DocumentDB
Azure Search
Data Factory
Stream Analytics
Operational Insights
App Services
Media Services
Service Bus
Push Notifications
Scheduler
BizTalk Services
Active Directory
Multi-Factor Authentication
Automation
CDN
API Management
RemoteApp
Application Insights
Cloud Computing
BostonAzure.org
• Boston Azure cloud user group
• Focused on Microsoft’s Public Cloud Platform
• Monthly, 6:00-8:30 PM in Boston area
– Food; wifi; free; great topics; growing community
• Follow on Twitter: @bostonazure
• More info or to join our Meetup.com group:
http://www.bostonazure.org
Contact Me
Bill Wilder
@codingoutloud
http://blog.codingoutloud.com
community inquiries:
[email protected]
business inquiries: www.devpartners.com
book: www.cloudarchitecturepatterns.com
http://bit.ly/billbook