Cloud Architecture Anti-Patterns A concise overview of some bad ideas .NET Architecture Group 20-May-2015 Bill Wilder, Finomial CTO @codingoutloud [email protected] blog.codingoutloud.com linkedin.com/in/billwilder • Except where noted contents © 2014 Development.
Download ReportTranscript Cloud Architecture Anti-Patterns A concise overview of some bad ideas .NET Architecture Group 20-May-2015 Bill Wilder, Finomial CTO @codingoutloud [email protected] blog.codingoutloud.com linkedin.com/in/billwilder • Except where noted contents © 2014 Development.
Cloud Architecture Anti-Patterns A concise overview of some bad ideas .NET Architecture Group 20-May-2015 Bill Wilder, Finomial CTO @codingoutloud [email protected] blog.codingoutloud.com linkedin.com/in/billwilder • Except where noted contents © 2014 Development Partners Software Corporation • http://www.devpartners.com • Cloud Architecture Anti-Patterns A concise overview of some bad ideas Find this slide .NET Architecture Group deck here 20-May-2015 Bill Wilder, Finomial CTO @codingoutloud [email protected] blog.codingoutloud.com linkedin.com/in/billwilder • Except where noted contents © 2014 Development Partners Software Corporation • http://www.devpartners.com • Cloud Micro-Service AntiPatterns for the Internet of Things written in Go A certifiably buzzworthy presentation Bill Wilder, Finomial CTO @codingoutloud [email protected] blog.codingoutloud.com linkedin.com/in/billwilder • Except where noted contents © 2014 Development Partners Software Corporation • http://www.devpartners.com • www.cloudarchitecturepatterns.com Who is Bill Wilder? www.bostonazure.org www.devpartners.com Lots of ♥ to all the clouds etc… 7 Architect Skills Technology Skills Technical Business Decisions Business Awareness Ability to Communicate Famous Architect: Aristotle On Properties: • Essential property = must have • Accidental property = happens to have but could lack Technology Skills Business Awareness For effective software architect, all are Essential Properties Ability to Communicate 9 To cloud or not to cloud? control vs. cost €$¥ Ctrl Ctrl Technology Skills Business Awareness Ability to Communicate €$¥ Cloud Services … in the Cloud “who would’ve thought” Cloud is a business innovation technology services + flexible rental model new types and combinations of services Services: TTM & Sleeping well 1/9th above water Treating your ops team as equivalent to the cloud vendor’s ops team (They are not. Let cloud vendor handle service operations. Use services. You focus on your app.) What is an Anti-Pattern Wikipedia says: (http://en.wikipedia.org/wiki/Anti-pattern) “A common response to a recurring problem that is usually ineffective and risks being highly counterproductive.” Bill’s amplification: “An anti-pattern approach may seem reasonable, or actually be reasonable in other contexts. There may be problems that are not yet be apparent.” Often depends on the situation. Tells: Traditional vs Cloud-Native TELLS/CLUES • 2-tier • N-tier, SOA, μSvcs • Single data center • Multi-data center • Vertical scaling • Horizontal scaling • Ignores failure • Expects failure • Transactional consist consist There is no “best” architecture –•itEventual is situational, Which is “best” architecture? CONSEQUENCES Traditional a Technical Business Decision. Cloud-Native • Less flexible • Agile/faster TTM Cloud-native popularity• growing • More manual/attention Auto-scaling in •proportion Less reliable (SPoF) • Self-healing to the shrinking cost • Maintenance window • HA and competitive benefits. • Less scalable, more $$ • Geo-LB/FO One-size-fits-all architecture [Cloud] Anti-Pattern Causes • • • • Abstraction misalignment Not reading the fine print Insufficient ongoing attention to cost Insufficient ongoing attention to automation www.pageofphotos.com (PoP) Move Simple PoP App to Cloud WHAT NOW? Scalability & Performance & Cost & Automation Time passes… PoP has lots of photos www.pageofphotos.com One-size-fits-all data storage (perf, scalability, cost) Upgrade to scenario-specific storage Some $, Perf, Scale benefits PoP uses Valet Key Pattern Even more $, Perf, Scale benefits CDN for public content Many, many other storage options also available: NoSQL varieties, caches, etc. Always access raw data (regardless of distance, cost) (performance, scalability, cost) PoP web tier goes multi-instance… Users experiencing login issues* *Depending on configuration … Are Cloud Resources Infinite? “We often hear that public cloud platforms offer the illusion of infinite resources. … This does not mean each resource has infinite capacity, just that you can request as many instances of the type of resource that you need.” Page 21, my (Bill Wilder’s) Cloud Architecture Patterns book Running stateful VMs in web / service tiers (Limits horizontal scalability & complicates autoscale – but sometimes is reasonable option) I don’t have a slide on this, but … sharding Reliability PoP Adding Video Support (uh oh!) Current Let’s extend PoP with a Service Tier Web Tier Services Tier Data Tier OPTION 1: Request/Response Services web browser Stateless Stateless Services REQUEST / RESPONSE (http + json) Coupling Between Tiers (reliability, scalability, cost) (Situational: I frequently violate! Also relates to microservices.) Work Producers Reliable Queue Work Consumers pull Data Tier push Services Tier Web Tier OPTION 2: Async Services web browser Stateless Stateless Services Notice anything “missing” ? There is no transaction Get used to idea of eventual consistency Stateless Services Enables Responsive UX • Response to interactive users is as fast as a work request can be persisted • UX challenge due to async processing – Eventual consistency processing – Eventual satisfaction for users Enables More Reliable Service • Decoupled front/back provides insulation • Blocking is bane of scalability General Case: Many Queue Types Web Tier (Admin) Web Web Tier Web Role (Public) Role (IIS) (IIS) Queue Queue Type 1 Type 1 Queue Queue Type 2 Type 2 Queue Type 3 Worker Worker Role Worker Role Service Role Tier Type 1 Worker Worker Role Worker Role Worker Worker Role Role Worker Role Service TypeRole 2 TypeTier 2 Type 2 Type 2 Enables Cost-Efficient Scaling • Loosely coupled, concern-independent scaling • Get Scale Units right • Optimize for CO$T EFFICIENCY • GOAL: cost α benefit How about the queue API? A reliable queue works just like any other queue, right? (beware the abstraction mismatch) Reliable Queue & 2-step Delete var url = “http://pageofphotos.blob.core.windows.net/up/<guid>.png”; queue.AddMessage( new CloudQueueMessage( url ) ); Web Tier Queue Service Tier var invisibilityWindow = TimeSpan.FromSeconds( 10 ); CloudQueueMessage msg = queue.GetMessage( invisibilityWindow ); (… do some processing then …) queue.DeleteMessage( msg ); Idempotent Processing An idempotent operation can be performed more than once without changing the end result. Poison Message Detection A poison message is a flawed message that can never be successfully processed. Tiers of Cloud Failure • • • • • Transient API/DB connection failures Temporary/Ephemeral drive loss DC outage (or smoking hole) Zone/Region outage (or smoking hole) Global outage “Failure is not an option” (Failure is routine, at least at lower tiers.) Programming against Cloud Services as though they were reliable (Transient Failures handled using Busy Signal Pattern) Security • A1-Injection • A2-Broken Authentication and Session Management • A3-Cross-Site Scripting (XSS) • A4-Insecure Direct Object References • A5-Security Misconfiguration • A6-Sensitive Data Exposure • A7-Missing Function Level Access Control • A8-Cross-Site Request Forgery (CSRF) • A9-Using Components with Known Vulnerabilities • A10-Unvalidated Redirects and Forwards unicorn cloud security for apps Copyright © 2013 Elizabeth B. O’Connor • used with permission • www.elizabethboconnor.com Belief in cloud app security unicorns Reality: your app’s vulnerabilities will port very cleanly to your favorite cloud platform Little Bobby Tables (still a problem) Conflating App & Platform security secure compliant Cloud News from June 2014 – DDoS – Security Breach – Ransom / Extortion – Fighting Back – Malicious Destruction of Assets – Business Failure ELAPSED TIME 12 HOURS • http://www.codespaces.com/ • A cautionary tale… 1FA single-factor auth (2FA/MFA is widely available) Service Level Agreements (SLA) PoP (pageofphotos.com) adds paid plans to corporate partners – wants to offer an SLA What is “the SLA” for storage? SLA Responsibilities • From Google Storage (https://cloud.google.com/storage/sla): "Back-off Requirements" means, when an error occurs, the Application is responsible for waiting for a period of time before issuing another request. This means that after the first error, there is a minimum back-off interval of 1 second and for each consecutive error, the back-off interval increases exponentially up to 32 seconds.” SLA Math • All required: 99.994 = 99.96 • All required: 99.95 x 99.92 x 99.99 = 99.74 • Period of time over which an SLA applies matters SLA Penalties • Limited to the service costs – Service costs != your business losses • Multiple instances might be needed to be eligible Passing along the SLA The cloud SLA becomes my service’s SLA Compose to boost reliability The architecture of a cloud-native application is aligned with the architecture of the underlying cloud platform. Hiring! HIRING at Finomial Corporation • Are you a talented senior engineer/architect interested in financial services in Boston area? • Technology stack is ASP.NET on Azure + SPA • Downtown Boston (startup space) • [email protected] (or grab a biz card) And…. Find this slide deck here See you at Boston Azure bostonazure.org Bill Wilder @codingoutloud [email protected] blog.codingoutloud.com linkedin.com/in/billwilder • Except where noted, slide deck is © 2014 Development Partners Software Corporation • http://www.devpartners.com • des questions?