Amazon Web Services and Eucalyptus

Download Report

Transcript Amazon Web Services and Eucalyptus

Amazon Web Services and
Eucalyptus
Darshan R. Kapadia
Gregor von Laszewski
http://grid.rit.edu
1
What is Amazon Web Services?
• The Amazon Web Services (AWS) are a
collection of remote computing services (also
called web services) offered over the Internet
by Amazon.com.
• Amazon Web Services (AWS) provides
companies of all sizes with an infrastructure
web services platform in the cloud.
http://en.wikipedia.org/wiki/Amazon_Web_Services
http://aws.amazon.com/what-is-aws/
http://grid.rit.edu
2
What does AWS offers?
• With AWS you can requisition compute power,
storage, and other services–gaining access to
a suite of elastic IT infrastructure services as
your business demands them.
• With AWS you have the flexibility to choose
whichever development platform or
programming model makes the most sense for
the problems you’re trying to solve.
http://aws.amazon.com/what-is-aws/
http://grid.rit.edu
3
AWS Services
•
•
•
•
•
•
Amazon Elastic Compute Cloud (Amazon EC2)
Amazon SimpleDB
Amazon Simple Storage Service (Amazon S3)
Amazon CloudFront
Amazon Simple Queue Service (Amazon SQS)
Amazon Elastic MapReduce
http://grid.rit.edu
4
Advantages of AWS
• Cost-effective
• Dependable
• Flexible
• Comprehensive
http://grid.rit.edu
5
Amazon Simple Storage Service (Amazon S3™)
• Amazon S3 is storage for the Internet.
• Amazon S3 provides a simple web services interface that can
be used to store and retrieve any amount of data, at any time,
from anywhere on the web.
AWS S3 Functionalities
• Write, read, and delete objects containing from 1 byte to 5
gigabytes of data each. The number of objects you can
store is unlimited.
• Each object is stored in a bucket and retrieved via a unique,
developer-assigned key.
• A bucket can be located in the United States or in Europe.
All objects within the bucket will be stored in the bucket’s
location, but the objects can be accessed from anywhere.
• Authentication mechanisms are provided to ensure that
data is kept secure from unauthorized access. Objects can
be made private or public, and rights can be granted to
specific users.
• Uses standards-based REST and SOAP interfaces designed
to work with any Internet-development toolkit.
http://grid.rit.edu
7
Properties of AWS S3
• Scalable: Amazon S3 can scale in terms of storage, request
rate, and users to support an unlimited number of webscale applications. It uses scale as an advantage: Adding
nodes to the system increases, not decreases, its
availability, speed, throughput, capacity, and robustness.
• Reliable: Store data durably, with 99.99% availability. There
can be no single points of failure. All failures must be
tolerated or repaired by the system without any downtime.
• Fast: Amazon S3 must be fast enough to support highperformance applications. Server-side latency must be
insignificant relative to Internet latency. Any performance
bottlenecks can be fixed by simply adding nodes to the
system.
Contd..
• Inexpensive: Amazon S3 is built from inexpensive commodity
hardware components. As a result, frequent node failure is the norm
and must not affect the overall system. It must be hardware-agnostic,
so that savings can be captured as Amazon continues to drive down
infrastructure costs.
• Simple: Building highly scalable, reliable, fast, and inexpensive
storage is difficult. Doing so in a way that makes it easy to use for any
application anywhere is more difficult. Amazon S3 must do both.
Pricing
•
•
•
•
Storage (Linux Based)
$0.150 per GB – first 50 TB / month of storage used
$0.140 per GB – next 50 TB / month of storage used
$0.130 per GB – next 400 TB /month of storage used
$0.120 per GB – storage used / month over 500 TB
•
•
•
•
Data Transfer
$0.170 per GB – first 10 TB / month data transfer out
$0.130 per GB – next 40 TB / month data transfer out
$0.110 per GB – next 100 TB / month data transfer out
$0.100 per GB – data transfer out / month over 150 TB
Requests
• $0.01 per 1,000 PUT, COPY, POST, or LIST requests
• $0.01 per 10,000 GET and all other requests
http://grid.rit.edu
10
Amazon CloudFront
• Amazon CloudFront is a web service for
content delivery.
• It integrates with other Amazon Web Services
to give developers and businesses an easy way
to distribute content to end users with low
latency, high data transfer speeds, and no
commitments.
http://grid.rit.edu
11
How to use Amazon CloudFront?
• Store the original versions of your files in an
Amazon S3 bucket.
• Create a distribution to register that bucket with
Amazon CloudFront through a simple API call.
• Use your distribution’s domain name in your web
pages or application. When end users request an
object using this domain name, they are
automatically routed to the nearest edge location
for high performance delivery of your content.
Amazon Elastic Compute Cloud (Amazon EC2™)
• Amazon Elastic Compute Cloud (Amazon EC2)
is a web service that provides resizable
compute capacity in the cloud.
http://grid.rit.edu
13
Amazon EC2 Service Highlights
• Elastic - Amazon EC2 enables you to increase or
decrease capacity within minutes, not hours or
days. You can commission one, hundreds or even
thousands of server instances simultaneously.
• Flexible- You have the choice of multiple instance
types, operating systems, and software packages.
• Designed for use with other Amazon Web
Services-Amazon EC2 works in conjunction with
Amazon Simple Storage Service (Amazon S3),
Amazon SimpleDB and Amazon Simple Queue
Service (Amazon SQS) to provide a complete
solution for computing, query processing and
storage across a wide range of applications.
How to use EC2
•
Create an Amazon Machine Image (AMI) containing your applications, libraries,
data and associated configuration settings.
•
Upload the AMI into Amazon S3. Amazon EC2 provides tools that make storing the
AMI simple. Amazon S3 provides a safe, reliable and fast repository to store your
images.
•
Use Amazon EC2 web service to configure security and network access.
•
Choose which instance type(s) and operating system you want, then start,
terminate, and monitor as many instances of your AMI as needed, using the web
service APIs or the variety of management tools provided.
•
Determine whether you want to run in multiple locations, utilize static IP
endpoints, or attach persistent block storage to your instances.
•
Pay only for the resources that you actually consume, like instance-hours or data
transfer.
Operating Systems
•
•
•
•
•
•
•
•
•
Red Hat Enterprise Linux
Windows Server 2003
Oracle Enterprise Linux
OpenSUSE
Linux
Ubuntu Linux
Fedora
Gentoo Linux
Debian
http://grid.rit.edu
16
Software
• Databases
– IBM DB2
– IBM Informix Dynamic Server
– Microsoft SQL Server Standard 2005
– MySQL Enterprise
– Oracle 11g
• Batch Processing
– Hadoop
– Condor
– Open MPI
http://grid.rit.edu
17
Contd..
• Web Hosting
– Apache HTTP
– IIS/Asp.Net
– IBM Lotus Web Content Management
– IBM WebSphere Portal Server
Pricing
•
•
•
•
Standard On-Demand Instances
Small (Default) $0.10 per hour
Large $0.40 per hour
Extra Large $0.80 per hour
• Standard Reserved Instances
http://grid.rit.edu
19
Amazon EC2 Workflow
http://docs.amazonwebservices.com/AWSEC2/latest/DeveloperGuide/
http://grid.rit.edu
20
Creating an AMI
• Select an AMI
• Generate a Key Pair
• Launch the Instance
• Get Administrator Password
• Authorize Network Access
• Connect to the Instance
• Load Software and Make Changes
http://grid.rit.edu
21
SOAP and Query API
• http://docs.amazonwebservices.com/AWSEC2
/2007-08-29/DeveloperGuide/
http://grid.rit.edu
22
Amazon Elastic MapReduce
• Amazon Elastic MapReduce is a web service
that enables businesses, researchers, data
analysts, and developers to easily and costeffectively process vast amounts of data.
• It utilizes a hosted Hadoop framework running
on the web-scale infrastructure of Amazon
Elastic Compute Cloud (Amazon EC2) and
Amazon Simple Storage Service (Amazon S3).
http://grid.rit.edu
23
Contd…
• Develop your data processing application authored in your choice
of Java, Ruby, Perl, Python, PHP, R, or C++.
• Upload your data and your processing application into Amazon S3.
Amazon S3 provides reliable, scalable, easy-to-use storage for your
input and output data.
• Log in to the AWS Management Console to start an Amazon Elastic
MapReduce “job flow.” Simply choose the number and type of
Amazon EC2 instances you want, specify the location of your data
and/or application on Amazon S3, and then click the “Create Job
Flow” button.
• Monitor the progress of your job flow(s) directly from the AWS
Management Console, Command Line Tools or APIs. And, after the
job flow is done, retrieve the output from Amazon S3.
http://grid.rit.edu
24
Eucalyptus
• Eucalyptus - Elastic Utility Computing
Architecture for Linking Your Programs To Useful
Systems - is an open-source software
infrastructure for implementing "cloud
computing" on clusters.
• The current interface to Eucalyptus is compatible
with Amazon's EC2, S3, and EBS interfaces, but
the infrastructure is designed to support multiple
client-side interfaces.
• Eucalyptus is implemented using commonly
available Linux tools and basic Web-service
technologies making it easy to install and
maintain.
http://grid.rit.edu
25
Features of EUCALYPTUS
• Interface compatibility with EC2 (both Web service and
Query interfaces)
• Simple installation and deployment using Rocks clustermanagement tools
• Secure internal communication using SOAP with WSsecurity
• Overlay functionality requiring no modification to the
target Linux environment
• Basic "Cloud Administrator" tools for system
management and user accounting
• The ability to configure multiple clusters, each with
private internal network addresses, into a single Cloud.
http://grid.rit.edu
26
References
• http://aws.amazon.com/
• http://open.eucalyptus.com/
http://grid.rit.edu
27