World Wide Web Infrastructure and Evolution

Download Report

Transcript World Wide Web Infrastructure and Evolution

Representational State Transfer: An Architectural Style for Distributed Hypermedia Interaction Roy T. Fielding

University of California, Irvine http://www.ics.uci.edu/~fielding/

Network Application Architecture

 Software architecture of a network-based app – abstract system view and model for comparison – communication restricted to message passing  Defines – how system components are allocated and identified – how the components interact to form a system – the amount and granularity of communication needed for interaction – interface protocols

Network Performance Measures

 latency – latent period: time between stimulus and first indication of a response – minimum latency = ping/echo time  throughput – rate of data transfer  round trips – number of interactions per user action

Network Performance Measures

 overhead – setup: time to enable application-level interaction – message control  amortization – spreading overhead across many interactions  completion = setup / amortization + (roundtrips * latency) + (control + data) / throughput

User-perceived Performance

 user-perceived latency impacted by – setup overhead – network distance x round trips – blocking/multithreading – collisions  user-perceived throughput impacted by – available network bandwidth – message control overhead – message buffering, layer mismatches – loss of synchronization

Network Application Performance

application  style  architecture  implementation  Network performance is bound by – application requirements – pattern of communication – infrastructure used to communicate – implementation of components  The best network application performance is obtained by not using the network – disconnected operation

Architectural Styles

 Common patterns within system architectures  One system may be composed of multiple styles  Some styles are hybrids of other styles  An architecture is an instantiation of a style  We could equally talk about – computer architecture – network architecture – software architecture [Shaw/Garlan, 1993]  network-based application architecture

Client/Server

 most hyped, least meaningful style  emphasizes separation of concerns  initiators (clients) and listeners (servers)

Remote Session

 each client initiates a session on server  application state kept on server  commands are used to exchange data or change session state  flexible, interactive, easy to extend services  scalability is a problem

Remote Data Access (RDA)

 Clients send database queries to remote server  Server maintains per-client state for joins/trans.

 Client must know enough about data structure to build structure-dependent queries  SQL commonly used to define query  Results transferred by (proprietary) protocol

Pipe-and-Filter

 a.k.a., one-way data flow  data stream is filtered through a sequence of components  components do not need to know identity of peers  components are transitive

Event-based Integration

 components listen to a message bus or register with a broker  components do not need to know identity of peers  separation of concerns, independent evolution  usually not designed for high-latency networks  poor scalability – collisions or single point of failure

Distributed Objects

 Components interact as peers  Emphasizes object management, data hiding, state distribution  Strong typing is usually assumed – identity of peer is required  Uses EBI or brokered client/server  Streaming not supported in general

Distributed Process Paradigms

[Gregory Andrews, 1991]  Heartbeat  Probe/Echo  Broadcast  Token-passing  Replicated/shared server, blackboard  Replicated workers, bag of tasks

Web Architectural Style

 Web architectural style revolves around five fundamental notions: – resource – representation of a resource – communication to obtain/modify representations – web “page” as an instance of application state – engines to move from one state to the next  browser  spider  any media type handler

What is a Resource?

 A resource can be anything that has identity – a document or image – a service, e.g., “today’s weather in Seattle” – a collection of other resources – non-networked objects (e.g., people)  The resource is the conceptual mapping to an entity or set of entities, not necessarily the entity that corresponds to that mapping at any particular point in time!

Representations of a Resource

 The Web is designed to manipulate and transfer representations of a resource  A single resource may be associated with multiple representations (content negotiation)  A representation is in the form of a media type – provides information for this resource  Hypermedia-aware media types – provide potential state transitions  Most representations are cachable

Representational State Transfer

 optimized for transfer of typed data streams  caching of representations allows application interaction to proceed without using the network  all components can be pipe-and-filter

Origin Server Model

 server provides interface to services as a resource hierarchy  implementation details hidden from clients  stateless interaction for scalability  application interaction can be spread across multiple servers  replaceable by a gateway pipe

Gateway Model

 appears as a normal origin server to client  provides an interface encapsulation of other services – data flow translation in both directions  also used for high-speed caching

Agent Model

 holds all application state – which allows user to manipulate it (history) – or anticipate changes to it (link maps)  application details hidden from server – browser, spider, index robot, personal agent  replaceable by a proxy pipe

Proxy Model

 translate multiple services into HTTP  transform data streams according to client limitations (e.g., image translation)  enforce security policies  enable shared caching

Web Architecture Evolution

 Uniform Resource Identifiers – http://www.ics.uci.edu/~fielding/talks/ – mailto:[email protected]

 Access protocols – HTTP, FTP, Gopher, ...

 Media types – HTML, XML, applet languages  Some architectural misfits – HTTP cookies, HTML frames

Uniform Resource Identifiers

 A simple and extensible means of identifying resources  Uniformity allows – different types of resource identification within a single protocol element – uniform semantic interpretation of common syntactic elements – relative syntactic interpretation independent of scheme  Few changes since 1991

Hypertext Transfer Protocol

 A protocol (syntax and semantics) for transferring representations of resources – usually across the Internet using TCP  Design goals – speed (stateless, cachable, few round-trips) – simplicity – extensibility – data (payload) independence  A true network-based API

HTTP/0.9 (pre-1993)

 Absolute Simplicity

GET /url-path Hello World Hello World

 No Extensibility – only one method (GET) – no request modifiers – no response metadata

HTTP/1.0 (1993-present)

 Simple and (mostly) Extensible

GET /Test/hello.html HTTP/1.0

Accept: text/html User-Agent: GET/5 libwww-perl/0.40

HTTP/1.0 200 OK Date: Fri, 12 Jan 1996 01:02:49 GMT Server: Apache/1.0.5

Content-type: text/html Content-length: 38 Last-modified: Wed, 10 Jan 1996 01: Hello Hello out there!

HTTP/1.0 Deficiencies

 No complete specification until end of `94  No minimum standard for compliance  Poor network behavior – one request per connection – no reliable transfer of dynamic content – no control over response caching – failed to anticipate proxies and gateways – created huge demand for vanity addresses – misuse/misunderstanding of MIME

HTTP/1.1

 Culmination of two years work, RFC2068 – with Henrik Frystyk, Jim Gettys, Jeff Mogul – designed at UCI and W3C; expanded in IETF  Improved Reliability – chunked transfer of dynamic content – recognition of proxy and gateway requirements – explicit cachability of responses  Improved Network Behavior – persistent connections – virtual hosts (many names, one address)

HTTP/1.1 (1997-????)

 Less Simple, More Extensible, but Compatible

GET /Test/hello.html HTTP/1.1

Host: kiwi.ics.uci.edu:8080 User-Agent: GET/7 libwww-perl/5.40

HTTP/1.1 200 OK Date: Fri, 07 Jan 1997 15:40:09 GMT Server: Apache/1.2b6

Content-type: text/html Transfer-Encoding: chunked Etag: “a797cd-465af” Cache-control: max-age=3600 Vary: Accept-Language …

HTTP/1.x Deficiencies

 MIME is too verbose (overhead per message)  Control mixed with metadata  Metadata restricted to header or trailer  Meta-metadata requires encapsulation of entire message  Fixed request/response ordering can block progress  Lack of multiplexing prevents getting important part of multiple representations first

HTTP/2.x

 Tokenized transfer of common fields – reducing bandwidth usage, latency – removal of MIME syntax limitations – self-descriptive for extensions  Multiplexing control, data, metadata streams – reducing desire for multiple connections – enabling multi-protocol connections – per-stream priority or credit mechanism  Layered streams for meta-metadata, encryption...

Media Types

 Web architecture is designed to be media type independent – but we can only use what agents will consume – leading to a chicken-and-egg adoption problem  HTML is still the lingua franca – difficult to extend semantics, rendering – wasteful to extend syntactically – no mechanism for alternatives  ECMAscript, DynamicHTML, applets

XML to the rescue?

 “X” for extensible: – self-descriptive syntax – semantics by reference (doctype, namespaces) – rendering by reference (style sheets)  An XML representation is an object turned inside-out, with behavior-by-reference  However, network application performance will demand standards for domain-specific doctypes and style sheets

Conclusions

 Web architectural style inherits from – client/server: separation of concerns, scalability – pipe-and-filter: streams, intermediaries, encapsulation – distributed objects: methods, message structure  Advantages of representational state transfer: – application state controlled by the user agent – composed of representations from multiple servers – representations can be cached, shared – matches hypermedia interaction model of combining information and control

Future Work

 Dynamic application architectures  Architectural analysis and performance bounds  Impact of future network architectures (ATM)  Balancing secure transfer with firewall visibility  Protocol for manipulating resource mappings  HTTP-NG (W3C/Xerox PARC)  rHTTP (UCI)

Questions?

 Slides available late next week: – http://www.ics.uci.edu/~fielding/talks/ webarch_9805/