Transcript World Wide Web Infrastructure and Evolution
Representational State Transfer: An Architectural Style for Distributed Hypermedia Interaction Roy T. Fielding
University of California, Irvine http://www.ics.uci.edu/~fielding/
Network Application Architecture
Software architecture of a network-based app – abstract system view and model for comparison – communication restricted to message passing Defines – how system components are allocated and identified – how the components interact to form a system – the amount and granularity of communication needed for interaction – interface protocols
Network Performance Measures
latency – latent period: time between stimulus and first indication of a response – minimum latency = ping/echo time throughput – rate of data transfer round trips – number of interactions per user action
Network Performance Measures
overhead – setup: time to enable application-level interaction – message control amortization – spreading overhead across many interactions completion = setup / amortization + (roundtrips * latency) + (control + data) / throughput
User-perceived Performance
user-perceived latency impacted by – setup overhead – network distance x round trips – blocking/multithreading – collisions user-perceived throughput impacted by – available network bandwidth – message control overhead – message buffering, layer mismatches – loss of synchronization
Network Application Performance
application style architecture implementation Network performance is bound by – application requirements – pattern of communication – infrastructure used to communicate – implementation of components The best network application performance is obtained by not using the network – disconnected operation
Architectural Styles
Common patterns within system architectures One system may be composed of multiple styles Some styles are hybrids of other styles An architecture is an instantiation of a style We could equally talk about – computer architecture – network architecture – software architecture [Shaw/Garlan, 1993] network-based application architecture
Client/Server
most hyped, least meaningful style emphasizes separation of concerns initiators (clients) and listeners (servers)
Remote Session
each client initiates a session on server application state kept on server commands are used to exchange data or change session state flexible, interactive, easy to extend services scalability is a problem
Remote Data Access (RDA)
Clients send database queries to remote server Server maintains per-client state for joins/trans.
Client must know enough about data structure to build structure-dependent queries SQL commonly used to define query Results transferred by (proprietary) protocol
Pipe-and-Filter
a.k.a., one-way data flow data stream is filtered through a sequence of components components do not need to know identity of peers components are transitive
Event-based Integration
components listen to a message bus or register with a broker components do not need to know identity of peers separation of concerns, independent evolution usually not designed for high-latency networks poor scalability – collisions or single point of failure
Distributed Objects
Components interact as peers Emphasizes object management, data hiding, state distribution Strong typing is usually assumed – identity of peer is required Uses EBI or brokered client/server Streaming not supported in general
Distributed Process Paradigms
[Gregory Andrews, 1991] Heartbeat Probe/Echo Broadcast Token-passing Replicated/shared server, blackboard Replicated workers, bag of tasks
Web Architectural Style
Web architectural style revolves around five fundamental notions: – resource – representation of a resource – communication to obtain/modify representations – web “page” as an instance of application state – engines to move from one state to the next browser spider any media type handler
What is a Resource?
A resource can be anything that has identity – a document or image – a service, e.g., “today’s weather in Seattle” – a collection of other resources – non-networked objects (e.g., people) The resource is the conceptual mapping to an entity or set of entities, not necessarily the entity that corresponds to that mapping at any particular point in time!
Representations of a Resource
The Web is designed to manipulate and transfer representations of a resource A single resource may be associated with multiple representations (content negotiation) A representation is in the form of a media type – provides information for this resource Hypermedia-aware media types – provide potential state transitions Most representations are cachable
Representational State Transfer
optimized for transfer of typed data streams caching of representations allows application interaction to proceed without using the network all components can be pipe-and-filter
Origin Server Model
server provides interface to services as a resource hierarchy implementation details hidden from clients stateless interaction for scalability application interaction can be spread across multiple servers replaceable by a gateway pipe
Gateway Model
appears as a normal origin server to client provides an interface encapsulation of other services – data flow translation in both directions also used for high-speed caching
Agent Model
holds all application state – which allows user to manipulate it (history) – or anticipate changes to it (link maps) application details hidden from server – browser, spider, index robot, personal agent replaceable by a proxy pipe
Proxy Model
translate multiple services into HTTP transform data streams according to client limitations (e.g., image translation) enforce security policies enable shared caching
Web Architecture Evolution
Uniform Resource Identifiers – http://www.ics.uci.edu/~fielding/talks/ – mailto:[email protected]
Access protocols – HTTP, FTP, Gopher, ...
Media types – HTML, XML, applet languages Some architectural misfits – HTTP cookies, HTML frames
Uniform Resource Identifiers
A simple and extensible means of identifying resources Uniformity allows – different types of resource identification within a single protocol element – uniform semantic interpretation of common syntactic elements – relative syntactic interpretation independent of scheme Few changes since 1991
Hypertext Transfer Protocol
A protocol (syntax and semantics) for transferring representations of resources – usually across the Internet using TCP Design goals – speed (stateless, cachable, few round-trips) – simplicity – extensibility – data (payload) independence A true network-based API
HTTP/0.9 (pre-1993)
Absolute Simplicity
GET /url-path
No Extensibility – only one method (GET) – no request modifiers – no response metadata
HTTP/1.0 (1993-present)
Simple and (mostly) Extensible
GET /Test/hello.html HTTP/1.0
Accept: text/html User-Agent: GET/5 libwww-perl/0.40
HTTP/1.0 200 OK Date: Fri, 12 Jan 1996 01:02:49 GMT Server: Apache/1.0.5
Content-type: text/html Content-length: 38 Last-modified: Wed, 10 Jan 1996 01:
HTTP/1.0 Deficiencies
No complete specification until end of `94 No minimum standard for compliance Poor network behavior – one request per connection – no reliable transfer of dynamic content – no control over response caching – failed to anticipate proxies and gateways – created huge demand for vanity addresses – misuse/misunderstanding of MIME
HTTP/1.1
Culmination of two years work, RFC2068 – with Henrik Frystyk, Jim Gettys, Jeff Mogul – designed at UCI and W3C; expanded in IETF Improved Reliability – chunked transfer of dynamic content – recognition of proxy and gateway requirements – explicit cachability of responses Improved Network Behavior – persistent connections – virtual hosts (many names, one address)
HTTP/1.1 (1997-????)
Less Simple, More Extensible, but Compatible
GET /Test/hello.html HTTP/1.1
Host: kiwi.ics.uci.edu:8080 User-Agent: GET/7 libwww-perl/5.40
HTTP/1.1 200 OK Date: Fri, 07 Jan 1997 15:40:09 GMT Server: Apache/1.2b6
Content-type: text/html Transfer-Encoding: chunked Etag: “a797cd-465af” Cache-control: max-age=3600 Vary: Accept-Language …
HTTP/1.x Deficiencies
MIME is too verbose (overhead per message) Control mixed with metadata Metadata restricted to header or trailer Meta-metadata requires encapsulation of entire message Fixed request/response ordering can block progress Lack of multiplexing prevents getting important part of multiple representations first
HTTP/2.x
Tokenized transfer of common fields – reducing bandwidth usage, latency – removal of MIME syntax limitations – self-descriptive for extensions Multiplexing control, data, metadata streams – reducing desire for multiple connections – enabling multi-protocol connections – per-stream priority or credit mechanism Layered streams for meta-metadata, encryption...
Media Types
Web architecture is designed to be media type independent – but we can only use what agents will consume – leading to a chicken-and-egg adoption problem HTML is still the lingua franca – difficult to extend semantics, rendering – wasteful to extend syntactically – no mechanism for alternatives ECMAscript, DynamicHTML, applets
XML to the rescue?
“X” for extensible: – self-descriptive syntax – semantics by reference (doctype, namespaces) – rendering by reference (style sheets) An XML representation is an object turned inside-out, with behavior-by-reference However, network application performance will demand standards for domain-specific doctypes and style sheets
Conclusions
Web architectural style inherits from – client/server: separation of concerns, scalability – pipe-and-filter: streams, intermediaries, encapsulation – distributed objects: methods, message structure Advantages of representational state transfer: – application state controlled by the user agent – composed of representations from multiple servers – representations can be cached, shared – matches hypermedia interaction model of combining information and control
Future Work
Dynamic application architectures Architectural analysis and performance bounds Impact of future network architectures (ATM) Balancing secure transfer with firewall visibility Protocol for manipulating resource mappings HTTP-NG (W3C/Xerox PARC) rHTTP (UCI)
Questions?
Slides available late next week: – http://www.ics.uci.edu/~fielding/talks/ webarch_9805/