Transcript Slide 1

SCOPE
Professor : Dr. Kyumars Sheykh Esmaili
Lecturers : Kayvan Zarei - shahed mahmoodi
Azad University of Sanandaj
1
1.About SCOP
2.PLATFORM OVERVIEW
4. SCOPE Execution
4.1 SCOPE Compilation
4.2 SCOPE Optimization
4.3 Example Query Plan
4.4 Runtime Optimization
Availability
Reliability
Scalability
Performance
Cost
5. EXPERIMENTAL EVALUATION
2.1 Cosmos Storage System
2.2 Cosmos Execution Environment
3.SCOPE Scripting Language
3.1 Input and Output
5.1 Experimental Setup
5.2 TPC-H Queries
5.3 Scalability
6. RELATED WORK
3.2 Select and Join
3.3 Expressions and Functions
3.4 User-Defined Operators
Azad University of Sanandaj
2
SCOPE (Structured Computations Optimized for Parallel Execution)
- A new declarative and extensible scripting language
- Targeted for this type of massive data analysis
- Being amenable to efficient parallel execution on large clusters
-Data is modeled as sets of rows composed of typed columns
SQL
Azad University of Sanandaj
3
SCOPE
tructured omputations ptimized for arallel xecution
- A declarative scripting language
- Easy to use: SQL-like syntax plus MapRecuce-like extensions
- Modular: provides a rich class of runtime operators
- Highly extensible:
- Fully integrated with .NET framework
- Provides interfaces for customized operations
- Flexible programming style: nested expressions or a series of
simple transformations
- Users focus on problem solving as if on a single machine
- System complexity and parallelism are hidden
Azad University of Sanandaj
4
Note :
Users can easily define their own functions and implement their own versions of
operators :
Extractors (parsing and constructing rows from a file)
Processors (row-wise processing)
Reducers (group-wise processing)
Combiners (combining rows from two inputs)
Azad University of Sanandaj
5
Large-scale Distributed Computing
…
...
…
…
…
How to program
the beast?
…
...
 Large data centers (x1000 machines): storage and computation
 Key technology for search (Bing, Google, Yahoo)
 Web data analysis, user log analysis, relevance studies, etc.
Azad University of Sanandaj
6
Internet companies store and analyze massive data sets, for example:
- such as search logs
- web content collected by crawlers
- click streams collected from a variety of web services.
Such analysis is becoming increasingly valuable for business in a variety of ways:
- to improve service quality and support novel features
- to detect changes in patterns over time
- to detect fraudulent activity.
Azad University of Sanandaj
7
Parallel Processing
Azad University of Sanandaj
8
Matrix Multiplication
Azad University of Sanandaj
9
Parallel Processing Architecture in Database Systems
Inter-Query
Inter-Operation
Intra-Operation
Azad University of Sanandaj
10
Parallel Processing in Business?
Massively Parallel Processing
Database for Business Intelligence
http://www.computerworld.com/pdfs/mpp_wp.pdf
Azad University of Sanandaj
11
Companies have developed distributed data storage and processing systems on
large clusters of shared-nothing commodity servers including :
Google‟s File System, Bigtable, Map-Reduce, Hadoop,
Yahoo!‟s Pig system, Ask.com‟s Neptune,
Microsoft‟s Dryad.
A typical cluster consists of hundreds or
thousands of
commodity machines
connected via a high-bandwidth network. It
is challenging to design a programming
model that enables users to easily write
programs that can efficiently and effectively
utilize all resources in such a cluster and
achieve maximum degree of parallelism
Azad University of Sanandaj
12
Map-Reduce / GFS
- GFS / Bigtable provide distributed storage
- The Map-Reduce programming model
- Good abstraction of group-by-aggregation operations
- Map function -> grouping
- Reduce function -> aggregation
- Very rigid: every computation has to be structured as a
sequence of map-reduce pairs
- Not completely transparent: users still have to use a parallel
mindset
- Error-prone and suboptimal: writing map-reduce programs is
equivalent to writing physical execution plans in DBMS
Azad University of Sanandaj
13
Pig Latin / Hadoop
- Hadoop: distributed file system and map-reduce
execution engine
- Pig Latin: a dataflow language using a nested data
model
- Imperative programming style
- Relational data manipulation primitives and plug-in
code to customize processing
- New syntax – need to learn a new language
- Queries are mapped to map-reduce engine
Azad University of Sanandaj
14
An Example: QCount
Compute the popular queries that have been requested at least 1000 times
SELECT query, COUNT(*) AS count
FROM “search.log” USINGLogExtractor
GROUP BY query
HAVING count> 1000
ORDER BY count DESC;
e = EXTRACT query
FROM “search.log” USINGLogExtractor;
OUTPUTTO“qcount.result”
s2 = SELECT query, count
FROMs1 WHERE count> 1000;
s1= SELECT query, COUNT(*) AS count
FROMeGROUP BY query;
s3 = SELECT query, count
FROMs2ORDER BY count DESC;
OUTPUTs3TO “qcount.result”
Data model: a relational rowset with well-defined schema
Azad University of Sanandaj
15
SCOPE / Cosmos
Microsoft has developed
a distributed computing
platform, called Cosmos
Figure 1: Cosmos Software Layers
Azad University of Sanandaj
16
Microsoft has developed a distributed computing platform, called Cosmos.
For storing and analyzing massive data sets.
Cosmos is designed to :
run on large clusters consisting of thousands of commodity
servers. Disk storage is distributed with each server having one
or more direct-attached disks.
Azad University of Sanandaj
17
- Cosmos Storage System
- Append-only distributed file system for storing
petabytes of data
- Optimized for sequential I/O
- Data is compressed and replicated
- Cosmos Execution Environment
- Flexible model: a job is a DAG (directed acyclic
graph)
Vertices -> processes
- edges -> data flows
- The job manager schedules and coordinates vertex
execution
-
- Provides runtime optimization, fault tolerance,
resource management
Azad University of Sanandaj
18
High-level design objectives for the Cosmos platform include :
1. Availability: Cosmos is resilient to multiple hardware fail ures to
avoid whole system outages.
2. Reliability: Cosmos is architected to recognize transient hardware
conditions to avoid corrupting the system.
3. Scalability: Cosmos is designed from the ground up to be a scalable
system, capable of storing and processing petabytes of data.
4. Performance: Cosmos runs on clusters comprised of thousands of
individual servers
5. Cost: Cosmos is cheaper to build, operate and expand, per gigabyte,
than traditional approaches to the same problem
Azad University of Sanandaj
19
Cosmos Storage System
- is an append-only file system that reliably stores petabytes of data
- The system is optimized for large sequential I/O
- All writes are append-only
- Data is distributed
A Cosmos Store provides a directory with a hierarchical names
pace and stores sequential files of unlimited size.
A file is physically composed of a sequence of extents. Extents
are the unit of space allocation and are typically a few hundred
megabytes in size.
A unit of computation generally consumes a small number of
collocated extents.
Azad University of Sanandaj
20
Cosmos Execution Environment
The lowest level primitives of the Cosmos execution
environment provide only the ability to run arbitrary
executable code on a server. Clients upload application
code and resources onto the system via a Cosmos
execution protocol. A recipient server assigns the task
a priority and executes it at an appropriate time. It is
difficult, tedious, error prone, and time consuming to
program at this lowest level to build an efficient and
fault tolerant application.
Azad University of Sanandaj
21
Input and Output
- SCOPE works on both relational and nonrelational data sources
- EXTRACT and OUTPUT commands provide a relational abstraction of
underlying data sources
EXTRACT column[:<type>] [, …]
FROM<input_stream(s) >
USING<Extractor> [(args)]
[HAVING<predicate>]
OUTPUT [<input>]
TO<output_stream>
[USING<Outputter> [(args)]]
- Built-in/customized extractors and outputters (C# classes)
publicclassLineitemExtractor : Extractor
{
…
public override Schema Produce(string[] requestedColumns, string[] args)
{…}
public overrideIEnumerable<Row> Extract(StreamReader reader, Row outputRow, string[] args)
{…}
}
Azad University of Sanandaj
22
Select and Join
SELECT [DISTINCT] [TOP count] select_expression [AS<name>] [, …]
FROM { <input stream(s)>USING<Extractor> |
{<input> [<joined input> […]]} [, …]
}
[WHERE<predicate>]
[GROUP BY <grouping_columns> [, …] ]
[HAVING<predicate>]
[ORDER BY <select_list_item> [ASC | DESC] [, …]]
joined input: <join_type>JOIN<input> [ON<equijoin>]
join_type: [INNER | {LEFT | RIGHT | FULL} OUTER]
- Supports different Agg functions: COUNT, COUNTIF, MIN, MAX, SUM,
AVG, STDEV, VAR, FIRST, LAST.
- No subqueries (but same functionality available because of outer join)
Azad University of Sanandaj
23
Deep Integration with .NET (C#)
- SCOPE supports C# expressions and built-in .NET functions/library
- User-defined scalar expressions
- User-defined aggregation functions
R1 = SELECT A+C AS ac, B.Trim() AS B1
FROM R
WHEREStringOccurs(C, “xyz”) > 2
#CS
public static intStringOccurs(stringstr, string ptrn)
{…}
#ENDCS
Azad University of Sanandaj
24
User Defined Operators
- SCOPE supports three highly extensible commands: PROCESS,
REDUCE, and COMBINE
- Complements SELECT for complicated analysis
- Easy to customize by extending built-in C# components
- Easy to reuse code in other SCOPE scripts
Azad University of Sanandaj
25
Process
- PROCESS command takes a rowset as input, processes each row, and
outputs a sequence of rows
PROCESS [<input>]
USING<Processor> [ (args) ]
[PRODUCE column [, …]]
[WHERE<predicate> ]
[HAVING<predicate> ]
publicclassMyProcessor: Processor
{
public override Schema Produce(string[] requestedColumns, string[] args, Schema inputSchema)
{…}
public overrideIEnumerable<Row> Process(RowSet input, Row outRow, string[] args)
{…}
}
Azad University of Sanandaj
26
Reduce
- REDUCE command takes a groupedrowset, processes each group, and
outputs zero, one, or multiple rows per group
REDUCE [<input> [PRESORT column [ASC|DESC] [, …]]]
ONgrouping_column [, …]
USING<Reducer> [ (args) ]
[PRODUCE column [, …]]
[WHERE<predicate> ]
[HAVING<predicate> ]
publicclassMyReducer: Reducer
{
public override Schema Produce(string[] requestedColumns, string[] args, Schema inputSchema)
{…}
public overrideIEnumerable<Row> Reduce(RowSet input, Row outRow, string[] args)
{…}
}
- Map/Reduce can be easily expressed by Process/Reduce
Azad University of Sanandaj
27
Combine
- COMBINE command takes two matching input rowsets, combines
them in some way, and outputs a sequence of rows
COMBINE<input1> [AS<alias1>] [PRESORT …]
WITH<input2> [AS<alias2>] [PRESORT …]
ON<equality_predicate>
USING<Combiner> [ (args) ]
PRODUCE column [, …]
[HAVING<expression> ]
COMBINE S1 WITH S2
ON S1.A==S2.A AND S1.B==S2.B AND S1.C==S2.C
USINGMyCombiner
PRODUCE D, E, F
publicclassMyCombiner: Combiner
{
public override Schema Produce(string[] requestedColumns, string[] args,
Schema leftSchema, string leftTable, Schema rightSchema, string rightTable)
{…}
public overrideIEnumerable<Row> Combine(RowSet left, RowSet right, Row outputRow, string[] args)
{…}
}
Azad University of Sanandaj
28
Importing Scripts
IMPORT<script_file>
[PARAMS<par_name> = <value> [,…]]
- Combines the benefits of virtual views and stored procedures in SQL
- Enables modularity and information hiding
- Improves reusability and allows parameterization
- Provides a security mechanism
E = EXTRACT query
FROM@@logfile@@
USINGLogExtractor ;
Q1 = IMPORT “MyView.script”
PARAMSlogfile=”Queries_Jan.log”,
limit=1000;
EXPORT
R = SELECT query, COUNT() AS count
FROME
GROUP BY query
HAVING count >@@limit@@;
Q2 = IMPORT “MyView.script”
PARAMSlogfile=”Queries_Feb.log”,
limit=1000;
…
Azad University of Sanandaj
29
Life of a SCOPE Query
Scope Queries
Parser /
Compiler
/ Security
...
…
…
…
...
Azad University of Sanandaj
30
Optimizer and Runtime
Scope Queries
(Logical Operator Trees)
Logical
Operators
Physical
operators
Cardinality
Estimation
Optimization
Rules
Cost
Estimat
ion
Transformation
Engine

Optimal Query Plans
(Vertex DAG)
 SCOPE optimizer
 Transformation-based optimizer
 Reasons about plan properties
(partitioning, grouping, sorting, etc.)
 Chooses an optimal plan based on
cost estimates
Vertex DAG: each vertex contains a
pipeline of operators
 SCOPE Runtime
 Provides a rich class of composable
physical operators
 Operators are implemented using
the iterator model
 Executes a series of operators in a
pipelined fashion
Azad University of Sanandaj
31
Example Query Plan (QCount)
SELECT query, COUNT(*) AS count
FROM “search.log” USINGLogExtractor
GROUP BY query
HAVING count> 1000
ORDER BY count DESC;
OUTPUT TO “qcount.result”
1.
2.
3.
4.
5.
6.
7.
8.
Azad University of Sanandaj
Extract the input cosmos file
Partially aggregate at the rack
level
Partition on “query”
Fully aggregate
Apply filter on “count”
Sort results in parallel
Merge results
Output as a cosmos file
32
TPC-H Query 2
// Extract region, nation, supplier, partsupp, part …
RNS_JOIN =
SELECTs_suppkey, n_nameFROM region, nation, supplier
WHEREr_regionkey == n_regionkey
ANDn_nationkey == s_nationkey;
RNSPS_JOIN =
SELECTp_partkey, ps_supplycost, ps_suppkey, p_mfgr, n_name
FROM part, partsupp, rns_join
WHEREp_partkey == ps_partkeyANDs_suppkey == ps_suppkey;
SUBQ =
SELECTp_partkeyASsubq_partkey,
MIN(ps_supplycost) ASmin_cost
FROMrnsps_joinGROUP BY p_partkey;
RESULT =
SELECTs_acctbal, s_name, p_partkey,
p_mfgr, s_address, s_phone, s_comment
FROMrnsps_joinAS lo, subqAS sq, supplier AS s
WHERElo.p_partkey == sq.subq_partkey
ANDlo.ps_supplycost == min_cost
ANDlo.ps_suppkey == s.s_suppkey
ORDERBYacctbalDESC, n_name, s_name, partkey;
OUTPUTRESULT TO "tpchQ2.tbl";
Azad University of Sanandaj
33
Sub Execution Plan to TPCH Q2
1.
2.
3.
4.
5.
6.
7.
8.
9.
Join on suppkey
Partially aggregate at the
rack level
Partition on group-by
column
Fully aggregate
Partition on partkey
Merge corresponding
partitions
Partition on partkey
Merge corresponding
partitions
Perform join
Azad University of Sanandaj
34
A Real Example
Azad University of Sanandaj
35
Current/Future Work
- Language enhancements
- Sharing, data mining, etc.
- Query optimization
- Auto tuning physical storage design
- Materialized view optimization
- Common subexpression exploitation
- Progressive query optimization
- Runtime optimization
- New execution strategies
- Self-adaptive dynamic query plans
Azad University of Sanandaj
36
Conclusions
- SCOPE: a new scripting language for large-scale analysis
- Strong resemblance to SQL: easy to learn and port existing
applications
-Very extensible
- Fully benefits from .NET library
- Supports built-in C# templates for customized operations
- Highly composable
- Supports a rich class of physical operators
- Great reusability with views, user-defined operators
- Improves productivity
- High-level declarative language
- Implementation details (including parallelism, system complexity) are
transparent to users
- Allows sophisticated optimization
- Good foundation for performance study and improvement
Azad University of Sanandaj
37