www.spatial.cs.umn.edu

Download Report

Transcript www.spatial.cs.umn.edu

SQL & Hadoop for big data
CSCI 4707
G39
Serena Li
Jasmine Joseph
What is Big Data?
○
Velocity - social media updates
○ Volume - huge(whole genome sequencing)
○ Variety - music, pictures, videos,
structured, and unstructured data
Cannot be handled by
Traditional Databases
What is Hadoop?
●
●
Open-source software framework
-- Hadoop Common
-- Hadoop Distributed File System (HDFS)
-- Hadoop YARN
-- Hadoop MapReduce
Addresses to handle Big Data
(storage and large-scale processing)
How Hadoop Works
Hadoop Distributed File
System(HDFS)
1) High-throughput
access to large
(terabytes or petabytes)
of data
2)Files are stored in a
redundantly across
multiple machines to
ensure their durability
to failure and high
availability to very
parallel applications.
http://vardhan-java2java.blogspot.com/2013/01/hadoop-interview-questions.html
How Hadoop Works
Map Reduce
1) Splits Problem
into smaller
chunks
2) Solved
Individually
3) Consolidate
Results
http://vardhan-java2java.blogspot.com/2013/01/hadoop-interview-questions.html
SQL vs Hadoop
•
•
•
SQL: structured data. Hadoop unstructured data.
Batch Processing: Hadoop: good for hours task,
not for small batches.
Cost for hadoop: virtual machine, different
operation systems, share hardware.
SQL and Hadoop
-- Hadoop Common
-- Hadoop HDFS
-- Hadoop YARN
-- Hadoop MapReduce
&
Big data trends 2014: SQL
● SQL Holds Biggest Promise for Big Data
● Yet SQL Poses Challenges (delay)
Big data trends in 2014: Hadoop
•
•
•
•
•
Authentication, Data Errors
Operational Hadoop,
Data Hubs
Data-Centric Apps
Focus: Unstructured Query Language
References
Apache Hadoop. In Wikipedia. Retrieved April 26, 2014, from
http://en.wikipedia.org/wiki/Apache_Hadoop
Comparing SQL databases and Hadoop. HadoopIntroduction. Retrieved April 28, 2014, from
https://sites.google.com/site/hadoopintroduction/home/comparing-sql-databases-and-hadoop
Preimesberger, C. (2014, January 6). Enterprise Big Data Analytics: 10 Prominent Trends to Look for
in 2014. Eweek. Retrieved April 26, 2014, from http://www.eweek.com/enterpriseapps/slideshows/enterprise-big-data-analytics-10-prominent-trends-to-look-for-in-2014.html
Vardhan. (2013, January 1). Hadoop interview questions. Retrieved April 26, 2014, from
http://vardhan-java2java.blogspot.com/2013/01/hadoop-interview-questions.html