Cassandra Meetup Nice Jean Armel Luce Orange France Thursday, June 19 2014 Summary  Short description of PnS.

Download Report

Transcript Cassandra Meetup Nice Jean Armel Luce Orange France Thursday, June 19 2014 Summary  Short description of PnS.

Cassandra Meetup Nice
Jean Armel Luce
Orange France
Thursday, June 19 2014
Summary
2

Short description of PnS. Why did we choose C* ?

Some key features of C*

After the migration …

Analytics with Hadoop/Hive over Cassandra

Some conclusions about the project PnS3.0
Jean Armel Luce - Orange-France
Cassandra Meetup Nice – June 19 2014
Short description of PnS3
PnS – Short description

PnS means Profiles and Syndication : PnS is a highly available
service for collecting and serving live data about Orange customers

End users of PnS are :
– Orange customers (logged to Portal www.orange.fr)
– Sellers in Orange shops
– Some services in Orange (advertisements, …)
4
Jean Armel Luce - Orange-France
Cassandra Meetup Nice – June 19 2014
PnS – The Big Picture
Millions of HTTP requests
(Rest or Soap)
Fast and highly available
WebService to get or set
data stored by pns :
-
postProcessing(data1)
postProcessing(data2)
postProcessing(data3)
postProcessing(datax)
…
End users
DB Queries
R/W operations
Thousands of files
(Csv or Xml)
Scheduled data injection
PNS
Data providers
5
Jean Armel Luce - Orange-France
Database
Cassandra Meetup Nice – June 19 2014
PnS – Architecture
2 DCs architecture for high availability

Until 2012, data were stored in 2
differents backends :

MySQL cluster (for volatile data)

PostGres « cluster » (sharding and
replication)
Bagnolet
and

web services
Sophia
Antipolis
(read and writes)
for batch updates

6
Jean Armel Luce - Orange-France
Cassandra Meetup Nice – June 19 2014
Timeline – Key dates of PnS 3.0
PNS 2
• Study phase
2010
to 2012
We did a large study about a few NoSQL databases (Cassandra, MongoDB, Riak, Hbase, Hypertable, …)
 We chose Cassandra as the single backend for PnS
• Design phase
06/2012
We started the design phase of PnS3.0
• Proof Of Concept
09/2012
We started a 1st (small) Cassandra cluster in production for a non critical application : 1 table, key value
access
• Production phase
04/2013
Migration of the 1st subset of data of PnS from mysql cluster to Cassandra in production
• Complete migration
05/2013
to
12/2013
Migration of all other subsets of data from Mysql cluster and Postgres to Cassandra
Add new nodes in the cluster (From 8 nodes in each DC to 16 nodes in each DC)
• Analytics
0
04/2014
7
Add a 3rd datacenter
Jean Armel Luce - Orange-France
Cassandra Meetup Nice – June 19 2014
PnS – Why did we choose Cassandra ?

Cassandra fits our requirements :
– Very high availability
PnS2 = 99,95% availability
we want to improve it !!!
– Low latency
20 ms < RT PnS2 web service < 150 ms
we want to improve it !!!
Higher load, higher volume next years
unpredictable; better scalability
?
brings new businesses
– Scalability

And also :
– Ease of use : Cassandra is easy to administrate and operate
– Some features that I like (rack aware, CL per request, …)
– Cassandra is designed to work naturally and plainly in a multidatacenter
architecture
8
Jean Armel Luce - Orange-France
Cassandra Meetup Nice – June 19 2014
Some key features of C*
Who is Cassandra ?

Cassandra is a NoSQL database, developped in Java,

Cassandra was created at Facebook in 2007, was open sourced
and then incubated at Apache, and is nowadays a Top-LevelProject.

2 distributions of Cassandra :
– Community edition : http://cassandra.apache.org/
 distributed under the Apache License
– Enterprise distribution : http://www.datastax.com/download
 distributed by Datastax
10
Jean Armel Luce - Orange-DSIF/SDFY-PnS 3.0
Cassandra Meetup Nice – June 19 2014
Cassandra : architecture/topology


11
The main characteristic of Cassandra : all the nodes play the same role

No master, no slave, no configuration server  no SPOF

Rows are sharded among the nodes in the cluster

Rows are replicated among the nodes in the cluster
The parameter TopologyStrategy defines how/where rows in a keyspace are
replicated (monodatacenter, multidatacenter, …)
Jean Armel Luce - Orange-DSIF/SDFY-PnS 3.0
Cassandra Meetup Nice – June 19 2014
Cassandra : how requests are executed

The application sends a request to one of the nodes (not always the same; try to
balance the load among the nodes). This node is called the coordinator

The coordinator routes the query to the datanode(s)

The datanodes execute the query and return the result to the coordinator

The coordinator returns the result to the application
Application
12
Jean Armel Luce - Orange-DSIF/SDFY-PnS 3.0
Cassandra Meetup Nice – June 19 2014
Cassandra : what happens when a node crash?

Read query Case 1 : a READ query is executed while a datanode is crashed :

the coordinator has already received the information (via Gossip) that
a node is down and do not send any request to this node
Replica1
Replica 2
Application
13
Jean Armel Luce - Orange-France
Replica 3
Cassandra Meetup Nice – June 19 2014
Cassandra : what happens when a node crash

Read query Case 2 : the coordinator crashes while a READ query is being executed :

the application receives a KO (or timeouts), then re-sends the request
to another node which acts as a new coordinator
Application
14
Jean Armel Luce - Orange-France
Cassandra Meetup Nice – June 19 2014
Cassandra : what happens when a node crash ?

Write query Case 1 : a WRITE query is executed while a datanode is
crashed and there are enough replica up :
– A “Hinted Handoff” is stored in the coordinator
Replica1
Replica 1
HH
Replica 2
Application
15
Jean Armel Luce - Orange-France
Replica 3
Cassandra Meetup Nice – June 19 2014
Cassandra : what happens when a node crash ?

Write query Case 1 : a WRITE query is executed while a datanode is
crashed :
– The write is executed in all replica which are available.
– A Hinted Handoff is stored in the coordinator, and the query will be
executed when the datanode comes back again (within 3 hours)

3 tips for keeping consistency between nodes :
– Hinted Handoffs (repair when node comes back in the ring after a
failure)
– Read repairs (automatic repair in background for 10% of read queries)
– Anti entropy repairs (manual read repair for all data)
16
Jean Armel Luce - Orange-France
Cassandra Meetup Nice – June 19 2014
The eventual consistency with Cassandra


We can specify the consistency level for each read/update/insert/delete
request
Latency
Consistenc
Availability
CL mostly used :
y
 LOCAL_ONE
Weak
Higher
Lower

ONE

ANY

LOCAL_QUORUM

QUORUM

ALL

SERIAL
Strong
Lower
Strong consistency : W + R > RF
17
Jean Armel Luce - Orange-France
Cassandra Meetup Nice – June 19 2014
Higher
Cassandra : sharding with virtual nodes
(versions 1.2 +)

Virtual nodes are available since C* 1.2.

With virtual nodes, adding a new node (or many nodes) in the cluster is easy


18
data are moved from ALL the old nodes to new node :

few data to move between nodes

after the move of data, the cluster is still well balanced
procedure totally automatized

Adding a new node in the cluster is a normal operation which is done on-line
without interruption of service

When adding nodes, replica are also moved between nodes
Jean Armel Luce - Orange-France
Cassandra Meetup Nice – June 19 2014
Cassandra : Adding a node using Vnodes
Example : adding a 5th node in a 4-nodes cluster
Node 4
Node 1
Node 5
Node 3
Node 2
19
Jean Armel Luce - Orange-France
Cassandra Meetup Nice – June 19 2014
CQL (Cassandra Query language)


20
DDL :

CREATE keyspace, CREATE table, CREATE INDEX

ALTER keyspace, ALTER table

DROP keyspace, DROP table, DROP INDEX
DML :

SELECT

INSERT/UPDATE (INSERT equivalent to UPDATE : improve performances)

DELETE (delete a ROW or delete COLUMNS in a ROW)
Jean Armel Luce - Orange-France
Cassandra Meetup Nice – June 19 2014
CQL (Cassandra Query language)

Exemple :
jal@jal-VirtualBox:~/cassandra200beta1/apache-cassandra-2.0.0-beta1-src/bin$
./cqlsh -u cassandra -p cassandra
Connected to Test Cluster at localhost:9160.
[cqlsh 4.0.0 | Cassandra 2.0.0-beta1-SNAPSHOT | CQL spec 3.1.0 | Thrift protocol
19.37.0]
Use HELP for help.
Keyspace instead
of database
cqlsh> CREATE KEYSPACE fr
Replication
strategy
Replication
factor
... WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
cqlsh> use fr;
Connexion to the
keyspace
21
Jean Armel Luce - Orange-France
Cassandra Meetup Nice – June 19 2014
CQL (Cassandra Query language)

Exemple :
cqlsh:fr> CREATE TABLE customer (
... custid int,
INSERT equivalent
to UPDATE
... firstname text,
... lastname text,
... PRIMARY KEY (custid) );
cqlsh:fr>
cqlsh:fr> UPDATE customer set firstname = ‘Bill', lastname = ‘Azerty' WHERE
custid = 1;
cqlsh:fr> INSERT INTO customer (custid , firstname , lastname ) values (2,
‘Steve', ‘Qwerty');
cqlsh:fr>
22
Jean Armel Luce - Orange-France
Cassandra Meetup Nice – June 19 2014
CQL (Cassandra Query language)

Exemple :
cqlsh:fr> SELECT firstname, lastname FROM customer WHERE custid = 1;
firstname | lastname
SELECT with clause
WHERE on primary
key
-----------+----------
Bill
| Azerty
(1 rows)
cqlsh:fr> SELECT * FROM customer WHERE custid = 2;
custid | firstname | lastname
--------+------------+---------2 | Steve
|
Qwerty
(1 rows)
23
Jean Armel Luce - Orange-France
Cassandra Meetup Nice – June 19 2014
CQL (Cassandra Query language)

Exemple :
cqlsh:fr> SELECT * FROM customer WHERE lastname= ‘Azerty';
Bad Request: No indexed columns present in by-columns clause with Equal
operator
SELECT rejected.
No other operator than = accepted in
WHERE clause
(<, >, != rejected)/ The column in the
WHERE clause must be indexed
This request requires an index on column
‘lastname’
24
Jean Armel Luce - Orange-France
Cassandra Meetup Nice – June 19 2014
After the migration …
The latency

Comparison before/after migration to Cassandra

Some graphs about the latency of the web services are very
explicit :
Service push mail
Service push webxms
dates of
migration to C*
26
Jean Armel Luce - Orange-France
Cassandra Meetup Nice – June 19 2014
The latency

Read and write latencies are now in microseconds in the datanodes :
Thanks to
and
This latency will be improved by (tests in progress) :
ALTER TABLE syndic WITH compaction = { 'class' : 'LeveledCompactionStrategy', 'sstable_size_in_mb' : ?? };
27
Jean Armel Luce - Orange-France
Cassandra Meetup Nice – June 19 2014
The availability
•
We got a few hardware failures and network outages
•
28
No impact on QoS :
•
no error returned by the application
•
no real impact on latency
Jean Armel Luce - Orange-France
Cassandra Meetup Nice – June 19 2014
The scalability
•
Thanks to vnodes (available since Cassandra 1.2), it is easy to scale
out
With NetworkTopologyStrategy, make sure to distribute evenly the nodes
in the racks
29
Jean Armel Luce - Orange-France
Cassandra Meetup Nice – June 19 2014
Analytics with Hadoop/Hive
over Cassandra
Basic architecture of the Cassandra cluster

Cluster without Hadoop : 2 datacenters, 16 nodes in each DC

RF (DC1, DC2) = (3, 3)

Web servers in DC1 send queries to C* nodes in DC1

Web servers in DC2 send queries to C* nodes in DC2
Pool
of
web
servers
DC1
31
Jean Armel Luce - Orange-France
DC1
Cassandra Meetup Nice – June 19 2014
DC2
Pool
of
web
servers
DC2
Architecture of the Cassandra cluster with the datacenter for
analytics

Cluster with Hadoop : 3 datacenters, 16 nodes in DC1, 16 nodes in
DC2, 4 nodes in DC3

32
RF (DC1, DC2, DC3) = (3, 3, 1)

Because RF = 1 in DC3, we need less storage space in this
datacenter

We favor cheaper disks (SATA) in DC3 rather than SSDs or
FusionIo cards
Jean Armel Luce - Orange-France
Cassandra Meetup Nice – June 19 2014
Architecture of the Cassandra cluster with the datacenter for
analytics
Pool
of
web
servers
DC1
DC1
DC2
DC3
33
Jean Armel Luce - Orange-France
Cassandra Meetup Nice – June 19 2014
Pool
of
web
servers
DC2
Some conclusions about
the project PnS3
Conclusions

35
With Cassandra, we have improved our QoS

Lower response time

Higher availability

More flexibility for exploitation teams

We are able to open our service to new opportunities

There is a large ecosystem around C* (Hadoop, Hive, Pig, Storm,
Shark, Titan, …), which offers more capabilities.

The Cassandra community is very active and helps a lot. There are a
lot of resources available : mailing lists, blogs, …
Jean Armel Luce - Orange-France
Cassandra Meetup Nice – June 19 2014
Thank you
for your attention
Questions
37
Jean Armel Luce - Orange-France
Cassandra Meetup Nice – June 19 2014
A few answers about hardware/OS version /Java
version/Cassandra version/Hadoop version

Hardware :


38
16 nodes in DC1 and DC2 at the end of 2013 :

2 CPU 6cores each Intel® Xeon® 2.00 GHz

64 GB RAM

FusionIO 320 GB MLC
4 nodes in DC3

24 GB de RAM

2 CPU 6cores each Intel® Xeon® 2.00 GHz

SATA Disks 15Krpm

OS : Ubuntu Precise (12.04 LTS)

Cassandra version : 1.2.13

Hadoop version : CDH 4.5 (with Hive 0.10) : Hadoop 2 with MRv1

Hive handler : https://github.com/cscetbon/hive/tree/hive-cdh4-auth

Java version : Java7u45 (GC CMS with option CMSClassUnloadingEnabled)
Jean Armel Luce - Orange-France
Cassandra Meetup Nice – June 19 2014
A few answers about data and requests



39
Data types :

Volume : 6 TB at the end of 2013

elementary types : boolean, integer, string, date

collection types

complex types : json, xml (between 1 and 20 KB)
Requests :

10.000 requests/sec at the end of 2013

80% get

20% set
Consistency level used by PnS for on line queries and batch updates
:

LOCAL_ONE (95% of the queries)

LOCAL_QUORUM (5% of the queries)
Jean Armel Luce - Orange-France
Cassandra Meetup Nice – June 19 2014