מצגת של PowerPoint

Download Report

Transcript מצגת של PowerPoint

Milestone 1
Workshop in Information Security –
Distributed Databases Project
By: Yosi Barad, Ainat Chervin and Ilia Oshmiansky
Project web site: http://infosecdd.yolasite.com
Access Control Security vs. Performance
1
Milestone 1:
Our Plan:
Install and run Cassandra
Install and run YCSB++
Initial testing of Cassandra
Run benchmark tests
Install Accumulo
2
Plan Step:
Install and run Cassandra
We have installed Cassandra on the lab computers
3
Plan Step:
Install and run Cassandra
Cassandra database is configured and capable to run in 2
different modes:
1) One cluster consisting of one node which manages all
the keys and values in the database.
2) One cluster consisting of two nodes which share the
keys values and they manage and store 50% of the
database each.
4
Plan Step:
Install and run YCSB++
• We have installed and built the YCSB++ source code
• We used YCSB++ with the "basic" database configuration
supplied ,in order to test the benchmark framework
5
Plan Step:
Initial testing of Cassandra
• We used Cassandra client shell in order to create keyspaces,
column families, add a column within a family and for storing
and retrieving key names and values.
• Cassandra supplies statistics for these manual operations so
we could get the idea of how much time each operation
consumes.
6
Plan Step:
Connect YCSB++ to Cassandra and run benchmark tests
• We used Cassandra-10 client binding supplied by the YCSB++
database in order to connect to the Cassandra database.
• We ran some core benchmark tests and the results are further
detailed later on in this document.
7
Plan Step:
Connect YCSB++ to Cassandra and run benchmark tests
• First we ran the tests from one client pc to a Cassandra server
consisting of a single node.
• Next we added another Cassandra node and re-conducted the
same tests.
8
Plan Step:
Connect YCSB++ to Cassandra and run benchmark tests
We ran the tests for these reasons:
1. Establish a baseline by which future results (post implementation
of cell level ACL) will be judged.
2. Establish the maximal throughput of Cassandra on a single node.
3. Compare the performance of a Cassandra with one node to
Cassandra with two node.
9
Plan Step:
Automate the testing procedure
• We created several scripts to automate the test.
For example script that would:
1) run all the different workloads YCSB++ offers with different
numbers of threads
2) Create an output file with the relevant results
10
Plan Step:
Automate the testing procedure
11
Plan Step:
We used the core workloads that are included with the YCSB
installation and ran them all 8 times each.
Each time we increased the number of threads.
Workload A: Update heavy workload - mix of 50/50 reads and writes.
12
Plan Step:
Workload B: Read mostly workload –
This workload has a 95/5 reads/write mix.
13
Plan Step:
Workload C: Read only - This workload is 100% read.
14
Plan Step:
Workload D: Read latest workload - In this workload, new records are
inserted, and the most recently inserted records are the most
popular.
15
Plan Step:
Workload E: Short ranges - In this workload, short ranges of records
are queried, instead of individual records.
16
Plan Step:
Workload F: Read-modify-write - In this workload, the client will read
a record, modify it, and write back the changes.
17
Plan Step:
Connect YCSB++ to Cassandra and run benchmark tests
• We noticed a general degradation in performance regarding the
Cassandra 2 nodes configuration
• We assume it is due to the synchronization overhead between the
two nodes.
• More work has to be done in order to explain these results. (see
plans ahead)
18
Plan Step:
Install Accumulo
• We have installed, configured and ran - apache Zookeeper and
apache Hadoop as they are prerequisites for the Accumulo
database.
• We made sure it works by performing several basic operations
using the client shell
19
Milestone 1
Progress Compared to Plan:
Plan Step
Status
Install and run Cassandra
Complete
Install and run YCSB++
Complete
Run some initial manual testing of Cassandra
Complete
Connect YCSB++ to Cassandra and run benchmark tests
Complete
Install Accumulo
Complete
20
Milestone 1
Plans for ahead
1. Extend our Accumulo and Cassandra setups to include several
clustersThis stage is critical in order to get real meaningful test results
and for finding security holes in the later stages.
21
Milestone 1
Plans for ahead
2. Improve our testing environmentThis stage includes the following:
a) Write our own workloads (with ACL)
b) Run several clients simultaneously
c) Edit the test configurations according to our test plan
(technical details)
22
Milestone 1
Plans for ahead
d) Run diverse tests to understand the limiting factors in
each test (might be the testing equipment, CPU-time, disk
I/O, network limitations, synchronization overhead
between nodes and much more). and if possible - change
the setup to eliminate this limiting factor.
e) Analyze the CPU and disk usage of the machines to
understand the results better.
23
Milestone 1
Plans for ahead
3. Get into the Cassandra code and start the cell-level ACL
implementationThere are two main options:
a) Sending JSON strings as part of the HTTP requests then
storing them in Cassandra.
24
Milestone 1
Plans for ahead
b) Adding simple strings like: "(Alice, rx) (Bob, rwxo) (Charlie,
rx) ..." we can store in Cassandra as is and when Alice will
try to read a file from Cassandra we will check that the
ACL allows her to do so.
25
Milestone 1
Overall
• We managed to complete the milestone as planned
• Moreover, we succeeded in extending the system to two
nodes.
This is quite a breakthrough given the difficulties we
experienced with the installations. And it brings us that much
closer to achieving the goal in milestone#2, which is running a
system consisting of several clusters.
26