Transcript Milestone 2

Milestone 2
Workshop in Information Security –
Distributed Databases Project
By: Yosi Barad, Ainat Chervin and Ilia Oshmiansky
Project web site: http://infosecdd.yolasite.com
Access Control Security vs. Performance
1
Milestone 2:
Our Plan:
Examine the Cassandra source code and research
different implementation options
Write some initial code
Test our implementation using YCSB++ with
basic configuration
Compare the test results to the results of our
initial tests
2
Milestone 2:
Our Plan:
Try to implement several different solutions to
be tested against each other
Set up a more advanced configuration of
Cassandra
Test the performance of Cassandra with and
without the added Cell-level Security
3
Milestone 2:
Our Plan:
Evaluate how to further improve the
implementation
Run more advanced set-ups of YCSB++
(custom workloads)
Create a final version of the
implementation based on the test results
4
Plan Step 1:
Examine the Cassandra source code and research
different implementation options
Steps:
• Fully understand the Cassandra data structure
• Cassandra terminology- A "Cell" is actually called a "Column" it is the lowest/smallest
increment of data and is represented as a tuple (triplet) that
contains a name, a value and a timestamp.
5
Plan Step 1:
Examine the Cassandra source code and research
different implementation options
-Next we have a "Column Family" which is a container for rows,
and can be thought of as a table in a relational system. Each row in
a column family can be referenced by its key.
6
Plan Step 1:
Examine the Cassandra source code and research
different implementation options
-Finally, a "Keyspace" is a container for column families. It's roughly
the same as a schema or database that is, a logical collection of
tables.
7
Plan Step 1:
Examine the Cassandra source code and research
different implementation options
• Understand the code flow of get and set operations.
• The security logic is separate from the rest of the code and has two main
interfaces:
1) Iauthenticator – responsible for authenticating the user that logs in.
2) Iauthority – responsible for authorizing the access of an
authenticated user to a specific resource.
• It is possible to write our own code which implements these Interfaces.
8
Plan Step 1:
Examine the Cassandra source code and research
different implementation options
• The Initial code had the following classes:
"AllowAllAuthentication.java“ - implements Iauthenticator .
"AllowAllAuthorization.java“ - implements Iauthority.
• With some research we found:
"SimplelAuthentication.java"
"SimpleAuthorization.java”
That allow some simple user authentication and a basic ACL.
9
Plan Step 1:
Examine the Cassandra source code and research
different implementation options
Some information about these classes:
• use two additional configuration files:
password.properties – a list of users and their passwords.
access.properties – contains a set of permissions.
10
Plan Step 1:
Examine the Cassandra source code and research
different implementation options
There are several problems with this implementation:
• Very inefficient
• Need a lot of maintenance.
Despite these issues we still believed we should use this code as a starting point
having full intentions to improve it.
11
Plan Step 2:
Write some initial code
The initial code we wrote:
• Implemented the RowKey access control.
At this point we could limit the access of Read/Write to a specific Row within a
ColumnFamily.
Keyspace1.Users.key1.<rw>=yosi
12
Plan Step 2:
Write some initial code
• Implemented the Column access control.
Same as The RowKey access control, this time we went a level lower.
Keyspace1.Users.key1.column1<rw>=yosi
13
Plan Step 2:
Write some initial code
• We set a new syntax to the Cassandra client: <column value>: [<user1,user2,...>
<permission>] [...].
For example:
Set Users['key1']['City'] = 'Haifa:scott,yosi rw:Andrew ro';
This command does the following:
1. creates a new column named "City" with value "Haifa" in column family “Users“.
2. Writes the permissions to the access.control file in the correct format.
In the following example we will add the following two lines to the access.control file:
Keyspace1.Users.key1.City<rw>=scott,yosi
Keyspace1.Users.key1. City<ro>=Andrew
14
Plan Step 3:
Test our implementation using YCSB++ with basic configuration
• We ran one basic test on it to get an idea of where we stand in terms of
performance.
• The results we got were terrifying (as expected):
An average of under 40 op/sec and it got lower the more tests we ran
every new entry meant another line in the file that we need to scan.
• Our next step was to improve the implementation so that it won't rely on
configuration files.
15
Plan Step 4:
Try to implement several different solutions to be tested
against each other
• Our two implementations were:
1) Writing the permissions to a file.
2) Storing the permissions within the values of the columns in
Cassandra.
16
Plan Step 5:
Storing the permissions within the values of the columns in
Cassandra.
This stage included the following:
• Parse the value returned from Cassandra.
• Add a new "get" function to grab and separate the ACL from the actual value
of a Column.
For example:
• Yosi wants to run the following command -
This command will now work as following:
1) Get the value and check the ACL in the value
2) Perform the validation
3) Perform the actual insert.
Milestone 2
Try to implement several different solutions to be tested
Story to
be told
version 1
against
each -other
‫עינת המגניבה‬
Access and modify
VerySecretValue
18
‫איליה הסקרן‬
Won’t be able to access
VerySecretValue
‫יוסי הקפדן‬
VerySecretValue:Yosi,Ainat rw
Milestone 2
Try to implement several different solutions to be tested
Story to
be told
version 2
against
each–other
‫עינת המגניבה‬
Access and modify
VerySecretValue
19
‫איליה הסקרן‬
Access but not modify
VerySecretValue
‫יוסי הקפדן‬
VerySecretValue:Yosi,Ainat rw:Ilia ro
Milestone 2
Demonstration
20
Plan Step 5:
Storing the permissions within the values of the columns in
Cassandra.
In order to further enhance the performance:
• We removed the "SimpleAuthority" related functions.
• We changed the implementation of the "authorize" function in the
SimpleAuthority class to read the ACL from the value within the Cassandra DB
rather than from the access.Properties file.
Plan Step 5:
Run more advanced set-ups of YCSB++ (custom workloads)
• To test our implementation we had to:
- Add ACL entries to the values YCSB sent to Cassandra
- Get YCSB to login to Cassandra with a user and password
• This included the following steps:
-Compile YCSB (this was a challenge since this code has no documentation
anywhere)
-Edit the YCSB code to connect to Cassandra with our user.
-Change the way YCSB generated values to fit our custom format
(<val>:<users> <rw>:<users> <ro>).
-Recompile it with different number of ACL entries for our "increasing
ACL" test.
• Now we got much better results:
22
Plan Step 5:
Workload A: Update heavy workload - mix of 50/50 reads and writes.
23
Plan Step 5:
Workload B: Read mostly workload – This workload has a 95/5
reads/write mix
24
Plan Step 5:
Workload C: Read only - This workload is 100% read.
25
Plan Step 5:
Workload D: Read latest workload - In this workload, new records are
inserted, and the most recently inserted records are the most popular.
26
Plan Step 5:
Workload F: Read-modify-write - In this workload, the client will read
a record, modify it, and write back the changes.
27
Plan Step 5:
Compare the test results to the results of our initial tests
• The current results aren't satisfying as they do not sit well with what was
expected (we expected the throughput to decrease with each additional
entry).
28
Plan Step 6:
Set up a more advanced configuration of Cassandra
(consisting of several nodes)
• We realized that we rather wait for a local HD allocation for running several
Cassandra nodes because:
- A shared hard drive would be the bottleneck and won't increase the
performance
- It is very hard to benchmark this remote storage.
- It is time consuming to set-up the clusters and if we'll get the local HD
allocation we might spend more time on building this configuration
again.
• We sent a request to the system admin for local HD allocations so we could
install Cassandra and test performance running on a local HD.
29
Plan Step 7:
Test the performance of the advanced configuration of
Cassandra with and without the added Cell-level Security
Once we will finish the more advanced set-up, we will be able to run both
Cassandra implementations (with and without the added security) and
thus get the desired results.
30
Plan Step 10:
Create a final version of the implementation
based on the test results
• We would like to further analyze the code and find ways to improve it
(see plans for ahead).
• Furthermore, we cannot rely on the tests we ran so far as they do not
accurately assess the performance.
31
Milestone 2
Progress Compared to Plan:
Plan Step
Examine the Cassandra source code and research
different implementation options
Write some initial code
Test our implementation using YCSB++ with basic
configuration
Compare the test results to the results of our initial
tests
Try to implement several different solutions to be
tested against each other
Set up a more advanced configuration of
Cassandra (consisting of several nodes)
Test the performance of the advanced
configuration of Cassandra with and without the
added Cell-level Security
Evaluate how to further improve the
implementation
Run more advanced set-ups of YCSB++ (custom
workloads)
Create a final version of the implementation based
32 test results
on the
Status
Milestone 2
Overall
We completed the goal of implementing cell-level ACL security,
but there is still some work to be done in the performance
testing and perhaps the code can be further improved.
33
Milestone 2
Plans for ahead
Expand the Cassandra setup –
• Upon receiving the local HD allocation we requested we can continue with the
more advanced testing and create a setup consisting of several nodes/clusters.
Expand the tests, search for limiting factors• We plan on expending our tests in several directions.
34
Milestone 2
Plans for ahead
Evaluate how to further improve the implementation –
At this point we do not see any major issues in our implementation. Also, we
will have to run better tests to understand the actual performance penalty of
our implementation and perhaps need some guidance to see where we can
improve.
Start analyzing security holes due to inconsistencies –
We need to figure out how to measure the inconsistencies using YCSB++ and
assess the security threats that might arise from these inconsistencies.
35
Milestone 2
Questions?
36