Measuring Performance
About this Guide
This guide is to measure the performance of running Cassandra with Portworx volumes. We use Docker directly on EC2 instances. You may chose to use a different way of starting Cassandra and creating Portworx volumes depending on your orchestration environment (Kubernetes, Mesosphere or Swarm). Running Cassandra in Docker containers is one of the most common uses of Portworx.
Testing Cassandra on Portworx
Below are the instructions to test and verify Cassandra’s Performance with Portworx volumes in a Docker environment without a scheduler. We will create three Cassandra docker containers on three machines and each Cassandra container will expose its ports. The test is conducted in AWS, using three r4.2xlarge instance and each with 60GB Ram and 128GB disk for the Portworx cluster.
Setup for the test
In each of the AWS instance launch the Portworx container and specify the etcd IP e.g. 172.31.45.219
and disk volume e.g./dev/xvdb
docker run --restart=always --name px -d --net=host \
--privileged=true \
-v /run/docker/plugins:/run/docker/plugins \
-v /var/lib/osd:/var/lib/osd:shared \
-v /dev:/dev \
-v /etc/pwx:/etc/pwx \
-v /opt/pwx/bin:/export_bin:shared \
-v /var/run/docker.sock:/var/run/docker.sock \
-v /var/cores:/var/cores \
-v /usr/src:/usr/src \
--ipc=host \
portworx/px-enterprise -daemon -k etcd://172.31.45.219:4001 -c PXCassTest001 -s /dev/xvdb
Once the Portworx cluster is created, create three Portworx volumes with size=60GB
that is local to each node. On each of the AWS instance run the following pxctl
command:
/opt/pwx/bin/pxctl volume create CVOL-`hostname` --size 60 --nodes LocalNode
Verify the new Portworx volumes:
/opt/pwx/bin/pxctl volume list
ID NAME SIZE HA SHARED ENCRYPTED IO_PRIORITY SCALE STATUS
999260470129557090 CVOL-ip-172-31-32-188 60 GiB 1 no no LOW 1 up - detached
973892635505817385 CVOL-ip-172-31-45-219 60 GiB 1 no no LOW 1 up - detached
446982770798983273 CVOL-ip-172-31-47-121 60 GiB 1 no no LOW 1 up - detached
Create Cassandra container(s) using the new Portworx volumes. In our test case; we have three AWS instances.
Set IP addresse variables of the three nodes on each instance:
NODE_1_IP=172.31.32.188
NODE_2_IP=172.31.45.219
NODE_3_IP=172.31.47.121
cassandra_conf.tar
file which includes cassandra.yaml file which is using TCP port 17000 instead of 7000. The reason to use custom cassandra.yaml file is because as of this writing, Portworx is occupying the port 7000, and the default Cassandra data and storage port is also on port 7000. Use this custom cassandra configuration to avoid this conflict. Download the cassandra_conf.tar file and extract it on /etc
folder. This step is not required if you are running with Kubernets or Mesosphere.
For each AWS instance do a docker run and launch the Cassandra latest version Docker container.
On Node 1:
docker run --name cass-`hostname` -e CASSANDRA_BROADCAST_ADDRESS=`hostname -i` \
-p 17000:17000 -p 7001:7001 -p 9042:9042 -p 9160:9160 -p 7199:7199 \
-v /etc/cassandra:/etc/cassandra \
-v CVOL-`hostname`:/var/lib/cassandra \
-d cassandra:latest
On Node 2:
docker run --name cass-`hostname` -e CASSANDRA_BROADCAST_ADDRESS=`hostname -i` \
-e CASSANDRA_SEEDS=${NODE_1_IP} \
-p 17000:17000 -p 7001:7001 -p 9042:9042 -p 9160:9160 -p 7199:7199 \
-v /etc/cassandra:/etc/cassandra \
-v CVOL-`hostname`:/var/lib/cassandra \
-d cassandra:latest
On Node 3:
docker run --name cass-`hostname` -e CASSANDRA_BROADCAST_ADDRESS=`hostname -i` \
-e CASSANDRA_SEEDS=${NODE_1_IP},${NODE_2_IP} \
-p 17000:17000 -p 7001:7001 -p 9042:9042 -p 9160:9160 -p 7199:7199 \
-v /etc/cassandra:/etc/cassandra \
-v CVOL-`hostname`:/var/lib/cassandra \
-d cassandra:latest
After all three Cassandra containers started, verify the status from one of the nodes by running nodetool status
$ docker exec -it cass-ip-172-31-32-188 sh -c 'nodetool status'
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 172.31.32.188 108.65 KiB 256 65.7% 7aa8a83e-1378-4aa1-b9d2-3008b3550b69 rack1
UN 172.31.45.219 84.33 KiB 256 66.0% b455c82e-9649-4724-adf1-dae09ec2c616 rack1
UN 172.31.47.121 108.29 KiB 256 68.3% 26ffac02-2975-4921-b5d0-54f3274bfe84 rack1
Running the test
Run Cassandra stress write
testing with 10K inserts into the target keyspace TestKEYSPACE
and 4 threads -rate threads=4
. On each node, start the Cassandra stress test about the same time. Below each node’s Cassandra container is inserting object into the same keyspace but at different sequence e.g. 1 .. 10000
.
On Node 1
docker exec -it cass-`hostname` cassandra-stress write n=10000 \
cl=quorum -mode native cql3 -rate threads=4 -schema keyspace="TestKEYSPACE01" \
"replication(factor=2)" -pop seq=1..10000 -log file=~/Test_10Kwrite_001.log \
-node ${NODE_1_IP},${NODE_2_IP},${NODE_3_IP}
On Node 2
docker exec -it cass-`hostname` cassandra-stress write n=10000 \
cl=quorum -mode native cql3 -rate threads=4 -schema keyspace="TestKEYSPACE01" \
"replication(factor=2)" -pop seq=10001..20000 -log file=~/Test_10Kwrite_002.log \
-node ${NODE_1_IP},${NODE_2_IP},${NODE_3_IP}
On Node 3
docker exec -it cass-`hostname` cassandra-stress write n=10000 \
cl=quorum -mode native cql3 -rate threads\>=72 -schema keyspace="TestKEYSPACE01" \
"replication(factor=2)" -pop seq=20001..30000 -log file=~/Test_10Kwrite_003.log \
-node ${NODE_1_IP},${NODE_2_IP},${NODE_3_IP}
The output of result on each node should be similar below:
docker exec -it cass-`hostname` cassandra-stress write n=10000 cl=quorum -mode native cql3 -rate threads=4 -schema keyspace="TestKEYSPACE" "replication(factor=2)" -pop seq=1..10000 -log file=~/Test_10Kwrite_001.log -node ${NODE_1_IP},${NODE_2_IP},${NODE_3_IP}
******************** Stress Settings ********************
Command:
Type: write
Count: 10,000
No Warmup: false
Consistency Level: QUORUM
Target Uncertainty: not applicable
Key Size (bytes): 10
Counter Increment Distibution: add=fixed(1)
Rate:
Auto: false
Thread Count: 4
OpsPer Sec: 0
Population:
Sequence: 1..10000
Order: ARBITRARY
Wrap: true
Insert:
Revisits: Uniform: min=1,max=1000000
Visits: Fixed: key=1
Row Population Ratio: Ratio: divisor=1.000000;delegate=Fixed: key=1
Batch Type: not batching
Columns:
Max Columns Per Key: 5
Column Names: [C0, C1, C2, C3, C4]
Comparator: AsciiType
Timestamp: null
Variable Column Count: false
Slice: false
Size Distribution: Fixed: key=34
Count Distribution: Fixed: key=5
Errors:
Ignore: false
Tries: 10
Log:
No Summary: false
No Settings: false
File: /root/Test_10Kwrite_001.log
Interval Millis: 1000
Level: NORMAL
Mode:
API: JAVA_DRIVER_NATIVE
Connection Style: CQL_PREPARED
CQL Version: CQL3
Protocol Version: V4
Username: null
Password: null
Auth Provide Class: null
Max Pending Per Connection: 128
Connections Per Host: 8
Compression: NONE
Node:
Nodes: [172.31.32.188, 172.31.45.219, 172.31.47.121]
Is White List: false
Datacenter: null
Schema:
Keyspace: TestKEYSPACE
Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
Replication Strategy Pptions: {replication_factor=2}
Table Compression: null
Table Compaction Strategy: null
Table Compaction Strategy Options: {}
Transport:
factory=org.apache.cassandra.thrift.TFramedTransportFactory; truststore=null; truststore-password=null; keystore=null; keystore-password=null; ssl-protocol=TLS; ssl-alg=SunX509; store-type=JKS; ssl-ciphers=TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA;
Port:
Native Port: 9042
Thrift Port: 9160
JMX Port: 7199
Send To Daemon:
*not set*
Graph:
File: null
Revision: unknown
Title: null
Operation: WRITE
TokenRange:
Wrap: false
Split Factor: 1
Connected to cluster: Test Cluster, max pending requests per connection 128, max connections per host 8
Datatacenter: datacenter1; Host: /172.31.32.188; Rack: rack1
Datatacenter: datacenter1; Host: /172.31.45.219; Rack: rack1
Datatacenter: datacenter1; Host: /172.31.47.121; Rack: rack1
Created keyspaces. Sleeping 3s for propagation.
Sleeping 2s...
Warming up WRITE with 7500 iterations...
Failed to connect over JMX; not collecting these stats
Running WRITE with 4 threads for 10000 iteration
Failed to connect over JMX; not collecting these stats
type total ops, op/s, pk/s, row/s, mean, med, .95, .99, .999, max, time, stderr, errors, gc: #, max ms, sum ms, sdv ms, mb
total, 1303, 1303, 1303, 1303, 1.8, 1.2, 4.9, 11.0, 17.0, 21.7, 1.0, 0.00000, 0, 0, 0, 0, 0, 0
total, 4324, 3021, 3021, 3021, 1.3, 1.1, 2.3, 6.1, 12.8, 14.9, 2.0, 0.27888, 0, 0, 0, 0, 0, 0
total, 7819, 3495, 3495, 3495, 1.1, 1.1, 1.3, 2.0, 11.0, 16.7, 3.0, 0.20770, 0, 0, 0, 0, 0, 0
total, 10000, 2692, 2692, 2692, 1.5, 1.1, 3.3, 9.2, 12.1, 13.5, 3.8, 0.15468, 0, 0, 0, 0, 0, 0
Results:
Op rate : 2,624 op/s [WRITE: 2,624 op/s]
Partition rate : 2,624 pk/s [WRITE: 2,624 pk/s]
Row rate : 2,624 row/s [WRITE: 2,624 row/s]
Latency mean : 1.3 ms [WRITE: 1.3 ms]
Latency median : 1.1 ms [WRITE: 1.1 ms]
Latency 95th percentile : 2.2 ms [WRITE: 2.2 ms]
Latency 99th percentile : 7.5 ms [WRITE: 7.5 ms]
Latency 99.9th percentile : 12.9 ms [WRITE: 12.9 ms]
Latency max : 21.7 ms [WRITE: 21.7 ms]
Total partitions : 10,000 [WRITE: 10,000]
Total errors : 0 [WRITE: 0]
Total GC count : 0
Total GC memory : 0.000 KiB
Total GC time : 0.0 seconds
Avg GC time : NaN ms
StdDev GC time : 0.0 ms
Total operation time : 00:00:03
END
If the above Cassandra test is OK and completed without any issue, the number of inserted objects and threads can be adjusted in such way to produce more accurate result.
Below is an example to insert 10 million objects into the target keyspace with threads >= 72
. When using threads >=72
, Cassandra Stress will run several cycles in threads 72, 108, 162, 243, 364, 546 and 819
docker exec -it cass-`hostname` cassandra-stress write n=10000000 \
cl=quorum -mode native cql3 -rate threads\>=72 -schema keyspace="TestKEYSPACE01" \
"replication(factor=2)" -pop seq=1..10000000 -log file=~/Test_10Mwrite_001.log \
-node ${NODE_1_IP},${NODE_2_IP},${NODE_3_IP}
After a write test, you can do a mixed test which is write/read; however a write test must be done before any mixed test:
docker exec -it cass-`hostname` cassandra-stress mixed n=10000000 \
cl=quorum -mode native cql3 -rate threads\>=72 -schema keyspace="TestKEYSPACE01" \
"replication(factor=2)" -pop seq=1..10000000 -log file=~/Test_10Mmixed_001.log \
-node ${NODE_1_IP},${NODE_2_IP},${NODE_3_IP}
Generally Cassandra stress test should be run on every Cassandra containers about the same time to increase the load. And using the same keyspace on the same test run, requires to use different sequence
to separate between each containers operation on the same keyspace (e.g. 1..10000 and 10001..20000 and so on)
.
Discussion Forum
If you have more questions about this application, please head over to our discussion forum and feel free to ask more questions.