-
Notifications
You must be signed in to change notification settings - Fork 13
Performance
All tests were performed on a 2.4 GHz Inter Core 2 Duo with 4GB 1067 MHz DDR3. The utility used can be found here.
The JVM running the server was not restarted between runs and tests types (Thrift vs Smile).
The test does not exercise the Hadoop code: the collector simply processes the events and buffers them locally. Maximum in-memory queue size is 200,000.
We simulated 5 Scribe clients, sending buckets of 60 messages:
Number of buckets sent per Scribe | Total number of messages sent | Time (run 1) | Time (run 2) |
---|---|---|---|
2000 | 600,000 | 0:34 | 0:34 |
2500 | 750,000 | 0:36 | 0:39 |
3000 | 900,000 | 0:48 | 0:47 |
3500 | 1,050,000 | 0:57 | 0:57 |
The collector was falling over (queue full) when we tried running all four tests consecutively. For the 3000 and 3500 tests, we had to wait for the queue to flush between the two.
At 4000, the collectors reject events (queue full).
In this configuration, capacity of the collectors is roughly 17,500 Thrift events per second.
Similar tests were performed on the same collector JVM. Note that the collector handled easily the four tests consecutively, but we waited for the queue to flush between runs 1 and 2:
Number of buckets sent per Scribe | Total number of messages sent | Time (run 1) | Time (run 2) |
---|---|---|---|
2000 | 600,000 | 0:25 | 0:25 |
2500 | 750,000 | 0:32 | 0:32 |
3000 | 900,000 | 0:39 | 0:37 |
3500 | 1,050,000 | 0:45 | 0:48 |
5000 | 1,500,000 | 1:06 | 1:05 |
20000 | 6,000,000 | 4:23 |
At 20,000, the collector had its queue almost full. Hence, in this configuration, capacity is roughly 22,800 events per second.
Notes:
- The nice side effect of Smile is that we can drop the Base64 encoding/decoding. This reduces CPU usage from 80% to 50% or lower.
- The String payload seems higher with Smile though (74 bytes for Smile, 68 for Base64-encoded Thrift).
- One of the reasons why Smile performs better is because we never fully deserialize the payload.
For 600,000 events sent:
Serialization | Acceptance time TP99 (millis) | Extraction time TP99 (millis) | Write time TP99 (millis) |
---|---|---|---|
Thrift | 0.074 | 0.026 | 0.257 |
Smile | 0.083 | 0.027 | 0.193 |
The Acceptance time is the full time spent in memory to process (deserialize, …) the events. The Extraction is the time spent de-serializing the original payload into an Event. The Write time is the time spent to write events to disk.
The Smile payload is much more compact on disk.