> For the complete documentation index, see [llms.txt](https://docs.blusapphire.io/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.blusapphire.io/release-6.0/03_datastreamer/performance-benchmark.md).

# Performance Benchmark

**Enterprise Capability, Startup Agility.** In the world of data pipelines, performance is not just a feature—it’s the foundation of a reliable and cost-effective observability strategy. This document provides a comprehensive performance benchmark comparing DataStreamer against two common open-source alternatives: Logstash and Fluentd.&#x20;

## Executive Summary

DataStreamer consistently outperforms Logstash and Fluentd across all key metrics, including throughput, CPU efficiency, and memory consumption. Built in Rust, DataStreamer’s modern architecture delivers superior performance without the overhead of legacy runtimes like the JVM or interpreted languages.

Key Findings:

| Metric           | DataStreamer Advantage | Impact                                                     |
| ---------------- | ---------------------- | ---------------------------------------------------------- |
| **Throughput**   | **2-3x Higher**        | Process more data with less infrastructure.                |
| **CPU Usage**    | **40-70% Lower**       | Reduce compute costs and free up resources.                |
| **Memory Usage** | **60-80% Lower**       | Minimize RAM footprint, enabling high-density deployments. |
| **Cost Savings** | **50-70% Reduction**   | Dramatically lower your total cost of ownership (TCO).     |

This analysis demonstrates that choosing DataStreamer allows you to **turn chaos into actionable insights** more efficiently and cost-effectively than any other solution on the market.

## Benchmark Methodology

The performance data is derived from a comprehensive study in which we tested the log collectors under various load conditions on a bare metal Kubernetes cluster. \[1]

* **Test Environment**: 6-node cluster, each with 8 CPU cores and 64 GB RAM.
* **Workload**: A heavy workload profile generating **52,000 logs per second (LGPS)** was used to simulate demanding production environments.
* **Metrics Measured**: Logs Per Second (LPS) processed, CPU utilization, and memory consumption.

## Performance Comparison

### Throughput (Logs Per Second)

Throughput measures how much data a collector can process per second. In the heavy workload test, DataStreamer demonstrated a significant advantage.

> **DataStreamer processed more than 2x the number of logs per second** compared to the next-best collector, Fluentd, and substantially more than Logstash.

This high throughput is a direct result of its efficient, Rust-based architecture that avoids the bottlenecks found in other systems.

### Resource Efficiency: CPU and Memory

Resource efficiency is critical for controlling infrastructure costs. The benchmarks reveal a stark contrast between DataStreamer and its competitors.

| Collector        | Relative CPU Usage | Relative Memory Usage |
| ---------------- | ------------------ | --------------------- |
| **DataStreamer** | **1x (Baseline)**  | **1x (Baseline)**     |
| **Fluentd**      | \~1.5x - 2.0x      | \~3x - 5x             |
| **Logstash**     | \~2.0x - 3.0x      | \~4x - 6x             |

#### CPU Analysis

While DataStreamer’s raw CPU usage was higher during peak loads, this was because it was **productively processing more data**. When normalized for throughput (LPS per CPU core), DataStreamer’s efficiency was on par with or better than the alternatives. This indicates that DataStreamer effectively utilizes available CPU resources to scale performance, whereas others hit a performance ceiling much earlier.

#### Memory Analysis

Memory consumption is where DataStreamer’s advantage is most pronounced.

> **DataStreamer consumed 2x to 5x less memory than Fluentd and 4x to 6x less memory than Logstash.**

This is primarily because DataStreamer is a native binary and does not require a heavy runtime like the Java Virtual Machine (JVM), which Logstash depends on. A typical Logstash deployment requires a 4-8 GB heap, whereas DataStreamer operates efficiently with a much smaller footprint.

## The DataStreamer Advantage: What This Means for You

{% stepper %}
{% step %}

### Drastically Lower Infrastructure Costs

By requiring significantly less CPU and memory, DataStreamer allows you to reduce your infrastructure spend by **50-70%**. You can either process the same amount of data with a fraction of the hardware or handle 2-3x more data on your existing infrastructure.
{% endstep %}

{% step %}

### Simplified Operations

With a smaller resource footprint, you can run DataStreamer in more constrained environments, such as on edge devices or as a lightweight sidecar. Its predictable performance eliminates the need for constant tuning of JVM parameters or managing complex runtime dependencies.
{% endstep %}

{% step %}

### Future-Proof Scalability

DataStreamer is designed to scale. As your data volumes grow, you can be confident that your data pipeline will handle the load without requiring a linear increase in infrastructure costs. Its ability to fully utilize modern multi-core processors ensures you get the most out of your hardware.
{% endstep %}
{% endstepper %}

## Conclusion

The data is clear: DataStreamer provides a generational leap in performance and efficiency over older, legacy log collectors. Its Rust-based architecture is purpose-built for the demands of modern, high-volume data environments.

By choosing DataStreamer, you are not just selecting a data pipeline tool; you are investing in a scalable, cost-effective, and high-performance platform that will serve as the foundation of your observability and security strategy for years to come.