← Back to overview
Performance claims

Performance Observations &
Execution Characteristics

The DataForge engine has been validated across local and cloud execution environments using real-world datasets and production-representative configurations. All results reflect sustained throughput under controlled conditions and are reproducible with comparable inputs.

Performance Summary

Validated results across execution environments and target systems.

2,516,818
rows / sec · 86.0 MB/sec throughput
Local PostgreSQL ingestion
75.8M rows · 30.1s elapsed · API execution path · warmed system conditions
883,017
rows / sec
Cloud SQL (Enterprise Plus · 8 vCPU / 64 GB)
Executed via Cloud Run Jobs
~15,000
rows / sec
Same Cloud SQL configuration
Executed via Cloud Run Services — CPU throttled post-response lifecycle
~300,000
rows / sec
Local SQL Server ingestion
C# native implementation · SqlBulkCopy
249,353
rows / sec
Local SQL Server ingestion
Go implementation
~8×
throughput advantage
PostgreSQL vs SQL Server
Under comparable conditions and identical source data

Verified Benchmark Record

Sustained end-to-end ingestion over the full dataset under warmed system conditions.

75,814,101
Rows inserted
30.1s
Elapsed time
2,516,818
Rows / sec
86.0 MB/s
Throughput
  • Local PostgreSQL ingestion
  • API execution path
  • Sustained run over full dataset — no synthetic batching
  • Executed under warmed system conditions
~2,160,869 rows/sec Previous recorded run · ~35.1s elapsed
2,516,818 rows/sec Current result · 30.1s elapsed
+16.5% throughput · −14% elapsed time

The observed improvement is consistent with enhanced write-ahead logging (WAL) efficiency and improved buffer pool utilization due to warmed execution state. These factors influence the destination system behavior, not the DataForge engine itself.

Performance gains are realized as the target system becomes more efficient. The ingestion engine maintains a stable execution profile.

This benchmark reflects sustained, end-to-end ingestion performance under realistic conditions and is representative of achievable throughput on properly configured systems.

Key Observations

01

The System Is Not Compute-Bound

Across all environments, high throughput is sustained with CPU utilization remaining low relative to output and disk I/O pressure remaining minimal under streaming conditions.

The limiting factor is not processing capacity, but data movement and destination system behavior.
02

Execution Model Determines Realized Throughput

Cloud results demonstrate a significant divergence between execution modes against identical infrastructure:

883K rows/sec Cloud Run Jobs — sustained CPU allocation
~15K rows/sec Cloud Run Services — lifecycle throttled post-202
Throughput is governed by execution model constraints, not infrastructure alone.
03

Destination System Characteristics Define Upper Bounds

The engine's performance ceiling is determined by the characteristics of the target system, not the ingestion pipeline itself.

  • PostgreSQL supports 2.5M+ rows/sec under warmed, optimized conditions via COPY FROM STDIN — single streaming protocol call per batch
  • SQL Server throughput is constrained by TDS bulk copy protocol overhead and UTF-16 encoding at the driver layer
04

Implementation Layer Impacts Throughput Efficiency

Observed differences between implementations on SQL Server under equivalent conditions:

~300K rows/sec C# · SqlBulkCopy · native protocol integration
249K rows/sec Go · go-mssqldb · driver and batching overhead
The engine remains constant. The execution layer determines how efficiently its performance is expressed.
05

Local vs Cloud Performance Characteristics

Local execution provides minimal network overhead, direct I/O access, and no container lifecycle constraints — producing the highest achievable throughput ceiling.

Cloud execution introduces network transfer overhead and service model constraints. Despite these differences, properly configured cloud execution remains within the same order of magnitude as local performance.

The 59× gap between Cloud Run Services and Jobs at constant hardware is attributable entirely to CPU allocation policy — not schema, data, or network.
06

Concurrency and Scaling Behavior

Performance improves with concurrency: overhead is amortized across workloads, throughput increases without proportional infrastructure expansion, and per-unit cost decreases as utilization rises.

  • Each Cloud Run Job handles one file at full throughput
  • The concurrency ladder scales linearly against the storage write ceiling
  • Infrastructure footprint remains stable as concurrency increases
Scaling is achieved through concurrency, not infrastructure growth.

A different approach to throughput.

The distinction is architectural, not incremental.

Traditional systems

Increase throughput by adding infrastructure — more nodes, more services, more coordination layers. Each addition introduces latency, failure surface, and operational cost. The infrastructure is the answer to every throughput question.

vs
DataForge

Increases throughput by increasing concurrency within a stable execution boundary — amortizing overhead, not multiplying it. The infrastructure is fixed; the throughput scales with utilization.

Operational Implications

Prefer sustained compute allocation

Use execution models that allow full CPU access for the duration of the job. Jobs over Services. Bare-metal over throttled containers. The mode is as important as the hardware.

Minimize unnecessary data movement

Network overhead is real and measurable. Colocate the execution environment with the target system where possible. Same-region deployments confirmed no throughput difference between public IP and VPC private IP.

Align target system selection with requirements

PostgreSQL and SQL Server carry different throughput ceilings under the same workload. System selection is a performance decision. The 8× gap exists at the protocol layer before DataForge is in the picture.

Treat runtime as an optimization layer

Language and driver selection affect how efficiently the engine's output is expressed. Native protocol integration outperforms abstraction layers at scale. The engine is constant; the delivery mechanism is a variable.

All results:

  • Derived from real-world datasets (CourtListener public corpus)
  • Executed under observable, repeatable conditions
  • Reproducible with comparable configurations and hardware class

DataForge consistently delivers high-throughput data movement across environments, with performance bounded primarily by external system constraints rather than internal processing limits.

Ready to
talk throughput?

Pilot discussions, investor conversations, enterprise architecture review, or technical deep-dives.