I recently built a sorting playground, and one question kept coming up:
How do you compare and evaluate sorting algorithms?
Not just theoretically, but in practice.
The problem
A simple benchmark sounds easy:
- run the same algorithm
- measure time
- compare
But it quickly becomes complicated:
- Python vs Rust vs C behave very differently
- large inputs can break CI pipelines
- some algorithms are not even benchmarkable
The approach
Instead of chasing perfect accuracy, I focused on something else:
consistency and reproducibility
Key decisions
- run benchmarks in CI (GitHub Actions)
- use fixed datasets
- run multiple iterations
- average results
Input sizes
- small
- medium
- large
But not every language runs all sizes (due to runtime issues).
The reality of cross-language benchmarking
One important thing I learned:
Not all languages should run the same workload.
For example:
- Rust / C / C++ can handle large datasets easily
- Python can become extremely slow on large inputs
- running everything “fairly” is not practical
Practical constraints
So I introduced constraints:
- Python skips large datasets
- heavy algorithms are limited
- some algorithms opt out entirely
This makes the system:
- fast enough to run in CI
- stable
- still useful for comparison
Incremental benchmarking
Another key idea:
don’t re-run everything
- new algorithms → benchmarked
- existing ones → reused
This keeps CI time under control.
What the system produces
Each algorithm gets:
- per-language results
- per-size measurements
- aggregated data
The output is stored as static JSON and rendered in the UI.
Why build this?
Because combining:
- visualization
- comparison
- and reproducible benchmarks
makes algorithms much easier to understand.
Future ideas
- more languages
- better scoring models
- workload-specific comparisons
But always keeping it simple enough to run.
If you’re interested, you can explore it here (Open sourced under the MIT license):
https://sorting.1234567890.dev/benchmark
https://github.com/T-1234567890/sort-playground
United States
NORTH AMERICA
Related News
How Braze’s CTO is rethinking engineering for the agentic area
10h ago
Amazon Employees Are 'Tokenmaxxing' Due To Pressure To Use AI Tools
21h ago

Implementing Multicloud Data Sharding with Hexagonal Storage Adapters
15h ago

DeepMind’s CEO Says AGI May Be ~4 Years Away. The Last Three Missing Pieces Are Not What Most People Think.
15h ago

CCSnapshot - A Claude Code Configs Transfer Tool
21h ago