Introduction to Computational Complexity, Part 3
Understanding the Limits of Computation, One Step at a Time
Randomized Algorithms and Space Complexity
Throughout the evolution of computer science, randomness has always hovered at the edge of determinism like a mischievous spirit: invisible, elusive, and surprisingly powerful.
We don’t usually think of algorithms as creatures of chance. We like them neat, predictable, and efficient. Input goes in, output comes out, every step accounted for.
But what if we allow a little luck?
What if, instead of trying to outmaneuver the worst-case input, we lean into the unknown? Flip a coin. Roll a die. Pick a random pivot.
This is the world of randomized algorithms: programs that don’t just tolerate uncertainty, but try to exploit it. They might not always give the right answer, or at least not with 100% certainty, but they do so quickly, efficiently, and, with high probability, correctly.
And in parallel, another axis of complexity emerges: space. If time is how long an algorithm takes, space is how much memory it needs. And, more than often, space is the real constraint. Especially in today’s world of massive data, streaming inputs, and edge devices with microscopic RAM.
So today, we turn the microscope on two beautiful twists of traditional computation: randomness and space.
This is the third part in our journey through computational complexity theory, and in my view, the most fascinating one.
Why? Because it challenges the very intuitions we formed in the first two parts. We began with time complexity: how long does a computation take? Then we talked about how problem difficulty stratifies into neat hierarchies: P, NP, PSPACE, EXPTIME.
But now, we have to ask a series of deeper, messier questions:
What happens when an algorithm is allowed to make mistakes, but only rarely?
Can randomness make a hard problem easier?
If we can verify a solution quickly, can we also find it quickly, if we throw in a few coin flips?
Is randomness just a hack? Or is it a fundamental resource like time and space?
Can we always simulate randomness deterministically?
What if our memory budget is so tight that we can barely remember where we are?
Can we still solve problems with only logarithmic space? With only a few bits of randomness?
How does computation change when we can only scan the data once, and can’t store more than a few counters?
Does randomness become more powerful when space is scarce, or less?
These are not just idle curiosities. They cut to the core of what it means to compute in a resource-constrained world.
Historically speaking, the study of randomized algorithms emerged in the 1970s and 1980s, as computer scientists began exploring the power of probabilistic computation.
The first surprise came with primality testing: one of the oldest questions in number theory. Suddenly, a randomized algorithm could check for primes orders of magnitude, and it was capable of doing it faster than any known deterministic method.
This wasn’t just theory: it had real, cryptographic implications.
Meanwhile, the exploration of space complexity had already begun. In the 1960s, researchers like Juris Hartmanis and Richard Stearns had started defining time and space complexity classes.
Later, the introduction of logarithmic space (L) and nondeterministic logspace (NL) offered a way to talk about computations that could run in minuscule memory.
But it wasn’t until decades later that breakthrough results like Reingold’s theorem showed that even problems like undirected connectivity could be solved in deterministic logspace, shattering long-held assumptions about randomness and space.
Today, the interplay between randomness and space sits at the frontier of complexity theory. It’s a land full of open problems, wild ideas, and hidden structure.
It asks not just whether something can be done, but how cleverly we can do it with the fewest possible bits.
So, in this post, we’ll explore amplification and derandomization, by examining how space and randomness intersect, sometimes awkwardly, sometimes elegantly.
And we’ll ask whether we truly need randomness at all; or whether, like magic, it’s just a clever illusion we haven’t yet fully understood.
Because if computation is about limits, then randomness and space are two of its most beautiful, most paradoxical ones.
The BPP Class: When You Let the Algorithm Flip a Coin
In the world of computation, randomness isn’t just noise, it’s a strategic tool. To understand how, we turn to one of the foundational complexity classes in randomized computation: BPP, short for Bounded-error Probabilistic Polynomial time.
What Is BPP?
A language L⊆{0,1}∗ is in BPP if there exists a probabilistic polynomial-time Turing machine M such that for every input string x:
If x∈L, then Pr[M(x)=1] ≥ 2/3
If x∉L, then Pr[M(x)=1] ≤ 1/3
This definition allows M to make some kinds of “mistakes”, but only with bounded probability. The constants 1/3 and 2/3 aren’t special at all: they can be replaced by any constants a< 1/2 <b as long as b − a≥ϵ, for some fixed ϵ > 0.
Through probability amplification, we can repeat M independently and take a majority vote. This reduces the error exponentially with the number of repetitions, thanks to bounds like the Chernoff bound.
So the point of BPP isn't absolute certainty: it's confidence bounded away from 50%, with the option to make that confidence arbitrarily close to 100% through repetition.
Why Use Randomness at All?
Randomized algorithms often tend to achieve efficiency, simplicity, or elegance unattainable by deterministic ones. Here are some classic examples:
1. Primality Testing
Before the deterministic AKS primality test (2002), we had the Miller–Rabin and Solovay–Strassen tests, two randomized algorithms that could efficiently determine whether a number was probably prime with high confidence.
These tests run in polynomial time and formed the basis of secure cryptographic protocols for decades.
2. Polynomial Identity Testing (PIT)
Given two polynomials p(x1,…,xn) and q(x1,…,xn), represented implicitly by arithmetic circuits or black boxes, is p≡q ? That is, do they compute the same function over all inputs?
This problem basically reduces to checking whether h:=p−q≡0. We don’t evaluate h symbolically; we randomly sample points from the domain using a large enough set S⊆F, and evaluate h(x) at random values x∈S^n. Thanks to the Schwartz–Zippel Lemma, if h≢ 0 and its total degree is d, then:
Prx∈Sn[h(x)=0]≤d / ∣S∣
So by choosing |S| > 2d, the algorithm errs with probability less than 1/2 when h is not equivalent to zero. This places PIT in the class RP, a one-sided-error version of BPP.
3. Undirected s–t Connectivity
Is there a path from vertex s to t in an undirected graph? This is the Undirected Path problem. It's a classic example in randomized logspace (RL) computation.
Aleliunas, Karp, Lipton, Lovász, and Rackoff in a research paper (1979) showed that a random walk of polynomial length from ss will reach all vertices with high probability in a connected graph. So to decide if there's a path from ss to tt, do the following:
Start at s
Take O(n^3) random steps
Accept if you reach t; otherwise reject
The key lemma is that a random walk of length O(n^3) visits all nodes with high probability. This places the problem in RL: a space-constrained analog of BPP.
4. Hashing and Bloom Filters
Probabilistic data structures like Bloom filters use hash functions with random-looking properties to allow space-efficient set membership queries. They occasionally give false positives, but never false negatives. This probabilistic tradeoff enables tremendous space and time savings.
Can We Eliminate Randomness?
This leads to a deep and tantalizing question in theoretical computer science:
Is randomness truly necessary for efficient computation?
Or more precisely,
Is BPP = P?
Many complexity theorists believe so. The belief is that randomness can be simulated deterministically using pseudorandom generators (PRGs), which are algorithms that stretch short truly random seeds into long sequences that are "random enough" for any efficient computation.
Derandomization: Conditional Evidence
We don’t yet have a proof that BPP = P. But we do have conditional results:
If strong pseudorandom generators exist, then BPP = P.
If certain hard problems exist in EXP, or if one-way functions exist, then we can construct PRGs that fool all BPP algorithms.
In short: hardness implies randomness.
BPP and Circuit Complexity
Let’s now consider the class P, the class of languages that can be decided by a polynomial-size circuit family (i.e., nonuniform polynomial time).
It’s known that:
BPP⊆P/Poly
This means that for any BPP language L, there exists a polynomial-size Boolean circuit family that decides L.
Here's the intuition in more practical terms:
Let M be a BPP algorithm with exponentially small error, by saying that ε=2^−2n for inputs of length nn, achieved via majority amplification.
For every x∈{0,1}^n, let r be the random string that makes M correct on x.
By the union bound: the total measure of “bad” random strings across all x is less than 1.
Therefore, there exists a single r^* that works for all inputs of length n.
This string r^* becomes the nonuniform advice for input length n, making M effectively deterministic. Hence, L∈P/Poly.
This is Adleman's Theorem (1978):
BPP ⊆ P/poly
Known Relationships Between Randomized Classes
The relationships among classes like BPP, RP, ZPP, and NP are rich but not fully settled:
P ⊆ ZPP ⊆ RP⊆BPP ⊆ P/poly
ZPP = RP∩co-RP
RP ⊆ NP, but whether BPP ⊆ NP is unknown
Derandomization would collapse some of these inclusions: if BPP = P, many distinctions vanish
A Short Thought
Randomized algorithms are powerful not because they guarantee certainty, but because they let us efficiently reach high confidence. They’ve transformed fields from primality testing to graph algorithms to error-correcting codes.
But whether this power is fundamental or just a crutch we haven’t yet learned to throw away, that’s still one of the great open questions in theoretical computer science.
Space Complexity: How Much Can You Remember?
Most of computational complexity theory is obsessed with time, by relentlessly asking: how fast can you solve a problem? But in many real-world systems, space, not time, is the bottleneck.
Streaming algorithms must operate with limited memory. Embedded systems live on kilobytes. In low-latency data processing, storing too much isn't just expensive, it’s practically infeasible.
So let’s shift focus. Let’s now talk about space, by asking ourselves how much a computation needs to remember.
The Core Space Classes
We begin with the basics:
L (Logspace): Problems solvable by a deterministic Turing machine using only O(\log n) memory. That’s just enough to index a position in the input, not to copy it.
NL (Nondeterministic Logspace): Like L, but the machine can make nondeterministic choices. A natural example is graph reachability: does a path exist from node s to node t?
PSPACE: The class of all problems solvable using polynomial space, regardless of time. It contains NP, co-NP, and much more.
EXPSPACE: Like PSPACE, but with exponential space bounds. Few practical algorithms live here, but the class is structurally rich.
Unlike time classes, space classes exhibit unusual stability. In time complexity, nondeterminism often blows up the power of a class. But in space complexity, things are subtler.
Savitch’s Theorem
In 1970, Savitch proved a surprising result:
For any function s(n) ≥ logn,
NSPACE(s(n)) ⊆ DSPACE(s(n)^2)
This means that every problem solvable in nondeterministic space s(n) is also solvable in deterministic space s(n)^2. So while nondeterminism is still more powerful, it doesn’t create exponential space gaps.
A classic example: graph reachability is in NL, but by Savitch's algorithm (recursive divide-and-conquer), we can solve it in deterministic O(\log^2 n) space.
This result is quite rare in complexity theory: we almost never get such tight control over nondeterminism.
Reingold’s Theorem: A Landmark Collapse
One of the most elegant results in modern complexity theory came in 2004:
Omer Reingold proved that:
SL=L
Here, SL is the class of problems solvable in logspace with symmetric, nondeterministic access (symmetric logspace), like undirected graph reachability.
Previously, it was only known to be in RL (randomized logspace). Reingold’s work showed that no randomness or nondeterminism is needed: a deterministic, logspace algorithm exists.
This was a shock. It dismantled the long-standing belief that undirected connectivity intrinsically required randomness. And it showed that even with tiny memory, we could simulate behaviors previously thought to require probabilistic shortcuts.
It was also a technical masterpiece, relying on expander graphs, zig-zag product constructions, and ideas from derandomization theory.
Randomness and Space: An Uneasy Alliance
When you’re constrained in memory, using randomness becomes trickier. The natural analogs to BPP in logspace are:
RL (Randomized Logspace): Logspace machines that can err with one-sided bounded probability.
BPL: Two-sided error, like BPP, but under a logspace constraint.
ZPL: Zero-error (Las Vegas) probabilistic logspace.
But space-bounded derandomization is much harder than time-bounded derandomization.
Why?
Because amplifying confidence in a randomized algorithm, running it multiple times and taking a majority vote, requires storing intermediate outcomes. But in logspace, you simply don’t have room for many coin flips, let alone votes.
Unlike BPP, where you can afford to repeat an algorithm k times and count successes, in logspace, even recording k outcomes may require more memory than you have.
This leads to fundamental techniques in space-bounded derandomization:
Randomness recycling: Reusing random bits via carefully designed pseudorandom walks.
Pseudorandom generators (PRGs) for logspace: Explicit constructions that stretch a short random seed into a long bitstream indistinguishable from true randomness by any logspace machine.
Shachar Lovett’s lectures emphasize the subtlety here: even generating pseudorandom bits requires memory. So PRGs must be designed to be computable within logspace.
And unlike in BPP, where derandomization could hinge on assuming cryptographic hardness, space-bounded derandomization often depends on combinatorial constructions, expanders, extractors, and pseudorandom walks.
Is RL = L?
This is the big open question in space-bounded randomness:
Is every problem solvable in randomized logspace also solvable in deterministic logspace?
In other words: is randomness eliminable in logspace computation?
Reingold’s result, showing Undirected Connectivity∈L, gives us hope. It was a problem previously complete for SL (and in RL), but now known to be in L.
But the general question remains unresolved.
Other problems, like Directed s–t Connectivity, remain complete for NL. Whether RL = L, or NL = L, are major frontiers. And derandomizing RL in full may require new forms of PRGs: stronger, simpler, and tighter than we’ve yet found.
Randomness, Memory, and Tradeoffs
Randomized algorithms are powerful because they offer simplicity and speed in exchange for a tolerable level of uncertainty. That’s often a great deal.
But when space is tight, when memory is scarce, the cost of randomness increases. It’s harder to amplify. Harder to recycle. Harder to control.
So space-bounded computation teaches us something fundamental: not all randomness is free.
In time complexity, you can spend time to control error. In space complexity, you might not have the room to do so.
The View from Here
If P vs NP is the Mount Everest of complexity theory, then the terrain of space complexity, randomized computation, and derandomization is a vast mountain range.
Not as tall, perhaps, but far more intricate. Narrow passes. Hidden valleys. Technical cliffs where even small improvements feel like summits.
We don’t yet know the full shape of this terrain.
But with every coin flip we eliminate, and every bit of memory we save, we get a little closer.
One step, and one logspace walk, at a time.
Toward a Deeper Understanding of Computation
Randomness and space, two forces at the edges of computation, reveal a side of algorithms that’s less mechanical and more philosophical. They challenge the classical ideal of predictable machines grinding out exact answers.
Instead, they invite us to embrace uncertainty, approximation, and constraint: not as weaknesses, but as creative tools.
We’ve seen that randomness can make hard problems easier, that coin flips can guide algorithms to elegant and efficient solutions.
We’ve also seen that space limitations impose a deeper kind of discipline: when memory is scarce, every bit matters, and cleverness replaces brute force.
But we’ve also glimpsed a deeper unity. The ongoing quest to derandomize computation, by eliminating randomness without sacrificing efficiency, hints that what looks like chance may in fact be structure in disguise.
And the push to solve problems in logarithmic space shows that what looks like a limitation might actually be a path to deeper understanding.
So the study of randomized and space-bounded algorithms isn’t just about performance. It’s about the essence of computation; what it means to compute, to decide, to remember, and to reason under constraint.
And as with many frontiers in theoretical computer science, we are far from having all the answers.
But in asking these questions, about BPP and RL, about pseudorandomness and memory limits, we are carefully mapping the terrain. Slowly. Inevitably. Sometimes even randomly.
Because complexity theory, at its heart, isn’t just a science of what’s possible. It’s a philosophy of what’s necessary.
And perhaps, in that interplay between luck and limitation, we’ll discover something essential: not just about algorithms, but about thought itself.