Solving Pipes Puzzles with Constraint Propagation

One of my guilty timewasters is playing pipe puzzles on my phone. The rules are fairly simple: you get a grid of pipes and you need to rotate them such that all of the pipes connect with no broken connections and no cycles. I vibecoded a JavaScript implementation to embed here so you can play around to understand the mechanics. If you want to play for real go to puzzle-pipes.com.

Solving these individually is certainly a waste of my time. A marginally better use of my time is writing a solver. At least this way I might learn something.

Solving Pipes Puzzles

Depth First Search

One approach we could take is to just check every possible configuration to see whether it’s a solution. This is clearly a terrible idea (an n×n grid has up to \(4^{n^2}\) configurations) but let’s run with it conceptually. The biggest issue is to figure out how to generate all of these configurations efficiently. Enter depth first search (with this stylized Julia code).

function dfs_solve(initial_state)
    stack = [initial_state]
    while !isempty(stack)
        state = pop!(stack)

        is_solved(state) && return state

        branch_pt = next_branch_point(state)

        for next_states in make_branch(state, branch_pt)
            push!(stack, next_states)
        end
    end
end

Our initial state is all possible assignments for each pipe. Our branching points will be selecting a definite assignment for each pipe. If it turns out we were correct on all assignments we’re done; otherwise we continue iterating through the search.

Short-Circuiting

We can obviously make this more efficient. Notably, we can start short-circuiting our search when we know the remaining solution is infeasible. If we’ve set up our pipes in such a way that we already have a cycle or two incompatible pipes are connected then there’s no point in further examining states from that tree. Thus we can augment our search with one further line.

function dfs_solve_with_stopping(initial_state)
    stack = [initial_state]
    while !isempty(stack)
        state = pop!(stack)

        !is_valid_state(state) && continue  # added to stop early
        is_solved(state) && return state

        branch_pt = next_branch_point(state)

        for next_states in make_branch(state, branch_pt)
            push!(stack, next_states)
        end
    end
end

Constraint Propagation

This is getting much better but it’s still fairly inefficient. I might introduce an inevitable contradiction towards the start of my search but we won’t know it until the very end. It would be nice if we could catch it earlier: we can do this via constraint propagation and just never generate these assignments in the first place.

function dfs_with_constraint_propagation(initial_state)
    stack = [initial_state]

    while !isempty(stack)
        (states, queue) = pop!(stack)

        propagate_constraints!(states, queue) # added to remove impossible next states

        !is_valid_state(states) && continue
        issolved(states) && return states

        branch_pt = next_branch_point(states)

        for next_states in make_branch(states, branch_pt)
            push!(stack, (next_states, [branch_pt]))
        end
    end
end

function propagate_constraints!(states, queue)
    while !isempty(queue)
        idx = pop!(queue)

        possible_states = states[idx]

        for (neighbor_idx, dir) in neighbors(idx, states)
            existing = states[neighbor_idx]
            allowed = possible_neighboring_states(possible_states, dir)
            new_set = intersect(existing, allowed)

            if length(new_set) < length(existing)
                states[neighbor_idx] = new_set
                push!(queue, neighbor_idx)
            end
        end
    end
end

This starts with an initial queue of states to consider: let’s just use the pipe whose state we just branched on. It then takes that state and looks at all the neighbors. For each neighbor, it calculates which neighbor states are consistent with the current state and only keeps those. If we removed any states in this process then we’ve successfully propagated a constraint; but now we might have follow-on effects! So we add that neighbor to the queue and see if we can propagate any further constraints. We keep doing this until we run out of the chain.

Going faster with mutable state

This already gets us a reasonably fast solver. However it’s not going to be the fastest; in particular we have a lot of memory allocation going on here due to making passing around a lot of copies of our states as next_state. It would be nicer if we just manipulated a single mutable state data structure

This only works, however, if there’s an easy way to backtrack and undo certain transformations. Fortunately this is relatively easy for us: we can maintain an undo log which stores the state changes so undoing an operation is simply assigning the old value to a field. So if we found out we had a contradiction we iterate back through our propogated constraints restoring the old sets.

But interestingly the constraint propagation is the easy part: it’s is_valid_state that benefits the most from this mutability.

Incrementally Detecting Invalid States

The check for connectivity is relatively easy enough to define with a Disjoint Set Union: we go through connecting tiles and if we ever find components which are already connected then we’ve got a loop and the solution is invalid.

This, however, requires us to iterate through all of the tiles each time as we rebuild the DSU from scratch. Note that there’s a lot of repeated work here: the DSU structure is basically the same for each state up to their shared history: we should be able to keep around the same data structure and just update it incrementally.

The only problem is with backtracking: we need to be able to undo a previous merge in our DSU data structure. With the default implementation this isn’t possible once path compression is applied: the flattened parent pointers lose the information needed to reverse the union.

We could follow the same approach as the state constraints and maintain an undo log. This adds a lot of additional bookkeeping as we add every node along the parents chain. In the worst case this is \(O(n^2)\).

Fortunately we can modify the usual DSU to drop path compression entirely. Without it, find walks the parent chain one step at a time, making every mutation to parents trivially reversible. Finds become O(log n) which we can easily live with.

Benchmarking

And with these we’re done! This happens to solve these puzzles quite quickly!

Puzzle	Rebuild DSU	Incremental DSU	Speedup
7×7	33.1 µs	13.5 µs	2.5×
15×15	1.3 ms	76.9 µs	16.3×
30×30	10.0 ms	295.8 µs	33.7×

I’ve vibecoded a bot to submit to puzzle-pipes.com: You would think we could blow the current first place out of the water as it only sits at 5 ms. Instead we’re currently in 2nd place on the robot’s leaderboard as the majority of the time is just latency to the server!

Now that we can efficiently solve these puzzles, what else can we do?

Can we generate these puzzles? (and how many are there?)

One question I pondered about was how the original website generates all of these puzzles. Could I create my own puzzles?

A simple approach is to use the same mechanism as the solver to generate puzzles. Instead of starting with a board of particular tiles we instead start with the full set of tiles. We then do our DFS but once we find a solution we keep going and explore the rest of the tree. By definition this should explore every single possible board configuration. Let’s use this to see how many there are!

We’re by necessity only going to be able to answer this for small boards since we’re just going to generate all of them!

Grid	Solutions (unwrapped)	Solutions (wrapped)
2×2	4	32
3×3	176	9936
4×4	76464	27409728

You can see that this grows very quickly! And yeah I don’t think I’m ever going to solve all of them.

Duplicates

The solutions obviously include a lot of duplicates due to symmetry. For instance the 2×2 grid is all the same solution: two end points connected by two corners. But you can rotate this through all four different ways. For larger grids you also can find non-distinct solutions through flipping and of course for wrapped puzzles you can arbitrarily select any pipe as the top left.

The natural question is whether there are any duplicates that are not due to symmetry? Are there initial conditions such that you can get two different possible solutions? I didn’t find any in my enumerations but with such limits on the size that’s not surprising since you need sufficient structure to enable duplication.

We can actually test this empirically now that we have a fast solver. We can sample puzzles (see next section) at various sizes and check whether the solver finds a second solution:

Grid	Rate unwrapped	Rate wrapped
5×5	17.6% [16.9%, 18.4%]	24.6% [23.7%, 25.4%]
7×7	24.6% [23.8%, 25.5%]	31.8% [30.9%, 32.8%]
10×10	44.6% [43.6%, 45.6%]	52.8% [51.8%, 53.8%]
25×25	98.5% [98.3%, 98.7%]	99.0% [98.8%, 99.2%]

The trend is clear: almost all puzzles end up having duplicate solutions at high enough sizes. The intuition for this becomes obvious when we look at an example of how this happens in a 5×5 wrapped puzzle:

Solution 1

Solution 2

The highlighted cells are the only ones that differ: this 2×2 block of Ts and endpieces can be rotated either way and maintain the correct external connectivity. As you start scaling there are more and more places for this or analogous substructures to show up. Of course there are always unique solutions (the trivial “spiral” pattern comes to mind) but they end up being sparser and sparser.

Interestingly, puzzle-pipes.com uses a hash to verify submitted solutions which wouldn’t work if multiple solutions were valid, suggesting they actively filter these out during generation. I’m curious if they achieve this through smarter generation or, like I’ve done, just throwing a bunch of compute at it.

How hard are these puzzles? (and can we sample them?)

Another question we might have is how much “harder” it is to solve the wrapped puzzles compared to the unwrapped versions. We can quantify this with the number of backtracks that it takes to solve each puzzle.

Now this gets much more interesting when we start increasing the size of the board. From how fast the numbers of solutions grew in the previous section, we know that generating all of the solutions is going to be infeasible. But we don’t need all of the solutions: a uniform random sample should suffice to answer most of our questions. Unfortunately getting this uniform random sample is quite hard!

Random DFS

The difficulty is not in the randomization: we can easily accomplish this by randomly choosing cells+orientations when doing our DFS. The difficulty lies with the uniformity. Let’s assume we’re pretty close to the end of our DFS and we’re randomly selecting between two orientations: if we choose one option our constraint propagation will kick in and there’s only the one solution remaining. If we choose the other however there are two possible choices and thus we’ll have one more DFS exploration. If we randomly selected our orientation we’ll be biased towards the solution which stems from the first selection since it’ll happen 50% of the time while the other two solutions will only show up 25% of the time each.

We could correct for this by biasing our selection probability by the total number of solutions left below this node in the tree. But that requires us to traverse the entire tree which grows too quickly! So we’re stuck.

Wilson’s Algorithm

Unless!! Maybe we’re going about this all wrong. What if instead of trying to shoehorn generation into our solving approach we just started from scratch.

If you take a step back and look at each solution it turns out they’re just spanning trees: a subgraph that connects all vertices with no cycles!

You can actually generate these uniformly using Wilson’s algorithm (modified to remove any solutions with cross pieces (all four connections active)). Wilson’s algorithm is just a random walk that removes loops. Start with a single seed cell in the tree. While unvisited cells remain, pick one at random and walk randomly until you reach the tree; any loop formed during the walk is immediately erased. Commit the resulting loop-erased path to the tree and repeat. See Jamis Buck’s excellent write-ups on Wilson’s algorithm (and the alternative Aldous-Broder algorithm) for detailed walkthroughs with visualizations.

Note that these are uniform over the graph structure: if we find that there are puzzles with non-unique solutions we’d end up oversampling that particular puzzle: we could check after generation however and reweight appropriately.

Unfortunately we start running into a bigger problem: the probability that a random spanning tree won’t have any cross pieces drops precipitously as the size goes up: at some point we reject almost all proposals. So we need a little hack: instead of rejecting the entire tree we reject the individual walks. That way we don’t throw away our progress. The only tradeoff is that we no longer have uniformity: the mass of trees which were close to rejected trees gets inflated since we push the rejected mass onto the remaining trees.

Which method is better?

We can verify this bias empirically on a small grid. For a 3×3 unwrapped puzzle we can enumerate all solutions and compare how often each solution appears when sampling via our random DFS versus Wilson’s algorithm. A perfectly uniform sampler would hit each solution equally; KL divergence from uniform measures how far each method strays. KL divergence is zero for a perfectly uniform sampler and grows positive as the distribution diverges.

Method	KL divergence from uniform
Wilson’s	0.0291
DFS-random	0.2202

Wilson’s lands much closer to zero than DFS-random so we’ll use it going forward. We won’t have a truly uniform sampler but it’ll have to do.

Quantifying Difficulty

Armed with our random tree generator we can quantify solve difficulty by counting how many dead-end branches the backtracking solver has to prune. This effectively measures the size of the search tree and thus a reasonable measure of the effort required to solve. We’ll randomly draw wrapped and unwrapped puzzles and see how many backtracks they require.

Grid	Backtracks (unwrapped)	Backtracks (wrapped)
5×5	1.5 [1.4, 1.6]	1.4 [1.3, 1.5]
7×7	2.1 [2.0, 2.2]	1.9 [1.8, 2.1]
10×10	3.5 [3.3, 3.7]	3.2 [2.9, 3.5]
25×25	25.3 [19.1, 35.6]	29.3 [21.6, 41.1]

We can clearly see both the increased difficulty as the size increases; there’s a less clear pattern between wrapped and unwrapped difficulty which suprised me.

Conclusion

You can find the full code at https://github.com/taylorpospisil/pipes.

The development process was pretty enjoyable; I coded up the original DFS with constraint propagation myself. Then with Claude Code I started improving things. Claude was proficient with tactical optimizations: getting the code to be type stable, adding pre-allocation. Algorithmic improvements were hit or miss; it ended up implementing a horrendous incremental DSU when you could just add a field to my existing implementation. But I could generally walk it through the changes, and it was much faster than writing it out myself. I could take advantage of “garbage time”: specify a prompt when I didn’t have time to code and come back for review later. The bot submission code was certainly something I could not have done myself: sure I could have used it to learn more about javascript and web parsing but that wouldn’t have met my internal cost-benefit analysis.

I still have an open question: is there any way to generate even bigger puzzles. The daily puzzles are 30×30 and the weekly and monthly are 40×40. It’d be cool to have a 100x100 puzzle or larger but when you start getting that high the rate of puzzles with unique solutions continues to drop precipitously. Worse yet, the computation time required to check for uniqueness starts to bite. I tried generating even a 60x60 grid and after an hour of computing gave up. I’ll likely need a way to guarantee uniqueness at generation time rather than rejection sampling.

And of course I’d love to be #1. There’s no point in optimizing my solver code any further: it’s literally my ping which is holding me back now. I guess I could figure out where the server is located and spin up a VPS as close as possible but that seems a little excessive (for now). I’ll have to settle for #2 and having a fun time. Hopefully that kicks my little habit: though they do have many other puzzle types…