Problem solving and software design is essentially a graph search in a solution space. Yet most generative AI interfaces force us into linear conversations.
Chat is very approachable and intuitive, it makes sense why it became the default interface. But for complex work, it falls short. Most implementations have acknowledged this by adding basic support for conversation branching, but those are still very limited.
To get a better idea of what we are missing, let’s look at some problem solving models and strategies. A tree is a type of graph, the distinction doesn’t matter for this post. I may use graph search and tree traversal terms interchangeably.
Introduction to Tree-based problem solving
Decision trees described in The Design of Design by Fred Brooks, 2010. From Chapter 2: How Engineers Think of Design—The Rational Model:
Now, so the Rational Model goes, the designer makes a design decision. Then, within the design space narrowed by that decision, he makes another. At each node he could have taken one or more other paths, so one can think of the process of design as the systematic exploration of a tree-structured design space.
In this model, design is conceptually (at least) very simple. One searches the tree-structured design space, testing each option against the constraints for feasibility and choosing so as to optimize the utility function. The search algorithms are well known and can be cleanly described.
Many engineers seem to approximate some sort of depth-first search strategy, choosing at each node the most promising or attractive option and exploring it to the end. At dead ends, one backtracks and takes another path. Hunches, experience, consistency, and aesthetic taste guide each option selection.
He also quotes Herbert Simon from The Sciences of the Artificial, 1969:
… [F]or the theory of design is that general theory of search … through large combinatorial spaces.
A description of the strategy in practical terms is Tree-based problem management, 2018:
When I am working on a problem, I almost always use the same strategy – I think of all the potential solutions to the problem and then start to think about what problems I might have trying to implement those solutions (repeated recursively). It becomes a dependency tree with a problem depending on solutions, which themselves depend on problems, etc.
Let’s say that Solution A depends on solving Problem X and Problem Y. As soon as you discover that Problem X is not viable, then there is no reason to work on Problem Y. You can also give up on any higher up problem that depended on Solution A. Similarly you can have solutions that depend on solving Problem X or Problem Y, in which case solving either means you do not have to look at the other or anything below it… and any higher or’s that depend on solution are also completed.
This sounds almost trivial, but for actual hard problems these trees can get very complicated, and it is easy to get lost in trying to solve a sub-problem and forget all the other contingencies on that solution.
I think you could build something like this in JS or even Bubble pretty quickly. It could be totally graphical, and I love the idea of seeing that there is actually one critical problem down towards that bottom of the tree that, once resolved as either solved or not viable, would immediately shrink the problem space in otherwise unexpected ways.
I can’t tell you how much time this would save me, and hopefully promote this problem solving strategy to others.
Pruning and Task graphs
Another description from HN commenter is pruning by building a Tree of failures:
My assumption was that humans don’t try a breadth-first approach. Instead, we split a task into a short-step (instinct and intuition selected), and long-step that summarizes/stores the next steps. The key idea is to recursively evaluate a task as a short-step (high-res - gets executed) and a long-step (lower-res - is just stored), until it succeeds or fails. If it fails, we must walk back keeping a summarized tree of failures in state so that we can exclude them in future selections.
A reply comment puts it in other words calling it “Getting to No” for a problem:
It’s sort of a combination of “getting to know” and the opposite of the salesman’s “getting to Yes”. When it works, it’s the fastest way to prune off obligations. The goal is to figure out why some particular problem: isn’t really a problem, doesn’t need to be solved, can’t be solved that way, can’t really be solved (because of physics or it’s really a different problem).
Finally, Steve Yegge describes it as Task graph and suggests how to divide tasks between humans and LLMs:
As any old-salt Project Manager knows all too well, every nontrivial project implicitly has a task graph consisting of multiple tasks, complete with subtasks and dependencies.
Here’s the rub: As of about May, LLMs can now execute most of the leaf tasks and even some higher-level interior tasks, even on large software projects. Which is great. But what’s left over for humans is primarily the more difficult planning and coordination nodes. Which are not the kind of task that you typically give junior developers.
UIs beyond chat
Tree-based problem solving is an useful approach in general, but becomes even more relevant in the era of generative AI. Managing context for LLMs well is crucial for getting good results. A tree-based UI would enable each branch to maintain its own context window and avoid token bloat from other branches.
For example, I describe a problem and an LLM offers five potential solutions. I immediately see that two of them are not viable and can exclude them to prune the search. I will continue to explore and drill down the other solutions separately without polluting the context with non-viable solutions.
Do we need to create entirely new products for this? I am not sure, perhaps it could be just a good implementation within existing thinking tools or knowledge tools like mindmaps or outliners. Although it would need to be much deeper integration with the underlying data model than just slapping a chatbox panel on the side. Another kind of existing apps where I have seen some experiments are canvas apps. Those look cool visually, but they seem to waste a lot of screen space.
I am not sure what the solution would look like. I don’t imagine a tree or graph interface replacing chat for simple everyday tasks. It will likely have a learning curve, which would make it worth only for more complex tasks. My hope is to see more exploration and experiments with interfaces beyond chat specifically designed for more efficient problem solving.