mitchell vitez

dark mode

blog about music art media

resume email github

Algebraic Dialogue Trees

If you’ve ever played a video game where you talked to a character, you’ve probably interacted with a dialogue tree. Dialogue trees allow both players and characters to say things (hence the “dialogue”) and have a concept of branching possibilities (hence the “tree”). For example, if a character asks you whether you want to go forward with Plan A or Plan B, their response should reasonably differ between the two. Of course, it’s often possible to pull the magician’s trick of “multiple outs”. Consider these two conversations:

Character: Do you think we should attack at dawn?
Player: Yes, seems like the best plan.
Character: Alright, I’ll send out the orders.

Character: Do you think we should attack at dawn?
Player: No, it seems too risky.
Character: Ah, but it’s a risk we must take. I’ll send out the orders.

While there may be the appearance of a choice here (if the player only gets to pick one option before moving on), that choice is an illusion. No matter what decision is made, the outcomes are the exact same. Sometimes, this is a necessary feature to cut down on adding an exponential number of possible scenarios to a game. However, it’s better to be a bit more subtle than this. Being given an obviously false or inconsequential choice can be more frustrating than not being given a choice at all.

Overuse of multiple outs can also lead to the other issues that come along with lazy writing. For example, if your game has “multiple endings” but the fundamental difference between them is the color of the big explosion at the end, that’s probably not very satisfying for players versus feeling that their actions throughout the game really changed the outcome, affecting the game’s world in an important way. The ending of a game is probably the place where game designers have the most freedom to branch off in wildly different directions, since there’s no need to tie the story back together afterwards.

I could talk about the consequences of various dialogue designs all day, but let’s get into some actual code implementing one.

Actual ADTs

There are a few ways we might straightforwardly implement dialogue trees, without having to think about them too much. We could just write a bunch of branching if statements, but it’s hard to think of a better recipe for spaghetti code. From a quick search, it seems like in C++ (one of the most common languages in game programming) we’d probably want to create a network of pointers, with each dialogue line pointing to the next, and choices pointing to multiple options. Finally, the one that we’ll talk about here, we could create an algebraic data type representing dialogue, and then base the rest of our design around that.

Conversation has a sequential, unbounded nature, so let’s make our DialogueTree an ordered container for many DialogueSteps. At each step there are multiple things that might happen, which we can enumerate. Perhaps the character just says a line, which we’ll call a Line, containing the actual text that shows up on screen as what the character said. Or maybe the player is offered the chance to say something, which we can call a Choice. This Choice is where we get the branching nature of the tree, since different text the player chooses leads to different subtrees of dialogue for what is said next.

Encoding this all in Rust might look like this:

type DialogueTree = Vec<DialogueStep>;

enum DialogueStep {
    Line(String),
    Choice(HashMap<String, DialogueTree>),
}

One great feature of using this structure is that it’s immediately extensible. If, for example, we wanted to add lines the player is forced to say, but has no choice about, we could add a PlayerLine(String), and the type system will catch every instance in the code where we need to make changes. We could even go more complicated and add conditionals. For example, we can add a case If(bool, String, String) that looks at the boolean, and picks between the two Strings based on its value. Algebraic data types can be used to encode entire parse trees for programming languages, so it’s no surprise that we can get similar capabilities here.

(De)serializing

Another great thing about our DialogueTree definition is that we have an incredibly easy way to get a parser for it: just add #[derive(Deserialize)]. With this derive, and a Rust Object Notation (RON) file, we can go from files representing dialogue to the corresponding actual Rust objects with a ron::de::from_reader. For example, we might write out a dialogue tree like this:

[ Line("Hello!")
, Line("Do you think this will be a good chat?")
, Choice(
    { "Yes, seems interesting.":
        [ Line("I think so too.") ]
    , "No, this is boring.":
        [ Line("Let's hope you change your mind.") ]
    })
]

Then parsing a full dialogue tree is no harder than opening the file and calling from_reader on it.

let f = File::open("path/to/dialogue.ron")
    .expect("Failed opening dialogue file");
let dialogue_tree: DialogueTree = from_reader(f)
    .expect("Failed to load dialogue");

This kind of simple deserialization makes it easier to separate data from code, and to write out long, sprawling dialogue trees in separate files where all we have to care about is what the characters are actually saying to each other. This helps us keep code as code, and data as data, a bit more easily.

Traversing the Tree and Awaiting Input

I typically find that once I have a nicely typed design, implementation just kind of falls into place. Here, all we need to do to run through a dialogue tree is iterate over it, pattern match on what kind of DialogueStep we’re currently dealing with, and do the right thing. On a Line you might change the text of a UI element on screen to show that line, or play a sound file representing that line. On a Choice you might hand over (limited) control to the player, letting them select from a menu of options.

pub fn start_dialogue(
    player: &Player,
    dialogue_tree: DialogueTree,
) {
    // this might block other interaction, e.g. movement
    player.dialoguing = true;

    for step in dialogue_tree {
        match step {
            DialogueStep::Line(line) => _,
            DialogueStep::Choice(map) => _,
        }
    }

    // return back control
    player.dialoguing = false;
}

However, there’s one problem with this in the context of running code from a game engine. Typically we’d call start_dialogue from within an update function that’s running many times per second, once per frame of the game. But we obviously don’t want all the dialogue to take place in the span of one frame (unless you’re an incredibly fast reader).

Instead, we want a way to “pick up where we left off”, letting the player take their time to read the line and then provide an input to move on. When I think of code that wants to “save its place” in this way, I usually think of generators, or async/await. While you do have to be careful when mixing asynchronous and synchronous code (especially from within a complex multithreaded environment like a game engine), code complexity can be much lower when you await user input rather than decide to deal with the “place-saving” code yourself.