Deriving State

So, one of the established “Best Practices” with React is to always store the minimum amount of data in state. If something can be derived from state, it should be.

But what does that actually mean? How does it work in practice?

Let's walk through an example, from the Wordle project we saw earlier.

Video Summary

I filmed this video about a month after the course initially launched. Since then, hundreds of people have gone through this project, and submitted their solutions. I've noticed some trends, some places where there seems to be some confusion. One of them is around how to structure state.

In the Wordle project, the main piece of state is the guesses array. It holds an array of strings:

const [guesses, setGuesses] = React.useState([]);
// ['HELLO', 'WORLD']

Now, this isn't all of the data we need, in order to render the UI. We also need to know the status for every letter in the guess (whether a particular letter is incorrect, misplaced, or correct).

I calculate that status in the Guess component:

function Guess({ value, answer }) {
  const result = checkGuess(value, answer);

  return (
    <p className="guess">
      {range(5).map((num) => (
        <Cell
          key={num}
          letter={result ? result[num].letter : undefined}
          status={result ? result[num].status : undefined}
        />
      ))}
    </p>
  );
}

The checkGuess function turns a string like "HELLO" into an array of objects:

console.log(result);
/*
  [
    { letter: 'H', status: 'incorrect' },
    { letter: 'E', status: 'incorrect' },
    { letter: 'L', status: 'incorrect' },
    { letter: 'L', status: 'incorrect' },
    { letter: 'O', status: 'misplaced' },
  ]
*/

Many students have wondered: wouldn't it make more sense to do this calculation on submit?

It feels wasteful to have to recalculate result on every single render. Whenever the user submits a new guess, we repeat this work for every previous guess!

What if we did this work on submit, and stored this array in state directly?

function handleSubmitGuess(tentativeGuess) {
  // Instead of storing the guess itself,
  // store the calculated result:
  const result = checkGuess(tentativeGuess, answer);

  const nextGuesses = [...guesses, result];
  setGuesses(nextGuesses);
}

This approach works just fine, but there's two things I want to dig into.

Whether the performance concern is valid
Which approach is more maintainable

For the performance, I set up a benchmark, as discussed in Module 2. It looked something like this:

const start = Date.now()
const numberOfIterations = 100_000;

let result;
for (let i = 0; i < numberOfIterations; i++) {
  result = checkGuess(tentativeGuess, answer);
}

const totalTimeTaken = Date.now() - start;
console.log(totalTimeTaken / numberOfIterations);

This tells me the number of milliseconds that it took to generate that array of objects from the result. I repeat the calculation a lot of times (100k) to get a realistic estimate for a single calculation (otherwise, it will always round to 0).

The result: on my machine, the calculation takes 0.5 microseconds.

For context, 100 milliseconds is the generally-accepted threshold for how much time has to pass for it to be noticeable. If the user submits their guess, and the result is shown in 50ms, or 75ms, or 90ms, it feels more-or-less instantaneous. if it takes 110ms, that will feel like a very short delay.

And so, when we do the math: the checkGuess function would need to be 200,000x slower for it to become noticeable by the user.

Granted, my computer is more powerful than a budget Android smartphone, but it's not 200,000 times faster. I feel confident that there are no circumstances in which this "guess calculation" process will affect the performance in any sort of noticeable way.

But what if it did affect the performance? Well, in that case, we could use the memoization techniques discussed in Module 3!

For example, we could memoize the component:

const Guess = React.memo(
  function Guess({ value, answer }) {
    const result = checkGuess(value, answer);

    return (
      <p className="guess">
        {range(5).map((num) => (
          <Cell
            key={num}
            letter={result ? result[num].letter : undefined}
            status={result ? result[num].status : undefined}
          />
        ))}
      </p>
    );
  }
);

By turning Guess into a pure component, we guarantee that it will only re-render when the value or answer props change. This means that the checkGuess calculation will only happen when it needs to happen.

Alright, so hopefully I've convinced you that we don't have to worry about performance in cases like this. We can focus on what the best approach is in terms of maintainability. Which approach is easiest for us to work with?

Well, let's suppose we're brand-new to this codebase. Which state shape is easier for you to understand?

// Option 1
['HELLO', 'WORLD', 'THIRD'];

// Option 2
[
  [
    { letter: 'H', status: 'incorrect' },
    { letter: 'E', status: 'incorrect' },
    { letter: 'L', status: 'incorrect' },
    { letter: 'L', status: 'incorrect' },
    { letter: 'O', status: 'misplaced' },
  ],
  [
    { letter: 'W', status: 'correct' },
    { letter: 'O', status: 'correct' },
    { letter: 'R', status: 'correct' },
    { letter: 'L', status: 'incorrect' },
    { letter: 'D', status: 'incorrect' },
  ],
  [
    { letter: 'T', status: 'incorrect' },
    { letter: 'H', status: 'incorrect' },
    { letter: 'I', status: 'incorrect' },
    { letter: 'R', status: 'misplaced' },
    { letter: 'D', status: 'incorrect' },
  ],
]

By storing the minimum amount of data in state, we keep that state easier to conceptualize. There's something really elegant about the state being a “pure” representation of the data.

Over time, apps tend to grow more and more complex, as we add more and more features. If we had 10 state variables in the Game component, wouldn't it be nicer if they were easier to reason about?

This also applies when it comes to state updates.

Let's suppose we're working on Wordle as our full-time job. Our manager barges into our cubicle and says "We have a problem. The users are getting frustrated. We want to let users edit their previous guesses".

How would we tackle this? Well, if our state is an array of strings, we could do something like this:

function updateGuess(guessIndex, newValue) {
  const nextGuesses = [...guesses];
  nextGuesses[guessIndex] = newValue;

  setGuesses(nextGuesses);
}

We would expect this function to change the state like this:

console.log(guesses);
// ['HELLO', 'WORLD'];

updateGuesses(1, 'WORKS');
// Next state: ['HELLO', 'WORKS'];

Because our state is in a relatively simple shape — an array of strings — it's a pretty straightforward transformation.

But in the alternative approach, where the state is in a much more complicated shape, the updates become much more challenging. We'd either need to do a complex data manipulation (editing the objects within an array within an array), or we'd have to duplicate the work to generate that data (the checkGuess function).

When we store derived data in state, it means we have to repeat those calculations whenever we want to change the state. This can become a significant burden.

But if we store the minimum representation in state, we can change the state much more easily. When the state changes, the component re-renders, and any calculations to derive data will re-run, generating the new values automatically.

There are exceptions (which we'll talk about shortly), but as a general rule, our life becomes much easier when we don't store derived data in state.

You can check out the alternative approach, without deriving state, on Github. But please keep in mind, I don't recommend this approach.

More info: