Writing Self-Documenting Code

Writing clear, understandable code for complex systems can be challenging. In this post, we’ll explore the concept of self-documenting code as a technique for writing clearer code.

What is self-documenting code?

Self-documenting code is code that utilizes descriptive method and variable names that resemble human speech. You may understand this concept even if you’ve not heard the term before. Other synonymous names include “human-readable code” and “self-describing code.”

Why is it useful?

Writing understandable code is hard:

– People think and solve problems in different ways
– Systems are complex and sometimes require workarounds or unintuitive solutions
– Not everyone contributing to a codebase has the same context about the entire system
– Comments and documentation can become outdated

Self-documenting code aims to solve these issues by encouraging code that clearly states its intent in common language.

Tic-tac-toe: A refactoring example

The good news is that writing self-documenting code can be very intuitive, because it mimics the way that we naturally think. To give an example of how we can refactor code to be more self-documenting, let’s use the childhood game of tic-tac-toe.

Intro

For those unfamiliar with the game, it works as follows: There is a 3 x 3 grid and two players, ‘X’ and ‘O’. The two players take turns marking spots on the grid. The first player to mark three spots in a straight line wins the game.

I’ve written a preliminary implementation below. The code is pretty short, and the game is fairly simple. However, upon first glance, you might wonder, “What the heck is happening here?” Even for such a straightforward program, the code takes a moment to grok.

class TicTacToe                                                                                                                                                                                            LENGTH = 3
 PLAYER_1 = :x
 PLAYER_2 = :o

 def initialize
   @grid = Array.new(LENGTH) { Array.new(LENGTH) }
   @current_player = PLAYER_1
   @empty_spaces = LENGTH * LENGTH
   @game_over = false
 end

 def take_turn(x, y)
   raise 'Game over!' if @game_over
   raise 'Grid space out of bounds!' unless (x >= 0 && x < LENGTH && y >= 0 && y < LENGTH)
   raise 'This space has already been taken!' if @grid[x][y]
   
   @grid[x][y] = @current_player
   @empty_spaces -= 1
   @current_player = (@current_player == PLAYER_1) ? PLAYER_2 : PLAYER_1

   case
   when game_won?(x, y)
     @game_over = true
     :game_won
   when @empty_spaces == 0
     @game_over = true
     :game_over_no_winner
   else
     :next_turn
   end
 end

 def game_won?(x, y)
   max_index = LENGTH - 1

   LENGTH.times.all? { |z| @grid[x][z] == @grid[x][y] } ||
     LENGTH.times.all? { |z| @grid[z][y] == @grid[x][y] } ||
     LENGTH.times.all? { |z| @grid[z][z] == @grid[x][y] } ||
     LENGTH.times.all? { |z| @grid[z][max_index - z] == @grid[x][y] }
 end
end

Refactoring strategy

So how can we make this code easier to read, understand, and maintain? I’ll use the following refactoring technique:

Identify functionally significant, independent pieces of code
Extract these pieces of code into methods with human-readable names
Repeat until all code is sufficiently human-readable

Let’s start here with the `take_turn` method:

def take_turn(x, y)
  raise 'Game over!' if @game_over
  raise 'Grid space out of bounds!' unless (x >= 0 && x < LENGTH && y >= 0 && y < LENGTH)
  raise 'This space has already been taken!' if @grid[x][y]

  @grid[x][y] = @current_player
  @empty_spaces -= 1
  @current_player = (@current_player == PLAYER_1) ? PLAYER_2 : PLAYER_1

  case
  when game_won?(x, y)
    @game_over = true
    :game_won
  when @empty_spaces == 0
    @game_over = true
    :game_over_no_winner
  else
    :next_turn
  end
end

Specifically, let’s take a look at the first three lines of code above. This chunk of code appears to deal with error checking. Great! Let’s extract that out into its own method and name it something descriptive:

def check_valid_move
  raise 'Game over!' if @game_over
  raise 'Grid space out of bounds!' unless (x >= 0 && x < LENGTH && y >= 0 && y < LENGTH)
  raise 'This space has already been taken!' if @grid[x][y]
end

Looking better already!

def take_turn(x, y)
  check_valid_move

  @grid[x][y] = @current_player
  @empty_spaces -= 1
  @current_player = (@current_player == PLAYER_1) ? PLAYER_2 : PLAYER_1

  case
  when game_won?(x, y)
    @game_over = true
    :game_won
  when @empty_spaces == 0
    @game_over = true
    :game_over_no_winner
  else
    :next_turn
  end
end

But we’re not quite done with the error checking code yet. We can make this even more readable. Let’s try some more refactoring:

def in_bounds?(x, y)
  x >= 0 && x < LENGTH && y >= 0 && y < LENGTH
end    

def space_taken?(x, y)
  grid[x][y]
end

With a couple simple extractions, the error checking code is now even more readable:

def check_valid_move(x, y)
  raise 'Game over!' if @game_over
  raise 'Grid space out of bounds!' unless in_bounds?(x, y)
  raise 'This space has already been taken!' if space_taken?(x, y)
end

There isn’t much more code that can be pulled out of here, so let’s return to `take_turn`:

def take_turn(x, y)
  check_valid_move

  @grid[x][y] = @current_player
  @empty_spaces -= 1
  @current_player = (@current_player == PLAYER_1) ? PLAYER_2 : PLAYER_1

  case
  when game_won?(x, y)
    @game_over = true
    :game_won
  when @empty_spaces == 0
    @game_over = true
    :game_over_no_winner
  else
    :next_turn
  end
end

The next chunk of code has a lot going on. What exactly is it doing? It appears to be marking the spot with the current player’s symbol, decrementing the number of empty spaces, and swapping players in preparation for the next turn. So, let’s update the code to say just that:

def mark_spot!(x, y)
  @grid[x][y] = @current_player
end

def decrement_spaces_left!
  @empty_spaces -= 1
end

def switch_players!
  @current_player = (@current_player == PLAYER_1) ? PLAYER_2 : PLAYER_1
end

Now:

def take_turn(x, y)
  check_valid_move(x, y)

  mark_spot!(x, y)
  decrement_spaces_left!
  switch_players!

  case
  when game_won?(x, y)
    @game_over = true
    :game_won
  when @empty_spaces == 0
    @game_over = true
    :game_over_no_winner
  else
    :next_turn
  end
end

Let’s tackle the final chunk of code:

def get_game_status(x, y)
  case
  when game_won?(x, y)
    @game_over = true
    :game_won
  when all_spots_taken?
    @game_over = true
    :game_over_no_winner
  else
    :next_turn
  end
end

def all_spots_taken?
  @empty_spaces == 0
end

For fun, let’s also go ahead and refactor the `game_won?` method:

def vertical_win?(x, y)
  LENGTH.times.all? { |z| @grid[x][z] == @grid[x][y] }
end

def horizontal_win?(x, y)
  LENGTH.times.all? { |z| @grid[z][y] == @grid[x][y] }
end

def main_diagonal_win?(x, y)
  LENGTH.times.all? { |z| @grid[z][z] == @grid[x][y] }
end

def antidiagonal_win?(x, y)
  max_index = LENGTH - 1
  LENGTH.times.all? { |z| @grid[z][max_index - z] == @grid[x][y] }
end

def diagonal_win?(x, y)
  main_diagonal_win?(x, y) || antidiagonal_win?(x, y)
end

def game_won?(x, y)
  vertical_win?(x, y) || horizontal_win?(x, y) || diagonal_win?(x, y)
end

Before:

def take_turn(x, y)
  raise 'Game over!' if @game_over
  raise 'Grid space out of bounds!' unless (x >= 0 && x < LENGTH && y >= 0 && y < LENGTH)
  raise 'This space has already been taken!' if @grid[x][y]

  @grid[x][y] = @current_player
  @empty_spaces -= 1
  @current_player = (@current_player == PLAYER_1) ? PLAYER_2 : PLAYER_1

  case
  when game_won?(x, y)
    @game_over = true
    :game_won
  when @empty_spaces == 0
    @game_over = true
    :game_over_no_winner
  else
    :next_turn
  end
end

After:

def take_turn(x, y)
  check_valid_move(x, y)

  mark_spot!(x, y)
  decrement_spaces_left!
  switch_players!
  get_game_status(x, y)
end

Not bad! The new code is much easier to read and understand. If we want to know the specifics of what the methods are doing, we can visit the definitions. We now also have a better idea of what the developer intended for the methods to do, which makes it easier to spot bugs or make modifications.

An added bonus: better tests

We’ve now seen how self-documenting code can make programs easier to read and understand. However, self-documenting code can also make programs easier to test. Self-documenting code encourages developers to write many small, single-function methods, rather than larger, multi-functional monolithic methods. In addition to being easier to understand, these smaller methods are also easier to test. This is because the input and output spaces for these smaller methods is more constrained than in the larger, more complex methods.

As an example, try to imagine all of the tests that you would need to write to comprehensively test take_turn. It’s pretty difficult to enumerate all of the possible input and ouptut conditions that you’d need to test for to ensure full coverage. Now, for comparison, imagine the tests that you’d need to write to fully test mark_spot! or switch_players!. Much easier, right? Since the purpose of each of these methods is very narrowly defined, the input and output spaces are also much simpler to enumerate. This pattern results not only in better test coverage, but also clearer tests that serve as additional documentation regarding the intent of the code.

Conclusion

Writing understandable code is hard. Even the smallest programs, as demonstrated by the original tic-tac-toe code above, can be difficult to follow. This, of course, does not bode well for the building of large, complex, multi-contributor systems. However, using the principles of self-documenting code can help! Even if you already have code that isn’t self-documenting, the good news is that it is relatively painless to transform existing code into self-documenting code. This can be done by iteratively extracting portions of code into single-use and single-line methods with descriptive names. In this way, even the most complex code can be quickly converted into self-documenting code that is clearer and easier to understand.