Dec 23 2007

perl: Word Search Generator

Published by at 11:26 am under Programming

Control

We want our program to be able to generate different variations of puzzles; different sizes of grids, some with easy ‘left-to-right only’ words, and others with words that are hidden in all directions. Plus of course the list of words needs to be different for each puzzle. So, we need a method of supplying all of these parameters to the program. One way to do this is to create a configuration file that controls the program. This is just a plain type file, one that could be created with notepad or any other text editor.

#
# file: animals.txt
# a configuration file for ws.pl
#
title=Animal Word Search
author=Steve Browning
pdf_filename=animal_word_search.pdf
#
size_x=10
size_y=10
directions=E,S
#
max_attempts=100
#
pdf_offset_x=100
pdf_offset_y=700
#
dog
cat
pig
cow
horse
fox
chicken
snake

Figure 3 – Configuration File

Looking at this configuration file you’ll probably recognize how it is defined. Any line starting with the # sign will be treated as a comment. Lines of the format <keyword>=<value> will be used to set those named parameters. And any other line in the file will be treated as a word to be placed into the grid of our puzzle.

OK, now for some perl code to read and parse the configuration file. This is from ws.pl:

# get the config filename from the command line
$config_filename=shift;
    
# if the config file is not readable, print a message and quit
if(! -r $config_filename) {
  print 'cannot read config file' . "\n";
  print 'usage: ws.pl [config.txt]' . "\n";
  exit;
}

# parse the config file
open(CONFIGFILE, "<$config_filename");
while() {
  chomp;
  $line=strip_blanks($_);
  if($line=~m/^\#/ || $line=~m/^$/) {
  } elsif ($line=~m/^(.*)\=(.*)$/) {
    $$1=$2;
  } else {
    push(@words, $line);
  }
}
close(CONFIGFILE);

Figure 4 – Read/Parse Configuration File

When we run our program, we will do it by opening a Windows command window, and typing:

perl ws.pl animals.txt

The perl program starts by obtaining the name of the config file, and making sure that file is readable (15). It then loops through each line in the file (23).

chomp (24) is used to take the line-feeds off of each line.

strip_blanks (25) is not a native perl function, it is a subroutine defined at the end of the source code to remove leading and trailing whitespace.

We then use regular expressions to interpret each line. If it starts with a # or is blank, we skip it (26).

Then in (27), we see if the line matches the format: some_text=some_other_text. If so, we “capture” the text to the left of the equal sign as $1 and the text to the right of the equal sign as $2. The parenthesis in the regular expression cause this capture to happen. We create a new variable using the string in $1 as the variable name, and set it equal to $2. If our config file had title=A puzzle, then our program now has a variable $title set equal to ‘A puzzle’. Line 28 in the code above does the magic.

For any line that doesn’t match our regular expressions, we simply treat it as a new word to be placed on our wordlist. We are building an array called @words which will contain all the words for our puzzle.

What happens if our configuration file doesn’t have all the data in it, or if some of the data is wrong? A little experience shows that our final PDF output is not going to work with a grid larger than 22×22 cells, and trying to place more than 30 words into the grid will likely not work. We’ll do a little error checking, although not all that is really needed. Since this is our own program we’ll trust ourselves to give good input.

# a little input checking
if($size_x>22 || $size_y>22) {
  print 'error: size_x or size_y cannot exceed 22' . "\n";
  exit;
}
if(eval($#words+1)>30) {
  print 'error: maximum of 30 words allowed on list' . "\n";
  exit;
}

Figure 5 – A Little Input Checking

If we want to move in any of the 8 compass directions on our grid, we’ll define hashes to tell us which way to move in the X and Y directions. For example ‘East’ means move +1 in the x direction, and 0 in the y direction. ‘Southwest’ is -1 x, -1 y.

# define hashes for the increments to use for the 8 directions
$inc_x{'E'}=1;   $inc_y{'E'}=0;
$inc_x{'S'}=0;   $inc_y{'S'}=1;
$inc_x{'W'}=-1;  $inc_y{'W'}=0;
$inc_x{'N'}=0;   $inc_y{'N'}=-1;
$inc_x{'NE'}=1;  $inc_y{'NE'}=-1;
$inc_x{'SE'}=1;  $inc_y{'SE'}=1;
$inc_x{'SW'}=-1; $inc_y{'SW'}=1;
$inc_x{'NW'}=-1; $inc_y{'NW'}=-1;

Figure 6 – Helping Hashes

The preliminary work is all done. Let’s move ahead to the guts of the algorithm.

Pages: 1 2 3 4 5 6 7

3 responses so far

Please do not load this page directly. Thanks!