#!/usr/bin/perl -w

# Boris - Generate symbol sequences
# John Fisher Ver.1 Nov 96
# Do what you like with it but leave these three lines

# Jim Henry 2000/8
# cleaning up some inefficient regexes

# 2009/7
# changing to use command-line parameters instead of interactively asking for
# param file and number of words

# $Log: boris.pl,v $
# Revision 1.28  2010-09-23 17:07:01  jim
# Fix defect with -S separator argument.
#
# Revision 1.27  2010-09-23 16:55:30  jim
# Fix sleeptime defect.
#
# Revision 1.26  2009-08-09 04:15:42  jim
# Move more file-scope variables into the %global_vars hash so the user
# can set them at runtime.  Improvements to cmd_set() incl. error checks
# on var names and values and parsing of \t \n \s for separator.
#
# Revision 1.25  2009-08-08 05:43:25  jim
# Implement 'save' command.  Add 'file=' argument to 'dump' command.
# Let user get help on several commands at once with serial args to
# 'help' command.
#
# Revision 1.24  2009-08-08 04:34:51  jim
# Start using %global_vars hash for lang and wordcount values that the user can
# set with 'set' command to affect default behavior of other cmds when given no
# arguments
#
# Revision 1.23  2009-08-08 03:33:24  jim
# Implement interactive non-meta mode, -s sleeptime and -S separator command-line
# arguments.
#
# Revision 1.22  2009-08-07 07:39:27  jim
# More tracing, and more fixes for mutate; works more often but not always.
#
# Revision 1.21  2009-08-06 05:27:59  jim
# More trace functions.  Fix memory leak problem with original language being mangled
# and its saved_vals hash growing indefinitely as we created mutated daughter langs from
# it.  Partial fix of problem with parenthesized tokens; but still doesn't work right
# more often than not.
#
# Revision 1.20  2009-08-05 02:18:42  jim
# Finished refactoring interactive_mode() to use a hash of subroutine refs for
# functions per specific commmands.   Further work on mutate_tree(): it now
# works intermittently, about a third of the time, which is better than before,
# but it also crashes sometimes due to stringify() encountering a node with
# no 'value'.
#
# Revision 1.19  2009-08-05 00:36:06  jim
# Working on fixes to mutate_tree() to handle parenthesized tokens correctly;
# still doesn't work right.   Start work on rewriting interactive_mode()
# with more flexible architecture; not finished, but it's stable and works
# as it is.
#
# Revision 1.18  2009-08-04 00:57:08  jim
# Better tracing, writing to file if in interactive mode.  More work on user interface,
# including 'show' and 'words' commands.  Fix bug in UI's use of mutate and get_sequence.
# But mutate function itself still doesn't work right; can't figure out why but it's
# making identical copies instead of mutated ones.
#
# Revision 1.17  2009-08-03 02:56:46  jim
# Converting SymbolGenerator to use instance variables instead of package variables
# for most purposes.   Not finished yet; code is untestable at the moment.
#
# Revision 1.16  2009-08-02 08:11:30  jim
# first draft, untested, of mutate and mutate_tree functions
# can't test this until I rewrite the whole package as a proper
# class moving package variables into the anon hash etc.
#
# Revision 1.15  2009-08-02 04:49:59  jim
# a bit more work on interactive meta mode, far from finished
#
# Revision 1.14  2009-08-02 03:26:40  jim
# Start working on meta mode and interactive mode
# Can't do much more until I write the mutate function in SymbolGenerator.pm
#
# Revision 1.13  2009-08-02 02:30:55  jim
# Further work on representing symbol string as tree.  Mostly works now, except
# for some spurious spaces inserted when working with metaphonology.boris; haven't
# yet figured out how to reproduce it with a simpler test format.   Added -r
# option to seed randomizer with specific seed for consistency in debugging.
# Added code to strip spaces surrounding : operator, as we do with + and -.
#
# Revision 1.12  2009-08-01 11:40:04  jim
# Partially finished factoring out generator class from boris.pl to SymbolGenerator.pm, and rewriting
# expand_string to work on tree rather than a flat result string.  Not finished, doesn't work yet.
#
# Revision 1.11  2009-08-01 04:26:52  jim
# fix defect where wasn't slurping whole paragraphs so statements split across lines were broken
#
# Revision 1.10  2009-07-31 21:09:41  jim

# Modify display_usage behavior, add credits and URL, don't read stdin
# if parameter file not specified; deep recursion check; probability
# bias error checking

# Revision 1.9  2009-07-31 02:33:13  jim
# handle slash-escaping of parentheses
#
# Revision 1.8  2009-07-31 02:17:51  jim
# implemented parenthesizing of tokens for retrieval of earlier values
#
# Revision 1.7  2009-07-31 00:06:02  jim
# Handle escaped colons, and simplify handling of escaped semicolons
#
# Revision 1.6  2009-07-30 21:03:56  jim
# Handle slash-escaped plus and minus
#
# Revision 1.5  2009-07-27 19:36:52  jim
# handle escaping of semicolons.  next: escaping of other metachars
#
# Revision 1.4  2009-07-27 19:17:43  jim
# added comment handling and started working on slash-escape handling
#
# Revision 1.3  2009-07-25 19:51:42  jim
# windows newline fix
#
# Revision 1.2  2009-07-25 19:49:58  jim
# adding command line options and 'use strict' and '-w' cleanup


use strict;
use SymbolGenerator;

# bitwise values for $run_mode
use constant DIRECT => 0;
use constant META => 2;
use constant INTERACTIVE => 4;

my $debug = 0;
my $param_file;
my $output_file;
my $rand_seed;
my $run_mode = DIRECT;

my %global_vars;
$global_vars{wordcount} = 1;
$global_vars{langcount} = 2;
$global_vars{percent} = 1;
my $sleep_time; 
my $default_sleep_time = 0.2;


my $credits = "Boris - a flexible word / sentence generator
Original version by John Fisher, current version by Jim Henry
http://www.pobox.com/~jimhenry/conlang/conlang.htm\n";


###TODO put help_brief and help_long info in this hash, then make
# cmd_help print all help_brief msgs one per line when user doens't
# ask about a specific cmd

my %command_info = (
    'mutate' => { function => \&cmd_mutate,
		  help =>
"m, mutate: Make one or more mutated versions of the current language or another specified language.
Arguments (all optional): 
	lang=<language ID>
	n=<number of daughter languages to make>
	pct=<percentage mutation rate>" },

    'recreate' => { function => \&cmd_recreate,
		    help =>
"r, recreate: Create one or more new languages from scratch.
Argument (optional):
	n=<number of new unrelated languages to make>" },

    'words' => { function => \&cmd_words,
		 help =>
"w, words: Output more words in the current or a specified languages.
Arguments (all optional):
	lang=<language ID>
	n=<number of words to output>
	file=<name of file to write words to>" },

    'help' => {  function => \&cmd_help,
		 help =>
"h, help: Display help, in general or on a specified command or variable
Argument: name of command or variable to get help on" },

    'quit' => { function => \&cmd_quit,
		help => 
"q, quit, exit:  Quit Boris (immediately, without saving anything; see 'save' and 'words')" },

    'show' => { function => \&cmd_show,
		help => 
"show: Show the phonology format of the current or a specified language
Argument:
	lang=<language ID>" },

    'save' => { function => \&cmd_show,
		help => 
"save: Save the format of the current or a specified language to a file
Arguments:
	lang=<language ID>		(optional)
	file=<name of file to save format to>	(required)

See also 'words', which writes words of a given language to the console
or to a specified file, and 'show', which displays the format of a language." },

    'load' => { function => \&cmd_unimplemented,
		help => "load: not implemented yet" },

    'list' => { function => \&cmd_unimplemented,
		help => "ls, list: not implemented yet" },
    
    'set' => { function => \&cmd_set,
	       help =>
"set:  Set one or more variables (which control default behavior of other commands), or 
	display current variable settings." },

    'dump' => { function => \&cmd_dump,
		help =>
"dump: print internal debug information about specified or current language
Arguments (optional):
	lang=<language ID>
	file=<name of file to write debug info to> 	... write to console if not specified"},
    );



my %command_aliases = (
    'm' => 'mutate',
    'r' => 'recreate',
    'w' => 'words',
    's' => 'save',
    'h' => 'help',
    '?' => 'help',
    'l' => 'list',
    'q' => 'quit',
    'exit' => 'quit',
    'view' => 'show',
    );

my %var_info = (
    'langcount' => { pattern => "^[1-9][0-9]*\$",
		     format_err => "must be a positive integer",
		     help => "Number of languages to create or mutate at a time" },

    'wordcount' => { pattern => "^[1-9][0-9]*\$",
		     format_err => "must be a positive integer",
		     help => "Number of words of a given language to print at a time" },

    'percent' => { pattern => "^[.0-9]*\$",
		     format_err => "must be a positive real number between 0 and 100",  ###TODO regex wont' check that fully, need eval{} code maybe...
		     help => "Percentage rate of mutation (probability of mutating from root of phonology format tree)" },

    'separator' => { pattern => "", 
		     format_err => "",
		     help => "Character or string to separate words in output; can use \\n \\t \\s to specify newline, tab, or space" },

    'lang' => { pattern => "[^\\s]",
		     format_err => "must be nonempty",
		     help => "Name/ID of language to work on by default" },
    );


my $general_help = "Commands:

m, mutate              Mutate format
r, recreate            Recreate format from scratch
save                   Save format to file
words                  Output more words in the current or specified language
set                    Display or set variable settings
show                   Show the format of the current or specified language
dump                   Dump internal debug information about the current or specified language
q, quit                Quit
h, help <command>      Get help on specified command";



#============================================================================


&parse_args (@ARGV);
# this is necessary so the command line args won't be interpreted as 
# input filenames when we go into interactive mode
@ARGV = ();

&trace ("\$run_mode == $run_mode\n");

# in debug mode, we want the output to be consistent and predictable from one run to the next
# so we seed the randomizer only if we're not in debug mode
if ( (not $debug) and (not defined $rand_seed) ) {
    srand;
}

if ( defined $rand_seed ) {
    srand $rand_seed;
}


if ( $run_mode & META ) {
    if ( not defined $global_vars{separator} ) {
	$global_vars{separator} = "\t";
    }
    if ( $run_mode & INTERACTIVE ) {
	&interactive_mode;
    } else {
	my $generator = SymbolGenerator->new( 'paramfile' => $param_file, 'debug' => $debug, 'run_mode' => $run_mode );
	if ( not $generator->{valid} ) {
	    die "Creation of SymbolGenerator based on $param_file failed\n";
	}
	my $params = $generator->generate_sequence;
	my $wordmaker = SymbolGenerator->new( 'paramstring' => $params, 'debug' => $debug );
	if ( not $wordmaker->{valid} ) {
	    die "Creation of SymbolGenerator using format generated by generator based on $param_file failed\n";
	}
	
	for ( 1..$global_vars{wordcount} ) {
	    print $wordmaker->generate_sequence, $global_vars{separator};
	}
	print "\n";
    }
} else {
    if ( not defined $global_vars{separator} ) {
	$global_vars{separator} = "\n";
    }

    my $generator = SymbolGenerator->new( 'paramfile' => $param_file, 'debug' => $debug, 'run_mode' => $run_mode );
    if ( not $generator->{valid} ) {
	die "Creation of SymbolGenerator based on $param_file failed\n";
    }

    if ( (not $run_mode & INTERACTIVE) and defined $output_file ) {
	open OUTPUT, ">" . $output_file		or die "can't open $output_file for writing\n";
	select OUTPUT;
    } else {
	$| = 1;		# force autoflush on stdout
    }

    for ( 1..$global_vars{wordcount} ) {
	print $generator->generate_sequence . $global_vars{separator};
    }

    if ( $run_mode & INTERACTIVE ) {
	while ( 1 ) {
	    # keep going until the user presses ENTER

	    ###TODO would like to make this work until user presses any key, and not allow their keypresses
	    # to be echoed to STDOUT while we're busy; not sure how to do that with Perl yet...

	    my ($rin, $win, $ein) = ("", "", "");
	    vec( $rin, fileno(STDIN), 1) = 1;
	    if ( select( $rin, $win, $ein, $sleep_time ) ) {
		my $keypress = <STDIN>;
		exit(0);
	    }
	    print $generator->generate_sequence . $global_vars{separator};
	}
    }
}

exit(0);

#=====================================================


my $init_trace = 0;

sub trace {

    my $msg = $_[0];
    my $level = defined $_[1] ? $_[1] : 1;
    if ( $debug < $level ) {
	return;
    }
    if ( $run_mode & INTERACTIVE ) {
	if ( not $init_trace ) {
	    my $debug_file = "boris_log.txt";	
	    open DEBUG, ">>" . $debug_file	or die "can't append to " . $debug_file . "\n";
	    print DEBUG "\n========================================\n\nTraced run started at " . localtime() . "\n\n";
	    $init_trace = 1;
	}
	print DEBUG $msg;
    } else {
	print STDERR $msg;
    }
}




sub parse_args {
    my $arg;
    while ( $arg = shift ) {
	&trace ("processing \$arg == $arg\n");

	# stop processing arguments if find '--' or an argument that 
	# doesn't start with '-'; in the latter case, first put the arg
	# back for processing by <ARGV>

	if ( $arg eq "--" ) {
	    last;
	}
	
	if ( $arg =~ m/^[^-]/ ) {
	    unshift @_, $arg;
	    last;
	}
	
	if ( $arg eq "-d" ) {
	    $debug = shift;
	    if ( not defined $debug or $debug !~ m/[0-9]+/ ) {
		unshift @_, $debug;
		$debug = 1;
	    }
	    next;
	}

	if ( $arg eq "-r" ) {
	    $rand_seed = shift;
	    if ( not defined $rand_seed  or  $rand_seed !~ m/^\-?[0-9]+$/ ) {
		die "Argument to -r option must be an integer\n";
	    }
	    next;
	}

	if ( $arg eq "-n" ) {
	    $global_vars{wordcount} = shift;
	    if ( $global_vars{wordcount} !~ /^[0-9]+$/ ) {
		die "Argument to -n option must be a positive integer\n";
	    }
	    next;
	}
	
	if ( $arg eq "-i" ) {
	    $param_file = shift;
	    if ( not defined $param_file ) {
		die "-i option requires a filename argument\n"
	    }
	    next;
	}

	if ( $arg eq "-o" ) {
	    $output_file = shift;
	    if ( not defined $output_file ) {
		die "-o option requires a filename argument\n"
	    }
	    next;
	}

	if ( $arg eq "-m" ) {
	    $run_mode |= META;
	    next;
	}

	if ( $arg eq "-I" ) {
	    $run_mode |= INTERACTIVE;
	    next;
	}

	if ( $arg eq "-s" ) {
	    $sleep_time = shift;
	    if ( not defined $sleep_time or $sleep_time !~ m/^[0-9.]+$/ 
		 and $sleep_time ne 'e' and $sleep_time ne 'pi' ) {
		die "Argument to -s option must be a positive real number\n"
	    }
	    if ( $sleep_time eq 'e' ) {
		$sleep_time = 2.71828182845904523536;
	    } elsif ( $sleep_time eq 'pi' ) {
		$sleep_time = 3.14159265358979323846;
	    }
    
	    next;
	}

###TODO here and in cmd_set, also support \0 notation for
### null-delimited output, in case user wants to output symbol strings
### that include spaces and newlines.  Other modules of the 
### conlang generation software suite should have an arg that tells
### them their input is going to be delimted witih \0 null.

	if ( $arg eq "-S" ) {
	    $global_vars{separator} = shift;
	    if ( not defined $global_vars{separator} 
		 or $global_vars{separator} eq '' ) 
	    {
		die "-S argument for word output separator requires an argument\n";
	    }
	    if ( $global_vars{separator} =~ m/\\t/ ) 
	    {
		$global_vars{separator} =~ s/\\t/\t/g;
	    } 
	    if ( $global_vars{separator} =~ m/\\n/ ) 
	    {
		$global_vars{separator} =~ s/\\n/\n/g;

	    }
	    next;
	}

	# unrecognized option or -h or -?
	&display_usage;
	exit;
    }
    
    if ( not defined $param_file ) {
	&display_usage;
	exit;
    }

    # consistency checks
    if ( defined $output_file and $run_mode & INTERACTIVE ) {
	warn "Setting -o output filename has no effect if in -I interactive mode\n";
    }

    if ( defined $sleep_time and not ( $run_mode & INTERACTIVE ) ) {
	warn "Setting -s sleep time has no effect if not in -I interactive mode\n";
    }

    if ( not defined $sleep_time and ( $run_mode & INTERACTIVE ) ) {
	$sleep_time = $default_sleep_time;
    }
}




sub display_usage {
    # strip the path from the filename by which we were called
    my $boris_filename = $0;
    $boris_filename =~ s/.*\///;

    print $credits;

    print STDERR <<HELP;

Usage: $boris_filename  -i <filename> [-n N] [-o <filename>] [-d]

-i <filename>   parameter file
-o <filename>   output file (if not specified, write to standard output; has no effect in interactive
                mode)
-n N            output N words, default 1
-m              meta mode: -i parameter file is a metaformat which generates formats for Boris
                In meta mode, -n argument apples to number of words to generate using 
                the format(s) generated by the metaformat
-I              interactive mode - generate -n N words all at once, then generate more words while pausing
                -s seconds after each, until user presses ENTER
                  OR, if -m meta option also given, generate N words with the first phonology generated by
                the -i meta-format, and then accept interactive commands to make new phonologies or mutate
                an existing phonology, etc.
-s N            sleep N seconds between output words when in interactive mode (no effect in meta mode)
-S <sep>        Use <sep> as the separator between output words.  Default is tab in meta mode, otherwise newline.
                You can use \\t or \\n to specify tab or newline.
-h              Get help (this message)
-d N            turn on debugging, with bitmask N; prints debug trace messages and refrains from seeding
                the random random number generator unless -r option is also given
-r N            set seed number for random number generator (possibly useful for consistency of behavior 
                when debugging)

HELP

}



my $base_lang_idx;
my %daughter_counter;
my %phonology_maker;
my %word_maker;


sub interactive_mode {
    print $credits;
  
    $base_lang_idx = 1;
    $phonology_maker{ $base_lang_idx } = SymbolGenerator->new( 'paramfile' => $param_file, 'debug' => $debug, 'run_mode' => $run_mode );
    if ( not $phonology_maker{ $base_lang_idx }->{valid} ) {
	die "Creation of SymbolGenerator based on $param_file failed\n";
    }
    my $params = $phonology_maker{ $base_lang_idx }->generate_sequence;
    $word_maker{ $base_lang_idx } = SymbolGenerator->new( 'paramstring' => $params, 'debug' => $debug );
    if ( not $word_maker{ $base_lang_idx }->{valid} ) {
	die "Creation of SymbolGenerator using format generated by generator based on $param_file failed\n";
    }

    print "\nWords of language $base_lang_idx:\n\n";
    for ( 1..$global_vars{wordcount} ) {
	print $word_maker{ $base_lang_idx }->generate_sequence, $global_vars{separator};
    }

    print "\nType h for help\n";

    my @command_history;		###TODO do something useful with this
    $global_vars{lang} = $base_lang_idx;
    while ( 1 ) {
	print "Command:";
	my $input = <>;
	if ( $input =~ m/^\s*([\?a-z]+)\s*(.*)/ ) {
	    my ($command, $argstr ) = ($1, $2);
	    if ( defined $command ) {
		my $dispatcher;
		if ( defined $command_info{ $command } ) {
		    $dispatcher = $command_info{ $command };
		} elsif ( defined $command_aliases{ $command } ) {
		    $dispatcher = $command_info{ $command_aliases{ $command } };
		} else {
		    print "Unrecognized command '$command'\n";
		    # print "commands:\n" . (join "\n", (keys %command_info));
		    &cmd_help;
		    print "\n\n";
		    next;
		}

		push @command_history, $input;
		
		$argstr =~ s/\s+$//;		# strip trailing spaces so the split will work right
		$argstr =~ s/\s*=\s*/=/g;	# strip whitespace surrounding equal signs
		my @pairs = split /\s+/, $argstr;
		my %vars;
		foreach ( @pairs ) {
		    my ($var, $val) = split "=", $_;
		    $vars{ $var } = $val; 
		}

		&{ $$dispatcher{function} }( $input, %vars );
	    } else {
		print "Unrecognized command '$command'\n";
		&cmd_help;
	    }
	} else { 
	    ###TODO on a blank line, repeat the last command (if it's safe to do so)
	    # else complain and display help
	    print "Unrecognized command '$input'\n";
	    &cmd_help;
	}
	print "\n\n";
    } # end while
} # end sub interactive_mode()

=pod

	    &show_help( $1 );

	} elsif  ( $command =~ m/^ *m *(.*)/ ) { # mutate 
	    ###TODO better err checking on args, and generalize this code for use with other cmds
	    my %local_vars;
	    if ( defined $1 && $1 ne '' ) {
		my $args = $1;
		$args =~ s/\s+$//;
		my @pairs = split /\s+/, $args;
		foreach ( @pairs ) {
		    my ($var, $val) = split "=", $_;
		    $local_vars{ $var } = $val; 
		}
	    }

	    # m command with lang=ID option resets the default language for later commands
	    if ( defined $local_vars{lang} ) {
		$lang_id = $local_vars{lang};
	    }
	    my $mutation_pct = defined $local_vars{pct} ?  $local_vars{pct} : 10;
	    my $mutate_count = defined $local_vars{n} ? $local_vars{n} : 2;

	    #my $start = defined ( $daughter_counter{ $lang_id } )  ?  $daughter_counter{ $lang_id } + 1  :  1;
	    my $start = ++$daughter_counter{ $lang_id };
	    $daughter_counter{ $lang_id } += $mutate_count;

	    for ( my $n = $start;  $n < $start + $mutate_count;  $n++ ) {
		my $this_lang = $lang_id . "." . $n;
		if ( grep { $_ eq $this_lang } keys %phonology_maker ) {
		    die "language $this_lang already exists";
		}

		#$wordmaker[ $format_idx ] = $wordmaker[ $lang ]->mutate( $mutation_pct );
		$phonology_maker{ $this_lang } = $phonology_maker{ $lang_id }->mutate( $mutation_pct, $this_lang );
		$params = $phonology_maker{ $this_lang }->get_sequence;
		$word_maker{ $this_lang } = SymbolGenerator->new( 'name' => $this_lang, 'paramstring' => $params );
		if ( not $word_maker{ $this_lang }->{valid} ) {
		    die "Creation of SymbolGenerator using format generated by generator based on $param_file failed\n";
		}
		print "\nWords of language $this_lang:\n\n";
		for ( 1..$global_vars{wordcount} ) {
		    print $word_maker{ $this_lang }->generate_sequence, "\t";
		}
		print "\n\n";
	    }
	} elsif  ( $command =~ m/^ *show *([^\s]+)*/ ) { # show format
	    my %local_vars;
	    if ( defined $1 && $1 ne '' ) {
		my $args = $1;
		$args =~ s/\s+$//;
		my @pairs = split /\s+/, $args;
		foreach ( @pairs ) {
		    my ($var, $val) = split "=", $_;
		    $local_vars{ $var } = $val; 
		}
	    }
	    my $lang =  $local_vars{lang} ?  $local_vars{lang}  :  $lang_id ;
	    if ( defined $phonology_maker{ $lang } ) {
		my $params = $phonology_maker{ $lang }->get_sequence;
		print "Format of language $lang:\n" . $params . "\n";
	    } else {
		print "Language $lang not found\n";
	    }
	    
	} elsif  ( $command =~ m/^ *save *([^\s]+)*/ ) { # save format and tree for later use
	    if ( not defined $1 ) {
		print "filename argument is required for s (save) command\n";
	    } else {
		my $filename = $1;
		print "Sorry, save command not implemented yet\n";
		###TODO write code to save format and tree to file
	    }

	} elsif ( $command =~ m/^ *r *$/ ) {
	    $lang_id = ++$base_lang_idx;
	    $phonology_maker{ $lang_id } = SymbolGenerator->new( 'paramfile' => $param_file, 'name' => $lang_id );
	    $params = $phonology_maker{ $lang_id }->generate_sequence;
	    $word_maker{ $lang_id } = SymbolGenerator->new( 'paramstring' => $params, 'name' => $lang_id );
	    print "\nWords of language $lang_id:\n\n";
	    for ( 1..$global_vars{wordcount} ) {
		print $word_maker{ $lang_id }->generate_sequence, "\t";
	    }
	    print "\n\n";
	} elsif ( $command =~ m/^ *ls *$/ ) {
	    print "Sorry, ls (list files) command not implemented yet\n";
	} elsif ( $command =~ m/^ *load/ ) {
	    print "Sorry, load command not implemented yet\n";
	} elsif ( $command =~ m/^ *w *(.*)/ ) {
	    # print more words
	    my %local_vars;
	    if ( defined $1 && $1 ne '' ) {
		my $args = $1;
		$args =~ s/\s+$//;
		my @pairs = split /\s+/, $args;
		foreach ( @pairs ) {
		    my ($var, $val) = split "=", $_;
		    $local_vars{ $var } = $val; 
		}
	    }
	    my $lang = defined $local_vars{ lang } ? $local_vars{lang} : $lang_id;
	    my $n = defined $local_vars{n} ? $local_vars{n} : $global_vars{wordcount};

	    if ( defined $word_maker{ $lang } ) {
		print "\nMore words of language $lang:\n";
		for ( 1..$n ) {
		    ###TODO handle printing to specified file as well as/instead of STDOUT
		    print $word_maker{ $lang }->generate_sequence, "\t";
		}
		print "\n";
	    } else {
		print "Language $lang not found\n";
	    }
	} elsif ( $command =~ m/^ *q *$/ ) {
	    ###TODO query if user wants to save current format or word list or ....?
	    exit(0);
	} else {
	    print "Unrecognized command\n";
	    &show_help;
	}
    } # end while true
=cut


sub cmd_unimplemented {
    print "Command not implemented yet\n";
}


sub cmd_set {
    my $argstr = shift;
    my %vars = @_;

    if ( scalar ( keys %vars ) >= 1 ) {
	foreach ( keys %vars ) {
	    my $var = $_;
	    if ( not ( grep { $_ eq $var } keys %var_info ) ) {
		print qq(Variable "$var" not recognized\n);
		return;
	    } elsif ( $vars{$var} !~ m/$var_info{$var}->{pattern}/ ) {
		print qq(Variable "$var" ) . $var_info{$var}->{format_err} . "\n";
		return;
	    }
	    $vars{ $var } =~ s/\\t/\t/g;
	    $vars{ $var } =~ s/\\n/\n/g;
	    $vars{ $var } =~ s/\\s/ /g;
	    $global_vars{ $var } = $vars{ $var };
	}
    } else { # display
	print "Variable settings:\n";
	foreach ( sort keys %global_vars ) {
	    print $_ . "\t\t[" . $global_vars{ $_ } . "]\n";
	}
    }
}


sub cmd_recreate {
    my $argstr = shift;
    my %local_vars = @_;

    my $lang_count = defined $local_vars{n} ? $local_vars{n} : $global_vars{langcount};
    
    for ( 1..$lang_count ) {
	$global_vars{lang} = ++$base_lang_idx;
	$phonology_maker{ $global_vars{lang} } = SymbolGenerator->new( 'paramfile' => $param_file, 'name' => $global_vars{lang} );
	my $params = $phonology_maker{ $global_vars{lang} }->generate_sequence;
	$word_maker{ $global_vars{lang} } = SymbolGenerator->new( 'paramstring' => $params, 'name' => $global_vars{lang} );
	print "\nWords of language $global_vars{lang}:\n\n";
	for ( 1..$global_vars{wordcount} ) {
	    print $word_maker{ $global_vars{lang} }->generate_sequence, $global_vars{separator};
	}
	print "\n";
    }
}


sub cmd_mutate {
    my $argstr = shift;
    my %local_vars = @_;

    if ( defined $local_vars{lang} ) {
	$global_vars{lang} = $local_vars{lang};
    }

    if ( not defined $phonology_maker{ $global_vars{lang} } ) {
	print "Language $global_vars{lang} not found";
	return;
    }

    my $mutation_pct = defined $local_vars{pct} ?  $local_vars{pct} : $global_vars{percent};
    my $mutate_count = defined $local_vars{n} ? $local_vars{n} : $global_vars{langcount};
    
    my $start = ++$daughter_counter{ $global_vars{lang} };
    $daughter_counter{ $global_vars{lang} } += $mutate_count - 1;
    
    for ( my $n = $start;  $n < $start + $mutate_count;  $n++ ) {
	my $this_lang = $global_vars{lang} . "." . $n;
	if ( grep { $_ eq $this_lang } keys %phonology_maker ) {
	    die "language $this_lang already exists";
	}
	
	$phonology_maker{ $this_lang } = $phonology_maker{ $global_vars{lang} }->mutate( $mutation_pct, $this_lang );
	my $params = $phonology_maker{ $this_lang }->get_sequence;
	$word_maker{ $this_lang } = SymbolGenerator->new( 'name' => $this_lang, 'paramstring' => $params );
	if ( not $word_maker{ $this_lang }->{valid} ) {
	    die "Creation of SymbolGenerator using format generated by generator based on $param_file failed\n";
	}
	print "\nWords of language $this_lang:\n\n";
	for ( 1..$global_vars{wordcount} ) {
	    print $word_maker{ $this_lang }->generate_sequence, $global_vars{separator};
	}
	print "\n";
    }
}


# this implements both 'save' and 'show'
###TODO a command to save the tree which generated the current format, as well...
sub cmd_show {
    my $argstr = shift;
    my %local_vars = @_;
    if ( $argstr =~ m/^\s*save/ ) {
	if ((not defined $local_vars{file}) or (defined $local_vars{file} and $local_vars{file} eq '' )) {
	    print $command_info{save}->{help};
	    return;
	}
    }

    my $lang =  $local_vars{lang} ?  $local_vars{lang}  :  $global_vars{lang} ;
    if ( defined $phonology_maker{ $lang } ) {
	my $params = $phonology_maker{ $lang }->get_sequence;
	if (  defined $local_vars{file}  and  $local_vars{file} ne '' ) {
	    if ( not (open OUT, ">" . $local_vars{file} ) ) {
		print qq(Can't open "$local_vars{file}" for writing);
		return;
	    }
	    print OUT "# Language format $lang generated from $param_file at " . localtime() . "\n\n";
	    print OUT $params . "\n";
	    close OUT;
	    print "Wrote format of language $lang to $local_vars{file}";
	} else {
	    print "Format of language $lang:\n" . $params;
	}
    } else {
	print "Language $lang not found";
    }
}



sub cmd_words {
    my $argstr = shift;
    my %local_vars = @_;

    my $lang = defined $local_vars{ lang } ? $local_vars{lang} : $global_vars{lang};
    my $n = defined $local_vars{n} ? $local_vars{n} : $global_vars{wordcount};

    my $writing_to_file = 0;
    if ( defined $local_vars{file} && $local_vars{file} ne '' ) {
	if ( not open OUT, ">>" . $local_vars{file} ) {
	    print qq(Couldn't open "$local_vars{file}" for appending); 
	    return;
	}
	$writing_to_file = 1;
    }

    if ( defined $word_maker{ $lang } ) {
	print "\nMore words of language $lang:\n";
	for ( 1..$n ) {
	    if ( $writing_to_file ) {
		print OUT $word_maker{ $lang }->generate_sequence, "\n";
	    } else {
		print $word_maker{ $lang }->generate_sequence, $global_vars{separator};
	    }
	}
    } else {
	print "Language $lang not found";
    }

    if ( $writing_to_file ) {
	close OUT;
	print "Wrote $n words to $local_vars{file}";
    }    
}




sub cmd_dump {
    my $argstr = shift;
    my %local_vars = @_;

    my $lang = defined $local_vars{ lang } ? $local_vars{lang} : $global_vars{lang};

    if ( defined $phonology_maker{ $lang } ) {
	$phonology_maker{ $lang }->dump_info( $local_vars{file} );
    } else {
	print "Language $lang not found";
    }
    
}




sub cmd_quit {
    exit(0);
}




# Display help in interactive mode; contrast display_usage() for command line mode
sub cmd_help {
    my $argstr = shift;
    my $got_help = 0;
    
    if ( defined $argstr ) {
	if ( $argstr !~ m/^\s*$/ ) {
	    # strip leading and trailing whitespace
	    $argstr =~ s/^\s*//;
	    $argstr =~ s/\s*$//;
	    my @cmds = split /\s/, $argstr;
	    foreach ( @cmds ) {
		if ( $_ =~ m/^(help|h|\?)$/ ) {
		    next;
		}

		if ( defined $command_info{ $_ } ) {
		    print $command_info{ $_ }->{help} . "\n\n";
		    $got_help = 1;
		} elsif ( defined $command_aliases{ $_ } ) {
		    print $command_info{ $command_aliases{ $_ } }->{help} . "\n\n";
		    $got_help = 1;
		}
	    }
	}
    }

    if ( not $got_help ) {
	print $general_help;
    }
 
}

###TODO - cmd: 'auto' - keep generating words/phonologies/whatever and
#   sleeping $global_vars{sleeptime} seconds after displaying each
#   one, until user presses a key to stop.  Then they can do something
#   with the last one which they presumably like and want to save or
#   mutate

# separate commands to load/save trees as opposed to formats

# and for metaformats vs. direct formats?

# -- tree file format should have a checksum for the format that created said
# tree; if you load a tree when a format with a different checksum is in effect
# you get a warning (it may work fine if only slight changes have been made
# to the metaformat, but will break if any tokens from the old version are
# missing in the new and might break for other reasons...)






__END__


=pod

From owner-conlan-@diku.dk  Tue Jan  7 18:20:06 1997
Return-Path: owner-conlan-@diku.dk
Received: from vidar.diku.dk (vidar.diku.dk [130.225.96.249]) by FindMail.COM (x.x.x/x.x.x) with SMTP id SAA20949 for ; Tue, 7 Jan 1997 18:20:04 -0800
Received: (from daemo-@localhost) by vidar.diku.dk (8.6.12/8.6.12) id DAA02888 for conlang-outgoing; Wed, 8 Jan 1997 03:08:31 +0100
Received: from relay-11.mail.demon.net (relay-11.mail.demon.net [194.217.242.137]) by vidar.diku.dk (8.6.12/8.6.12) with SMTP id DAA02881 for ; Wed, 8 Jan 1997 03:08:28 +0100
Received: from drummond.demon.co.uk ([158.152.10.59]) by relay-10.mail.demon.net
           id aa1004521; 8 Jan 97 1:57 GMT
Message-ID: 
Date: Wed, 8 Jan 1997 00:08:47 +0000
To: conlan-@diku.dk
From: John Fisher 
Subject: CONLANG: Word generator
MIME-Version: 1.0
X-Mailer: Turnpike Version 3.00 
Sender: owner-conlan-@diku.dk
Precedence: bulk
Priority: non-urgent
Reply-To: John Fisher 

I posted this not long ago - please anyone feel free to hack it around
as you wish...

  ------- Forwarded message follows -------
Okay, a couple of people said go ahead so I'm taking that as an excuse
:-) Here's a simple Perl string generator program, which I shall call
Boris, since I know a nice cat called that.

To run it, you need a file of parameters.  Here's a tiny phonology of
a sort of Pacific kind (?):

----------------------------------------

TOP=I+V+F:1 I+V+M+V+F:9;

I=I0:1 I1:9;
M=M1:5 M2:1;
F=F0:1 F1:3;
V=a:1 i:1 u:1;

I0=;
I1=p:10 f:10 m:10 t:10 s:10 n:10 k:10 r:3;
M1=I1:73 ':10;
M2=I1+r;
F0=;
F1=m:5 n:5 ng:5 ':3 r:1;

rr>r;
'a>ha;
----------------------------------------

It always starts with TOP.  In this case, TOP can be one of two
things, I+V+F or I+V+M+V+F.  The numbers after the colons are the
relative chances of these happening.  In this case, you get 1
I+V+F for every 9 I+V+M+V+F's.  If you do it often enough.

Then we have definitions for I, which could be an I0 or an I1, M (M1
or M2), F (F0 or F1), and V.  For V there are 3 possibilities: a, i
and u.
 
The definitions for I0 and F0 indicate that they are replaced by
nothing.  For the others, we have a variety of consonants.  M1 is the
same as I1, except that you can have a glottal stop as well; there's
73 chances of an I1 and only 10 of a ', but that's okay since I1 is a
group of 8 consonants.

After all those there are some rules of a different kind.  "rr>r;"
rewrites "rr" as "r" - this could happen from the M2 rule.  In the
same way "'a" is rewritten as "ha".

The program listing is at the end of this.  Under Unix you will have to
put a line at the beginning like this:

#!/usr/bin/perl

- or wherever your Perl interpreter is.  Put your parameters in a file
and fire up the program:
----------------------------------------
Parameter file: tiny.txt
How many? 10
umun
samrang
mu'ung
sun
sarang
tuma
sapu
tipang
mana'
mumam
----------------------------------------

Here's another example: this time a small syntax. Using a `-' instead
of a `+' makes it put a space between the items.  You can run a rule
over more than one line - just don't forget the `;'

----------------------------------------
TOP=NP-VP;
NP=Det-N;
Det=a:3 the:8;
N=cat:1 mouse:1 rat:1 dog:1 lion:1;
VP=VinP:1 VtrP:3;
VinP=Vin;
Vin=slept:1 cried:1 sat:1 walked:1;
VtrP=Vtr-NP;
Vtr=ate:1 kicked:1 knew:1 kissed:1;
^a>A;
^t>T;
----------------------------------------

I'm only a beginner at Perl so apologies to the native speakers.  All
suggestions, corrections etc gratefully received.

--John

-- 
--John
--joh-@drummond.demon.co  john-@epcc.ed.ac.uk
--Elet Anta homepage: http://www.drummond.demon.co.uk/anta/
--Drummond ro cleshfan merec: fanye litoc inye litoc


=cut
