# Boris - Generate symbol sequences
# John Fisher Ver.1 Nov 96
# Do what you like with it but leave these three lines

srand;
&read_parameters;

print "How many? ";
$num = <>;

for ($n=0; $n<$num; $n++) {
   print &expansion, "\n";
}

sub read_parameters {
   print ("Parameter file: ");
   chop ($filename = <>);
   open (PARAMS, $filename) || die ("Couldn't find parameters\n");
   $/ = ";";
   while ($line = <PARAMS>) {
      chop ($line);
      $line =~ s/\n/ /g;
      if ($line =~ /^\s*(.*)([=>])(.*)\s*$/) {
         $head=$1; $sign=$2; $value=$3;
         if ($value =~ /[=>]/) {
            die "; missing after `$head'\n";
         }
         $value =~ s/\s+/ /g;
         $value =~ s/^ /^/;
         if ($sign eq "=") {
            $value =~ s/\s*([\+\-])\s*/$1/g;
            $total = 0;
            $litstr = "";
            $biasstr = "";
            foreach $item (split (/ /,$value)) {
               ($lit,$bias) = split(/:/,$item);
               $total += $bias;
               $litstr .= "$lit ";
               $biasstr .= "$total ";
            }
            $totals{$head} = $total;
            chop ($lits{$head} = $litstr);
            $biases{$head} = $biasstr;
         } else {
            #must be >
            push (@matches, $head);
            push (@subs, $value);
         }
      }
   }
   $/ = "\n";
}

sub expansion {
   $terminals = &expand_string ("TOP");
   return &substitute ($terminals);
}
  
sub expand_string {
   local ($target) = @_;
   local ($val, $i, $v, $vv);
   $total = $totals{$target};
   if ($total == 0) {
      $val = $lits{$target};
   } else {
      @itemlits = split (/ /,$lits{$target});
      @itembiases = split (/ /,$biases{$target});
      $random = int (rand ($total));
      for ($i=0; $i<@itembiases; $i++) {
         if ($random < $itembiases[$i]) {
            $val = $itemlits[$i];
            last;
         }
      }
   }
   $vv = "";
   while ($val =~ /^([^\+\-]*)([\+\-])(.*)/) {
      $v = $1; $sign = $2; $val = $3;
      $vv .= &expand_string ($v);
      $vv .= " " if $sign eq "-";
   }
   if (defined ($totals{$val})) {
      $val = &expand_string ($val);
   }
   $val = $vv . $val;
   return $val;
}

sub substitute {
   local ($term) = @_;
   local ($i);
   for ($i=0; $i<@matches; $i++) {
      $term =~ s/$matches[$i]/$subs[$i]/g;
   }
   return $term;
}


__END__


=pod

From owner-conlan-@diku.dk  Tue Jan  7 18:20:06 1997
Return-Path: owner-conlan-@diku.dk
Received: from vidar.diku.dk (vidar.diku.dk [130.225.96.249]) by FindMail.COM (x.x.x/x.x.x) with SMTP id SAA20949 for ; Tue, 7 Jan 1997 18:20:04 -0800
Received: (from daemo-@localhost) by vidar.diku.dk (8.6.12/8.6.12) id DAA02888 for conlang-outgoing; Wed, 8 Jan 1997 03:08:31 +0100
Received: from relay-11.mail.demon.net (relay-11.mail.demon.net [194.217.242.137]) by vidar.diku.dk (8.6.12/8.6.12) with SMTP id DAA02881 for ; Wed, 8 Jan 1997 03:08:28 +0100
Received: from drummond.demon.co.uk ([158.152.10.59]) by relay-10.mail.demon.net
           id aa1004521; 8 Jan 97 1:57 GMT
Message-ID: 
Date: Wed, 8 Jan 1997 00:08:47 +0000
To: conlan-@diku.dk
From: John Fisher 
Subject: CONLANG: Word generator
MIME-Version: 1.0
X-Mailer: Turnpike Version 3.00 
Sender: owner-conlan-@diku.dk
Precedence: bulk
Priority: non-urgent
Reply-To: John Fisher 

I posted this not long ago - please anyone feel free to hack it around
as you wish...

  ------- Forwarded message follows -------
Okay, a couple of people said go ahead so I'm taking that as an excuse
:-) Here's a simple Perl string generator program, which I shall call
Boris, since I know a nice cat called that.

To run it, you need a file of parameters.  Here's a tiny phonology of
a sort of Pacific kind (?):

----------------------------------------

TOP=I+V+F:1 I+V+M+V+F:9;

I=I0:1 I1:9;
M=M1:5 M2:1;
F=F0:1 F1:3;
V=a:1 i:1 u:1;

I0=;
I1=p:10 f:10 m:10 t:10 s:10 n:10 k:10 r:3;
M1=I1:73 ':10;
M2=I1+r;
F0=;
F1=m:5 n:5 ng:5 ':3 r:1;

rr>r;
'a>ha;
----------------------------------------

It always starts with TOP.  In this case, TOP can be one of two
things, I+V+F or I+V+M+V+F.  The numbers after the colons are the
relative chances of these happening.  In this case, you get 1
I+V+F for every 9 I+V+M+V+F's.  If you do it often enough.

Then we have definitions for I, which could be an I0 or an I1, M (M1
or M2), F (F0 or F1), and V.  For V there are 3 possibilities: a, i
and u.
 
The definitions for I0 and F0 indicate that they are replaced by
nothing.  For the others, we have a variety of consonants.  M1 is the
same as I1, except that you can have a glottal stop as well; there's
73 chances of an I1 and only 10 of a ', but that's okay since I1 is a
group of 8 consonants.

After all those there are some rules of a different kind.  "rr>r;"
rewrites "rr" as "r" - this could happen from the M2 rule.  In the
same way "'a" is rewritten as "ha".

The program listing is at the end of this.  Under Unix you will have to
put a line at the beginning like this:

#!/usr/bin/perl

- or wherever your Perl interpreter is.  Put your parameters in a file
and fire up the program:
----------------------------------------
Parameter file: tiny.txt
How many? 10
umun
samrang
mu'ung
sun
sarang
tuma
sapu
tipang
mana'
mumam
----------------------------------------

Here's another example: this time a small syntax. Using a `-' instead
of a `+' makes it put a space between the items.  You can run a rule
over more than one line - just don't forget the `;'

----------------------------------------
TOP=NP-VP;
NP=Det-N;
Det=a:3 the:8;
N=cat:1 mouse:1 rat:1 dog:1 lion:1;
VP=VinP:1 VtrP:3;
VinP=Vin;
Vin=slept:1 cried:1 sat:1 walked:1;
VtrP=Vtr-NP;
Vtr=ate:1 kicked:1 knew:1 kissed:1;
^a>A;
^t>T;
----------------------------------------

I'm only a beginner at Perl so apologies to the native speakers.  All
suggestions, corrections etc gratefully received.

--John

-- 
--John
--joh-@drummond.demon.co  john-@epcc.ed.ac.uk
--Elet Anta homepage: http://www.drummond.demon.co.uk/anta/
--Drummond ro cleshfan merec: fanye litoc inye litoc


=cut
