hn-classics/_stories/2000/9328907.md

1090 lines
56 KiB
Markdown
Raw Permalink Normal View History

---
created_at: '2015-04-06T16:19:43.000Z'
title: Lingua::Romana::Perligata Perl for the XXI-Imum Century (2000)
url: http://www.csse.monash.edu.au/~damian/papers/HTML/Perligata.html
author: rhythmvs
points: 76
story_text: ''
comment_text:
num_comments: 7
story_id:
story_title:
story_url:
parent_id:
created_at_i: 1428337183
_tags:
- story
- author_rhythmvs
- story_9328907
objectID: '9328907'
2018-06-08 12:05:27 +00:00
year: 2000
---
2018-02-23 18:19:40 +00:00
[Source](http://users.monash.edu/~damian/papers/HTML/Perligata.html "Permalink to Lingua::Romana::Perligata -- Perl for the XXIimum Century")
# Lingua::Romana::Perligata -- Perl for the XXIimum Century
 
# Lingua::Romana::Perligata -- Perl for the XXI-imum Century
## Damian Conway
### School of Computer Science and Software Engineering
Monash University
Clayton 3168, Australia
### [ [`mailto:damian@csse.monash.edu.au ][1][http://www.csse.monash.edu.a u/~damian][2]`
#
* * *
Abstract
This paper describes a Perl module -- Lingua::Romana::Perligata -- that makes it possible to write Perl programs in Latin. A plausible rationale for wanting to do such a thing is provided, along with a comprehensive overview of the syntax and semantics of Latinized Perl. The paper also explains the special source filtering and parsing techniques required to efficiently interpret a programming language in which the syntax is (largely) non-positional.
* * *
# Introduction
Compared to other languages (both modern and ancient), English has a comparatively weak lexical structure. Much of the grammatical load of an English sentence is carried by positional cues. A statement such as ``The boy gave the dog the food'' only makes sense because of the convention that the subject precedes the verb, which precedes the indirect object, which precedes the direct object. Changing the order -- ``The food gave the boy the dog'' \-- changes the meaning.
Most programming languages use similar positional grammatical cues. The operation `$maximum = $next` is very different in meaning from `$next = $maximum`. Likewise, the function call `push @my_assets, @your_money` is not the same as `push @your_money, @my_assets`.
Generally speaking, older natural languages have richer lexical structures (such as inflexions for noun number and case) and therefore rely less on word order. For example, in Latin the statements _Puer dedit cani escam_ and _Escam dedit puer cani_ both mean ``The boy gave the dog the food''. Indeed, the more usual word order would be reverse Polish, with the verb coming last: _Puer cani escam dedit_.
This flexibility is possible because Latin uses inflexion, not position, to denote lexical roles. The lack of a suffix denotes that the boy (_puer_) is the subject; the _-i_ ending indicates that the dog (_cani_) is the indirect object; whilst the _-am_ ending indicates that the food (_escam_) is the direct object.
To say ``The food gave the boy the dog'', one might write: _Puero canem esca dedit_. Here, the _-o_ ending denotes that the boy is now the indirect object, the _-em_ ending indicates that the dog has become the direct object, whilst the _-a_ ending indicates that the food is the subject.
 
 
# A less positional programming language
There is no reason why programming languages could not also use inflexions, rather than position, to denote lexical roles. Perl already makes some use of this idea by requiring different prefixes to denote differing types of symbols: `$` to denote a scalar, `@` to denote an array, `&` to denote a subroutine, etc.
Indeed, there is no reason why certain built-in functions, such as `bless`, or the block form of `map`, or even `push` could not allow their arguments to be specified in any order, at least in cases where the prefixes (or the lack thereof) make the roles of each argument unambiguous:
        my $obj = bless 'Classname', %obj;
        @squares = map @numbers {$_**2};
        push ('Moe', 'Larry', 'Curly') => @stooges;
Moreover, since the function names themselves are unambiguous in their role, there is no reason why their position need be fixed either:
        @squares = @numbers map {$_**2};
        ('Moe', 'Larry', 'Curly') => @stooges push;
Perl already allows a modicum of this flexibility in the form of statement modifiers:
        if ($next > $max) { $max = $next }
        # ...is the same as...
        $max = $next if $next > $max;
This paper describes a new module -- Lingua::Romana::Perligata -- that explores an alternative syntactic binding for Perl, using inflexions based on classical Latin grammar. These inflexions subsume the function of the standard Perl `$`/`@`/`%`/`&` prefixes and support the new concept of _semantic roles_, which allows far greater freedom in the specification of functions, operations, and their respective arguments.
 
 
# Semantic roles
Most of Perl's rich variety of operators provide an assignment variant: `+=` for `+`, `.=` for `.`, `||=` for `||`, etc. Thus, nearly half of Perl's operators produce some change in one of their arguments. Likewise, many built-in Perl functions (`push`, `pop`, `open`, etc.) modify one of their arguments.
In both cases, the operand or argument to be modified is denoted positionally \-- it is always the left operand or the first argument. Furthermore this argument is always ``implicitly enreferenced'', as if it had a `$` or `@` prototype.
Thus, in Perl, operands and arguments have one of two semantic roles: target or data. A target is passed by reference and is modified during the evaluation of an operation or function. Data are passed by value (though that value itself may be a reference) and they control or ``fuel'' the modification of the target.
In this model, it is possible to recast almost all built-in Perl functions and operations as procedures of exactly two arguments: a single reference (the target) and a single list (the data):
        _sysopen( -target=>*FILE, -data=>[$filename,$mode,$perms] );
        _push( -target=>@stooges, -data=>['Moe', 'Larry', 'Curly'] );
        _pop( -target=>@stack, -data=>[] );
        
        _assign( -target=>$max, -data=>[$nextval] )
                if _num_less_than( -target=>undef, -data=>[$max, $nextval] );
        _assign( -target=>$now, -data=>[ _time(-target=>undef, -data=>[]) ]);
Note that, for many functions, either or both of these two standard arguments may be null.
 
 
# Mapping the model to Latin
To map this simplified model of Perl onto an inflexion-based syntax, it is necessary to choose an inflexion scheme that differentiates the three components of each function: name, target, and data.
Consider the assignment of a list to an array:
        @gunslingers = ( @good, @bad, $Ugly );
In semantic role notation that is:
        _assign( -target=>@gunslingers, -data=>[@good, @bad, $Ugly] );
In English, this would be expressed:
        Assign gunslingers goodies and baddies and Mr Ugly.
The imperative verb ``assign'' specifies the action to be performed. The noun ``gunslingers'' specifies the indirect (or dative) object of the action. In other words, it is the recipient of the effect of the action -- the target. The phrase ``goodies and baddies and uglies'' specifies the direct object of the action -- that which is to be assigned. In other words, the data. The direct and indirect objects are only distinguished by the order in which they appear: indirect object first.
The English version also uses a plural inflexion on ``goodies'' and ``baddies'', much in the same way that Perl uses the `@` prefix to indicate the multiplicity of the objects involved.
In Latin, the same instruction would be (loosely) rendered:
        Bonos tum malos tum Foedum pugnatoribus da.
Here the direct objects are _bonos_ (``the good (people)'', accusative plural), _malos_ (``the bad (people)'', accusative plural) and _Foedum_ (``Mr Ugly'', accusative singular). The indirect object is _pugnatoribus_ (``fighters'', dative plural) and the verb is _da_ (``give'', present imperative). The conjunction _tum_ means ``and then'', and conveys the significance of the order of the direct objects.
Unlike the English ``-s'' ending, the various Latin suffixes (_-os_, _-um_, _-ibus_) specify both the number and the role (or ``case'') of the nouns they inflect. This means that the positions of the various objects, and indeed of the verb itself, do not matter. The same sentence could equally well be written:
        Pugnatoribus da bonos tum malos tum Foedum.
or
        Da bonos tum malos tum Foedum pugnatoribus.
Semantically, all of these variants (and any other permutations of the verb and its objects) are equivalent to the same target/data model:
        _assign( -target=>@gunslingers, -data=>[@good, @bad, $Ugly] );
and hence are equivalent to the standard Perl:
        @gunslingers = ( @good, @bad, $Ugly );
Thus it is possible to write Perl programs in Latin.
 
 
# Lingua::Romana::Perligata
The Lingua::Romana::Perligata module provides the necessary translation services to allow Perl programs to be written using a syntactic binding (_perligatus_) modelled on the ancient _lingua Romana_. To distinguish it from regular Perl, this binding -- and any code specified in it -- is henceforth referred to as ``Perligata''.
 
 
## Variables
To simplify the mind-numbingly complex rules of declension and conjugation that govern inflexions in Latin, Perligata treats all user-defined scalar and array variables as neuter nouns of the second declension -- singular for scalars, plural for arrays. This minimizes the number of suffixes that must be remembered.
Hashes represent something of a difficulty in Perligata, as Latin lacks an obvious way of distinguishing these ``plural'' variables from arrays. The solution that has been adopted is to depart from the second declension and represent hashes as masculine plural nouns of the fourth declension.
Hence, the type and role of all types variables are specified by their number and case, as indicated in Table 1.
When elements of arrays and hashes are referred to directly in Perl, the prefix of the container changes from `@` or `%` to `$`. So it should not be surprising that Perligata also makes use of a different inflexion to distinguish these cases.
Indexing operations such as `$array[$elem]` or `$hash{$key}` might be translated as ``elem of array'' or ``key of hash''. This suggests that when arrays or hashes are indexed, their names should be rendered in the genitive (or possessive) case. Multi-level indexing operations (`$array[$row][$column]`) mean ``column of row of array'', and hence the first indexing variable must also be rendered in the genitive. Table 1 also summarizes this role.
**Table 1: Perligata variables**
| ----- |
| **Perligata** | **Number, Case, and Declension** | **Perl** | **Role** |
| _nextum_ | accusative singular 2nd | $next | scalar data |
| _nexta_ | accusative plural 2nd | @next | array data |
| _nextus_ | accusative plural 4th | %next | hash data |
| _nexto_ | dative singular 2nd | $next | scalar target |
| _nextis_ | dative plural 2nd | @next | array target |
| _nextibus_ | dative plural 4th | %next | hash target |
| _nexti_ | genitive singular 2nd | [$next] | indexed scalar |
| _nextorum_ | genitive plural 2nd | $next[] | indexed array |
| _nextuum_ | genitive plural 4th | $next{} | indexed hash |
In other words, scalars are always singular nouns, arrays and hashes are always plural (but of different declensions), and the case of the noun specifies its role: accusative for data, dative for target, genitive when being indexed.
The common punctuation variables `$_` and `@_` are special cases. `$_` is often the value under implicit consideration (e.g. in pattern matches, or `for` loops) and so it is rendered as ``this thing'': _hoc_ in the data role, _huic_ in the target role, _huius_ when indexed.
Similarly, `@_` is implicitly the list of things passed into a subroutine, and so is rendered as ``these things'': _haec_ in the data role, _his_ in the target role, _horum_ when indexed.
Other punctuation variables take the Latin forms of their English.pm equivalents (see Appendix A), often with a large measure of poetic licence. For example, in Perligata, `$/` is rendered as _ianitorem_ or ``gatekeeper''.
The ``numeral'' variables -- `$1`, `$2`, etc. -- are rendered as synthetic compounds: _parprimum_ (``the equal of the first''), _parsecundum_ (``the equal of the second''), etc. When indexed, they take their genitive forms: _parprimi_, _parsecundi_, etc. Since they cannot be directly modified as the target of an action, they have no dative forms.
 
 
### `my`, `our`, and `local`
In Perligata, the `my` modifier is rendered -- not surprisingly \-- by the first person possessive pronouns: _meo_ (conferring a scalar context) and _meis_ (for a list context). Note that the modifier is always applied to a dative, and hence is itself declined in that case. Thus:
        meo varo haec da.                # my $var = @_;
        meis varo haec da.               # my ($var) = @_
        meis varis haec da.              # my @var = @_;
Similarly the `our` modifier is rendered as _nostro_ or _nostris_, depending on the desired context.
The Perl `local` modifier is _loco_ or _locis_ in Perligata:
        loco varo haec da.               # local $var = @_;
        locis varo haec da.              # local ($var) = @_
        locis varis haec da.             # local @var = @_;
This is particularly felicitous: not only is _loco_ the Latin term from which the word ``local'' derives, it also means ``in place of'' (as in: _in loco parentis_). This meaning is much closer to the actual behaviour of the `local` modifier, namely to temporarily install a new symbol table entry in place of the current one.
* * *
## Subroutines
Functions, operators, and user-defined subroutines are represented as verbs or, in some situations, verbal nouns. Here, the inflexion of the verb determines not only its semantic role, but also its call context.
User-defined subroutines are the simplest group. To avoid ambiguity, they are all treated as verbs of the third conjugation. Table 2 illustrates the various usages for a user-defined subroutine `count()`.
**Table 2: Perligata subroutines**
| ----- |
| **Perligata** | **Number, Mood, etc** | **Perl** | **Role** | **Context** |
| _countere_ | infinitive | sub count | definition | - |
| _counte_ | imperative sing. | count() | call | void |
| _countementum_ | acc. sing. resultant | count() | call-data | scalar |
| _countementa_ | acc. plur. resultant | count() | call-data | list |
| _countemento_ | dat. sing. resultant | count() | call-target | scalar |
| _countementis_ | dat. plur. resultant | count() | call-target  | list |
The use of the infinitive as a subroutine definition is obvious: _accipere_ would tell Perligata how ``to accept''; _spernere_, how ``to reject''. So _countere_ specifies how ``to count''.
The use of the imperative for void context is also straightforward: _accipe_ commands Perligata to ``accept!'', _sperne_ tells it to ``reject!'', and _counte_ bids it ``count!''. In each case, an instruction is being given (and in a void context too, so no backchat is expected).
Handling scalar and list contexts is a little more challenging. The corresponding Latin must still have verbal properties, since an action is being performed upon objects. But it must also have the characteristics of a noun, since the result of the call will itself be used as the object (i.e. target or data) of some other verb. Fortunately, Latin has a rich assortment of verbal nouns -- far more than English -- that could fill this role.
Since it is the result of the subroutine call that is of interest here, the best solution was to use the _-ementum_ suffix, which specifies the (singular, accusative) outcome of an action. This corresponds to the result of a subroutine called in a scalar context and used as data. For a list data context, the plural suffix _-ementa_ is used, and for targets, the dative forms are used: _-emento_ and _-ementis_. Note that these endings are completely consistent with those in Table 1.
 
 
## Built-in functions and operators
Built-in operators and functions could have followed the same pattern as subroutines. For example `shift` might have been _shifte_ in a void context, _shiftementa_ when used as data in an array context, _shiftemento_ when used as a target in a scalar context, etc.
However, Latin already has a perfectly good verb with the same meaning as `shift`: _decapitare_ (``to behead''). Unfortunately, this verb is of the first conjugation, not the second, and hence has the imperative form _decapita_, which makes it look like a Perligata array in a data role.
Orthogonality has never been Perl's highest design criterion, so Perligata follows suit by eschewing bland consistency in favour of aesthetics. All Perligata keywords -- including function and operator names -- are therefore specified as correct Latin verbs, of whatever conjugation is required. Table 3 shows a selection of these, whilst Appendix A contains the full list of Perligata keywords.
**Table 3: Sample Perligata built-in functions and operators **
| ----- |
| **Operator/ function** | **Rendered as** | **Void context** | **Scalar data** | **List data** |
| + | "add" | _adde_ | _addementum_ | _addementa _ |
| = | "give" | _da_ | _damentum_ | _damenta _ |
| . | "conjoin" | _sere_ | _serementum_ | _serementa _ |
| .. | "enlist" | _conscribe_ | _conscribementum_ | _conscribementa _ |
| shift | "behead" | _decapita_ | _decapitamentum_ | _decapitamenta _ |
| push | "stack" | _cumula_ | _cumulamentum_ | _cumulamenta _ |
| pop | "unstack" | _decumula_ | _decumulamentum_ | _decumulamenta _ |
| grep | "winnow" | _vanne_ | _vannementum_ | _vannementa _ |
| print | "write" | _scribe_ | _scribementum_ | _scribementa _ |
| write | "write under" | _subscribe_ | _subscribementum_ | _subscribementa _ |
| die | "die" | _mori_ | _morimentum_ | _morimenta _ |
Note, however, that consistency has not been entirely forsaken. The back-formations of inflexions for scalar and list context are entirely regular, and consistent with those for user-defined subroutines (Table 2).
A few Perl built-in functions -- `pos`, `substr`, `keys` \-- can be used as lvalues. That is, they can be the target of some other action (typically of an assignment). In Perligata such cases are written in the dative singular (since the lvalues are always scalar). Note too that, because an assignment to an lvalue function modifies its first argument, that argument must be a target too, and hence must be written in the dative as well.
Thus:
        nexto stringum reperimentum da.     # $next = pos $string;
        nextum stringo reperimento da.      # pos $string = $next;
        inserto stringum tum unum tum duo excerpementum da.
                                    # $insert = substr($string,1,2);
        insertum stringo unum tum duo excerpemento da.
                                    # substr($string,1,2) = $insert;
        keyis hashus nominamentum da        # @keys = keys %hash;
        keya hashibus nominamento da        # keys %hash = @keys;
* * *
## Blocks and control structures
Natural languages generally use some parenthetical device -- such as parentheses, commas, or (as here) dashes -- to group and separate collections of phrases or statements.
Some such mechanism would be an obvious choice for denoting Perligata code blocks, but there is a more aesthetically pleasing solution. Perl's block delimiters (`{`..`}`) have two particularly desirable properties: they are individually short, and collectively symmetrical. It was considered important to retain those characteristics in Perligata.
In Latin, the word _sic_ has a sense that means ``as follows''. Happily, its contranym, _cis_, has the meaning (among others) ``to here''. The allure of this kind of wordplay being impossible to resist, Perligata delimits blocks of statements with these two words. For example:
        sic                                     # {
            loco ianitori.                     #   local $/;
            dato fonti perlegementum da.        #   $data = <DATA>;
        cis                                     # }
Control structures in Perligata are rendered as conditional clauses, as they are in Latin, English, and Perl. And as in those other languages, they may precede or follow the code blocks they control. Table 4 summarizes the control structures Perligata provides.
**Table 4: Perligata control structures**
| ----- |
| **Perligata** | **Perl** |
| _si ... fac_ | if ...  |
| _nisi ... fac_ | unless ...  |
| _dum ... fac_ | while ... |
| _donec ... fac_ | until ... |
| _per (quisque) ... in ... fac_ | for(each) ...  |
| _posterus_ | next  |
| _ultimus_ | last  |
| _reconatus_ | redo  |
| _confectus_ | continue  |
The trailing _fac_ is the imperative form of _facere_ (``to do'') and is used as a delimiter on the control statement's condition.
The choice of _dum_ and _donec_ is completely arbitrary, since Latin does not distinguish ``while'' and ``until'' as abstractions in the way English does. _Dum_ and _donec_ each mean both ``while'' and ``until'', and Latin relies on context (i.e. semantics) to distinguish them. This is impractical for Perligata, so it always treats _dum_ as `while` and _donec_ as `until`. This choice was made in order to favour the shorter term for the more common type of loop.
The choice of _confectus_ for `continue` seeks to convey the function of the control structure, not the literal meaning of the English word. That is, a `continue` block specifies how to complete (_conficere_) an iteration.
Perligata only supports the pure iterative form of `for(each)`, not the C-like three-part syntax. Because:
        foreach $var (@list)...
means ``for each variable in the list...'', the scalar variable must be in the accusative (as it is governed by the preposition ``for''), and the list must be in the ablative (denoting inclusion). Fortunately, in the second declension, the inflexion for ablatives is exactly the same as for datives, giving:
        per quisque varum in listis...
This means that no extra inflexions have to be learned just to use the _per_ loop. Better still, the list (_listis_) looks like a Perligata array variable in a target role, which it clearly is, since its contents may be modified within the loop.
 
 
## Miscellaneous other features
 
### Numbers
Numeric literals in Perligata are rendered by Roman numerals -- _I_, _II_, _III_, _IV_..._XV_..._XLII_, etc. However, the first 10 numbers may also be referred to by name: _unum_, _duo_, _tres_, _quattuor_, _quinque_, _sex_, _septem_, _octo_, _novem_, _decem_. Zero, for which there is no Latin numeral, is rendered by _nullum_ (``no-one''). _Nihil_ (``nothing'') might have been a closer rendering, but it is indeclinable and hence indistinguishable in the accusative and genitive.
When a numeric literal is used in an indexing operation, it must be an ordinal (``first of'', ``second of'', etc). The first ten ordinals are named: _primum_, _secundum_, _tertium_, _quartum_, _quintum_, _sextum_, _septimum_, _octavum_, _nonum_, _decimum_ (in the accusative, of course, since they are always data). Ordinals greater than ten are represented by their corresponding numeral with the suffix _-imum_: _XVimum_ (``15th''), _XLIIimum_ (``42nd''), etc. By analogy, ordinal zero is rendered by the invented form _nullimum_.
In a multi-level indexing operation, ordinals may need to be specified in the genitive: _nulli_, _primi_, _secundi_, _tertii_, _quarti_..._XVimi_..._XLIIimi_, etc.
For example:
        $unimatrix[1][3][9][7];
would be:
        septimum noni tertii primi unimatrixorum
        # seventh of ninth of third of first of unimatrix
Note that the order of the genitives is significant here, and is the reverse of that required in Perl.
Floating point numbers are expressed in Perligata as Latin fractions:
        unum quartum                    # 0.25
        MMMCXLI Mimum                 # 3.141
Note that the numerator is always cardinal and the denominator ordinal (``one fourth'', ``3141 1000ths''). Technically, both should also be in the feminine gender -- _una quarta_, _MMMCXLI Mimae_ \-- but this Latin rule is not enforced in Perligata.
 
 
### Strings
Classical Latin does not use punctuation to denote direct quotation. Instead the verb _inquit_ (``said'') is used to report a direct utterance. Hence in Perligata, a literal character string is constructed, not with quotation marks, but by invoking the verbal noun _inquementum_ (``the result of saying''), with a data list of literals to be interpolated. For example:
        print STDOUT 'Enter next word:';
becomes:
        Enter tum next tum word inquementum tum biguttam egresso scribe.
Note that the arguments to _inquementum_ are special in that they are treated as literals. Punctuation strings have special names, such as _lacunam_ (``a hole'' -> space), _stadium_ (``a stride'' -> tabspace), _novumversum_ (``new verse'' -> newline), or _biguttam_ (``two spots'' -> colon).
Perligata does not provide an interpolated quotation mechanism. Instead, variables must be concatenated into a string. So:
        print STDERR "You entered $wordn";
becomes:
        You tum entered inquementum tum wordum tum novumversum oraculo scribe.
 
### References
To create a reference to a variable in a data role (target role variables are automatically enreferenced), the variable is prefaced with the preposition _ad_ (``to''). To create a reference to a subroutine, the associated verb is inflected with the accusative suffix _-torem_ (``one who...'') to produce the corresponding noun-of-agency.
For example:
        val inquementum datuum ad datum da.       # $dat{val} = $data;
        arg inquementum datuum ad arga da.        # $dat{arg} = @arg;
        act inquementum datuum functorem da.      # $dat{act} = &func;
A special case of this construction is the anonymous subroutine constructor _factorem_ (``one who does...''), which is the equivalent of `sub {...}` in Perl:
        anonymo da factorem sic haec mori cis.    # $anon = sub { die @_ };
As in Perl, such subroutines may be invoked by concatenating a call specifier to the name of the variable holding the reference:
        anonymume nos tum morituri inquementum.   # &$anon('Nos morituri');
Note that the variable holding the reference (_anonymum_) is being used as data, so it is named in the accusative.
In the few cases where a subroutine reference can be the target of an action, the dative suffix (_-tori_) is used instead:
        benedictum functori classum.              # bless &func, $class;
        benedictum factori sic mori cis classum.  # bless sub{die}, $class;
 
### Boolean logic
Perl's logical conjunctive and disjunctive operators come in two precedences, and curiously, so do those of Latin. The higher precedence Perl operators \-- `&&` and `||` \-- are represented in Perligata by the emphatic Latin conjunctions _atque_ and _vel_ respectively. The lower precedence operators -- `and` and `or` \-- are represented by the unemphatic conjunctive suffixes _-que_ and _-ve_. Hence:
resulto damentum foundum atque defum. # $result = $found && $def;
resulto damentum foundum defumque. # $result = $found and $def;
resulto damentum foundum vel defum. # $result = $found || $def;
resulto damentum foundum defumve. # $result = $found or $def;
Note that, as in Latin, the suffix of the unemphatic conjunction is always appended to the first word after the point at which the conjunction would appear in English. Thus:
$result = $val or max($1,$2);
is rendered as:
resulto damentum valum parprimumve tum parsecundum maxementum.
Proper Latinate comparisons would be odious in Perligata, because they require their first argument to be expressed in the nominative and would themselves have to be indicative. This would, of course, improve the positional independence of the language even further, allowing:
        si valus praecedit datum...              # if $val < $dat...
        si praecedit datum valus...              # if $val < $dat...
        si datum valus praecedit...              # if $val < $dat...
Unfortunately, it also introduces another set of case inflexions and another verbal suffix. Worse, it would mean that noun suffixes are no longer unambiguous. In the 2nd declension, the nominative plural ends in the same _-i_ as the genitive singular, and the nominative singular ending (_-us_) is the same as the accusative plural suffix for the fourth declension. So if nominatives were used, scalars could no longer always be distinguished from arrays or from hashes, except by context.
To avoid these problems, Perligata represents the equality and simple inequality operators by three pairs of verbal nouns as described in Table 5\.
**Table 5: Perligata comparison operators**
| ----- |
| **Perligata** | **Meaning** | **Perl** |
| _aequalitam_ | "equality (of...)" | ==  |
| _aequalitas_ | "equalities (of...)" | eq  |
| _praestantiam_ | "precedence (of...)" | < |
| _praestantias_ | "precedences (of...)" | lt  |
| _comparitiam_ | "comparison (of...)" | <=>  |
| _comparitias_ | "comparisons (of...)" | cmp  |
Each operator takes two data arguments, which it compares:
        si valum tum datum aequalitam...               # if $val == $dat...
        si valum praestantias datum...                 # if $val lt $dat...
        digere sic aum comparitiam bum cis lista.      # sort {$a<=>$b} @list;
Note that although _digere_ looks like an infinitive (i.e. a subroutine definition) it is in fact the imperative of _digerere_ (``to sort'') and is the Perligata keyword for `sort`. The philosophically inclined might choose to think of the confusion this engenders as a form of Instant Justice visited upon those who use `sort` in a void context.
The effects of the other comparison operators -- `&gt;`, `&lt;=`, `!=`, `ne`, `ge`, etc. -- are all achieved by appropriate ordering of the two arguments and combination with the the logical negation operator _non_:
        si valum datum non aequalitam...         # if $val != $dat...
        si datum praestantiam valum...           # if $val > $dat...
        si valum non praestantias datum...       # if $val ge $dat...
* * *
### Packages and classes
The Perligata keyword to declare a package is _domus_, literally ``the house of''. In this context, the name of the class follows the keyword and is treated as a literal; as if it were the data argument of an _inquementum_.
To explicitly specify a variable or subroutine as belonging to a package, the preposition _intra_ (``within'') is used. To call a subroutine as a method of a particular package (or of an object), the preposition _apud_ (``of the house of'') is used.
The Perl `bless` function is _benedice_ in Perligata, but almost invariably used in the scalar data role: _benedictum_.
Thus:
        domus Specimen.                        # package Specimen;
        newere                                      # sub new
        sic                                         # {
            meis datibus.                           #   my %data;
            counto intra Specimen
                postincresce.                       #   $Specimen::count++;
            datibus nullum horum benedictum.        #   bless %data, $[_0];
        cis                                         # }
        printere                                    # sub print
        sic                                         # {
            modus tum indefinitus inquementum mori. #   die 'method undefined';
        cis                                         # }
        domus princeps.                             # package main;
        meo objecto da                              # my $object =
                newementum apud Specimen.      #       Specimen->new;
        printe apud objectum;                       # $object->print;
 
# Putting it all together -- a Greek algorithm in Latin
The Sieve of Eratosthenes is one of oldest well-known algorithms. As the better part of Roman culture was ``borrowed'' from the Greeks, it is perhaps fitting that the first ever Perligata program should be as well:
        #! /usr/local/bin/perl -w
        use Lingua::Romana::Perligata;
        maximum inquementum tum biguttam egresso scribe.
        meo maximo vestibulo perlegamentum da.
        da duo tum maximum conscribementa meis listis.
        dum listis decapitamentum damentum nexto
            fac sic
                nextum tum novumversum scribe egresso.
                lista sic hoc recidementum nextum cis vannementa da listis.
            cis.
The `use Lingua::Romana::Perligata` statement causes the remainder of the program to be translated into the following Perl:
        print STDOUT 'maximum:';                  
        my $maxim = <STDIN>;                     
        my (@list) = (2..$maxim);
        while ($next = shift @list)             
            {
                print STDOUT $next, "n";
                @list = grep {$_ % $next} @list; 
            }
Note in the very last Perligata statement (_lista sic hoc...da listis_) that the use of inflexion distinguishes the `@list` that is `grep`'ed (_lista_) from the `@list` that is assigned to (_listis_), even though each is at the ``wrong'' end of the statement, compared with the Perl version.
 
 
# The implementation of the Lingua::Romana::Perligata module
The module itself is a very simple example of a source filter, and makes use of Paul Marquess's Filter::Util::Call module. The Perligata parser is invoked from a single subroutine, which is called as a filter on the source code, as described in the following sections.
 
 
## Filtering
The Filter::Util::Call module greatly simplifies the task of pre-filtering source code. A filtering module that uses Filter::Util::Call simply adds the command `filter_add({})` to its `import` subroutine. Then when the filtering module is itself used in some code, Filter::Util::Call looks in the filtering module's namespace for a subroutine called `filter`, which it calls. That `filter` subroutine can access the source code from the file that called the filtering module, and can modify that code as appropriate. Whatever string is in the variable `$_` when the `filter` subroutine returns is passed to the compiler as the final program source.
 
 
## Tokenization
For Lingua::Romana::Perligata, the `filter` subroutine conforms to the usual structure of a grammar-based parser/translator. It first invokes a tokenizer to break the Perligata source code into a sequence of tokens.
The tokenizer is very simple: it just splits the source on whitespace or on a period, and then classifies each word in the resulting list by matching it against a series of increasingly general patterns. Keywords are tested first, followed by numbers and numerals, punctuation variables, user-defined functions in scalar and list contexts, user-defined subroutines in void contexts, variables in a target role (i.e. datives), and finally variables in a data role (accusatives),
As each word is classified, it is converted to an object of the corresponding token type -- Keyword, Number, Var, Sub, etc. Each object stores the original word and its corresponding Perl construct. For example, the sequence _dum maxo maxa maxamentum damentum_ would yield a list equivalent to:
        (
            bless({ raw=>'dum',        perl=>'while' }, 'Conditional'     ),
            bless({ raw=>'maxo',       perl=>'$max'  }, 'Noun_Dative'     ),
            bless({ raw=>'maxa',       perl=>'@max'  }, 'Noun_Accusative' ),
            bless({ raw=>'maxamentum', perl=>'&max'  }, 'Verb_Resultative'),
            bless({ raw=>'da',         perl=>'='     }, 'Verb_Imperative  ),
        )
These objects then form a stream of tokens that is passed to the parser.
 
 
## Parsing
The position-independence of much of the Perligata grammar makes the task of parsing it quite challenging when using standard tools, which are typically predicated on lexical components appearing in rigidly defined sequences.
For example, to write a rule that matches Perligata subroutine calls, the following is required:
        Action: Dative AccusativeList Verb
              | AccusativeList Dative Verb
              | AccusativeList Verb Dative
              | Dative Verb AccusativeList
              | Verb Dative AccusativeList
              | Verb AccusativeList Dative
              | Accusative Verb Accusative
              | Dative Verb
              | AccusativeList Verb
              | Verb Dative
              | Verb AccusativeList
              | Verb
The difficulties are further compounded by the fact that targets and data can also be (the results of) other position-independent subroutine calls.
This produces a left-recursive grammar with an unusually large number of shift/reduce and reduce/reduce ambiguities (over 100 of each), which makes the grammar very sensitive to subrule precedence and to the ordering of productions within each rule. Appendix B shows the full grammar.
To cope efficiently with these constraints, an `LALR(1)` parser was built using François Désarménien's excellent Parse::Yapp module.
 
 
## Translation and execution
Each rule of the Perligata grammar contains an embedded action. Collectively these actions construct a full parse tree for the source code as the grammar parses it. Each node in the tree is a blessed object belonging to a class that represents the corresponding Perl construct. For example, after it has been parsed, the fragment _dum maxo maxa maxamentum damentum_ will have been converted to the following tree:
        bless( {
            condition =>
              bless( {
                 target =>
                    bless( { raw => 'maxo', perl => '$max' }, 'Var_Target'),
                 data =>
                    bless( {
                       raw  => 'maxamentum', perl => '&max',
                       data => [
                          bless( { raw=>'maxa', perl=>'@max'  }, 'Var_Data'),
                       ],
                    }, 'SubCall'),
                }, 'Assignment'),
             block =>
                undef,
        }, 'WhileLoop');
Once the tree is constructed, the equivalent Perl code is obtained by calling the method `codify()` on the root node of the tree. This recursively invokes the `codify()` methods of all the subnodes in the tree, each of which returns a string containing a Perl code fragment corresponding to the subtree at that node. By concatenating these fragments, a string containing the full Perl program is generated. This string is assigned to `$_` at the end of the `filter()` subroutine, to be compiled and executed automatically by Filter::Util::Call.
 
 
# Conclusion
Latin is a surprisingly good fit for Perl. The rich case structure provides an abundance of plausible mappings for Perl data types and subroutine calls, especially when Perl's own eclectic syntax and semantics are mapped onto the more regular ``action/target/data'' model.
The use of inflexion to denote semantic roles in a programming language offers an interesting variation from the ubiquity of positional syntax, replacing the requirement to recall syntactic rules with the requirement to remember suffixes. Which of these two tasks is easier will probably vary from programmer to programmer.
With the release of this module on the CPAN, the author looks forward to the advent of truly epic Perl poetry.
 
 
# Acknowledgements
Special thanks to John Crossley, Tom Christiansen, and Bennett Todd, for their invaluable feedback and suggestions. And my enduring gratitude to David Secomb and Deane Blackman for their patience in helping me struggle with the perplexities of the _lingua Romana_.
* * *
# Appendix A: Perligata dictionary
This appendix lists the complete Perligata vocabulary, except for Roman numerals (_I_, _II_, _III_, etc.)
In each of the following tables, the three columns are always the same: ``Perl construct'', ``Perligata equivalent'', ``Literal meaning of Perligata equivalent''.
Generally, only the accusative form is shown for nouns and adjectives, and only the imperative for verbs.
 
 
 
 
Table A1: Values and variables
 
| ----- |
| 0 | _nullum_ | "no-one" |
| 1 | _unum_ | "one" |
| 2 | _duo_ | "two" |
| 3 | _tres_ | "three" |
| 4 | _quattuor_ | "four" |
| 5 | _quinque_ | "five" |
| 6 | _sex_ | "six" |
| 7 | _septem_ | "seven" |
| 8 | _octo_ | "eight" |
| 9 | _novem_ | "nine" |
| 10 | _decem_ | "ten" |
| 1/2 | _secundum_ | "second" |
| 1/3 | _tertium_ | "third" |
| 1/4 | _quartum_ | "fourth" |
| 1/5 | _quintum_ | "fifth" |
| 1/6 | _sextum_ | "sixth" |
| 1/7 | _septimum_ | "seventh" |
| 1/8 | _octavum_ | "eighth" |
| 1/9 | _nonum_ | "ninth" |
| 1/10 | _decimum_ | "tenth" |
| $1 | _parprimum_ | "equal of the first" |
| $2 | _parsecundum_ | "equal of the first" |
| $3 | _partertium_ | "equal of the third" |
| $4 | _parquartum_ | "equal of the fourth" |
| $5 | _parquintum_ | "equal of the fifth" |
| $6 | _parsextum_ | "equal of the sixth" |
| $7 | _parseptimum_ | "equal of the seventh" |
| $8 | _paroctavum_ | "equal of the eighth" |
| $9 | _parnonum_ | "equal of the ninth" |
| $10 | _pardecimum_ | "equal of the tenth" |
| $/ | _ianitorem_ | "gatekeeper" |
| $#var | _admeta_ | "measure out" |
| $_ | _hoc/huic_ | "this thing" |
| @_ | _his/horum_ | "these things" |
| ":" | _biguttam_ | "two spots" |
| " " | _lacunam_ | "a gap" |
| "t" | _stadium_ | "a stride" |
| "n" | _novumversum_ | "new line" |
| local | _loco_ | "in place of" |
| my | _meo_ | "my" |
| our | _nostro_ | "our" |
| main | _princeps_ | "principal" |
 
 
 
Table A2: Quotelike operators
 
| ----- |
| '' | _inque_ | "say" |
| q// | _inque_ | "say" |
| m// | _compara_ | "match" |
| s/// | _substitue_ | "substitute" |
| tr/// | _converte_ | "translate" |
| y/// | _converte_ | "translate" |
 
 
 
Table A3: Mathematical operators and functions
 
| ----- |
| + | _adde_ | "add" |
| - | _deme_ | "subtract" |
| - | _nega_ | "negate" |
| * | _multiplica_ | "multiply" |
| / | _divide_ | "divide" |
| % | _recide_ | "lop off" |
| ** | _eleva_ | "raise" |
| ++ | _preincresce_ | "increase beforehand" |
| ++ | _postincresce_ | "increase afterwards" |
| \-- | _predecresce_ | "decrease beforehand" |
| \-- | _postdecresce_ | "decrease afterwards" |
| abs | _priva_ | "strip from" |
| atan2 | _angula_ | "create an angle" |
| sin | _oppone_ | "oppose" |
| cos | _accuba_ | "lie beside" |
| int | _decolla_ | "behead" |
| log | _succide_ | "log a tree" |
| sqrt | _fode_ | "root into" |
| rand | _conice_ | "guess, cast lots" |
| srand | _prosemina_ | "to scatter seed" |
 
 
 
Table A4: Logical and comparison operators
 
| ----- |
| ! | _non_ | "general negation" |
| && | _atque_ | "empathic and" |
| || | _vel_ | "emphatic or" |
| and | _-que_ | "and" |
| or | _-ve_ | "or" |
| < | _praestantiam_ | "precedence of" |
| lt | _praestantias_ | "precedences of" |
| <=> | _comparitiam_ | "comparison of" |
| cmp | _comparitias_ | "comparisons of" |
| == | _aequalitam_ | "equality of" |
| eq | _aequalitas_ | "equalities of" |
 
Table A5: Strings
 
| ----- |
| chomp | _morde_ | "bite" |
| chop | _praecide_ | "cut short" |
| chr | _inde_ | "give a name to" |
| hex | _senidemi_ | "sixteen at a time" |
| oct | _octoni_ | "eight at a time" |
| ord | _numera_ | "number" |
| lc | _deminue_ | "diminish" |
| lcfirst | _minue_ | "diminish" |
| uc | _amplifica_ | "increase" |
| ucfirst | _amplia_ | "increase" |
| quotemeta | _excipe_ | "make an exception" |
| crypt | _huma_ | "inter" |
| length | _meta_ | "measure" |
| pos | _reperi_ | "locate" |
| pack | _convasa_ | "pack baggage" |
| unpack | _deconvasa_ | "unpack" |
| split | _scinde_ | "split" |
| study | _stude_ | "study" |
| index | _scruta_ | "search" |
| join | _coniunge_ | "join" |
| substr | _excerpe_ | "extract" |
 
 
 
Table A6: Scalars, arrays, and hashes
 
| ----- |
| defined | _confirma_ | "verify" |
| undef | _iani_ | "empty, make void" |
| scalar | _secerna_ | "to distinguish, isolate" |
| reset | _lusta_ | "cleanse" |
| pop | _decumula_ | "unstack" |
| push | _cumula_ | "stack" |
| shift | _decapita_ | "behead" |
| unshift | _capita_ | "crown" |
| splice | _iunge_ | "splice" |
| grep | _vanne_ | "winnow" |
| map | _applica_ | "apply to" |
| sort | _digere_ | "sort" |
| reverse | _retexe_ | "reverse" |
| delete | _dele_ | "delete" |
| each | _quisque_ | "each" |
| exists | _adfirma_ | "confirm" |
| keys | _nomina_ | "name"  |
| values | _argue_ | "to disclose the contents" |
 
 
 
Table A7: I/O related
 
| ----- |
| open | _evolute_ | "open a book" |
| close | _claude_ | "close a book" |
| eof | _extremus_ | "end of" |
| read | _lege_ | "read" |
| getc | _sublege_ | "pick up something" |
| <>/readline | _perlege_ | "read through" |
| print | _scribe_ | "write" |
| printf | _describe_ | "describe" |
| sprintf | _rescribe_ | "rewrite" |
| write | _subscribe_ | "write under" |
| format | _pinge_ | "paint" |
| formline | _distingue_ | "intersperse" |
| pipe | _inriga_ | "irrigate" |
| tell | _enunta_ | "tell" |
| seek | _conquire_ | "to seek out" |
| STDIN | _vestibulo_ | "an entrance" |
| STDOUT | _egresso_ | "an exit" |
| STDERR | _oraculo_ | "a place were doom is pronounced" |
| DATA | _fonti_ | "a well-spring" |
 
 
 
Table A8: Control
 
| ----- |
| `{`...} | _sic...cis_ | "as follows...to here" |
| do | _fac_ | "do" |
| sub {...} | _factorem sic...cis_ | "one who does as follows...to here" |
| eval | _aestima_ | "evaluate" |
| exit | _exi_ | "exit" |
| for | _per_ | "for" |
| foreach | _per quisque_ | "for each" |
| goto | _adi_ | "go to" |
| if | _si_ | "if" |
| return | _redde_ | "return" |
| unless | _nisi_ | "if not" |
| until | _donec_ | "until" |
| while | _dum_ | "while" |
| wantarray | _deside_ | "want" |
| last | _ultimus_ | "final" |
| next | _posterus_ | "following" |
| redo | _reconatus_ | "trying again" |
| continue | _confectus_ | "complete" |
| die | _mori_ | "die" |
| warn | _mone_ | "warn" |
 
Table A9: Packages, classes, and modules
 
| ----- |
| -> | _apud_ | "of the house of" |
| :: | _intra_ | "within" |
| bless | _benedice_ | "bless" |
| caller | _memora_ | "recount a history" |
| package | _domus_ | "house of " |
| ref | _agnosce_ | "identify" |
| tie | _liga_ | "tie" |
| tied | _exhibe_ | "display something" |
| untie | _solve_ | "to untie" |
| require | _require_ | "require" |
| use | _ute_ | "use" |
 
Table A10: System and filesystem interaction
 
| ----- |
| chdir | _demigrare_ | "migrate" |
| chmod | _permitte_ | "permit" |
| chown | _vende_ | "sell" |
| fcntl | _modera_ | "control" |
| flock | _confluee_ | "flock together" |
| glob | _inveni_ | "search" |
| ioctl | _impera_ | "command" |
| link | _copula_ | "link" |
| unlink | _decopula_ | "unlink" |
| mkdir | _aedifica_ | "build" |
| rename | _renomina_ | "rename" |
| rmdir | _excide_ | "raze" |
| stat | _exprime_ | "describe" |
| truncate | _trunca_ | "shorten" |
| alarm | _terre_ | "frighten" |
| dump | _mitte_ | "drop" |
| exec | _commuta_ | "transform" |
| fork | _furca_ | "fork" |
| kill | _interfice_ | "kill" |
| sleep | _dormi_ | "sleep" |
| system | _obsecra_ | "entreat a higher power" |
| umask | _dissimula_ | "mask" |
| wait | _manta_ | "wait for" |
 
Table A11: Miscellaneous operators
 
| ----- |
| , | _tum_ | "and then" |
| . | _sere_ | "conjoin" |
| .. | _conscribe_ | "enlist" |
| | _ad_ | "towards" |
| = | _da_ | "give" |
* * *
# Appendix B: Perligata grammar
Script:         Statement(s)
Statement       ( Conditional | Imperative | Data | Target ) '.'
Conditional:      Control Block
                | Imperative Control
Control:          'dum'   Data  'fac'                      # while
                | 'donec' Data  'fac'                      # until
                | 'si'    Data  'fac'                      # if
                | 'nisi'  Data  'fac'                      # unless
                | /per (quisque)?/ NOUN_ACCUSATIVE(?)
                                   'in' Target 'fac'       # foreach A (B)
Block:            'sic'  Script  'cis'
Imperative:       Data Verb Data
                | Target Datalist Verb
                | Datalist Target Verb
                | Datalist Verb Target(?)
                | Target Verb Datalist(?)
                | Verb Target Datalist(?)
                | Verb Datalist Target(?)
                | Verb
Target:           NOUN_DATIVE
                | POSSESSIVE  NOUN_DATIVE               # my $A, local @B, etc.
                | Block
                | Data  Resultative_dative  Data
                | Target  Datalist  Resultative_dative
                | Datalist  Target  Resultative_dative
                | Datalist  Resultative_dative  Target(?)
                | Target  Resultative_dative  Datalist(?)
                | Resultative_dative  Target  Datalist(?)
                | Resultative_dative  Datalist(?)  Target(?)
                | Target  'intra'  Accusative           # B::A
                | Target  'apud'  Dative                # B->A
                | Target NOUN_GENITIVE                  # B[A]
                | 'factori' Block                       # sub {...}
Accusative:       NOUN_ACCUSATIVE
                | 'ad'  Accusative                      # A
                | Accusative  'intra'  Accusative       # B::A
                | Accusative  'apud'  Accusative        # B->A
                | Accusative  NOUN_GENITIVE             # B[A]
                | 'factorem' Block                      # sub {...}
Data:             Accusative
                | 'nega'  Data                          # -A
                | 'non'  Data                           # !A
                | Data  Resultative_accusative  Data
                | Target  Datalist  Resultative_accusative
                | Datalist  Target  Resultative_accusative
                | Datalist  Resultative_accusative  Target(?)
                | Target  Resultative_accusative  Datalist(?)
                | Resultative_accusative  Target  Datalist(?)
                | Resultative_accusative  Datalist(?)  Target(?)
Verb:             VERB
                | Verb  'intra'  Accusative             # B::A()
                | Verb  'apud'  Dative                  # B->A()
Resultative_dative:              
                  RESULTATIVE_DATIVE
                | Resultative_dative  'intra'  Accusative       
                | Resultative_dative  'apud'  Dative
Resultative_accusative:          
                  RESULTATIVE_DATIVE
                | Resultative_accusative  'intra'  Accusative   
                | Resultative_accusative  'apud'  Dative        
Datalist:         Datalist  'tum'  Data
                | Data
[1]: mailto:damian%40csse.monash.edu.au
[2]: http://www.csse.monash.edu.au/~damian