Provisional documentation for lo

This is mainly so I don't forget the features I have added. The version number is now considered to be 1.13, but the code should be considered beta, since I ran out of numbers at version 0.99 (which I now consider the alpha code, since only Tapestry group have had true access to it).

Version 1.x represents a complete rewrite of lo; mainly to clean up some outer level constructs. I made a paradigm shift from approaching it like a buffered filter (line in; keep it for reference til trigger; buffer out) to a fully buffered system (read a chunk into buffer; process the buffer; output it). While this results in greater memory requirements, it is conceptually easier to program this way. Also as lo 0.x was essentially my first significant perl program, I tended to use +substr+ and +index+ constructs, whereas now, as a true believer, the model favours [t[s///t].

Parsing priority

In a fully buffered process, strict encountered line precedence may be relaxed somewhat by making multiple passes through the buffer.

Zero buffer pass (buffer normalization)

This normalizes the buffer in certain ways to ease coding of pattern matching. Notably:

First buffer pass

The elements of the first pass will be seen to be essentially variations on a common theme. With one or two exceptions, directives for the first pass may be identified as single character escapes. This pass is modal and concerned, in rough order of ???, with:

The following text-level filtering is currently done as in the second pass but conceptually belong in this pass and will be moved here after debugging:

{~typestack underflow~}

Escape literal sequences

Any character or digram (two character sequence) may be escaped by placing a backslash \ directly before it. To place a backslash, use two \\.

Most digrams may also be escaped by placing the backslash between the two characters, thus effectively disabling matches on it, (eg [\;). However backslashing the digram as a whole (eg \) is silently converted to the first form, so that it works too.

Long sequences of light may be escaped as a whole by use of the : directive as the only character of a line.

Long sequences of HTML may be escaped as a whole by use of the " directive as the only character of a line.

Note that escapement in this context means escapement of the character as it will eventually be displayed. Thus "\>" will probably still be translated into a > so that it is display properly in the HTML browser.

Commenting conventions

The following commenting conventions are implemented:

Mode directives

The following single characters, when they are the only character on a line of text, toggle the processing mode:
  • escape light directive
    escape html directive
  • line for line directive
    full literal (preformatted) directive
    + table directive
    ? form directive
    = code directive
    & script directive
    ! meta-directive (reserved, not implemented this version)
    . is reserved, for compatiblity with mail etc
  • Import / export directives

    The following characters (currently implemented only when they occur in column 1 of a line, indicate an import/export/pipe directive:

    Second pass

    Link creation shorthand

    Summary

    Most of lo is not specific to hypertext, but is more generally and aesthetically pleasant alternative to more rigid markup schemes. lo is designed to be intuitive, ergometric, and forgiving.

    Text is normalized and the most common text attributes are implemented

  • lists
  • block quotes
  • headings - rulers

    A very literal minded link shorthand is introduced; one which translates directly to the standard HTML HREF and IMG conventions.

    Provisions are made for stream processing including importing and exporting and the creation of dedicated filters to handle forms, tables, and frames.


    lo version 0.94 patch 134/026
    24 June 1997