LTPL : Loki Text Processing Language Manual
Table of Contents
1 Introduction
1.1 What Is This Exactly ?
LTPL is a declaritive text processing and manipulation utility that specializes in making common text processing tasks as easily and tersely as possible.
1.2 Who Is This For ?
This is a tool mainly for programmers and users who require ridgit formated text output. Specifically targeting those who would traditionally use awk to do so.
1.3 Why Does This Exist ?
I wanted a utility for text processing that was more flexible than Sed but more tearse than awk that that didnt contain the C-isms that was carried over as bagage from the Bell Labs Unix style thought process when designing awk.
1.4 What This Tool Is
1.4.1 It is a Utility For Text Manipulation
LTPL is more a utility along the lines of sed and awk than an actual language you would write in a file. It ment to be embedded in bash scripts and to be used either along side or be more versitle than more traditional tools.
1.5 What This Tool Is Not
1.5.1 It Is Not Complete Replacement For Awk.
I am only comparing my language periodically to awk as the closest parallel of the language that exists and is in relitive popular use today. I am aware that awk is a more traditional language with a much more familar syntax that is ment to write programs and scripts that may be run from a file or piped in from another program. however the purpose of LTPL is explicitly for reading in and displaying data.
1.5.2 It Is Not A Scripting Language
This tool is not for all intensive purposes is not a scrihpting language nor should it ever be used as one. LTPL is a tearse and simple text processing utility and nothing more
1.6 Why The Cryptic Syntax
LTPL's syntax and grammar is very simple aiming and being simple to use in a terminal enviroment. With this consideration taken a APL or Regular expression like syntax was considerd appropriate for a speedy interactive usage that I was trying to achieve.
2 Essentals
There are four core concepts exibit themselves in LTPL. The language its self is rather simple which makes it a very powerful tool for parsing text.
- Reader Directives
- Actions
- Objects
- Regular Expressions
2.1 Reader Directives
2.1.1 What Is A Reader Directive
A Reader Directive is a command to the intepreter that directs the manner in which the input file will be parsed.
There are two main Reader Directive types in LTPL.
- Parsing Directives
- Range Specifiers
- Parsing Directives
Parsing Directives which controls how the file is able to be read. there are a variety of different ways text can be formatted. It is not always appropriate to read file left to right.
and yes im sorry ltpl is 1 indexed throught the language, but there is a very good reason
Reader Directives Parse Description $FS Default Starting Cursor Position == parse left to right " " Space (1,1) | | column by column "\n" Newline (1,1) ^| column by column "\n" Newline (1,n) -- Read a single line " " Space (1,n) <= read right to left " " Space (n,1) where n is the number of elements in that row or column
Here is an example on how reader directives can be used with some formated input
$ ls -l drwxr-xr-x 2 user user 4096 Jan 20 19:42 Desktop drwxr-xr-x 2 user user 4096 Feb 4 00:36 Documents drwxr-xr-x 3 user user 4096 Feb 6 23:16 Downloads drwxr-xr-x 2 user user 4096 Jan 20 19:42 Music drwxr-xr-x 2 user user 4096 Jan 20 19:42 Pictures drwxr-xr-x 2 user user 4096 Jan 20 19:42 Public drwxr-xr-x 2 user user 4096 Jan 20 19:42 Templates drwxr-xr-x 2 user user 4096 Jan 20 19:42 Videos $ ls -l | ltpl "==$5[*:1024][p]." Output: 4194304 4194304 4194304 4194304 4194304 4194304 4194304 4194304
where we are getting the 5th element and multiplying it by 1024 and printing the output
Another way to achieve the same behavior but more efficently would be to do
$ ls -l drwxr-xr-x 2 user user 4096 Jan 20 19:42 Desktop drwxr-xr-x 2 user user 4096 Feb 4 00:36 Documents drwxr-xr-x 3 user user 4096 Feb 6 23:16 Downloads drwxr-xr-x 2 user user 4096 Jan 20 19:42 Music drwxr-xr-x 2 user user 4096 Jan 20 19:42 Pictures drwxr-xr-x 2 user user 4096 Jan 20 19:42 Public drwxr-xr-x 2 user user 4096 Jan 20 19:42 Templates drwxr-xr-x 2 user user 4096 Jan 20 19:42 Videos $ ls -l | ltpl "||$5[*:1024][p]." 4194304 4194304 4194304 4194304 4194304 4194304 4194304 4194304
The only difference here from the example above is the way that the interpter reads the information. instead of reading every single field starting with drwxr-xr-2 and ending when we find $5 which in this case is 4096. we can get entire columns of text just by reading by column.
Examples of the other reader directives being used can be found at …. (havent made a place for it yet)
- Range Specifiers
There are a lot of times we want to ommit certain places where we have junk in a file. By junk I dont really mean garbage in the sense that its not important but I mean that its not applicable for what we need.
you can achieve this by using Range Specifiers which controls what field the given lines of the input will be read.
A Position Specifier may be used in combitation with a Parser Directive to give more flexibilty to the user; detailing what subset of data of the input that will be read.
given the following syntax.
"||0,1"
which reads each column skipping the first column entirely
A practical application to get all of the numbers on line 5 would be
$ cat file.txt John Doe March 21st, 2022 John_The_Doe@hotmail.com (because I feel hotmail is funny) 10 20 30 40 50 hello good bye $ ltpl file.txt "--5,5[p]" 10 20 30 40 50
where it reads only one singular line of text at line 5 and prints it
2.2 Objects
2.2.1 What Are Objects ?
Objects in ltpl really are one of the most complex things in the entire language. but even though they are complex they are also rather simple in structure. Objects are hiearchical and able to be indexed quite simply as well.
\Tree [.S a [.NP b c ] d ] \Tree [.S a [ b c ].NP d ]
2.2.2 Object Primitives
Object Primitives are a what you would call literals in any other language but there is a quirk of all objects in ltpl. every object has children represented as numbers. By default you are adressing the children of $0 when refering to lines.
2.2.3 Fields And Implict Objects
- Fields
Fields are the way that LTPL treats columns of text that are seperated by the $FS implicit object.
- Implicit Objects
Name Description Type $0 The 0th field refering to the whole line of text. That contains an object array of Object Array $n The $nth field that refers to an object that is broken up into fields by the $fs Integer Object $FS the delimiting character(s) that designantes the seperation of new tokens by a user defined or LTPL specified token String Object $CL the current line being read $PL previous line read $NL the next line to be read. $NLR the number of lines that have been read. $NTR the number of lines that need to be read. $RED An object $GREEN $BLUE $BLACK $WHITE $CYAN $MAGENTA $YELLOW $PURPLE $PINK $ORANGE where n is the number of fields
2.2.4 User Defined Objects
Objects In LTPL are able to be created by referancing a nonexistant object by using the $. typing the example nonsensecal case "$bar" is a proper object declaration.
2.2.5 Assignement
Defining new variables along with reassigning existing ones are core parts of any programming language (except for the haskell purists out there) LTPL is no different but has a quirk to do so.
- Examples
LTPL example.txt "==$bar[10]."
LTPL example.txt "==$foo."
both of which are valid instances of objects where foo is assigned to an empty string by default and bar is assigned to 10
$ ls
drwxr-xr-x 2 user user 4096 Jan 20 19:42 Desktop
drwxr-xr-x 2 user user 4096 Feb 4 00:36 Documents
drwxr-xr-x 3 user user 4096 Feb 6 23:16 Downloads
drwxr-xr-x 2 user user 4096 Jan 20 19:42 Music
drwxr-xr-x 2 user user 4096 Jan 20 19:42 Pictures
drwxr-xr-x 2 user user 4096 Jan 20 19:42 Public
drwxr-xr-x 2 user user 4096 Jan 20 19:42 Templates
drwxr-xr-x 2 user user 4096 Jan 20 19:42 Videos
ls -l | LTPL "==$3[~=$3]$foo[p].
Output:
drwxr-xr-x 2 user 4096 Jan 20 19:42 Desktop
drwxr-xr-x 2 user 4096 Feb 4 00:36 Documents
drwxr-xr-x 3 user 4096 Feb 6 23:16 Downloads
drwxr-xr-x 2 user 4096 Jan 20 19:42 Music
drwxr-xr-x 2 user 4096 Jan 20 19:42 Pictures
drwxr-xr-x 2 user 4096 Jan 20 19:42 Public
drwxr-xr-x 2 user 4096 Jan 20 19:42 Templates
drwxr-xr-x 2 user 4096 Jan 20 19:42 Videos
2.2.6 Object Arrays And Subfields
Subfields and object arrays are more or less equvilent. with the only real difference is in what is being refered. to put it simply:
if it is a field it will be refered to as a subfield. if it is a user defined object it is refered to as an object array. The encomapssing term between the two is object array
- Refering To Object Arrays
Below is a dummy file with that we will parse.
Example.txt:
hello world this is a tjest.if you notice there is a spelling error
you can fix such a minute error like so
LTPL Example.txt "–$6$2[~=$6$2]$6$1[>>]$0[p]."
where $6 is the 6th field while refering to the 2nd object field. deleting the object in said field and moving the object on the left next to it over where the previous object resided
2.3 Actions
2.3.1 What Is An Action
An action is the primary enact changes to Objects. they are syntactically represented within [].
- Example
$ ls -l
drwxr-xr-x 2 user user 4096 Jan 20 19:42 Desktop
drwxr-xr-x 2 user user 4096 Feb 4 00:36 Documents
drwxr-xr-x 3 user user 4096 Feb 6 23:16 Downloads
drwxr-xr-x 2 user user 4096 Jan 20 19:42 Music
drwxr-xr-x 2 user user 4096 Jan 20 19:42 Pictures
drwxr-xr-x 2 user user 4096 Jan 20 19:42 Public
drwxr-xr-x 2 user user 4096 Jan 20 19:42 Templates
drwxr-xr-x 2 user user 4096 Jan 20 19:42 Videos
ls -l | LTPL "==$6[p]"
Output:
Jan
Feb
Feb
Jan
Jan
Jan
Jan
Jan
2.3.2 Actions Predefined
2.3.3 Why Can't I Define My Own Actions
well there is a simple answer to that. LTPL is not a scripting language. If you feel you need to define your own actions to make a certain action easier. you should look at some other language. consider using AWK or perl. heck sed can be useful in some circumstances.
2.3.4 Possible Actions
Name | Symbolic Name | Description | Possible Arguments | Examples | program description | |
---|---|---|---|---|---|---|
p | Prints an object to stdout | p -red -green -blue | none | —$1[p 255,0,0]. | |||
write | w | writes objects to a file | w -filename | -filename | —$0[w file.txt]. | ||
filter | ~ | removes if condition is true | ~ -logical operator -object | —$0[~=10][p]. | ||
cast to type | -> | given an object it converts it to the type of a given object | -> -object | —$1[-> 10][p] | ||
ternary | ? | does the next action if true the other if false | ? - logical operator object | —$1[? = 10]Success[p 0 255,0]fail[p 255,0,0]. | ||
italic | i | none | —$0[i][p]. | |||
bold | b | none | ||||
underline | _ | underlines an object | none | —$0[_][p]. | ||
highlight | # | highlights an object | none | —$0[ | ||
shift down line | VV | shfits an object down into the line below it. | ||||
shift up line | ^^ | |||||
Swap lines | ||||||
Move Right | >> | shifts an object right by one field replacing the object that inhabited that location | none | —$1[>>]$2[p]. | ||
Move Left | << | shifts an object left by one field replacing the object that inhabited that location | none |
3 Implentation Details
4 Examples
5 Benchmarks
ure Ideas