Builtin functions (M0 0.3 macro processor)

5 Builtin functions

5.1 Introduction

The builtin functions are not immediately accessible after startup. At startup they must first be linked to a macro name using the 0_define macro. This is the only macro available at startup.

The arguments should be placed on the first stack (stack a) by using a pattern. The pattern 0 is defined at startup. The syntax of this pattern is used in the examples below. The first argument is placed on position 1 on the stack. The position 0 on the stack has the name of the macro.

In the description below the arguments with square hooks ([...]) indicate optional arguments. The arguments are not placed in a defined syntax, because you can determine the syntax!

5.2 Functions for defining macros

This is probably the most complex macro to start with, but since it is the first macro to be used it is best to understand first.

builtin: define name [definition] [builtin function] macro-settings [argument-pattern] [substitute-pattern] [program-instructions] [macro-set]

This function defines a new macro. The 0_define macro can be used for this at startup. For an overview how the options for the different parts function see The complete working of macros.

The name is the name of the new macro, must exist and have at least one character. The name can consist of any character or byte or a set of characters or bytes. Characters can be grouped with straight hooks (left hook [ and right hook ] are the default) to form a set at the specific position in the name. For example:

[Nn]ame

means the word "name" starting with a small or capital n. Also special characters defined with specialc (see Functions for setting symbols) can be used to define sets or strings.

The name can have a maximum of 64 positions. This is a limit invoked by the implementation and can not be overcome without substantial changes to the program.

The [definition] replaces the macro name. The [definition] is normally used in user macros and normally not when linking to builtin macros. However it can be used with builtin macros to output a text different from the output of the builtin. Special codes or words (e.g. $1 as in m4) can be used to fill in the arguments of the macro.

The [builtin function] determines the builtin function to be called. This function will be called irrespective of the other options and settings. The pattern for collecting arguments and / or the program should have placed all the necessary arguments on the stack for the correct execution of the builtin function.

The macro-settings should be set, otherwise the default (nn00) will be used. This default is normally not very useful. The macro-settings is a string of a specific length with characters at a specific position having a specific meaning. The positions in the string have the meanings:

1

Is the definition recursively processed (r) or not (n or anything other than r) by the macro processor.

2

Is recursion for macro recognition during processing of the arguments active (r) or not (n or anything other than r).

3

Number (hexadecimal; 0 – f) defining the pre-size of the macro name.

4

Number (hexadecimal) defining the post-size of the macro name or the character "S".

The "S" sets the post-size to the length of the macro minus the pre-size. This can be used to detect the start of a pattern by a macro and include the macro in the argument processing, especially by using variable length macros. In this case, the macro name in argument 0 is as if the post-size is zero.

A warning:

Using the "S" makes the macro have a length of zero when processing the text. This has the consequence that aborting the argument processing will lead to an infinite loop of detecting the macro again. This could be overcome by setting a proper virtual character.

5

Optional virtual character. This character is output to the algorithm but not to the output. This can be used if a pre-size is set in macros so that these macros can be recognised after a preceding macro. Mostly for emulation of macros as words surrounded by white space. An example of the difference (see the difference at the "rr00"):

# A simple define.
0_define:Define:;;define;nr01;0;;stack_d "" putarg3 "rr00"
putarg4 "0" putarg5 "" putarg6 "" putarg7 

Define:test;Hello World!
Define:aa;te
Define:bb;st

aabb

# A simple define with virtual char.
0_define:Define_:;;define;nr01;0;;stack_d "" putarg3 "rr00 "
putarg4 "0" putarg5 "" putarg6 "" putarg7 

Define_:cc;te
Define_:dd;st

ccdd

This example will result in:

# A simple define.


Hello World!

# A simple define with virtual char.


test

An example of macro-settings:

nr01

The meaning of this is: no macro recursion, recursion during argument processing, no (0) pre-size, post-size of 1.

The [argument-pattern] is the name of the pattern to be used for collecting arguments.

The [substitute-pattern] is the name of the pattern to be used for substitution of special characters with the arguments in the definition. If this option is not used the default argument substitution is used. If this option exists but the definition string is empty, then the first argument is used in place of the definition. This can be used for macros such as the format macro of m4.

The [program-instructions] is a program that is executed just after the arguments are collected and before the builtin macro is executed or argument substitution. This can be used to set arguments to fixed values.

The optional [macro-set] is the name of a set of macros that will be used to define the macro in. If this option is not set, the macro is defined in the current active set of macros. The default macro set name at start is 0. If the macro set does not yet exist, it will be initialised.

The only macro 0_define available at the start using the pattern 0 for argument collection is defined as:

0_define:0_define:;;define;nr01;0

This definition has

0_define:: (second 0_define:) as the name of the macro. It is including the start (:) of the argument collection.
define: Is the builtin function.
nr01: The macro-settings meaning: no macro recursion (not necessary since no output), recursion during argument processing, no (0) pre-size, post-size of 1 (this is the : of the argument collection).
0: The pattern (named 0) for argument collection.

Output: none.

builtin: push name [definition] [builtin function] macro-settings [argument-pattern] [substitute-pattern] [program-instructions] [macro-set]

Similar to define, but saves the previous definition. The previous definition can be retrieved with pop. It redefines the user macro name using the same arguments as used in define.

Output: none.

builtin: undefine name [macro-set]

Deletes the macro name. The macro does not exist anymore after this.

If the [macro-set] is used then the macro is deleted from this macro set. If [macro-set] is not defined, then the current macro set is used. If the macro set named in [macro-set] does not exist, an error is output.

Output: none.

builtin: pop name [macro-set]

Retrieves the previous definition of the macro name.

If the [macro-set] is used then the macro is retrieved using this macro set. If [macro-set] is not defined, then the current macro set is used. If the macro set named in [macro-set] does not exist, an error is output.

Output: none.

builtin: copy from-name to-name [from macro-set] [to macro-set]

Copies the macro or mcall with from-name to the macro with to-name.

If the [from macro-set] is used then the macro is retrieved from this macro set. If the [to macro-set] is used then the macro is copied to this macro set. If the [from macro-set] or the [to macro-set] is not defined, then the current macro set is used.

If the macro set named in [from macro-set] or [to macro-set] does not exist, an error is output.

Output: none.

builtin: pushcopy from-name to-name [from macro-set] [to macro-set]

Similar to copy, but saves the previous definition of the to-name macro. The previous definition can be retrieved with pop.

Output: none.

builtin: set_info name info [macro-set]

With this function a special information text can be set in a macro. This information text can be retrieved with the function info by outputting the ninth argument.

The name is the name of the macro for which to set the information text.

The info is the information text.

If the [macro-set] is used then the macro from this macro set is used. If [macro-set] is not defined, then the current macro set is used. If the macro set named in [macro-set] does not exist, an error is output.

Output: none.

5.3 Macro sets

Macros are defined in a set of macros. This allows to have macros active at different times when processing an input and not having to spilt an input and processing the parts separately.

Only one set of macros can be active at one time. It is thus not so flexible as to be able to activate and deactivate certain macros.

The number of macro sets has no impact on normal performance. There is however more memory used to store the macro sets.

builtin: macroset macro-set

Switches to the macro set with name macro-set. This macro set must already exist, otherwise this is an error and the macro set is not changed.

Only macros available in the new macro set can be used after switching. The function define can for example be used to define macros in a macro set.

Output: none.

builtin: defmcall name [called-macro] [called macro-set] mcall-settings argument-pattern [program-instructions] [macro-set]

Defines a macro that will call a macro (mcall) in a macro set. The mcall enables the call to a macro (or an mcall) in another macro set without first switching to this macro set.

The mcall is similar to a macro, except that it calls another macro instead of a built in function or a definition. It calls the macro with the name in [called-macro] or when this is empty the macro name in first argument on the stack when the mcall is invoked. It will collect the arguments using the argument-pattern and run the [program] before the called macro is executed.

The called macro will not perform argument collection since this has already been done by the mcall. This has the implication that the syntax for argument collection as defined in the mcall (by the argument-pattern) is used and not the syntax as defined in the called macro.

Apart from the argument collection, the called macro will execute all the defined settings in the macro set of the called macro (the [called macro-set]). This means that if macros are used in the definition string of the called macro, these macros should exist in the called macro set if they should expand there and not expand in the current macro set.

The name is the name of the new macro call, must exist and has the same possibilities as a macro name.

The [called-macro] is the name of the called macro. The definition of the called macro at the time of the mcall is used. The [called-macro] should be written in a form that would normally be recognized in the processed text. If the [called-macro] is not defined, the first argument on the stack is used as the called macro name. If the called macro does not exist an error will result.

If the first argument on the stack is used as the called macro name, the stack is adapted so that the called macro will see the arguments shifted by one. The first argument for the called macro is the second argument for the mcall. For the called macro, its name is on position zero of the stack.

The called macro is expanded in the [called macro-set]. If the [called macro-set] is not defined the current macro set is used as the macro set for the called macro. If the called macro set does not exist, it will be initialised.

The mcall-settings is the same as the macro-settings in define and should be set. The recursive processing of the definition can be overridden by the called macro using the instruction recur_y or recur_n.

The [argument-pattern] is the name of the pattern to be used for collecting arguments in the same way as for macros.

The [program-instructions] is a program that is executed just after the arguments are collected in the same way as for macros.

The optional [macro-set] is the name of a set of macros that will be used to define the macro call in. If this option is not set, the macro call is defined in the current active set of macros. If the macro set does not yet exist, it will be initialised.

The undefine function can be used to delete the mcall.

Output: none.

The following figure is a symbolic overview of the working of mcall when the [called-macro] holds the name of a macro:

in:  | | | |m|a|c|r|o| |n|a|m|e|(|a|r|g|u|m|e|n|t|s|)| | |
           +-------------------+-+-------------------+
                macro name     : :     arguments
           +-+                 +-+            
         pre-size            post-size     |
                               : :         v
                               : :  
                               | | | | | | | | | | | | |
                               +-+---+-----+-+----------
                           pattern for collecting arguments +
                                     small programs
    
                                           |
                                           |          argument n
                                           +----->    ----------
                                                         ...
       optional program for stack manipulation -->    ----------
                                                      argument 3
                                                      ----------
                                                      argument 2
   [name of macro] --->  called macro  <---           ----------  
                                                      argument 1  
                             |                        ----------
                             v    
          
out: | | | | |m|a|c|r|o| |o|u|t|p|u|t| | | | | | | | | | |
           +-+---------------------------+
           +-+   replacement of macro
       pre-size

The following figure is a symbolic overview of the working of mcall when the [called-macro] is empty:

in:  | | | |m|a|c|r|o| |n|a|m|e|(|a|r|g|u|m|e|n|t|s|)| | |
           +-------------------+-+-------------------+
                macro name     : :     arguments
           +-+                 +-+            
         pre-size            post-size     |
                               : :         v
                               : :  
                               | | | | | | | | | | | | |
                               +-+---+-----+-+----------
                           pattern for collecting arguments +
                                     small programs
                                           |
                                           |          argument n
                                           +----->    ----------
                                                         ...
       optional program for stack manipulation -->    ----------
                                                         ...
                                                      ----------
                    ...               <---               ...
              --------------                          ----------
              new argument 2          <---            argument 3
              --------------                          ----------
              new argument 1          <---            argument 2
              --------------                          ----------  
              new argument 0  <-+- name of macro <--- argument 1  
                     |          |                     ----------
                     v          |                      
               called macro   <-+
                     |
                     v    
out: | | | | |m|a|c|r|o| |o|u|t|p|u|t| | | | | | | | | | |
           +-+---------------------------+
           +-+   replacement of macro
       pre-size

builtin: pshmcall name [called-macro] [called macro-set] mcall-settings argument-pattern [program-instructions] [macro-set]

Push macro call is similar to defmcall, but saves the previous definition. The previous definition can be retrieved with pop. It redefines the macro call name using the same arguments as used in defmcall.

Output: none.

5.4 Functions for defining patterns

builtin: pattern name pattern-part program-instructions [priority]

This functions adds pattern-part to a pattern named name. A pattern with name is automatically initialised when not already used before.

See Patterns for the possibilities of the pattern-part.

The program to be executed when the pattern part is triggered is defined in program. Normally every pattern part has a program, but it can be left out.

The [priority] is used in a special case. Normally a pattern will not be triggered if at the same position a macro is triggered. This is based on the logic that a macro will change the text and that at this position no pattern should also be triggered with the old input. The [priority] can overrule this behaviour by setting a priority of 1 or larger.

An example of a special case where this is used is the pattern for argument collection in the m4 emulation:

0_pattern:m4;[(,][@00- ]*[!-@ff];stack_a begin-1 ;1

This pattern part sets the begin of an argument after optional whitespace ([@00- ]*) and has to include a first non whitespace character ([!-@ff]). It will set the begin of the argument directly before the non whitespace character (the begin-1 instruction). The non whitespace can however also be a macro, like the quote macro. In this case both the pattern part and the macro should be triggered.

Output: none.

builtin: append_p name pattern-part program-instructions [priority]

This function appends pattern-part to the previous pattern part in the pattern named name. It has the same arguments as pattern.

The pattern-part matching will only start when the previously defined pattern-part has triggered.

Output: none.

builtin: clr_pat name

Clears an existing pattern named name.

The pattern still exists after this function call, but it is empty. This can be used to redefine a syntax.

Output: none.

builtin: copy_pat from to

Copies an existing pattern named from to to.

This can be used to reuse a pattern and extend it further in a new pattern.

Output: none.

builtin: program name program-instructions

Defines a program that can be called from other programs.

The name should have a maximum of 8 characters. This is the name that is used to call this program using @name. The program should already be defined before the call instruction can be used, because the instructions are converted into internal codes when a program is defined. If the program is not yet known, then it is not possible to convert the call to an internal code.

Output: none.

5.5 Functions for setting symbols

builtin: specialc special-character string

This function defines a special character (special-character) that expands to string in the name of macros or in pattern-parts.

This can be used as a short means of writing repeatedly longer strings. The special character is used by preceding it with "\" in macro names or pattern parts. This preceding character can be set using the charpat function.

The special-character is a single character and can have any byte value.

A redefinition of the special character has no influence on already use of it in macro names and pattern parts.

Output: none.

builtin: charpat pattern-settings

This function defines the characters with a special meaning used in macro names and pattern parts.

The pattern-settings is a string of a specific length (9 characters) with characters at a specific position having a specific meaning. The positions in the string have the meanings:

1: Is the character used as opening brace for character sets. This is "[" by default.
2: Is the character used as closing brace for character sets. This is "]" by default.
3: Is the character used to indicate range in a character set. This is "-" by default.
4: Is the character used to indicate a special character. This is "\" by default.
5: Is the character used to input a character by hexadecimal ASCII code. This is "@" by default.
6: Is the character used for the one or more characters in a character set. This is "+" by default.
7: Is the character used for the zero or more characters in a character set. This is "*" by default.
8: Is the character used for the zero or one character in a character set. This is "?" by default.
9: Is the character used for a one time trigger of a character set. This is "~" by default.

The default pattern-settings is thus: []-\@+*?~

Output: none.

chararg

builtin: chararg argument-settings variable-quote-start variable-quote-end variable-separator [macro-set]

This function is used to define the characters for the argument substitution when using the the default argument substitution. Only a single character can be selected. This is similar to m4.

The argument-settings is a string of a specific length (5 or 6 characters) with characters at a specific position having a specific meaning. The positions in the string have the meanings:

1: This is a hexadecimal number for the amount of digits after an argument character. A number from 1 – 5 indicates the maximum number of digits is 1 – 5 with a minimum of 1 digit. A number from a – d sets the number of digits to exactly 2 – 5. The remaining numbers 0, 6 – 9, e and f are reserved.
2: Is the character used for arguments in the definition. This is $ by default. The number after this character in the definition is the entry of the stack.
3: Is the character used to output all arguments. This character is put after the first character in a definition. This is * by default.
4: Is the character used to output all arguments with quotes. This character is put after the first character in a definition. This is @ by default.
5: Is the character used to output the number of arguments. This character is put after the first character in a definition. This is # by default.
6: Is the optional character used for arguments from the other stacks in the definition. This is $ by default. An a – f after this character in the definition selects the stack. The number after this is the entry of the stack. If this is not set, the arguments from the other stacks can not be selected.

The variable-quote-start, variable-quote-end and variable-separator are the numbers of the variables where the characters or strings of the start of a quote, the end of a quote and the separator of arguments are stored. See also the function set_var in Functions for using variables and system information for defining these characters or strings.

The optional [macro-set] is the name of a set of macros that will be used to set the argument characters for. If this option is not set, the argument characters are set in the current active set of macros. The default macro set name at start is 0. If the macro set does not yet exist, it will be initialised.

The default is defined as:

0_define:0_chars_args:;;chararg;nr01;0

0_chars_args:1$*@#$;1;2;3

Output: none.

5.6 String functions

builtin: strindex string substring

This function returns the position of substring in the string.

The first position has number 0.

If the substring is not found the number -1 is returned.

Output: position of the substring in string or -1.

builtin: substr string from [length]

Expands to the substring of string, which starts at from, and extends for [length] characters, or to the end of string, if [length] is omitted. The starting index of a string is 0.

Output: the substring

builtin: strtrans string chars [replacement]

Translates the characters in chars in the string to corresponding characters in [replacement]. Corresponding characters have the same position in chars and [replacement].

If a character in chars has no corresponding character in [replacement] it is deleted.

Also a range of characters can be given, e.g.: a-z or z-a.

Output: translated string

builtin: num2chr number [number] ... [number]

This function returns the character with the ASCII of number. The number is therefore at least 0 and maximum 255. If the number is outside of this range a space is output.

Multiple [number] can be used and result in a string with the ASCII characters of the numbers.

Output: single character or string of characters

builtin: num2str number [radix] [width]

Converts a number to a string.

By default the number is output as decimal. It can also be output in another base by setting [radix]. The [radix] can be 2 – 36. The radix prefix is not in the output.

The minimum width is set by [width]. The number is padded with zeros.

A custom output can be made by using a definition string and using outputs placed in arguments. The arguments that can be used for this are:

4: The radix prefix.
5: Positive binary output. The number is treated as unsigned.
6: A minus sign if the number is negative.
7: The leading zeros.
8: String of the number as in the output without leading zeros or minus sign.

Output: string of number

5.7 Functions for using variables and system information

A set of variables to store strings is available. They can be used to store information without using macros. They are also used to store the characters or strings of multiple arguments and quoting for the default argument substitution function.

These variables are referenced by a number from 1 to 255. Negative numbers are used to get certain system or process information.

builtin: set_var number string

Sets the variable with number to the value of string. The number can be from 1 to 255.

Output: none.

builtin: get_var number

Outputs a string from the variable with number.

The number can be:

1 - 255: Gets the string from the variable with number.
-1: Gets the diversion number.
-2: Gets the string of the operating system type e.g. unix.
-3: Gets the current global line number. This counts all new lines starting from the first input file.
-4: Gets the current local line number. This counts all new lines in the current input file.
-5: Gets the current input file name.
-6: Gets the program name.
-7: Gets a string with all the options after the -- on the command line.
-8: Reserved.
-9: Gets the return value from the last command executed by the function shell.

Output: the selected string.

5.8 Functions for getting macro information

builtin: ifmacro? name [yes] [no] [macro-set]

This function checks if the macro with name exists in the macro set [macro-set]. If the macro exists the string in [yes] is output otherwise the string in [no] is output.

If [macro-set] is not defined, then the current macro set is used. If the macro set named in [macro-set] does not exist, an error is output.

Output: the string in [yes] or in [no].

builtin: info name [macro-set]

This function outputs the definition string of the macro with name or the called macro in case of an mcall with name defined in the macro set [macro-set].

If [macro-set] is not defined, then the current macro set is used. If the macro set named in [macro-set] does not exist, an error is output.

The other settings of the macro are put in arguments. They can be used for a customized output by using a definition. The arguments that can be used for this are:

2: Definition string or called macro.
3: The builtin function or macro set of the called macro.
4: macro-settings
5: Pattern for argument collection.
6: Pattern for argument substitution.
7: The optional program.
8: Macro set of the macro.
9: This will output the information text of the macro (see also the function set-info). Or if the information text is not set, depending on the type of macro this will be "macro" if the macro is a macro or "mcall" if the macro is a macro call.

Output: definition or called macro of name

5.9 Functions for input and output

Diversions are temporary memory buffers for storing the output. The diversions can be retrieved at a later time or are automatically retrieved at the end of the input files.

If a diversion is set, the output that would go to the normal output is stored in a buffer. Thus if a diversion is set multiple times inside a macro, only the last setting of the diversion has an influence on the output.

The memory buffers have a diversion number of one or larger. Diversion number 0 means the normal output. At the end of the input files the remaining diversions are automatically output in ascending order.

builtin: divert [number]

Sets the diversion to [number]. If the [number] is not given, it will be by default 0, the normal output.

A negative [number] will discard all output.

Output: none

builtin: undivert [number] [number]

The undivert will empty the diversion with [number] into the output. Multiple [number]s can be given to undivert. If no number is given, all diversions are undiverted.

The contents of the diversion can still be recursed for macros if the macro-settings is properly set.

Output: diversion buffer

builtin: include file

The include will read the file and output its contents.

The macro-settings of the definition will (like every macro) determine if the contents of the file is recursed for macro recognition or not. Using a definition or definition fill pattern does not seem useful.

If file does not exist an error is output and program execution is stopped.

Output: contents of file.

builtin: sinclude file

Same as include but will fail silently if the file does not exist.

Output: contents of file.

builtin: at_last string

Puts the string in a buffer that will be append to the input after the the end of the last file. Similar to the divert function, but it will be automatically undiverted and used for macro recognition.

The function appends additional strings to the previous strings. It works thus like a FIFO buffer in the same way as the divert function.

Output: none.

builtin: shell command

Executes the command in a shell. The command is executed in a shell with the stdin of m0 as input and the stdout is redirected as input to m0.

Output: the output of stdout of the command.

builtin: tempfile template

Creates a temporary file using a template. The template should have six X (e.g. tmp/docXXXXXX) at the end for placement of a random string. If these X are not in the template, the template will be expanded to hold the random string with six characters.

The file is created and left empty. The function will output the name of the temporary file for further use.

Output: name of the temporary file.

5.10 Functions for handling of errors

builtin: errprint message [message]

This function will output message and following optionally multiple [message] to standard error.

Output: none

builtin: exit errorcode

This will exit the m0 program directly with an errorcode. The errorcode should be between 0 and 255.

Output: none

5.11 Some functions are missing?

Several functions that exist in e.g. m4 are not available in m0. Examples are:

Conditionals like if else.
Arithmetic like decr to decrement a value or eval to evaluate a mathematical expression.
Quoting and comments
Shift macro
String functions like len for the length of a string or format to output formatted strings.

These functions are not available because they can be implemented using macros, patterns and programs. Examples of these functions can be found in M4 emulation example.

• Introduction:
• Functions for defining macros:
• Macro sets:
• Functions for defining patterns:
• Functions for setting symbols:
• String functions:
• Functions for using variables and system information:
• Functions for getting macro information:
• Functions for input and output:
• Functions for handling of errors:
• Some functions are missing?: