uCalc API Version: 2.1.3-preview.2 Released: 6/16/2026

Warning

uCalc API Preview Release Notice:The documentation describes the intended behavior of the API. The current preview build contains incomplete features, unoptimized performance, and is subject to breaking changes.

Tokens = [Tokens]

Property

Product: 

Transformer Library

Class: 

Transformer

Provides access to the collection of token definitions, allowing for dynamic customization of the transformer's lexical rules.

Remarks

The @Tokens() property is your gateway to the Transformer's lexical analysis engine. It returns the Tokens object that defines how an input string is broken down into fundamental units (tokens) before being processed by pattern-matching rules.

Why Customize Tokens?

Modifying the default token set allows you to extend the Transformer's syntax to support new language constructs. Common use cases include:

  • Defining Custom Literals: Add support for C-style hexadecimal (0x...) or binary (0b...) numbers.
  • Extending Identifiers: Change the rules for what constitutes a "word", for example, to allow characters like - or $ in identifiers.
  • Creating Custom Comment Styles: Implement single-line (//...), multi-line (/*...*/), or other language-specific comment formats to be ignored by your rules via SkipOver.
  • Adapting to Data Formats: Tailor the tokenizer to a specific format like CSV or a custom log file structure.

Token Precedence (LIFO)

By default, tokens are evaluated in a Last-In, First-Out (LIFO) order. The most recently added token is checked first, giving it the highest precedence. This is crucial for resolving ambiguity. For example, you should define specific keywords (like if or else) after you define the general-purpose alphanumeric token to ensure they are matched first.

💡 Why uCalc? (Comparative Analysis)

  • vs. Static Lexer Generators (e.g., ANTLR, Lex/Flex): Traditional compiler tools require an external toolchain, a separate grammar file, and a code generation/build step. This process is static. uCalc's key advantage is that its token engine is fully dynamic and programmatic. You can add, remove, or modify token definitions at runtime using the Tokens API, without any external tools or recompilation. This provides unparalleled flexibility for creating adaptable and user-extendable DSLs.

  • vs. Manual Regex & String Splitting: Manually parsing input with string functions and regular expressions is complex and error-prone. Standard regex struggles with nested structures (like parentheses) and context (distinguishing an operator from a character inside a string). uCalc's tokenizer is structured, context-aware, and handles these challenges automatically.

Examples

Succinct: Adds a C-style single-line comment token and categorizes it as whitespace to be ignored by other rules.
				
					using uCalcSoftware;

var uc = new uCalc();
var t = new uCalc.Transformer();
t.FromTo("this", "THAT");

string text = "transform this but not // this in a comment";

Console.WriteLine("--- Before --- ");
// Initially, the comment is treated as regular text.
Console.WriteLine(t.Transform(text));

// Add a token for C-style comments and classify it as whitespace.
t.Tokens.Add("//.*", TokenType.Whitespace);

Console.WriteLine("");
Console.WriteLine("--- After --- ");
// Re-run the transform. The comment is now ignored.
Console.WriteLine(t.Transform(text));
				
			
--- Before --- 
transform THAT but not // THAT in a comment

--- After --- 
transform THAT but not // this in a comment
				
					#include <iostream>
#include "uCalc.h"

using namespace std;
using namespace uCalcSoftware;

int main() {
   uCalc uc;
   uCalc::Transformer t;
   t.FromTo("this", "THAT");

   string text = "transform this but not // this in a comment";

   cout << "--- Before --- " << endl;
   // Initially, the comment is treated as regular text.
   cout << t.Transform(text) << endl;

   // Add a token for C-style comments and classify it as whitespace.
   t.Tokens().Add("//.*", TokenType::Whitespace);

   cout << "" << endl;
   cout << "--- After --- " << endl;
   // Re-run the transform. The comment is now ignored.
   cout << t.Transform(text) << endl;
}
				
			
--- Before --- 
transform THAT but not // THAT in a comment

--- After --- 
transform THAT but not // this in a comment
				
					Imports System
Imports uCalcSoftware
Public Module Program
   Public Sub Main()
      Dim uc As New uCalc()
      Dim t As New uCalc.Transformer()
      t.FromTo("this", "THAT")
      
      Dim text As String = "transform this but not // this in a comment"
      
      Console.WriteLine("--- Before --- ")
      '// Initially, the comment is treated as regular text.
      Console.WriteLine(t.Transform(text))
      
      '// Add a token for C-style comments and classify it as whitespace.
      t.Tokens.Add("//.*", TokenType.Whitespace)
      
      Console.WriteLine("")
      Console.WriteLine("--- After --- ")
      '// Re-run the transform. The comment is now ignored.
      Console.WriteLine(t.Transform(text))
   End Sub
End Module
				
			
--- Before --- 
transform THAT but not // THAT in a comment

--- After --- 
transform THAT but not // this in a comment
Practical: Modifies the default alphanumeric token to include hyphens, allowing it to match hyphenated identifiers as single words.
				
					using uCalcSoftware;

var uc = new uCalc();
var t = new uCalc.Transformer();
t.FromTo("{@Alpha:word}", "[{word}]");
string text = "id-E123 is a special-identifier.";

Console.WriteLine("--- Before --- ");
// By default, 'id-123' is tokenized as three separate parts: 'id', '-', and '123'.
Console.WriteLine(t.Transform(text));

// Get the alphanumeric token item by its name and modify its regex.
var alphaToken = t.Tokens["_token_alphanumeric"];
alphaToken.Regex = "[a-zA-Z0-9-]+";

Console.WriteLine("");
Console.WriteLine("--- After --- ");
// Now, hyphenated words are matched as single alphanumeric tokens.
Console.WriteLine(t.Transform(text));
				
			
--- Before --- 
[id]-[E123] [is] [a] [special]-[identifier].

--- After --- 
[id-E123] [is] [a] [special-identifier].
				
					#include <iostream>
#include "uCalc.h"

using namespace std;
using namespace uCalcSoftware;

int main() {
   uCalc uc;
   uCalc::Transformer t;
   t.FromTo("{@Alpha:word}", "[{word}]");
   string text = "id-E123 is a special-identifier.";

   cout << "--- Before --- " << endl;
   // By default, 'id-123' is tokenized as three separate parts: 'id', '-', and '123'.
   cout << t.Transform(text) << endl;

   // Get the alphanumeric token item by its name and modify its regex.
   auto alphaToken = t.Tokens()["_token_alphanumeric"];
   alphaToken.Regex("[a-zA-Z0-9-]+");

   cout << "" << endl;
   cout << "--- After --- " << endl;
   // Now, hyphenated words are matched as single alphanumeric tokens.
   cout << t.Transform(text) << endl;
}
				
			
--- Before --- 
[id]-[E123] [is] [a] [special]-[identifier].

--- After --- 
[id-E123] [is] [a] [special-identifier].
				
					Imports System
Imports uCalcSoftware
Public Module Program
   Public Sub Main()
      Dim uc As New uCalc()
      Dim t As New uCalc.Transformer()
      t.FromTo("{@Alpha:word}", "[{word}]")
      Dim text As String = "id-E123 is a special-identifier."
      
      Console.WriteLine("--- Before --- ")
      '// By default, 'id-123' is tokenized as three separate parts: 'id', '-', and '123'.
      Console.WriteLine(t.Transform(text))
      
      '// Get the alphanumeric token item by its name and modify its regex.
      Dim alphaToken = t.Tokens("_token_alphanumeric")
      alphaToken.Regex = "[a-zA-Z0-9-]+"
      
      Console.WriteLine("")
      Console.WriteLine("--- After --- ")
      '// Now, hyphenated words are matched as single alphanumeric tokens.
      Console.WriteLine(t.Transform(text))
   End Sub
End Module
				
			
--- Before --- 
[id]-[E123] [is] [a] [special]-[identifier].

--- After --- 
[id-E123] [is] [a] [special-identifier].
Transformer: Matching by tokens vs match by character; also whitespace sensitivity
				
					using uCalcSoftware;

var uc = new uCalc();
// This examples shows the default match by
// token mode, as well as how to reconfigure
// it in order to do match by character
// along with a whitespace variation

var t = uc.NewTransformer();
var txt = "This is an island test, I said.";
t.FromTo("is", "<is>");

Console.WriteLine(t.Transform(txt));
Console.WriteLine("");

t.Tokens.Description = "Match by character";
t.Tokens.Add("."); // This overrides existing tokens
t.FromTo("is", "<is>");
Console.WriteLine(t.Tokens.Description);
Console.WriteLine(t.Transform(txt));
Console.WriteLine("");

// Note: whitespace sensitivity is off by default
// Whitespace token is re-introduced
// (after being overridden in the previous Token Add())
t.Tokens.Description = "By char + whitespace ignored";
t.Tokens.Add("[\\t\\v ]+", TokenType.Whitespace);
t.FromTo("is", "<{@Self}>");
Console.WriteLine(t.Tokens.Description);
Console.WriteLine(t.Transform(txt));
				
			
This <is> an island test, I said.

Match by character
Th<is> <is> an <is>land test, I said.

By char + whitespace ignored
Th<is> <is> an <is>land test, <I s>aid.
				
					#include <iostream>
#include "uCalc.h"

using namespace std;
using namespace uCalcSoftware;

int main() {
   uCalc uc;
   // This examples shows the default match by
   // token mode, as well as how to reconfigure
   // it in order to do match by character
   // along with a whitespace variation

   auto t = uc.NewTransformer();
   auto txt = "This is an island test, I said.";
   t.FromTo("is", "<is>");

   cout << t.Transform(txt) << endl;
   cout << "" << endl;

   t.Tokens().Description("Match by character");
   t.Tokens().Add("."); // This overrides existing tokens
   t.FromTo("is", "<is>");
   cout << t.Tokens().Description() << endl;
   cout << t.Transform(txt) << endl;
   cout << "" << endl;

   // Note: whitespace sensitivity is off by default
   // Whitespace token is re-introduced
   // (after being overridden in the previous Token Add())
   t.Tokens().Description("By char + whitespace ignored");
   t.Tokens().Add("[\\t\\v ]+", TokenType::Whitespace);
   t.FromTo("is", "<{@Self}>");
   cout << t.Tokens().Description() << endl;
   cout << t.Transform(txt) << endl;
}
				
			
This <is> an island test, I said.

Match by character
Th<is> <is> an <is>land test, I said.

By char + whitespace ignored
Th<is> <is> an <is>land test, <I s>aid.
				
					Imports System
Imports uCalcSoftware
Public Module Program
   Public Sub Main()
      Dim uc As New uCalc()
      '// This examples shows the default match by
      '// token mode, as well as how to reconfigure
      '// it in order to do match by character
      '// along with a whitespace variation
      
      Dim t = uc.NewTransformer()
      Dim txt = "This is an island test, I said."
      t.FromTo("is", "<is>")
      
      Console.WriteLine(t.Transform(txt))
      Console.WriteLine("")
      
      t.Tokens.Description = "Match by character"
      t.Tokens.Add(".") '// This overrides existing tokens
      t.FromTo("is", "<is>")
      Console.WriteLine(t.Tokens.Description)
      Console.WriteLine(t.Transform(txt))
      Console.WriteLine("")
      
      '// Note: whitespace sensitivity is off by default
      '// Whitespace token is re-introduced
      '// (after being overridden in the previous Token Add())
      t.Tokens.Description = "By char + whitespace ignored"
      t.Tokens.Add("[\\t\\v ]+", TokenType.Whitespace)
      t.FromTo("is", "<{@Self}>")
      Console.WriteLine(t.Tokens.Description)
      Console.WriteLine(t.Transform(txt))
   End Sub
End Module
				
			
This <is> an island test, I said.

Match by character
Th<is> <is> an <is>land test, I said.

By char + whitespace ignored
Th<is> <is> an <is>land test, <I s>aid.
Returns the default list of tokens (index, description, name, regex) in a transformer
				
					using uCalcSoftware;

var uc = new uCalc();
var t = uc.NewTransformer();
Console.WriteLine($"Token Count: {t.Tokens.Count}");
Console.WriteLine("");
Console.WriteLine("Index  Type  Name: regex");
Console.WriteLine("========================");

foreach(var token in t.Tokens) {
   Console.Write(t.Tokens.IndexOf(token));
   Console.WriteLine($"  {token.Description}  {token.Name}: {token.Regex}");
}
				
			
Token Count: 27

Index  Type  Name: regex
========================
0  generic  _token_line: .*
1  generic  _token_catchall: .
2  generic  _token_catchall_utf8_other: [\xf0-\xf7][\x80-\xbf][\x80-\xbf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]|[\xc0-\xdf][\x80-\xbf]
3  generic  _token_punctuation: (--|\.{3}|\xE2\x80\xA6|[!"#$%&'()*+,\-./:;<=>?@\[\\\]^_`{|}~]|\xE2\x80[\x90-\x95])
4  generic  _token_quotechar: ("){3}|"|'
5  generic  _token_quotechar_single: '
6  generic  _token_quotechar_double: "
7  generic  _token_quotechar_tripledouble: """
8  memberaccess  _token_memberaccess: \.
9  generic  _token_variableargs: \.\.\.
10  reducible  _token_reducible2: [-:|+/*^&=%@!`\\<>?#$~]+
11  bracket  _token_parenthesis: \(
12  bracketclose  _token_parenthesis_close: \)
13  bracket  _token_curlybrace: \{
14  bracketclose  _token_curlybrace_close: \}
15  bracket  _token_squarebracket: \[
16  bracketclose  _token_squarebracket_close: \]
17  argseparator  _token_argseparator: ,
18  statementseparator  _token_newline: (?:\r?\n)|\r
19  statementseparator  _token_semicolon: ;
20  literal  _token_string_singlequoted: '([^']*(?:''[^']*)*)'
21  literal  _token_string_doublequoted: "([^"]*(?:""[^"]*)*)"
22  literal  _token_string_tripledoublequoted: """([\s\S]*?)"""
23  whitespace  _token_whitespace: [\t\v ]+
24  reducible  _token_reducible: [-:|+/*^&=%@!`\\<>?]+
25  literal  _token_floatnumber: [0-9]*\.?[0-9]+([eE][+-]?[0-9]+)?
26  alphanumeric  _token_alphanumeric: [a-zA-Z_][a-zA-Z0-9_]*
				
					#include <iostream>
#include "uCalc.h"

using namespace std;
using namespace uCalcSoftware;

int main() {
   uCalc uc;
   auto t = uc.NewTransformer();
   cout << "Token Count: " << t.Tokens().Count() << endl;
   cout << "" << endl;
   cout << "Index  Type  Name: regex" << endl;
   cout << "========================" << endl;

   for(auto token : t.Tokens()) {
      cout << t.Tokens().IndexOf(token);
      cout << "  " << token.Description() << "  " << token.Name() << ": " << token.Regex() << endl;
   }
}
				
			
Token Count: 27

Index  Type  Name: regex
========================
0  generic  _token_line: .*
1  generic  _token_catchall: .
2  generic  _token_catchall_utf8_other: [\xf0-\xf7][\x80-\xbf][\x80-\xbf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]|[\xc0-\xdf][\x80-\xbf]
3  generic  _token_punctuation: (--|\.{3}|\xE2\x80\xA6|[!"#$%&'()*+,\-./:;<=>?@\[\\\]^_`{|}~]|\xE2\x80[\x90-\x95])
4  generic  _token_quotechar: ("){3}|"|'
5  generic  _token_quotechar_single: '
6  generic  _token_quotechar_double: "
7  generic  _token_quotechar_tripledouble: """
8  memberaccess  _token_memberaccess: \.
9  generic  _token_variableargs: \.\.\.
10  reducible  _token_reducible2: [-:|+/*^&=%@!`\\<>?#$~]+
11  bracket  _token_parenthesis: \(
12  bracketclose  _token_parenthesis_close: \)
13  bracket  _token_curlybrace: \{
14  bracketclose  _token_curlybrace_close: \}
15  bracket  _token_squarebracket: \[
16  bracketclose  _token_squarebracket_close: \]
17  argseparator  _token_argseparator: ,
18  statementseparator  _token_newline: (?:\r?\n)|\r
19  statementseparator  _token_semicolon: ;
20  literal  _token_string_singlequoted: '([^']*(?:''[^']*)*)'
21  literal  _token_string_doublequoted: "([^"]*(?:""[^"]*)*)"
22  literal  _token_string_tripledoublequoted: """([\s\S]*?)"""
23  whitespace  _token_whitespace: [\t\v ]+
24  reducible  _token_reducible: [-:|+/*^&=%@!`\\<>?]+
25  literal  _token_floatnumber: [0-9]*\.?[0-9]+([eE][+-]?[0-9]+)?
26  alphanumeric  _token_alphanumeric: [a-zA-Z_][a-zA-Z0-9_]*
				
					Imports System
Imports uCalcSoftware
Public Module Program
   Public Sub Main()
      Dim uc As New uCalc()
      Dim t = uc.NewTransformer()
      Console.WriteLine($"Token Count: {t.Tokens.Count}")
      Console.WriteLine("")
      Console.WriteLine("Index  Type  Name: regex")
      Console.WriteLine("========================")
      
      For Each token In t.Tokens
         Console.Write(t.Tokens.IndexOf(token))
         Console.WriteLine($"  {token.Description}  {token.Name}: {token.Regex}")
      Next
   End Sub
End Module
				
			
Token Count: 27

Index  Type  Name: regex
========================
0  generic  _token_line: .*
1  generic  _token_catchall: .
2  generic  _token_catchall_utf8_other: [\xf0-\xf7][\x80-\xbf][\x80-\xbf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]|[\xc0-\xdf][\x80-\xbf]
3  generic  _token_punctuation: (--|\.{3}|\xE2\x80\xA6|[!"#$%&'()*+,\-./:;<=>?@\[\\\]^_`{|}~]|\xE2\x80[\x90-\x95])
4  generic  _token_quotechar: ("){3}|"|'
5  generic  _token_quotechar_single: '
6  generic  _token_quotechar_double: "
7  generic  _token_quotechar_tripledouble: """
8  memberaccess  _token_memberaccess: \.
9  generic  _token_variableargs: \.\.\.
10  reducible  _token_reducible2: [-:|+/*^&=%@!`\\<>?#$~]+
11  bracket  _token_parenthesis: \(
12  bracketclose  _token_parenthesis_close: \)
13  bracket  _token_curlybrace: \{
14  bracketclose  _token_curlybrace_close: \}
15  bracket  _token_squarebracket: \[
16  bracketclose  _token_squarebracket_close: \]
17  argseparator  _token_argseparator: ,
18  statementseparator  _token_newline: (?:\r?\n)|\r
19  statementseparator  _token_semicolon: ;
20  literal  _token_string_singlequoted: '([^']*(?:''[^']*)*)'
21  literal  _token_string_doublequoted: "([^"]*(?:""[^"]*)*)"
22  literal  _token_string_tripledoublequoted: """([\s\S]*?)"""
23  whitespace  _token_whitespace: [\t\v ]+
24  reducible  _token_reducible: [-:|+/*^&=%@!`\\<>?]+
25  literal  _token_floatnumber: [0-9]*\.?[0-9]+([eE][+-]?[0-9]+)?
26  alphanumeric  _token_alphanumeric: [a-zA-Z_][a-zA-Z0-9_]*