uCalc API Version: 2.1.3-preview.2 Released: 6/16/2026

Warning

uCalc API Preview Release Notice:The documentation describes the intended behavior of the API. The current preview build contains incomplete features, unoptimized performance, and is subject to breaking changes.

Introduction

Product: 

Transformer Library

Class: 

Tokens

Manages the collection of lexical rules (tokens) that define how the parser breaks input strings into meaningful units.

Remarks

⚙️ The Tokens Class: The Lexical Engine

The Tokens class is the configuration object for uCalc's lexical analyzer, or tokenizer. It is the heart of what makes uCalc a token-aware engine, providing a powerful and dynamic alternative to traditional character-based tools like regular expressions. This collection defines the set of rules that the parser uses to break a raw input string into a stream of meaningful units—such as words, numbers, operators, and string literals—before any pattern matching or evaluation occurs.


🧠 The Core of Token-Awareness

The most significant advantage of uCalc's parsing model is its foundation in tokenization. This provides a structural understanding of text that character-based tools lack.

  • Regex is character-aware: It sees rate = "rate" as a flat stream of characters and can easily make incorrect changes.
  • uCalc is token-aware: The Tokens collection defines rules that first identify rate as an Alphanumeric token, = as a Reducible (operator) token, and "rate" as a single, atomic Literal (string) token. This structural awareness prevents parsing and transformation logic from accidentally corrupting the content of string literals or user-defined constructs like comments.

🚀 Dynamic and Configurable at Runtime

Unlike static parser generators like ANTLR or Flex/Bison, where token rules are defined in external grammar files and compiled into the application, uCalc's token set is a live, programmable object. Using the methods on this class, you can add, remove, or modify token definitions at runtime, without any external tools or recompilation. This allows your application to adapt its own syntax on the fly, a key feature for building user-configurable Domain-Specific Languages (DSLs).

🥇 Precedence is Key (LIFO)

Token definitions are evaluated in a Last-In, First-Out (LIFO) order. The most recently added token definition is checked first, giving it the highest precedence. This is crucial for resolving ambiguity. For example, to correctly parse a language with keywords, you should define the specific keywords (like if, else) after you define the general-purpose identifier token ({@Alphanumeric}). This ensures that if is matched as a keyword, not just as a generic word.


📚 Member Overview

The following methods and properties are available for managing a token collection.

MemberDescription
AddDefines a new lexical token using a regular expression or imports an existing token definition.
AtRetrieves the token definition (as an Item) at a specific zero-based index in the precedence list.
ByNameRetrieves a token definition by its unique name (e.g., _token_alphanumeric).
ByTypeRetrieves token definitions based on their lexical category (e.g., TokenType::Literal).
ClearRemoves all token definitions from the collection.
ContextSwitchDynamically swaps the active token set when a start pattern is matched, until an end pattern is found.
CountGets the total number of token definitions in the collection.
DescriptionGets or sets a user-defined text description for the token collection.
IndexOfGets the zero-based precedence index of a specific token definition.
InsertAdds a new token definition at a specific index, providing explicit control over its precedence.
RemoveRemoves a specific token definition from the collection at runtime.
uCalcRetrieves the parent uCalc instance that owns this token collection.

Examples

Succinct: Defines a C-style line comment token (`//...`) and categorizes it as whitespace so it is ignored by the parser.
				
					using uCalcSoftware;

var uc = new uCalc();
var t = uc.NewTransformer();
// By default, a comment would cause a syntax error.
Console.Write("Before: ");
Console.WriteLine(uc.EvalStr("10 + 5 // Add 5"));

// Add a new token definition for C-style comments.
// The regex `//.*` matches from '//' to the end of the line.
// We classify it as Whitespace so the parser skips it.
uc.ExpressionTokens.Add("//.*", TokenType.Whitespace);

Console.Write("After:  ");
Console.WriteLine(uc.EvalStr("10 + 5 // Add 5"));
				
			
Before: Undefined identifier
After:  15
				
					#include <iostream>
#include "uCalc.h"

using namespace std;
using namespace uCalcSoftware;

int main() {
   uCalc uc;
   auto t = uc.NewTransformer();
   // By default, a comment would cause a syntax error.
   cout << "Before: ";
   cout << uc.EvalStr("10 + 5 // Add 5") << endl;

   // Add a new token definition for C-style comments.
   // The regex `//.*` matches from '//' to the end of the line.
   // We classify it as Whitespace so the parser skips it.
   uc.ExpressionTokens().Add("//.*", TokenType::Whitespace);

   cout << "After:  ";
   cout << uc.EvalStr("10 + 5 // Add 5") << endl;
}
				
			
Before: Undefined identifier
After:  15
				
					Imports System
Imports uCalcSoftware
Public Module Program
   Public Sub Main()
      Dim uc As New uCalc()
      Dim t = uc.NewTransformer()
      '// By default, a comment would cause a syntax error.
      Console.Write("Before: ")
      Console.WriteLine(uc.EvalStr("10 + 5 // Add 5"))
      
      '// Add a new token definition for C-style comments.
      '// The regex `//.*` matches from '//' to the end of the line.
      '// We classify it as Whitespace so the parser skips it.
      uc.ExpressionTokens.Add("//.*", TokenType.Whitespace)
      
      Console.Write("After:  ")
      Console.WriteLine(uc.EvalStr("10 + 5 // Add 5"))
   End Sub
End Module
				
			
Before: Undefined identifier
After:  15
Practical: Demonstrates how removing the token for single-quoted strings causes the parser to treat the quote and its contents as individual generic tokens.
				
					using uCalcSoftware;

var uc = new uCalc();
// This example removes the single-quoted string token.
var t = new uCalc.Transformer();
string txt = "This is a test, 'This is a test'";
t.FromTo("{token:1}", "<{@Self}>");

Console.WriteLine("--- Before Removing Token ---");
// Initially, 'This is a test' is treated as a single token.
Console.WriteLine(t.Transform(txt).Text);

// Now, find and remove the token definition for single-quoted strings.
var singleQuoteToken = t.Tokens.ByName("_token_string_singlequoted");
t.Tokens.Remove(singleQuoteToken);

// Re-run the transform. The text must be set again to be re-tokenized.
t.Text = txt;
Console.WriteLine("");
Console.WriteLine("--- After Removing Token ---");
// Now, the single quote is a generic token, as are the words inside it.
Console.WriteLine(t.Transform().Text);
				
			
--- Before Removing Token ---
<This> <is> <a> <test><,> <'This is a test'>

--- After Removing Token ---
<This> <is> <a> <test><,> <'><This> <is> <a> <test><'>
				
					#include <iostream>
#include "uCalc.h"

using namespace std;
using namespace uCalcSoftware;

int main() {
   uCalc uc;
   // This example removes the single-quoted string token.
   uCalc::Transformer t;
   string txt = "This is a test, 'This is a test'";
   t.FromTo("{token:1}", "<{@Self}>");

   cout << "--- Before Removing Token ---" << endl;
   // Initially, 'This is a test' is treated as a single token.
   cout << t.Transform(txt).Text() << endl;

   // Now, find and remove the token definition for single-quoted strings.
   auto singleQuoteToken = t.Tokens().ByName("_token_string_singlequoted");
   t.Tokens().Remove(singleQuoteToken);

   // Re-run the transform. The text must be set again to be re-tokenized.
   t.Text(txt);
   cout << "" << endl;
   cout << "--- After Removing Token ---" << endl;
   // Now, the single quote is a generic token, as are the words inside it.
   cout << t.Transform().Text() << endl;
}
				
			
--- Before Removing Token ---
<This> <is> <a> <test><,> <'This is a test'>

--- After Removing Token ---
<This> <is> <a> <test><,> <'><This> <is> <a> <test><'>
				
					Imports System
Imports uCalcSoftware
Public Module Program
   Public Sub Main()
      Dim uc As New uCalc()
      '// This example removes the single-quoted string token.
      Dim t As New uCalc.Transformer()
      Dim txt As String = "This is a test, 'This is a test'"
      t.FromTo("{token:1}", "<{@Self}>")
      
      Console.WriteLine("--- Before Removing Token ---")
      '// Initially, 'This is a test' is treated as a single token.
      Console.WriteLine(t.Transform(txt).Text)
      
      '// Now, find and remove the token definition for single-quoted strings.
      Dim singleQuoteToken = t.Tokens.ByName("_token_string_singlequoted")
      t.Tokens.Remove(singleQuoteToken)
      
      '// Re-run the transform. The text must be set again to be re-tokenized.
      t.Text = txt
      Console.WriteLine("")
      Console.WriteLine("--- After Removing Token ---")
      '// Now, the single quote is a generic token, as are the words inside it.
      Console.WriteLine(t.Transform().Text)
   End Sub
End Module
				
			
--- Before Removing Token ---
<This> <is> <a> <test><,> <'This is a test'>

--- After Removing Token ---
<This> <is> <a> <test><,> <'><This> <is> <a> <test><'>
Returns the default list of tokens (index, description, name, regex) in a transformer
				
					using uCalcSoftware;

var uc = new uCalc();
var t = uc.NewTransformer();
Console.WriteLine($"Token Count: {t.Tokens.Count}");
Console.WriteLine("");
Console.WriteLine("Index  Type  Name: regex");
Console.WriteLine("========================");

foreach(var token in t.Tokens) {
   Console.Write(t.Tokens.IndexOf(token));
   Console.WriteLine($"  {token.Description}  {token.Name}: {token.Regex}");
}
				
			
Token Count: 27

Index  Type  Name: regex
========================
0  generic  _token_line: .*
1  generic  _token_catchall: .
2  generic  _token_catchall_utf8_other: [\xf0-\xf7][\x80-\xbf][\x80-\xbf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]|[\xc0-\xdf][\x80-\xbf]
3  generic  _token_punctuation: (--|\.{3}|\xE2\x80\xA6|[!"#$%&'()*+,\-./:;<=>?@\[\\\]^_`{|}~]|\xE2\x80[\x90-\x95])
4  generic  _token_quotechar: ("){3}|"|'
5  generic  _token_quotechar_single: '
6  generic  _token_quotechar_double: "
7  generic  _token_quotechar_tripledouble: """
8  memberaccess  _token_memberaccess: \.
9  generic  _token_variableargs: \.\.\.
10  reducible  _token_reducible2: [-:|+/*^&=%@!`\\<>?#$~]+
11  bracket  _token_parenthesis: \(
12  bracketclose  _token_parenthesis_close: \)
13  bracket  _token_curlybrace: \{
14  bracketclose  _token_curlybrace_close: \}
15  bracket  _token_squarebracket: \[
16  bracketclose  _token_squarebracket_close: \]
17  argseparator  _token_argseparator: ,
18  statementseparator  _token_newline: (?:\r?\n)|\r
19  statementseparator  _token_semicolon: ;
20  literal  _token_string_singlequoted: '([^']*(?:''[^']*)*)'
21  literal  _token_string_doublequoted: "([^"]*(?:""[^"]*)*)"
22  literal  _token_string_tripledoublequoted: """([\s\S]*?)"""
23  whitespace  _token_whitespace: [\t\v ]+
24  reducible  _token_reducible: [-:|+/*^&=%@!`\\<>?]+
25  literal  _token_floatnumber: [0-9]*\.?[0-9]+([eE][+-]?[0-9]+)?
26  alphanumeric  _token_alphanumeric: [a-zA-Z_][a-zA-Z0-9_]*
				
					#include <iostream>
#include "uCalc.h"

using namespace std;
using namespace uCalcSoftware;

int main() {
   uCalc uc;
   auto t = uc.NewTransformer();
   cout << "Token Count: " << t.Tokens().Count() << endl;
   cout << "" << endl;
   cout << "Index  Type  Name: regex" << endl;
   cout << "========================" << endl;

   for(auto token : t.Tokens()) {
      cout << t.Tokens().IndexOf(token);
      cout << "  " << token.Description() << "  " << token.Name() << ": " << token.Regex() << endl;
   }
}
				
			
Token Count: 27

Index  Type  Name: regex
========================
0  generic  _token_line: .*
1  generic  _token_catchall: .
2  generic  _token_catchall_utf8_other: [\xf0-\xf7][\x80-\xbf][\x80-\xbf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]|[\xc0-\xdf][\x80-\xbf]
3  generic  _token_punctuation: (--|\.{3}|\xE2\x80\xA6|[!"#$%&'()*+,\-./:;<=>?@\[\\\]^_`{|}~]|\xE2\x80[\x90-\x95])
4  generic  _token_quotechar: ("){3}|"|'
5  generic  _token_quotechar_single: '
6  generic  _token_quotechar_double: "
7  generic  _token_quotechar_tripledouble: """
8  memberaccess  _token_memberaccess: \.
9  generic  _token_variableargs: \.\.\.
10  reducible  _token_reducible2: [-:|+/*^&=%@!`\\<>?#$~]+
11  bracket  _token_parenthesis: \(
12  bracketclose  _token_parenthesis_close: \)
13  bracket  _token_curlybrace: \{
14  bracketclose  _token_curlybrace_close: \}
15  bracket  _token_squarebracket: \[
16  bracketclose  _token_squarebracket_close: \]
17  argseparator  _token_argseparator: ,
18  statementseparator  _token_newline: (?:\r?\n)|\r
19  statementseparator  _token_semicolon: ;
20  literal  _token_string_singlequoted: '([^']*(?:''[^']*)*)'
21  literal  _token_string_doublequoted: "([^"]*(?:""[^"]*)*)"
22  literal  _token_string_tripledoublequoted: """([\s\S]*?)"""
23  whitespace  _token_whitespace: [\t\v ]+
24  reducible  _token_reducible: [-:|+/*^&=%@!`\\<>?]+
25  literal  _token_floatnumber: [0-9]*\.?[0-9]+([eE][+-]?[0-9]+)?
26  alphanumeric  _token_alphanumeric: [a-zA-Z_][a-zA-Z0-9_]*
				
					Imports System
Imports uCalcSoftware
Public Module Program
   Public Sub Main()
      Dim uc As New uCalc()
      Dim t = uc.NewTransformer()
      Console.WriteLine($"Token Count: {t.Tokens.Count}")
      Console.WriteLine("")
      Console.WriteLine("Index  Type  Name: regex")
      Console.WriteLine("========================")
      
      For Each token In t.Tokens
         Console.Write(t.Tokens.IndexOf(token))
         Console.WriteLine($"  {token.Description}  {token.Name}: {token.Regex}")
      Next
   End Sub
End Module
				
			
Token Count: 27

Index  Type  Name: regex
========================
0  generic  _token_line: .*
1  generic  _token_catchall: .
2  generic  _token_catchall_utf8_other: [\xf0-\xf7][\x80-\xbf][\x80-\xbf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]|[\xc0-\xdf][\x80-\xbf]
3  generic  _token_punctuation: (--|\.{3}|\xE2\x80\xA6|[!"#$%&'()*+,\-./:;<=>?@\[\\\]^_`{|}~]|\xE2\x80[\x90-\x95])
4  generic  _token_quotechar: ("){3}|"|'
5  generic  _token_quotechar_single: '
6  generic  _token_quotechar_double: "
7  generic  _token_quotechar_tripledouble: """
8  memberaccess  _token_memberaccess: \.
9  generic  _token_variableargs: \.\.\.
10  reducible  _token_reducible2: [-:|+/*^&=%@!`\\<>?#$~]+
11  bracket  _token_parenthesis: \(
12  bracketclose  _token_parenthesis_close: \)
13  bracket  _token_curlybrace: \{
14  bracketclose  _token_curlybrace_close: \}
15  bracket  _token_squarebracket: \[
16  bracketclose  _token_squarebracket_close: \]
17  argseparator  _token_argseparator: ,
18  statementseparator  _token_newline: (?:\r?\n)|\r
19  statementseparator  _token_semicolon: ;
20  literal  _token_string_singlequoted: '([^']*(?:''[^']*)*)'
21  literal  _token_string_doublequoted: "([^"]*(?:""[^"]*)*)"
22  literal  _token_string_tripledoublequoted: """([\s\S]*?)"""
23  whitespace  _token_whitespace: [\t\v ]+
24  reducible  _token_reducible: [-:|+/*^&=%@!`\\<>?]+
25  literal  _token_floatnumber: [0-9]*\.?[0-9]+([eE][+-]?[0-9]+)?
26  alphanumeric  _token_alphanumeric: [a-zA-Z_][a-zA-Z0-9_]*