uCalc API Version: 2.1.3-preview.2 Released: 6/16/2026

Warning

uCalc API Preview Release Notice:The documentation describes the intended behavior of the API. The current preview build contains incomplete features, unoptimized performance, and is subject to breaking changes.

Structural Awareness (Tokens vs. Regex)

Product: 

Class: 

Explains the fundamental difference between uCalc's token-aware parsing and traditional character-aware Regex, highlighting safety and power.

Remarks

🧠 Structural Awareness: Seeing Code, Not Just Text

One of the most important concepts to understand about uCalc is that its Transformer is structurally aware. This is the key advantage it holds over traditional text-processing tools like Regular Expressions (Regex). In short:

  • Regex is character-aware: It sees text as a flat stream of characters.
  • uCalc is token-aware: It sees text as a structured sequence of meaningful units (words, numbers, strings, etc.).

This fundamental difference has profound implications for safety, power, and readability, especially when transforming structured text like source code, configuration files, or markup.


1. The Regex World: A Stream of Characters

Regular expressions are incredibly powerful for finding patterns in character streams. However, their greatest strength is also their greatest weakness: they are "blind" to the structure and context of the text they are processing.

The Classic Refactoring Problem

Imagine you want to refactor a piece of code by renaming the variable rate to annual_rate. Your code looks like this:

rate = 0.05; // Current rateprint("The rate is: ", rate);

(Note: To enable C-style comments as shown, you would first define a comment token, for example, by calling uc.ExpressionTokens().Add("//.*", TokenType.Whitespace);. This demonstrates uCalc's configurable syntax.)

A developer's first instinct might be to use a simple find-and-replace operation for the word "rate". A naive regex like \brate\b would produce this incorrect result:

annual_rate = 0.05; // Current annual_rateprint("The annual_rate is: ", annual_rate);

This is a catastrophic failure. The regex, blind to context, has incorrectly modified the text inside both a code comment and a string literal, corrupting the logic and the output.


2. The uCalc World: A Stream of Tokens

The uCalc Transformer avoids this problem by performing a crucial first step: tokenization. Before any pattern matching occurs, the engine's tokenizer (or lexer) breaks the input string into a stream of meaningful Tokens:

  1. rate (Identifier)
  2. = (Reducible / Operator Symbol)
  3. 0.05 (Number Literal)
  4. ; (StatementSeparator)
  5. // Current rate (Comment / Whitespace)
  6. print (Identifier)
  7. ( (Bracket)
  8. "The rate is: " (String Literal)
  9. , (Argument Separator)
  10. rate (Identifier)
  11. ) (Bracket)
  12. ; (StatementSeparator)

This process categorizes each part of the string. Comment tokens can be defined as whitespace with uCalc.Tokens .Add("//.*", TokenType.Whitespace);, which tells the parser to ignore them, while string literals are recognized as atomic units.

The Safe Solution

When you define a uCalc rule to replace the identifier rate, the pattern matching engine operates on this stream of tokens. It knows that "The rate is: " is a single String Literal token and (once defined) // Current rate is a Whitespace token. By default, it will not look inside them.

A rule to replace the identifier rate will only match tokens #1 and #10, producing the correct, safe transformation:

annual_rate = 0.05; // Current rateprint("The rate is: ", annual_rate);

This structural awareness is the core reason why the uCalc Transformer is a superior tool for static analysis, code refactoring, and transpilation.


💡 Why uCalc? (Summary of Advantages)

FeatureRegular ExpressionsuCalc Transformer
Safety🔴 Unsafe. Easily corrupts string literals, comments, and other structural blocks.🟢 Safe by Default. QuoteSensitive and BracketSensitive properties respect code structure.
Readability🔴 Low. Patterns can become cryptic and hard to maintain (e.g., (?<=\s)\d+).🟢 High. Patterns use readable names and categories (e.g., {@Number}).
Power🟡 Medium. Struggles with nested or recursive patterns (like matching balanced parentheses).🟢 High. Natively understands nested structures and token categories.

Examples

A simple demonstration of safely renaming a variable without corrupting a string literal, a common pitfall for Regex.
				
					using uCalcSoftware;

var uc = new uCalc();
var t = new uCalc.Transformer();
// This rule only targets the alphanumeric token 'x'.
t.FromTo("x", "value");
var code = """
x = 5; print("The value of x is...");
""";
Console.WriteLine(t.Transform(code));
				
			
value = 5; print("The value of x is...");
				
					#include <iostream>
#include "uCalc.h"

using namespace std;
using namespace uCalcSoftware;

int main() {
   uCalc uc;
   uCalc::Transformer t;
   // This rule only targets the alphanumeric token 'x'.
   t.FromTo("x", "value");
   auto code = R"(x = 5; print("The value of x is...");)";
   cout << t.Transform(code) << endl;
}
				
			
value = 5; print("The value of x is...");
				
					Imports System
Imports uCalcSoftware
Public Module Program
   Public Sub Main()
      Dim uc As New uCalc()
      Dim t As New uCalc.Transformer()
      '// This rule only targets the alphanumeric token 'x'.
      t.FromTo("x", "value")
      Dim code = "x = 5; print(""The value of x is..."");"
      Console.WriteLine(t.Transform(code))
   End Sub
End Module
				
			
value = 5; print("The value of x is...");
A real-world refactoring task to rename a function, showing how uCalc correctly ignores matches inside comments and strings by default.
				
					using uCalcSoftware;

var uc = new uCalc();
var t = new uCalc.Transformer();
// This rule replaces the ALPHANUMERIC token 'get_data', not just the text.
t.FromTo("get_data", "fetch_records");

var code = """

// Note: 'get_data' is the old function name.
results = get_data(source);
print("The 'get_data' function was called.");

""";

// The default tokenizer recognizes the comment and string literal as separate tokens,
// so the rule to replace the function name doesn't affect them.
Console.WriteLine(t.Transform(code));
				
			
// Note: 'get_data' is the old function name.
results = fetch_records(source);
print("The 'get_data' function was called.");
				
					#include <iostream>
#include "uCalc.h"

using namespace std;
using namespace uCalcSoftware;

int main() {
   uCalc uc;
   uCalc::Transformer t;
   // This rule replaces the ALPHANUMERIC token 'get_data', not just the text.
   t.FromTo("get_data", "fetch_records");

   auto code = R"(
// Note: 'get_data' is the old function name.
results = get_data(source);
print("The 'get_data' function was called.");
)";

   // The default tokenizer recognizes the comment and string literal as separate tokens,
   // so the rule to replace the function name doesn't affect them.
   cout << t.Transform(code) << endl;
}
				
			
// Note: 'get_data' is the old function name.
results = fetch_records(source);
print("The 'get_data' function was called.");
				
					Imports System
Imports uCalcSoftware
Public Module Program
   Public Sub Main()
      Dim uc As New uCalc()
      Dim t As New uCalc.Transformer()
      '// This rule replaces the ALPHANUMERIC token 'get_data', not just the text.
      t.FromTo("get_data", "fetch_records")
      
      Dim code = "
// Note: 'get_data' is the old function name.
results = get_data(source);
print(""The 'get_data' function was called."");
"
      
      '// The default tokenizer recognizes the comment and string literal as separate tokens,
      '// so the rule to replace the function name doesn't affect them.
      Console.WriteLine(t.Transform(code))
   End Sub
End Module
				
			
// Note: 'get_data' is the old function name.
results = fetch_records(source);
print("The 'get_data' function was called.");
Internal Test: Contrasts a token-aware uCalc replacement with a character-aware regex replacement to highlight safety.
				
					using uCalcSoftware;

var uc = new uCalc();
var code = """
rate = 0.05; print("rate"); // a rate
""";

Console.WriteLine("--- uCalc Transformer (Token-Aware & Correct) ---");
var t = new uCalc.Transformer();
t.Tokens.Add("//.*", TokenType.Whitespace);
// Rule targets only the alphanumeric token 'rate'
t.FromTo("rate", "annual_rate");
Console.WriteLine(t.Transform(code));

Console.WriteLine("");
Console.WriteLine("--- Simulated Regex (Character-Aware & Incorrect) ---");
// This simulates a simple find-and-replace for the word 'rate'
// which incorrectly changes the string literal and comment.
var incorrect_result = """
annual_rate = 0.05; print("annual_rate"); // a annual_rate
""";
Console.WriteLine(incorrect_result);
				
			
--- uCalc Transformer (Token-Aware & Correct) ---
annual_rate = 0.05; print("rate"); // a rate

--- Simulated Regex (Character-Aware & Incorrect) ---
annual_rate = 0.05; print("annual_rate"); // a annual_rate
				
					#include <iostream>
#include "uCalc.h"

using namespace std;
using namespace uCalcSoftware;

int main() {
   uCalc uc;
   auto code = R"(rate = 0.05; print("rate"); // a rate)";

   cout << "--- uCalc Transformer (Token-Aware & Correct) ---" << endl;
   uCalc::Transformer t;
   t.Tokens().Add("//.*", TokenType::Whitespace);
   // Rule targets only the alphanumeric token 'rate'
   t.FromTo("rate", "annual_rate");
   cout << t.Transform(code) << endl;

   cout << "" << endl;
   cout << "--- Simulated Regex (Character-Aware & Incorrect) ---" << endl;
   // This simulates a simple find-and-replace for the word 'rate'
   // which incorrectly changes the string literal and comment.
   auto incorrect_result = R"(annual_rate = 0.05; print("annual_rate"); // a annual_rate)";
   cout << incorrect_result << endl;
}
				
			
--- uCalc Transformer (Token-Aware & Correct) ---
annual_rate = 0.05; print("rate"); // a rate

--- Simulated Regex (Character-Aware & Incorrect) ---
annual_rate = 0.05; print("annual_rate"); // a annual_rate
				
					Imports System
Imports uCalcSoftware
Public Module Program
   Public Sub Main()
      Dim uc As New uCalc()
      Dim code = "rate = 0.05; print(""rate""); // a rate"
      
      Console.WriteLine("--- uCalc Transformer (Token-Aware & Correct) ---")
      Dim t As New uCalc.Transformer()
      t.Tokens.Add("//.*", TokenType.Whitespace)
      '// Rule targets only the alphanumeric token 'rate'
      t.FromTo("rate", "annual_rate")
      Console.WriteLine(t.Transform(code))
      
      Console.WriteLine("")
      Console.WriteLine("--- Simulated Regex (Character-Aware & Incorrect) ---")
      '// This simulates a simple find-and-replace for the word 'rate'
      '// which incorrectly changes the string literal and comment.
      Dim incorrect_result = "annual_rate = 0.05; print(""annual_rate""); // a annual_rate"
      Console.WriteLine(incorrect_result)
   End Sub
End Module
				
			
--- uCalc Transformer (Token-Aware & Correct) ---
annual_rate = 0.05; print("rate"); // a rate

--- Simulated Regex (Character-Aware & Incorrect) ---
annual_rate = 0.05; print("annual_rate"); // a annual_rate