What's the point?

Here is a few reasons why one might consider using T-Regx. Main of which are:

  • It's bulletproof
  • It's reliable
  • It's readable

What's wrong with PHP Regular Expressions:

PHP regular expressions API is far from perfect. Here's only a handful of what's wrong with it:

PHP is Implicit

You are probably a PHP developer. I would like to get 'Robert likes apples'. Can you tell me which is the correct signature for this task?

preg_replace('/Bob/', 'Robert', 'Bob likes apples'); // pattern, replacement, subject
// or
preg_replace('/Bob/', 'Bob likes apples', 'Robert'); // pattern, subject, replacement
// ??

Another try. Let's say you'd like to limit replacements. But you remember that there's a reference parameter &$count somewhere. Again, which is the correct signature?

$limit = 1;
preg_replace(?, ?, ?, $limit, $count);
// or
preg_replace(?, ?, ?, $count, $limit);
// ??

PHP is Unintuitive

Programming languages are tools created to solve problems. An experienced programmer should be able to look at the code and tell what it does. With PHP preg_* functions it's just. not. possible.

Someone who doesn't know PHP regular expressions, can probably ask themselves:

  • preg_replace('//', $r, $s) - will this replace all occurrences? Or just one?
  • preg_match('//', $subject) - will this match the first occurrence? Or all?
  • preg_match_all('//', $subject); Ok, this will find all matches, so preg_match() only finds the first.
  • preg_filter('//', $replacements, $subject) - who needs $replacements in filter method?

What's more

  • Parameters:

    • Functions with 4 or 5 parameters (3 of which are optional).

      It means that, whoever looks at the code has to remember (or to look up) what those optional values are and in which order.

  • Return types:

    • Array of arrays, which contain either a string, null, or an array of nulls, strings and ints.

PHP is Messy

  • PREG_OFFSET_CAPTURE is a nightmare! It changes return type from "an array of arrays" to "an array of arrays of arrays".
  • PREG_SET_ORDER / PREG_PATTERN_ORDER change return values. It's either "groups of matches" or "matches of groups", depending on the flag.

The worst part? You find yourself looking at this code

return $match[1][0];

having no idea what. it. does. You have to see whether you're using preg_match() or preg_match_all() and whether any of PREG_SET_ORDER/PREG_PATTERN_ORDER/PREG_OFFSET_CAPTURE were used.

PHP is Inconsistent

  • How do you get results and the count of the results?

    CountReturn typeArgument reference
    ValuesArgument referenceReturn type
    $replaced = preg_replace($p, $r, $s, $count);
    $count = preg_match($p, $s, $matched);
  • If you use PREG_OFFSET_CAPTURE and your subject isn't matched with the pattern; these are the results:

    true['match', 2]['match', 2']
    false''[null, -1]
  • preg_quote() quotes different characters for different PHP versions.

  • PHP documentation promises that

    preg_filter() is identical to preg_replace() except it only returns the (possibly transformed) subjects...

    but preg_filter() and preg_replace() actually return completely different values for the same parameters.

  • Found $matches received from preg_match() have completely difference structure than those from preg_replace_callback().

PHP is Deliberately buggy

  • preg_match and preg_match_all return either:

    • (int) x - a number of matches, if a match is found
    • (int) 0 - if no matches are found
    • (bool) false - if a runtime error occurred

    So if you do just this:

    if (preg_match('//', '')) {

there's no way of knowing whether your pattern is incorrect or whether it's correct but your subject isn't matched by your pattern. You need to remember to add an explicit false check each time you use it.

  • All preg_* functions only return false/null/[] on error. You have to remember to call preg_last_error() to get some insight in the nature of your error. Of course it only returns int! So you have to look up that 4 is "invalid utf8 sequence" and 2 is "backtrack limit exceeded".
  • However, false-check and preg_last_error() can only save you from runtime errors. So called compile errors don't work that way and require either setting a custom error handler (bad idea) or read and clear just one of those errors (good luck with errors in preg_replace_callback() for example).
  • preg_filter() for arrays returns [] if an error occurred; even though [] is the perfectly valid result for this function. For example, it could have filtered out all values or its input was an empty array right from the beginning.
  • For certain parameter types, some PCRE methods (e.g. preg_filter()) raise fatal errors terminating the application.

T-Regx to the rescue

That's why T-Regx happened. It addresses all of PHP regular expressions flaws:

T-Regx is descriptive

What about now? Is the task easier?

pattern('Bob')->replace('Bob likes applees')->first()->with('Robert');
pattern('Bob')->replace('Bob likes applees')->only($limit)->with('Robert');
pattern('Bob')->count('Bob likes applees');

T-Regx is for developers (it's reliable)

If you try to use an invalid regular expression in Java or JavaScript, you would probably get a SyntaxError exception and you'd be forced to handle it. Such things don't happen in PHP regular expressions.

T-Regx always throws an exception and never issues any warnings, fatals, errors or notices.

try {
return pattern('Foo')->match('Bar')->all();
catch (CleanRegexException $exception) {
// handle the error

Furthermore, T-Regx throws different exceptions for different errors:

  • SubjectNotMatchedException
  • MalformedPatternException
  • FlagNotAllowedException
  • GroupNotMatchedException
  • NonexistentGroupException
  • InvalidReplacementException
  • InvalidReturnValueException
  • MissingSplitDelimiterGroupException
  • InternalCleanRegexException

They all extend CleanRegexException though.

Further, furthermore, if you pass an invalid data type to any of the T-Regx methods, \InvalidArgumentException is thrown.

T-Regx is explicit

Looking at T-Regx code, everyone can immediately see author's intentions and will be able to recognize what the code exactly does, right away.

// or

Looking at this code is like reading a book.

You will not find arrays of arrays of arrays in T-Regx API. Each functionality has a dedicated set of methods.

pattern($pattern)->match($subject)->first(function (Match $match) {
$match->offset(); // offset of a matched occurrence
$match->group(2)->offset(); // offset of a matched capturing group
$match->hasGroup('uri'); // group validation
$match->hasGroup('2asd'); // throws \InvalidArgumentException

Read more about Match details.

T-Regx is really smart with its exceptions

We really did put a lot of thoughts to make T-Regx secure, so for example these code snippets aren't a big deal:

pattern('\w+')->replace($subject)->all()->callback(function (Match $match) {
try {
return pattern('intentionally (( invalid {{ pattern ')->match('Foo')->first();
catch (MalformedPatternException $ex) {
// it's all good and dandy with the catching of this exception :)
return $match;

In other words, warnings and flags raised and set by the first pattern()->match() invalid call will be represented as MalformedPatternException and won't interfere with the upper pattern()->replace().

Last updated on