Skip to main content

Announcement - Prepared patterns revamp


Hello, dear regexp writers! For about 5 months now, we've been working really hard or rewriting prepared patterns, in order to introduce certain necessary features to them.

The biggest issue, of prepared patterns in their current form, is that the only form of ignoring a placeholder in a pattern was escape.

Pattern::inject('foo:@', ['bar']); // includes value
Pattern::inject('foo:\@', []); // doesn't include value

Of course, you could also escape the slash, so foo:\\@ would include the value, foo:\\\@ wouldn't, and so on.

The that's fine, but it's not everything. There are other cases whether placeholders needed special treatments, most notably [@], \Q@\E, (?#@) and #@\n (with x flag). We knew about those cases, and we made sure, that while the placeholder would be used in those cases, they wouldn't break the pattern and wouldn't introduce any unexpected behaviour.

So in other words, as long as the users used the library according to the documentation, every thing would be fine and every feature would be usable as usual.

The problem appears, what if user uses the library not in accordance to the documentation? Well, the best case would be to throw an exception, where users' actions were invalid, or perform them if they were. Sadly, it turns out that with the current implementation that appeared to be impossible. And there's also another case, where user can use in-pattern structures to enable or disable x flag, turning a certain pattern into a comment, or turn a comment off. In that, handling the placeholder properly turned out to be virtually impossible, not for the corner cases but for the standard cases as well. So we decided to spend months, to rewrite the prepared patterns internals, allowing us to handle the pattern building process much better.

The changes haven't been released yet, but they will be soon. Here are the changes:

  • Currently, \@ would be left untouched. This behaviour is unchanged.
  • Currently, [@], \Q@\E, \c@ would be injected. These values won't be injected now.
  • Currently, placeholder @ in comment would be injected. From now on, it won't, regardless of flags used in the main pattern, or in any of the subpatterns.

So in short, in the current version, @ placeholder was replaced everytime, unless escaped.

In this the next release, @ will be replaced only if that's a literal in a pattern. So, if @ is a part of a character-class ([@]), is quoted (\Q@\E), is escaped \@, is in a comment ((?#@)), or is in an extended comment (#@\n, when x flag is used), then it won't be injected, or any other case to come, it won't be injected.

Announcement - Prepared patterns simplification


Hello, dear regexp writers! Again! After the revamp of prepared patterns, there will come a change in the interface of the prepared patterns method as well. Simply speaking, we'll simplify them.

Reconcile Pattern::inject() vs Pattern::bind()#

When prepared patterns, first came to be, the initial idea behind Pattern::bind() was that we could name our placeholder, so that the regular expression could become more readable. With named placeholders we could also reuse them.

However, after a year of production use, it turns out that naming placeholders doesn't produce as much utility, as it does to compromise the robustness of the patterns. And reusing of the patterns proved to be even less frequent.

For example, instead of

Pattern::bind('', ['animal' => $animal]);

one could simply use

Pattern::inject('', [$animal, $animal]);

There have been debates as to which of those approaches is "cleaner", and the majority decided that the Pattern::inject() is cleaner, despite the duplication of placeholders, on the rationale that, if the placeholder is used twice, so should the injected values.

All in all, we decided that Pattern::bind() doesn't bring any more utility that Pattern::inject(), and there's nothing you could do with Pattern::bind(), that you couldn't with Pattern::inject(), so we decided to remove it from the library.

Bad design of Pattern::template()#

Some time back, we introduce Pattern::template() as a way of building patterns using a fluent builder. You could specify a template with @ and & placeholders inside. @ placeholders would be injected with the values, while & would be injected with patterns, like masks.

After the review of the interface, we admit that was a bad interface from the start. We didn't think it through.

We decided that two placeholders, @ and & were superfluous, and we could easily achieve the same effect with just one. Additionally, we decided that we shouldn't have tied the template to the Pattern::inject() in such a crude way.

Pattern::template('&, @, &, @')
->literal() // replace the first "&" with "&"
->mask($mask, $keywords) // replace the second "&" with the mask
->inject([$first, $second]); // replace the first and the second "@" with values

We admit that this design was as bad as it could ever be, we hated using that in production. It must be eliminated.

Instead, the new API will look similar to this one:

Pattern::template('@, @, @, @')
->literal('&') // replace the first "@" with "&"
->literal($first) // replace the second "@" with value
->mask($mask, $keywords) // replace the third "@" with the mask
->literal($second) // replace the fourth "@" with value

Which we believe looks cleaner, is more description, conveys intention and is prone to create less bugs, in our opinions.

Templates and builders


We've release T-Regx 0.11.0.

This is more of a maintenance release, most of our development time is hovering around inject #91 issue, and that's quite a heavy feature, requiring us to in fact rewrite our Prepared Patterns completely, and use our dedicated regular expressions parser. None of the parsers available on the internet matched our needs. It will probably be released as T-Regx 1.0, because it introduces too much breaking changes. (Actually it was realeased as 0.12.0)

Another time-consuming thing is website being rewritten from scratch, you can expect it in a few months.

In this release, we simplified PatternBuilder to Pattern, simplified template() and mask() methods, unified Pattern/PatternImpl/PatternInterface into one being, and we added Pcre version helper.

As of the release, as always, everything is described in on github.

Implicit all() in replace()


We've release T-Regx 0.10.2.

Normally, when doing replacements, you always had to specify explicitly the number of them, so:

  • replace()->all()->with()
  • replace()->first()->by()
  • replace()->only(2)->focus()

Since 0.10.2, you can skip the quantifier, and just use with()/callback()/by()/focus() or any other replace methods, like so:

  • replace()->with()
  • replace()->by()
  • replace()->focus()

And they will replace every occurrence, just like all().

Don't worry, we don't use any kind of meta-programing with magic methods or anything. We used simple polymorphism and design patterns (delegation and adapter in this case), so if you click Ctrl+B/Go to declaration in your IDE, you will see exactly what code is being run.

Additionally, we customized some exceptions messages. Now, depending on the nature of your exception, you will see one of these additional exception messages:

  • Expected to get the 3-nth element from fluent pattern, but the subject backing the feed was not matched
  • Expected to get the first match as integer, but subject was not matched
  • Expected to get the first element from fluent pattern, but the elements feed has 0 element(s)
  • and more. You can see them all on github in /CleanRegex/Internal/Exception/Messages

As always, everything is described in on github.

Valentine's release


We've release T-Regx 0.10.1.

This time, we've updated match filtering. Previously, methods filter() used on regular match pattern, it would filter only Detail, and have exactly alike interface as the said match pattern (like a filtering decorator), yet fluent()->filter() simply removed entries from the fluent stream. We don't like that difference.

So we renamed match()->filter() to match()->remaining(), since that looks more like a decorator it is, and we added new match()->filter() method which works like all(), but it only returns the items matching the predicate (like array_filter).

Apart from that, we fixed a bug that was lurking in fluent()->flatMap() (don't worry, it's gone now :), as well as improving the fluent()->first(). Now, when filtering a fluent stream, calling first() first calls preg_match(), and if it matches the predicate, that Detail is simply returned. If the first Detail isn't matched by the predicate, then it calls preg_match_all() and returns the first detail from that, that matches the predicate.

As always, everything is described in on github.

T-Regx on PHP8


We've release T-Regx 0.10.0.

In that change, there's only added support for PHP8, which is handling of PHP8 change in methods contracts, and removal of previously deprecated Match, since match is now a keyword in PHP.

As always, everything is described in on github.

Formats and expectations!


We've release T-Regx 0.9.14.

We changed T-Regx quite a bit with all the breaking changes, we'll try to minimise it in the future. All the changes are listed in

In this release we added two major new features: formats and replace expectations.

Formats, using Pattern::format() allow you to build a real regular expressions using an arbitrary "mask" or "format string".

Replacement expectations add methods to control how many replacements were performed, you can now count the amount of replacements performed with counting(), and methods exactly(), atLeast(), and atMost() are a simplification of counting(), which simply validate whether the replacements performed match the expectations.