Skip to main content

Stand with ukraine

🇷🇺 К гражданам России#

В Украине сейчас идет война. Силами РФ наносятся удары по гражданской инфраструктуре в Харькове, Киеве, Чернигове, Сумах, Ирпене и десятках других городов. Гибнут люди - и гражданское население, и военные, в том числе российские призывники, которых бросили воевать. Чтобы лишить собственный народ доступа к информации, правительство РФ запретило называть войну войной, закрыло независимые СМИ и принимает сейчас ряд диктаторских законов. Эти законы призваны заткнуть рот всем, кто против войны. За обычный призыв к миру сейчас можно получить несколько лет тюрьмы.

Не молчите! Молчание - знак вашего согласия с политикой российского правительства.
Вы можете сделать выбор НЕ МОЛЧАТЬ.


🇺🇸 To people of Russia#

There is a war in Ukraine right now. The forces of the Russian Federation are attacking civilian infrastructure in Kharkiv, Kyiv, Chernihiv, Sumy, Irpin and dozens of other cities. People are dying – both civilians and military servicemen, including Russian conscripts who were thrown into the fighting. In order to deprive its own people of access to information, the government of the Russian Federation has forbidden calling a war a war, shut down independent media and is passing a number of dictatorial laws. These laws are meant to silence all those who are against war. You can be jailed for multiple years for simply calling for peace.

Do not be silent! Silence is a sign that you accept the Russian government's policy.
You can choose NOT TO BE SILENT.

Announcement - Prepared patterns revamp

Rawwrrrr!

Hello, dear regexp writers! For about 5 months now, we've been working really hard or rewriting prepared patterns, in order to introduce certain necessary features to them.

The biggest issue, of prepared patterns in their current form, is that the only form of ignoring a placeholder in a pattern was escape.

Pattern::inject('foo:@', ['bar']); // includes value
Pattern::inject('foo:\@', []); // doesn't include value

Of course, you could also escape the slash, so foo:\\@ would include the value, foo:\\\@ wouldn't, and so on.

The that's fine, but it's not everything. There are other cases whether placeholders needed special treatments, most notably [@], \Q@\E, (?#@) and #@\n (with x flag). We knew about those cases, and we made sure, that while the placeholder would be used in those cases, they wouldn't break the pattern and wouldn't introduce any unexpected behaviour.

So in other words, as long as the users used the library according to the documentation, every thing would be fine and every feature would be usable as usual.

The problem appears, what if user uses the library not in accordance to the documentation? Well, the best case would be to throw an exception, where users' actions were invalid, or perform them if they were. Sadly, it turns out that with the current implementation that appeared to be impossible. And there's also another case, where user can use in-pattern structures to enable or disable x flag, turning a certain pattern into a comment, or turn a comment off. In that, handling the placeholder properly turned out to be virtually impossible, not for the corner cases but for the standard cases as well. So we decided to spend months, to rewrite the prepared patterns internals, allowing us to handle the pattern building process much better.

The changes haven't been released yet, but they will be soon. Here are the changes:

  • Currently, \@ would be left untouched. This behaviour is unchanged.
  • Currently, [@], \Q@\E, \c@ would be injected. These values won't be injected now.
  • Currently, placeholder @ in comment would be injected. From now on, it won't, regardless of flags used in the main pattern, or in any of the subpatterns.

So in short, in the current version, @ placeholder was replaced everytime, unless escaped.

In this the next release, @ will be replaced only if that's a literal in a pattern. So, if @ is a part of a character-class ([@]), is quoted (\Q@\E), is escaped \@, is in a comment ((?#@)), or is in an extended comment (#@\n, when x flag is used), then it won't be injected, or any other case to come, it won't be injected.

Announcement - Prepared patterns simplification

Rawwrrrr!

Hello, dear regexp writers! Again! After the revamp of prepared patterns, there will come a change in the interface of the prepared patterns method as well. Simply speaking, we'll simplify them.

Reconcile Pattern::inject() vs Pattern::bind()#

When prepared patterns, first came to be, the initial idea behind Pattern::bind() was that we could name our placeholder, so that the regular expression could become more readable. With named placeholders we could also reuse them.

However, after a year of production use, it turns out that naming placeholders doesn't produce as much utility, as it does to compromise the robustness of the patterns. And reusing of the patterns proved to be even less frequent.

For example, instead of

Pattern::bind('http://@animal.site.com/@animal', ['animal' => $animal]);

one could simply use

Pattern::inject('http://@.site.com/@', [$animal, $animal]);

There have been debates as to which of those approaches is "cleaner", and the majority decided that the Pattern::inject() is cleaner, despite the duplication of placeholders, on the rationale that, if the placeholder is used twice, so should the injected values.

All in all, we decided that Pattern::bind() doesn't bring any more utility that Pattern::inject(), and there's nothing you could do with Pattern::bind(), that you couldn't with Pattern::inject(), so we decided to remove it from the library.

Bad design of Pattern::template()#

Some time back, we introduce Pattern::template() as a way of building patterns using a fluent builder. You could specify a template with @ and & placeholders inside. @ placeholders would be injected with the values, while & would be injected with patterns, like masks.

After the review of the interface, we admit that was a bad interface from the start. We didn't think it through.

We decided that two placeholders, @ and & were superfluous, and we could easily achieve the same effect with just one. Additionally, we decided that we shouldn't have tied the template to the Pattern::inject() in such a crude way.

Pattern::template('&, @, &, @')
->literal() // replace the first "&" with "&"
->mask($mask, $keywords) // replace the second "&" with the mask
->inject([$first, $second]); // replace the first and the second "@" with values

We admit that this design was as bad as it could ever be, we hated using that in production. It must be eliminated.

Instead, the new API will look similar to this one:

Pattern::template('@, @, @, @')
->literal('&') // replace the first "@" with "&"
->literal($first) // replace the second "@" with value
->mask($mask, $keywords) // replace the third "@" with the mask
->literal($second) // replace the fourth "@" with value
->build();

Which we believe looks cleaner, is more description, conveys intention and is prone to create less bugs, in our opinions.

Templates and builders

Rawwrrrr!

We've release T-Regx 0.11.0.

This is more of a maintenance release, most of our development time is hovering around inject #91 issue, and that's quite a heavy feature, requiring us to in fact rewrite our Prepared Patterns completely, and use our dedicated regular expressions parser. None of the parsers available on the internet matched our needs. It will probably be released as T-Regx 1.0, because it introduces too much breaking changes. (Actually it was realeased as 0.12.0)

Another time-consuming thing is t-regx.com website being rewritten from scratch, you can expect it in a few months.

In this release, we simplified PatternBuilder to Pattern, simplified template() and mask() methods, unified Pattern/PatternImpl/PatternInterface into one being, and we added Pcre version helper.

As of the release, as always, everything is described in ChangeLog.md on github.

Implicit all() in replace()

Rawwrrrr!

We've release T-Regx 0.10.2.

Normally, when doing replacements, you always had to specify explicitly the number of them, so:

  • replace()->all()->with()
  • replace()->first()->by()
  • replace()->only(2)->focus()

Since 0.10.2, you can skip the quantifier, and just use with()/callback()/by()/focus() or any other replace methods, like so:

  • replace()->with()
  • replace()->by()
  • replace()->focus()

And they will replace every occurrence, just like all().

Don't worry, we don't use any kind of meta-programing with magic methods or anything. We used simple polymorphism and design patterns (delegation and adapter in this case), so if you click Ctrl+B/Go to declaration in your IDE, you will see exactly what code is being run.

Additionally, we customized some exceptions messages. Now, depending on the nature of your exception, you will see one of these additional exception messages:

  • Expected to get the 3-nth element from fluent pattern, but the subject backing the feed was not matched
  • Expected to get the first match as integer, but subject was not matched
  • Expected to get the first element from fluent pattern, but the elements feed has 0 element(s)
  • and more. You can see them all on github in /CleanRegex/Internal/Exception/Messages

As always, everything is described in ChangeLog.md on github.

Valentine's release

Rawwrrrr!

We've release T-Regx 0.10.1.

This time, we've updated match filtering. Previously, methods filter() used on regular match pattern, it would filter only Detail, and have exactly alike interface as the said match pattern (like a filtering decorator), yet fluent()->filter() simply removed entries from the fluent stream. We don't like that difference.

So we renamed match()->filter() to match()->remaining(), since that looks more like a decorator it is, and we added new match()->filter() method which works like all(), but it only returns the items matching the predicate (like array_filter).

Apart from that, we fixed a bug that was lurking in fluent()->flatMap() (don't worry, it's gone now :), as well as improving the fluent()->first(). Now, when filtering a fluent stream, calling first() first calls preg_match(), and if it matches the predicate, that Detail is simply returned. If the first Detail isn't matched by the predicate, then it calls preg_match_all() and returns the first detail from that, that matches the predicate.

As always, everything is described in ChangeLog.md on github.

T-Regx on PHP8

Rawwrrrr!

We've release T-Regx 0.10.0.

In that change, there's only added support for PHP8, which is handling of PHP8 change in methods contracts, and removal of previously deprecated Match, since match is now a keyword in PHP.

As always, everything is described in ChangeLog.md on github.

Formats and expectations!

Heey!

We've release T-Regx 0.9.14.

We changed T-Regx quite a bit with all the breaking changes, we'll try to minimise it in the future. All the changes are listed in ChangeLog.md.

In this release we added two major new features: formats and replace expectations.

Formats, using Pattern::format() allow you to build a real regular expressions using an arbitrary "mask" or "format string".

Replacement expectations add methods to control how many replacements were performed, you can now count the amount of replacements performed with counting(), and methods exactly(), atLeast(), and atMost() are a simplification of counting(), which simply validate whether the replacements performed match the expectations.

Dark mode

Heey!

We added dark mode to the documentation page, which we know is sexy right now.

We also updated a bunch of documentation pages, and there's more on the way.

We'll try to make the documentation as rich as possible, before we split the releases of T-Regx into PHP7 and PHP8 versions.