Prepared Patterns
If you use Pattern::inject()
or Pattern::template()
, you can explicitly specify which parts of your pattern
should be treated as string literals, and not as regular expression special characters.
Prepared Patterns also understand that strings, that are supposed to be treated as string literals, are to be quoted with a delimiter, that was chosen with Automatic Delimiters.
#
Use-caseWhen you need to use unsafe data in your patterns, it might be tempting to do something like this:
But you, dear reader, know that it's a terrible, terrible idea. $_GET['domain']
may contain
unexpected/malicious regular expression characters.
We need to treat each part of that pattern separately, we must:
^https://
must be treated as a regular expression$_GET['domain']
must be treated as a string\.(com|net)
must be treated as a regular expression
Pattern::inject()
#
With Pattern::inject()
allows you to specify @
placeholder in your pattern, in which
your value will be safely injected.
The code above means:
- Treat
https?://
and/index\.php
as regexp - Replace
@
with$_GET['domain']
, but handling all regexp special characters.
#
PCRE modifiersShould there be a need for additional PCRE modifiers (flags), simply pass them as a last argument into prepared patterns.
#
PCRE-styled patternsShould there be a need for your own delimiters, or you just want to use PCRE style, simply use
PatternBuilder::builder()->pcre()->
method, instead of Pattern::
facade:
#
Old-school pattern quotingHave you chosen to work with regular PCRE functions, your code might look similar to this:
Prepared Patterns address some of this approach flaws. They:
- automatically delimiter your input, so there's no need for specifying the delimiter again in
preg_quote()
. - are declarative. Meaning, you only need to declare that you want those values to be treated as string literals.
- fix inconsistency with
preg_quote()
quoting different values since PHP 7.3
They also add additional functionality, that currently is utterly missing in PHP:
- Extended mode (enabled e.g. with
x
flag) ignores whitespaces, so large expressions can be split to multiple lines.preg_quote()
doesn't quote spaces, so user-input spaces are also going to be ignored - Prepared Patterns will however preserve them.
#
What about special casesT-Regx prepared patterns understand that sometimes "@"
placeholder shouldn't be treated as a placeholder, even
when using Pattern::inject()
/Pattern::template()
. These cases are:
- Character class:
\w+:[0-9@]
- Perl quote:
\w+:\Q@\E
- Control character:
\w+:\c@
- Comment (when
x
flag is used):\w+:#@\n
When @
appears in one of those fragments, it won't be treated as a placeholder, and values won't be injected into it.