Skip to main content

Handling user input

Prepared Patterns allow you to confidently use user-input or unsafe data that might contain regular expression special characters. It's also integrated with Automatic Delimiters, so they're quoted with regard to the delimiter that was chosen automatically for you.

You can either use Pattern::template() which is a fully-fledged prepared pattern builder, or you can settle for easy and quick Pattern::inject().

You can read about each of them in the next section, but for now, let's cover the basics.

Why handling user input is important#

Let's say, you would like to search a subject for "My dog's name is Barky", where the dog's name is user input. Maybe you created a web application which allows anyone to search their dogs.

$input = $_GET['name'];
Pattern::of("(My|Our) dog's name is " . $input . '!');

Immediately though, you can see that $input may contain special characters (like ., ?) that might interfere with your pattern. It also poses a threat to ReDOS attack, if the unsafe values aren't handled properly. More over, someone might actually try to be malicious by hand and might want to deliberate break your pattern.

For example, given query param ?name=(Barky!, this is what the pattern might end up looking:

Pattern::of("(My|Our) dog's name is (Barky!");

If, by accident, $input had a value of B(arky - you would receive an exception missing ) at offset 31, but that's not everything. If you simply try to use malformed patterns, T-Regx throws and exception and you're done. However, with access to injecting malicious expressions, other, more harmful structures can be added, for example:

  • Complex look-aheads and look-arounds ((?!<)
  • Recursive patterns ((?R))
  • Structures prone to catastrophic backtracking

Such harmful structures can realistically pose ReDos attacks treats to your application and your server.

Read on, to learn about proper handling of user input.

Why not just preg_quote()#

Good question.

The same reason why good programmers use Prepared SQL Statements, instead of mysql_real_escape_string(). They allow you to separate regular expression from unsafe data, which helps with making the pattern safer:

  • it's declarative, means you only need to declare how would you like the data to be used.
  • delimiters become an implementation detail - one less thing to worry about.
  • Extended mode (e.g. with x flag, or in-pattern construct) require spaces and whitespaces to also be quoted, which preg_quote() doesn't quote at all!
  • preg_quote() doesn't quote comments in PHP 7.1 and before, in T-Regx this is handled on all PHP versions.
Last updated on