Regular expressions are a powerful tool for using patterns to search and modify text, and are vital in many programs, programming languages, databases, and
Starting in 1999, UTS #18: Unicode Regular Expressions has supplied guidelines and conformance
levels for supporting Unicode in regular expressions. The new version 21 broadens the scope of properties for regular expressions (regex) to allow for properties of strings (such as for emoji sequences). For example, the following matches all emoji flags except the French flag:
Among the improvements are:
Provides a new
Annex D: Resolving Character Classes with Strings for handling negations
of sets of strings.
Updates the full property list to include
the latest UCD properties, plus Emoji properties and UTS #39 properties.
Removes obsolete text passages, and makes
editorial changes for clarity.
to help the Unicode Consortium’s work on digitally disadvantaged languages