Warning: preg_replace(): Unknown modifier ']'
up vote
34
down vote
favorite
I have the following error :
Warning: preg_replace(): Unknown modifier ']' in xxx.php on line 38
This is the code on line 38 :
<?php echo str_replace("</ul></div>", "", preg_replace("<div[^>]*><ul[^>]*>", "", wp_nav_menu(array('theme_location' => 'nav', 'echo' => false)) )); ?>
Can someone please help me to fix this problem?
php regex wordpress preg-replace
|
show 11 more comments
up vote
34
down vote
favorite
I have the following error :
Warning: preg_replace(): Unknown modifier ']' in xxx.php on line 38
This is the code on line 38 :
<?php echo str_replace("</ul></div>", "", preg_replace("<div[^>]*><ul[^>]*>", "", wp_nav_menu(array('theme_location' => 'nav', 'echo' => false)) )); ?>
Can someone please help me to fix this problem?
php regex wordpress preg-replace
6
Add delimeters around the pattern:"/<div[^>]*><ul[^>]*>/"
– raina77ow
Dec 20 '13 at 14:07
1
@mario I don't really see why you put a bounty here? Are you really looking for new answers here? If yes what's wrong with the current one?
– Rizier123
Jul 2 '15 at 9:36
1
@Rizier123 The bounty description says it all: "One or more of the answers is exemplary and worthy of an additional bounty."
– birgire
Jul 2 '15 at 10:16
3
Yes, this isn't meant to attract more answers. The existing one is a pretty excellent example already. It's a great visual explaination, and likely applicable to many similar cases. And such mini bounties are mainly intended as temporary public bookmark - to make it better known. And perhaps establish this as another universal reference. (Though could make sense to craft an artificial CW answer with extra examples + links afterwards…)
– mario
Jul 2 '15 at 19:33
1
@mario If this should get an artifical answer, shouldn't we change the example a bit? I mean the OP is parsing HTML with regexes. I'm with you that the answer shows a lot of effort (and I like him and his posts) but I'm asking: Is this necessary? I mean a short "You need to enclose your regex with a delimiter" and a link to the (very good!) documentation would have been enough. Isn't it? IMHO all that extra information is going into the wrong direction and may confuse (expected newbie) users more than it helps.
– hek2mgl
Jul 2 '15 at 21:49
|
show 11 more comments
up vote
34
down vote
favorite
up vote
34
down vote
favorite
I have the following error :
Warning: preg_replace(): Unknown modifier ']' in xxx.php on line 38
This is the code on line 38 :
<?php echo str_replace("</ul></div>", "", preg_replace("<div[^>]*><ul[^>]*>", "", wp_nav_menu(array('theme_location' => 'nav', 'echo' => false)) )); ?>
Can someone please help me to fix this problem?
php regex wordpress preg-replace
I have the following error :
Warning: preg_replace(): Unknown modifier ']' in xxx.php on line 38
This is the code on line 38 :
<?php echo str_replace("</ul></div>", "", preg_replace("<div[^>]*><ul[^>]*>", "", wp_nav_menu(array('theme_location' => 'nav', 'echo' => false)) )); ?>
Can someone please help me to fix this problem?
php regex wordpress preg-replace
php regex wordpress preg-replace
edited Aug 7 '15 at 16:31
Rizier123
51.7k1563104
51.7k1563104
asked Dec 20 '13 at 14:05
user3122995
184124
184124
6
Add delimeters around the pattern:"/<div[^>]*><ul[^>]*>/"
– raina77ow
Dec 20 '13 at 14:07
1
@mario I don't really see why you put a bounty here? Are you really looking for new answers here? If yes what's wrong with the current one?
– Rizier123
Jul 2 '15 at 9:36
1
@Rizier123 The bounty description says it all: "One or more of the answers is exemplary and worthy of an additional bounty."
– birgire
Jul 2 '15 at 10:16
3
Yes, this isn't meant to attract more answers. The existing one is a pretty excellent example already. It's a great visual explaination, and likely applicable to many similar cases. And such mini bounties are mainly intended as temporary public bookmark - to make it better known. And perhaps establish this as another universal reference. (Though could make sense to craft an artificial CW answer with extra examples + links afterwards…)
– mario
Jul 2 '15 at 19:33
1
@mario If this should get an artifical answer, shouldn't we change the example a bit? I mean the OP is parsing HTML with regexes. I'm with you that the answer shows a lot of effort (and I like him and his posts) but I'm asking: Is this necessary? I mean a short "You need to enclose your regex with a delimiter" and a link to the (very good!) documentation would have been enough. Isn't it? IMHO all that extra information is going into the wrong direction and may confuse (expected newbie) users more than it helps.
– hek2mgl
Jul 2 '15 at 21:49
|
show 11 more comments
6
Add delimeters around the pattern:"/<div[^>]*><ul[^>]*>/"
– raina77ow
Dec 20 '13 at 14:07
1
@mario I don't really see why you put a bounty here? Are you really looking for new answers here? If yes what's wrong with the current one?
– Rizier123
Jul 2 '15 at 9:36
1
@Rizier123 The bounty description says it all: "One or more of the answers is exemplary and worthy of an additional bounty."
– birgire
Jul 2 '15 at 10:16
3
Yes, this isn't meant to attract more answers. The existing one is a pretty excellent example already. It's a great visual explaination, and likely applicable to many similar cases. And such mini bounties are mainly intended as temporary public bookmark - to make it better known. And perhaps establish this as another universal reference. (Though could make sense to craft an artificial CW answer with extra examples + links afterwards…)
– mario
Jul 2 '15 at 19:33
1
@mario If this should get an artifical answer, shouldn't we change the example a bit? I mean the OP is parsing HTML with regexes. I'm with you that the answer shows a lot of effort (and I like him and his posts) but I'm asking: Is this necessary? I mean a short "You need to enclose your regex with a delimiter" and a link to the (very good!) documentation would have been enough. Isn't it? IMHO all that extra information is going into the wrong direction and may confuse (expected newbie) users more than it helps.
– hek2mgl
Jul 2 '15 at 21:49
6
6
Add delimeters around the pattern:
"/<div[^>]*><ul[^>]*>/"
– raina77ow
Dec 20 '13 at 14:07
Add delimeters around the pattern:
"/<div[^>]*><ul[^>]*>/"
– raina77ow
Dec 20 '13 at 14:07
1
1
@mario I don't really see why you put a bounty here? Are you really looking for new answers here? If yes what's wrong with the current one?
– Rizier123
Jul 2 '15 at 9:36
@mario I don't really see why you put a bounty here? Are you really looking for new answers here? If yes what's wrong with the current one?
– Rizier123
Jul 2 '15 at 9:36
1
1
@Rizier123 The bounty description says it all: "One or more of the answers is exemplary and worthy of an additional bounty."
– birgire
Jul 2 '15 at 10:16
@Rizier123 The bounty description says it all: "One or more of the answers is exemplary and worthy of an additional bounty."
– birgire
Jul 2 '15 at 10:16
3
3
Yes, this isn't meant to attract more answers. The existing one is a pretty excellent example already. It's a great visual explaination, and likely applicable to many similar cases. And such mini bounties are mainly intended as temporary public bookmark - to make it better known. And perhaps establish this as another universal reference. (Though could make sense to craft an artificial CW answer with extra examples + links afterwards…)
– mario
Jul 2 '15 at 19:33
Yes, this isn't meant to attract more answers. The existing one is a pretty excellent example already. It's a great visual explaination, and likely applicable to many similar cases. And such mini bounties are mainly intended as temporary public bookmark - to make it better known. And perhaps establish this as another universal reference. (Though could make sense to craft an artificial CW answer with extra examples + links afterwards…)
– mario
Jul 2 '15 at 19:33
1
1
@mario If this should get an artifical answer, shouldn't we change the example a bit? I mean the OP is parsing HTML with regexes. I'm with you that the answer shows a lot of effort (and I like him and his posts) but I'm asking: Is this necessary? I mean a short "You need to enclose your regex with a delimiter" and a link to the (very good!) documentation would have been enough. Isn't it? IMHO all that extra information is going into the wrong direction and may confuse (expected newbie) users more than it helps.
– hek2mgl
Jul 2 '15 at 21:49
@mario If this should get an artifical answer, shouldn't we change the example a bit? I mean the OP is parsing HTML with regexes. I'm with you that the answer shows a lot of effort (and I like him and his posts) but I'm asking: Is this necessary? I mean a short "You need to enclose your regex with a delimiter" and a link to the (very good!) documentation would have been enough. Isn't it? IMHO all that extra information is going into the wrong direction and may confuse (expected newbie) users more than it helps.
– hek2mgl
Jul 2 '15 at 21:49
|
show 11 more comments
2 Answers
2
active
oldest
votes
up vote
73
down vote
accepted
Why the error occurs
In PHP, a regular expression needs to be enclosed within a pair of delimiters. A delimiter can be any non-alphanumeric, non-backslash, non-whitespace character; /
, #
, ~
are the most commonly used ones. Note that it is also possible to use bracket style delimiters where the opening and closing brackets are the starting and ending delimiter, i.e. <pattern_goes_here>
, [pattern_goes_here]
etc. are all valid.
The "Unknown modifier X" error usually occurs in the following two cases:
When your regular expression is missing delimiters.
When you use the delimiter inside the pattern without escaping it.
In this case, the regular expression is <div[^>]*><ul[^>]*>
. The regex engine considers everything from <
to >
as the regex pattern, and everything afterwards as modifiers.
Regex: <div[^> ]*><ul[^>]*>
│ │ │ │
└──┬──┘ └────┬─────┘
pattern modifiers
]
here is an unknown modifier, because it appears after the closing >
delimiter. Which is why PHP throws that error.
Depending on the pattern, the unknown modifier complaint might as well have been about *
, +
, p
, /
or )
or almost any other letter/symbol. Only imsxeADSUXJu
are valid PCRE modifiers.
How to fix it
The fix is easy. Just wrap your regex pattern with any valid delimiters. In this case, you could chose ~ and get the following:
~<div[^>]*><ul[^>]*>~
│ │
│ └─ ending delimiter
└───────────────────── starting delimiter
If you're receiving this error despite having used a delimiter, it might be because the pattern itself contains unescaped occurrences of the said delimiter.
Or escape delimiters
/foo[^/]+bar/i
would certainly throw an error. So you can escape it using a backslash if it appears anywhere within the regex:
/foo[^/]+bar/i
│ │ │
└──────┼─────┴─ actual delimiters
└─────── escaped slash(/) character
This is a tedious job if your regex pattern contains so many occurrences of the delimiter character.
The cleaner way, of course, would be to use a different delimiter altogether. Ideally a character that does not appear anywhere inside the regex pattern, say #
- #foo[^/]+bar#i
.
More reading:
- PHP regex delimiters
- http://www.regular-expressions.info/php.html
How can I convert ereg expressions to preg in PHP? (missing delimiters)
Unknown modifier '/' in …? what is it? (on usingpreg_quote()
)
I noticed that the same occurs when one of the delimiters is inside apreg_quote()
, thus something likepreg_replace('/'.preg_quote('/').'/i','',$string);
gives the same error of the topic. Shouldn't the slash get escaped bypreg_quote()
?
– TechNyquist
Apr 29 '16 at 9:47
I ran into this when updating some oldereg
calls topreg_match
. Had to introduce delimiters.
– JoshP
Feb 28 '17 at 16:04
add a comment |
up vote
14
down vote
Other examples
The reference answer already explains the reason for "Unknown modifier" warnings. This is just a comparison of other typical variants.
When forgetting to add regex
/
delimiters/
, the first non-letter symbol will be assumed to be one. Therefore the warning is often about what follows a grouping(…)
,[…]
meta symbol:
preg_match("[a-zA-Z]+:s*.$"
↑ ↑⬆
Sometimes your regex already uses a custom delimiter (
:
here), but still contains the same character as unescaped literal. It's then mistaken as premature delimiter. Which is why the very next symbol receives the "Unknown modifier ❌" trophy:
preg_match(":[[d:/]+]:"
↑ ⬆ ↑
When using the classic
/
delimiter, take care to not have it within the regex literally. This most frequently happens when trying to match unescaped filenames:
preg_match("/pathname/filename/i"
↑ ⬆ ↑
Or when matching angle/square bracket style tags:
preg_match("/<%tmpl:id>(.*)</%tmpl:id>/Ui"
↑ ⬆ ↑
Templating-style (Smarty or BBCode) regex patterns often require
{…}
or[…]
brackets. Both should usually be escaped. (An outermost{}
pair being the exception though).
They also get misinterpreted as paired delimiters when no actual delimiter is used. If they're then also used as literal character within, then that's, of course … an error.
preg_match("{bold[^}]+}"
↑ ⬆ ↑
Whenever the warning says "Delimiter must not be alphanumeric or backslash" then you also entirely forgot delimiters:
preg_match("ab?c*"
↑
"Unkown modifier 'g'" often indicates a regex that was copied verbatimly from JavaScript or Perl.
preg_match("/abc+/g"
⬆
PHP doesn't use the
/g
global flag. Instead thepreg_replace
function works on all occurences, andpreg_match_all
is the "global" searching pendant to the one-occurencepreg_match
.
So, just remove the
/g
flag.
See also:
· Warning: preg_replace(): Unknown modifier 'g'
· preg_replace: bad regex == 'Unknown Modifier'?
A more peculiar case pertains the PCRE_EXTENDED
/x
flag. This is often (or should be) used for making regexps more lofty and readable.
This allows to use inline
#
comments. PHP implements the regex delimiters atop PCRE. But it doesn't treat#
in any special way. Which is how a literal delimiter in a#
comment can become an error:
preg_match("/
ab?c+ # Comment with / slash in between
/x"
(Also noteworthy that using
#
as#abc+#x
delimiter can be doubly inadvisable.)
Interpolating variables into a regex requires them to be pre-escaped, or be valid regexps themselves. You can't tell beforehand if this is gonna work:
preg_match("/id=$var;/"
↑ ↺ ↑
It's best to apply
$var = preg_quote($var, "/")
in such cases.
See also:
· Unknown modifier '/' in ...? what is it?
Another alternative is using
Q…E
escapes for unquoted literal strings:
preg_match("/id=Q{$var}E;/mix");
Note that this is merely a convenient shortcut, not dependable/safe. It would fall apart in case that
$var
contained a literal'E'
itself (however unlikely).
Deprecated modifier /e is an entirely different problem. This has nothing to do with delimiters, but the implicit expression interpretation mode being phased out. See also: Replace deprecated preg_replace /e with preg_replace_callback
Alternative regex delimiters
As mentioned already, the quickest solution to this error is just picking a distinct delimiter. Any non-letter symbol can be used. Visually distinctive ones are often preferred:
~abc+~
!abc+!
@abc+@
#abc+#
=abc+=
%abc+%
Technically you could use $abc$
or |abc|
for delimiters. However, it's best to avoid symbols that serve as regex meta characters themselves.
The hash #
as delimiter is rather popular too. But care should be taken in combination with the x
/PCRE_EXTENDED
readability modifier. You can't use # inline
or (?#…)
comments then, because those would be confused as delimiters.
Quote-only delimiters
Occassionally you see "
and '
used as regex delimiters paired with their conterpart as PHP string enclosure:
preg_match("'abc+'"
preg_match('"abc+"'
Which is perfectly valid as far as PHP is concerned. It's sometimes convenient and unobtrusive, but not always legible in IDEs and editors.
Paired delimiters
An interesting variation are paired delimiters. Instead of using the same symbol on both ends of a regex, you can use any <...>
(...)
[...]
{...}
bracket/braces combination.
preg_match("(abc+)" # just delimiters here, not a capture group
While most of them also serve as regex meta characters, you can often use them without further effort. As long as those specific braces/parens within the regex are paired or escaped correctly, these variants are quite readable.
Fancy regex delimiters
A somewhat lazy trick (which is not endorsed hereby) is using non-printable ASCII characters as delimiters. This works easily in PHP by using double quotes for the regex string, and octal escapes for delimiters:
preg_match("01 abc+ 01mix"
The 01
is just a control character ␁ that's not usually needed. Therefore it's highly unlikely to appear within most regex patterns. Which makes it suitable here, even though not very legible.
Sadly you can't use Unicode glyps ❚
as delimiters. PHP only allows single-byte characters. And why is that? Well, glad you asked:
PHPs delimiters atop PCRE
The preg_*
functions utilize the PCRE regex engine, which itself doesn't care or provide for delimiters. For resemblence with Perl the preg_*
functions implement them. Which is also why you can use modifier letters /ism
instead of just constants as parameter.
See ext/pcre/php_pcre.c on how the regex string is preprocessed:
First all leading whitespace is ignored.
Any non-alphanumeric symbol is taken as presumed delimiter. Note that PHP only honors single-byte characters:
delimiter = *p++;
if (isalnum((int)*(unsigned char *)&delimiter) || delimiter == '\') {
php_error_docref(NULL,E_WARNING, "Delimiter must not…");
return NULL;
}
The rest of the regex string is traversed left-to-right. Only backslash
\
-escaped symbols are ignored.Should the delimiter be found again, the remainder is verified to only contain modifier letters.
If the delimiter is one of the
([{< )]}> )]}>
pairable braces/brackets, then the processing logic is more elaborate.
int brackets = 1; /* brackets nesting level */
while (*pp != 0) {
if (*pp == '\' && pp[1] != 0) pp++;
else if (*pp == end_delimiter && --brackets <= 0)
break;
else if (*pp == start_delimiter)
brackets++;
pp++;
}
It looks for correctly paired left and right delimiter, but ignores other braces/bracket types when counting.
The raw regex string is passed to the PCRE backend only after delimiter and modifier flags have been cut out.
Now this is all somewhat irrelevant. But explains where the delimiter warnings come from. And this whole procedure is all to have a minimum of Perl compatibility. There are a few minor deviations of course, like the […]
character class context not receiving special treatment in PHP.
More references
- preg_match(); - Unknown modifier '+'
- Unknown modifier '/' error in PHP
- PHP RegExpr error Unkown modifier '('
- Unknown modifier '(' when using preg_match() with a REGEX expression
- PHP: Regex - Unknown modifier error
- Warning: preg_match() [function.preg-match]: Unknown modifier '('
When does preg_match(): Unknown modifier error occur?
(Just a well-written question demonstrating prior research)
Very nice explanation
– Svetoslav Marinov
Sep 13 '17 at 9:13
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f20705399%2fwarning-preg-replace-unknown-modifier%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
73
down vote
accepted
Why the error occurs
In PHP, a regular expression needs to be enclosed within a pair of delimiters. A delimiter can be any non-alphanumeric, non-backslash, non-whitespace character; /
, #
, ~
are the most commonly used ones. Note that it is also possible to use bracket style delimiters where the opening and closing brackets are the starting and ending delimiter, i.e. <pattern_goes_here>
, [pattern_goes_here]
etc. are all valid.
The "Unknown modifier X" error usually occurs in the following two cases:
When your regular expression is missing delimiters.
When you use the delimiter inside the pattern without escaping it.
In this case, the regular expression is <div[^>]*><ul[^>]*>
. The regex engine considers everything from <
to >
as the regex pattern, and everything afterwards as modifiers.
Regex: <div[^> ]*><ul[^>]*>
│ │ │ │
└──┬──┘ └────┬─────┘
pattern modifiers
]
here is an unknown modifier, because it appears after the closing >
delimiter. Which is why PHP throws that error.
Depending on the pattern, the unknown modifier complaint might as well have been about *
, +
, p
, /
or )
or almost any other letter/symbol. Only imsxeADSUXJu
are valid PCRE modifiers.
How to fix it
The fix is easy. Just wrap your regex pattern with any valid delimiters. In this case, you could chose ~ and get the following:
~<div[^>]*><ul[^>]*>~
│ │
│ └─ ending delimiter
└───────────────────── starting delimiter
If you're receiving this error despite having used a delimiter, it might be because the pattern itself contains unescaped occurrences of the said delimiter.
Or escape delimiters
/foo[^/]+bar/i
would certainly throw an error. So you can escape it using a backslash if it appears anywhere within the regex:
/foo[^/]+bar/i
│ │ │
└──────┼─────┴─ actual delimiters
└─────── escaped slash(/) character
This is a tedious job if your regex pattern contains so many occurrences of the delimiter character.
The cleaner way, of course, would be to use a different delimiter altogether. Ideally a character that does not appear anywhere inside the regex pattern, say #
- #foo[^/]+bar#i
.
More reading:
- PHP regex delimiters
- http://www.regular-expressions.info/php.html
How can I convert ereg expressions to preg in PHP? (missing delimiters)
Unknown modifier '/' in …? what is it? (on usingpreg_quote()
)
I noticed that the same occurs when one of the delimiters is inside apreg_quote()
, thus something likepreg_replace('/'.preg_quote('/').'/i','',$string);
gives the same error of the topic. Shouldn't the slash get escaped bypreg_quote()
?
– TechNyquist
Apr 29 '16 at 9:47
I ran into this when updating some oldereg
calls topreg_match
. Had to introduce delimiters.
– JoshP
Feb 28 '17 at 16:04
add a comment |
up vote
73
down vote
accepted
Why the error occurs
In PHP, a regular expression needs to be enclosed within a pair of delimiters. A delimiter can be any non-alphanumeric, non-backslash, non-whitespace character; /
, #
, ~
are the most commonly used ones. Note that it is also possible to use bracket style delimiters where the opening and closing brackets are the starting and ending delimiter, i.e. <pattern_goes_here>
, [pattern_goes_here]
etc. are all valid.
The "Unknown modifier X" error usually occurs in the following two cases:
When your regular expression is missing delimiters.
When you use the delimiter inside the pattern without escaping it.
In this case, the regular expression is <div[^>]*><ul[^>]*>
. The regex engine considers everything from <
to >
as the regex pattern, and everything afterwards as modifiers.
Regex: <div[^> ]*><ul[^>]*>
│ │ │ │
└──┬──┘ └────┬─────┘
pattern modifiers
]
here is an unknown modifier, because it appears after the closing >
delimiter. Which is why PHP throws that error.
Depending on the pattern, the unknown modifier complaint might as well have been about *
, +
, p
, /
or )
or almost any other letter/symbol. Only imsxeADSUXJu
are valid PCRE modifiers.
How to fix it
The fix is easy. Just wrap your regex pattern with any valid delimiters. In this case, you could chose ~ and get the following:
~<div[^>]*><ul[^>]*>~
│ │
│ └─ ending delimiter
└───────────────────── starting delimiter
If you're receiving this error despite having used a delimiter, it might be because the pattern itself contains unescaped occurrences of the said delimiter.
Or escape delimiters
/foo[^/]+bar/i
would certainly throw an error. So you can escape it using a backslash if it appears anywhere within the regex:
/foo[^/]+bar/i
│ │ │
└──────┼─────┴─ actual delimiters
└─────── escaped slash(/) character
This is a tedious job if your regex pattern contains so many occurrences of the delimiter character.
The cleaner way, of course, would be to use a different delimiter altogether. Ideally a character that does not appear anywhere inside the regex pattern, say #
- #foo[^/]+bar#i
.
More reading:
- PHP regex delimiters
- http://www.regular-expressions.info/php.html
How can I convert ereg expressions to preg in PHP? (missing delimiters)
Unknown modifier '/' in …? what is it? (on usingpreg_quote()
)
I noticed that the same occurs when one of the delimiters is inside apreg_quote()
, thus something likepreg_replace('/'.preg_quote('/').'/i','',$string);
gives the same error of the topic. Shouldn't the slash get escaped bypreg_quote()
?
– TechNyquist
Apr 29 '16 at 9:47
I ran into this when updating some oldereg
calls topreg_match
. Had to introduce delimiters.
– JoshP
Feb 28 '17 at 16:04
add a comment |
up vote
73
down vote
accepted
up vote
73
down vote
accepted
Why the error occurs
In PHP, a regular expression needs to be enclosed within a pair of delimiters. A delimiter can be any non-alphanumeric, non-backslash, non-whitespace character; /
, #
, ~
are the most commonly used ones. Note that it is also possible to use bracket style delimiters where the opening and closing brackets are the starting and ending delimiter, i.e. <pattern_goes_here>
, [pattern_goes_here]
etc. are all valid.
The "Unknown modifier X" error usually occurs in the following two cases:
When your regular expression is missing delimiters.
When you use the delimiter inside the pattern without escaping it.
In this case, the regular expression is <div[^>]*><ul[^>]*>
. The regex engine considers everything from <
to >
as the regex pattern, and everything afterwards as modifiers.
Regex: <div[^> ]*><ul[^>]*>
│ │ │ │
└──┬──┘ └────┬─────┘
pattern modifiers
]
here is an unknown modifier, because it appears after the closing >
delimiter. Which is why PHP throws that error.
Depending on the pattern, the unknown modifier complaint might as well have been about *
, +
, p
, /
or )
or almost any other letter/symbol. Only imsxeADSUXJu
are valid PCRE modifiers.
How to fix it
The fix is easy. Just wrap your regex pattern with any valid delimiters. In this case, you could chose ~ and get the following:
~<div[^>]*><ul[^>]*>~
│ │
│ └─ ending delimiter
└───────────────────── starting delimiter
If you're receiving this error despite having used a delimiter, it might be because the pattern itself contains unescaped occurrences of the said delimiter.
Or escape delimiters
/foo[^/]+bar/i
would certainly throw an error. So you can escape it using a backslash if it appears anywhere within the regex:
/foo[^/]+bar/i
│ │ │
└──────┼─────┴─ actual delimiters
└─────── escaped slash(/) character
This is a tedious job if your regex pattern contains so many occurrences of the delimiter character.
The cleaner way, of course, would be to use a different delimiter altogether. Ideally a character that does not appear anywhere inside the regex pattern, say #
- #foo[^/]+bar#i
.
More reading:
- PHP regex delimiters
- http://www.regular-expressions.info/php.html
How can I convert ereg expressions to preg in PHP? (missing delimiters)
Unknown modifier '/' in …? what is it? (on usingpreg_quote()
)
Why the error occurs
In PHP, a regular expression needs to be enclosed within a pair of delimiters. A delimiter can be any non-alphanumeric, non-backslash, non-whitespace character; /
, #
, ~
are the most commonly used ones. Note that it is also possible to use bracket style delimiters where the opening and closing brackets are the starting and ending delimiter, i.e. <pattern_goes_here>
, [pattern_goes_here]
etc. are all valid.
The "Unknown modifier X" error usually occurs in the following two cases:
When your regular expression is missing delimiters.
When you use the delimiter inside the pattern without escaping it.
In this case, the regular expression is <div[^>]*><ul[^>]*>
. The regex engine considers everything from <
to >
as the regex pattern, and everything afterwards as modifiers.
Regex: <div[^> ]*><ul[^>]*>
│ │ │ │
└──┬──┘ └────┬─────┘
pattern modifiers
]
here is an unknown modifier, because it appears after the closing >
delimiter. Which is why PHP throws that error.
Depending on the pattern, the unknown modifier complaint might as well have been about *
, +
, p
, /
or )
or almost any other letter/symbol. Only imsxeADSUXJu
are valid PCRE modifiers.
How to fix it
The fix is easy. Just wrap your regex pattern with any valid delimiters. In this case, you could chose ~ and get the following:
~<div[^>]*><ul[^>]*>~
│ │
│ └─ ending delimiter
└───────────────────── starting delimiter
If you're receiving this error despite having used a delimiter, it might be because the pattern itself contains unescaped occurrences of the said delimiter.
Or escape delimiters
/foo[^/]+bar/i
would certainly throw an error. So you can escape it using a backslash if it appears anywhere within the regex:
/foo[^/]+bar/i
│ │ │
└──────┼─────┴─ actual delimiters
└─────── escaped slash(/) character
This is a tedious job if your regex pattern contains so many occurrences of the delimiter character.
The cleaner way, of course, would be to use a different delimiter altogether. Ideally a character that does not appear anywhere inside the regex pattern, say #
- #foo[^/]+bar#i
.
More reading:
- PHP regex delimiters
- http://www.regular-expressions.info/php.html
How can I convert ereg expressions to preg in PHP? (missing delimiters)
Unknown modifier '/' in …? what is it? (on usingpreg_quote()
)
edited May 23 '17 at 12:34
Community♦
11
11
answered Dec 20 '13 at 14:06
Amal Murali
60.5k1295116
60.5k1295116
I noticed that the same occurs when one of the delimiters is inside apreg_quote()
, thus something likepreg_replace('/'.preg_quote('/').'/i','',$string);
gives the same error of the topic. Shouldn't the slash get escaped bypreg_quote()
?
– TechNyquist
Apr 29 '16 at 9:47
I ran into this when updating some oldereg
calls topreg_match
. Had to introduce delimiters.
– JoshP
Feb 28 '17 at 16:04
add a comment |
I noticed that the same occurs when one of the delimiters is inside apreg_quote()
, thus something likepreg_replace('/'.preg_quote('/').'/i','',$string);
gives the same error of the topic. Shouldn't the slash get escaped bypreg_quote()
?
– TechNyquist
Apr 29 '16 at 9:47
I ran into this when updating some oldereg
calls topreg_match
. Had to introduce delimiters.
– JoshP
Feb 28 '17 at 16:04
I noticed that the same occurs when one of the delimiters is inside a
preg_quote()
, thus something like preg_replace('/'.preg_quote('/').'/i','',$string);
gives the same error of the topic. Shouldn't the slash get escaped by preg_quote()
?– TechNyquist
Apr 29 '16 at 9:47
I noticed that the same occurs when one of the delimiters is inside a
preg_quote()
, thus something like preg_replace('/'.preg_quote('/').'/i','',$string);
gives the same error of the topic. Shouldn't the slash get escaped by preg_quote()
?– TechNyquist
Apr 29 '16 at 9:47
I ran into this when updating some old
ereg
calls to preg_match
. Had to introduce delimiters.– JoshP
Feb 28 '17 at 16:04
I ran into this when updating some old
ereg
calls to preg_match
. Had to introduce delimiters.– JoshP
Feb 28 '17 at 16:04
add a comment |
up vote
14
down vote
Other examples
The reference answer already explains the reason for "Unknown modifier" warnings. This is just a comparison of other typical variants.
When forgetting to add regex
/
delimiters/
, the first non-letter symbol will be assumed to be one. Therefore the warning is often about what follows a grouping(…)
,[…]
meta symbol:
preg_match("[a-zA-Z]+:s*.$"
↑ ↑⬆
Sometimes your regex already uses a custom delimiter (
:
here), but still contains the same character as unescaped literal. It's then mistaken as premature delimiter. Which is why the very next symbol receives the "Unknown modifier ❌" trophy:
preg_match(":[[d:/]+]:"
↑ ⬆ ↑
When using the classic
/
delimiter, take care to not have it within the regex literally. This most frequently happens when trying to match unescaped filenames:
preg_match("/pathname/filename/i"
↑ ⬆ ↑
Or when matching angle/square bracket style tags:
preg_match("/<%tmpl:id>(.*)</%tmpl:id>/Ui"
↑ ⬆ ↑
Templating-style (Smarty or BBCode) regex patterns often require
{…}
or[…]
brackets. Both should usually be escaped. (An outermost{}
pair being the exception though).
They also get misinterpreted as paired delimiters when no actual delimiter is used. If they're then also used as literal character within, then that's, of course … an error.
preg_match("{bold[^}]+}"
↑ ⬆ ↑
Whenever the warning says "Delimiter must not be alphanumeric or backslash" then you also entirely forgot delimiters:
preg_match("ab?c*"
↑
"Unkown modifier 'g'" often indicates a regex that was copied verbatimly from JavaScript or Perl.
preg_match("/abc+/g"
⬆
PHP doesn't use the
/g
global flag. Instead thepreg_replace
function works on all occurences, andpreg_match_all
is the "global" searching pendant to the one-occurencepreg_match
.
So, just remove the
/g
flag.
See also:
· Warning: preg_replace(): Unknown modifier 'g'
· preg_replace: bad regex == 'Unknown Modifier'?
A more peculiar case pertains the PCRE_EXTENDED
/x
flag. This is often (or should be) used for making regexps more lofty and readable.
This allows to use inline
#
comments. PHP implements the regex delimiters atop PCRE. But it doesn't treat#
in any special way. Which is how a literal delimiter in a#
comment can become an error:
preg_match("/
ab?c+ # Comment with / slash in between
/x"
(Also noteworthy that using
#
as#abc+#x
delimiter can be doubly inadvisable.)
Interpolating variables into a regex requires them to be pre-escaped, or be valid regexps themselves. You can't tell beforehand if this is gonna work:
preg_match("/id=$var;/"
↑ ↺ ↑
It's best to apply
$var = preg_quote($var, "/")
in such cases.
See also:
· Unknown modifier '/' in ...? what is it?
Another alternative is using
Q…E
escapes for unquoted literal strings:
preg_match("/id=Q{$var}E;/mix");
Note that this is merely a convenient shortcut, not dependable/safe. It would fall apart in case that
$var
contained a literal'E'
itself (however unlikely).
Deprecated modifier /e is an entirely different problem. This has nothing to do with delimiters, but the implicit expression interpretation mode being phased out. See also: Replace deprecated preg_replace /e with preg_replace_callback
Alternative regex delimiters
As mentioned already, the quickest solution to this error is just picking a distinct delimiter. Any non-letter symbol can be used. Visually distinctive ones are often preferred:
~abc+~
!abc+!
@abc+@
#abc+#
=abc+=
%abc+%
Technically you could use $abc$
or |abc|
for delimiters. However, it's best to avoid symbols that serve as regex meta characters themselves.
The hash #
as delimiter is rather popular too. But care should be taken in combination with the x
/PCRE_EXTENDED
readability modifier. You can't use # inline
or (?#…)
comments then, because those would be confused as delimiters.
Quote-only delimiters
Occassionally you see "
and '
used as regex delimiters paired with their conterpart as PHP string enclosure:
preg_match("'abc+'"
preg_match('"abc+"'
Which is perfectly valid as far as PHP is concerned. It's sometimes convenient and unobtrusive, but not always legible in IDEs and editors.
Paired delimiters
An interesting variation are paired delimiters. Instead of using the same symbol on both ends of a regex, you can use any <...>
(...)
[...]
{...}
bracket/braces combination.
preg_match("(abc+)" # just delimiters here, not a capture group
While most of them also serve as regex meta characters, you can often use them without further effort. As long as those specific braces/parens within the regex are paired or escaped correctly, these variants are quite readable.
Fancy regex delimiters
A somewhat lazy trick (which is not endorsed hereby) is using non-printable ASCII characters as delimiters. This works easily in PHP by using double quotes for the regex string, and octal escapes for delimiters:
preg_match("01 abc+ 01mix"
The 01
is just a control character ␁ that's not usually needed. Therefore it's highly unlikely to appear within most regex patterns. Which makes it suitable here, even though not very legible.
Sadly you can't use Unicode glyps ❚
as delimiters. PHP only allows single-byte characters. And why is that? Well, glad you asked:
PHPs delimiters atop PCRE
The preg_*
functions utilize the PCRE regex engine, which itself doesn't care or provide for delimiters. For resemblence with Perl the preg_*
functions implement them. Which is also why you can use modifier letters /ism
instead of just constants as parameter.
See ext/pcre/php_pcre.c on how the regex string is preprocessed:
First all leading whitespace is ignored.
Any non-alphanumeric symbol is taken as presumed delimiter. Note that PHP only honors single-byte characters:
delimiter = *p++;
if (isalnum((int)*(unsigned char *)&delimiter) || delimiter == '\') {
php_error_docref(NULL,E_WARNING, "Delimiter must not…");
return NULL;
}
The rest of the regex string is traversed left-to-right. Only backslash
\
-escaped symbols are ignored.Should the delimiter be found again, the remainder is verified to only contain modifier letters.
If the delimiter is one of the
([{< )]}> )]}>
pairable braces/brackets, then the processing logic is more elaborate.
int brackets = 1; /* brackets nesting level */
while (*pp != 0) {
if (*pp == '\' && pp[1] != 0) pp++;
else if (*pp == end_delimiter && --brackets <= 0)
break;
else if (*pp == start_delimiter)
brackets++;
pp++;
}
It looks for correctly paired left and right delimiter, but ignores other braces/bracket types when counting.
The raw regex string is passed to the PCRE backend only after delimiter and modifier flags have been cut out.
Now this is all somewhat irrelevant. But explains where the delimiter warnings come from. And this whole procedure is all to have a minimum of Perl compatibility. There are a few minor deviations of course, like the […]
character class context not receiving special treatment in PHP.
More references
- preg_match(); - Unknown modifier '+'
- Unknown modifier '/' error in PHP
- PHP RegExpr error Unkown modifier '('
- Unknown modifier '(' when using preg_match() with a REGEX expression
- PHP: Regex - Unknown modifier error
- Warning: preg_match() [function.preg-match]: Unknown modifier '('
When does preg_match(): Unknown modifier error occur?
(Just a well-written question demonstrating prior research)
Very nice explanation
– Svetoslav Marinov
Sep 13 '17 at 9:13
add a comment |
up vote
14
down vote
Other examples
The reference answer already explains the reason for "Unknown modifier" warnings. This is just a comparison of other typical variants.
When forgetting to add regex
/
delimiters/
, the first non-letter symbol will be assumed to be one. Therefore the warning is often about what follows a grouping(…)
,[…]
meta symbol:
preg_match("[a-zA-Z]+:s*.$"
↑ ↑⬆
Sometimes your regex already uses a custom delimiter (
:
here), but still contains the same character as unescaped literal. It's then mistaken as premature delimiter. Which is why the very next symbol receives the "Unknown modifier ❌" trophy:
preg_match(":[[d:/]+]:"
↑ ⬆ ↑
When using the classic
/
delimiter, take care to not have it within the regex literally. This most frequently happens when trying to match unescaped filenames:
preg_match("/pathname/filename/i"
↑ ⬆ ↑
Or when matching angle/square bracket style tags:
preg_match("/<%tmpl:id>(.*)</%tmpl:id>/Ui"
↑ ⬆ ↑
Templating-style (Smarty or BBCode) regex patterns often require
{…}
or[…]
brackets. Both should usually be escaped. (An outermost{}
pair being the exception though).
They also get misinterpreted as paired delimiters when no actual delimiter is used. If they're then also used as literal character within, then that's, of course … an error.
preg_match("{bold[^}]+}"
↑ ⬆ ↑
Whenever the warning says "Delimiter must not be alphanumeric or backslash" then you also entirely forgot delimiters:
preg_match("ab?c*"
↑
"Unkown modifier 'g'" often indicates a regex that was copied verbatimly from JavaScript or Perl.
preg_match("/abc+/g"
⬆
PHP doesn't use the
/g
global flag. Instead thepreg_replace
function works on all occurences, andpreg_match_all
is the "global" searching pendant to the one-occurencepreg_match
.
So, just remove the
/g
flag.
See also:
· Warning: preg_replace(): Unknown modifier 'g'
· preg_replace: bad regex == 'Unknown Modifier'?
A more peculiar case pertains the PCRE_EXTENDED
/x
flag. This is often (or should be) used for making regexps more lofty and readable.
This allows to use inline
#
comments. PHP implements the regex delimiters atop PCRE. But it doesn't treat#
in any special way. Which is how a literal delimiter in a#
comment can become an error:
preg_match("/
ab?c+ # Comment with / slash in between
/x"
(Also noteworthy that using
#
as#abc+#x
delimiter can be doubly inadvisable.)
Interpolating variables into a regex requires them to be pre-escaped, or be valid regexps themselves. You can't tell beforehand if this is gonna work:
preg_match("/id=$var;/"
↑ ↺ ↑
It's best to apply
$var = preg_quote($var, "/")
in such cases.
See also:
· Unknown modifier '/' in ...? what is it?
Another alternative is using
Q…E
escapes for unquoted literal strings:
preg_match("/id=Q{$var}E;/mix");
Note that this is merely a convenient shortcut, not dependable/safe. It would fall apart in case that
$var
contained a literal'E'
itself (however unlikely).
Deprecated modifier /e is an entirely different problem. This has nothing to do with delimiters, but the implicit expression interpretation mode being phased out. See also: Replace deprecated preg_replace /e with preg_replace_callback
Alternative regex delimiters
As mentioned already, the quickest solution to this error is just picking a distinct delimiter. Any non-letter symbol can be used. Visually distinctive ones are often preferred:
~abc+~
!abc+!
@abc+@
#abc+#
=abc+=
%abc+%
Technically you could use $abc$
or |abc|
for delimiters. However, it's best to avoid symbols that serve as regex meta characters themselves.
The hash #
as delimiter is rather popular too. But care should be taken in combination with the x
/PCRE_EXTENDED
readability modifier. You can't use # inline
or (?#…)
comments then, because those would be confused as delimiters.
Quote-only delimiters
Occassionally you see "
and '
used as regex delimiters paired with their conterpart as PHP string enclosure:
preg_match("'abc+'"
preg_match('"abc+"'
Which is perfectly valid as far as PHP is concerned. It's sometimes convenient and unobtrusive, but not always legible in IDEs and editors.
Paired delimiters
An interesting variation are paired delimiters. Instead of using the same symbol on both ends of a regex, you can use any <...>
(...)
[...]
{...}
bracket/braces combination.
preg_match("(abc+)" # just delimiters here, not a capture group
While most of them also serve as regex meta characters, you can often use them without further effort. As long as those specific braces/parens within the regex are paired or escaped correctly, these variants are quite readable.
Fancy regex delimiters
A somewhat lazy trick (which is not endorsed hereby) is using non-printable ASCII characters as delimiters. This works easily in PHP by using double quotes for the regex string, and octal escapes for delimiters:
preg_match("01 abc+ 01mix"
The 01
is just a control character ␁ that's not usually needed. Therefore it's highly unlikely to appear within most regex patterns. Which makes it suitable here, even though not very legible.
Sadly you can't use Unicode glyps ❚
as delimiters. PHP only allows single-byte characters. And why is that? Well, glad you asked:
PHPs delimiters atop PCRE
The preg_*
functions utilize the PCRE regex engine, which itself doesn't care or provide for delimiters. For resemblence with Perl the preg_*
functions implement them. Which is also why you can use modifier letters /ism
instead of just constants as parameter.
See ext/pcre/php_pcre.c on how the regex string is preprocessed:
First all leading whitespace is ignored.
Any non-alphanumeric symbol is taken as presumed delimiter. Note that PHP only honors single-byte characters:
delimiter = *p++;
if (isalnum((int)*(unsigned char *)&delimiter) || delimiter == '\') {
php_error_docref(NULL,E_WARNING, "Delimiter must not…");
return NULL;
}
The rest of the regex string is traversed left-to-right. Only backslash
\
-escaped symbols are ignored.Should the delimiter be found again, the remainder is verified to only contain modifier letters.
If the delimiter is one of the
([{< )]}> )]}>
pairable braces/brackets, then the processing logic is more elaborate.
int brackets = 1; /* brackets nesting level */
while (*pp != 0) {
if (*pp == '\' && pp[1] != 0) pp++;
else if (*pp == end_delimiter && --brackets <= 0)
break;
else if (*pp == start_delimiter)
brackets++;
pp++;
}
It looks for correctly paired left and right delimiter, but ignores other braces/bracket types when counting.
The raw regex string is passed to the PCRE backend only after delimiter and modifier flags have been cut out.
Now this is all somewhat irrelevant. But explains where the delimiter warnings come from. And this whole procedure is all to have a minimum of Perl compatibility. There are a few minor deviations of course, like the […]
character class context not receiving special treatment in PHP.
More references
- preg_match(); - Unknown modifier '+'
- Unknown modifier '/' error in PHP
- PHP RegExpr error Unkown modifier '('
- Unknown modifier '(' when using preg_match() with a REGEX expression
- PHP: Regex - Unknown modifier error
- Warning: preg_match() [function.preg-match]: Unknown modifier '('
When does preg_match(): Unknown modifier error occur?
(Just a well-written question demonstrating prior research)
Very nice explanation
– Svetoslav Marinov
Sep 13 '17 at 9:13
add a comment |
up vote
14
down vote
up vote
14
down vote
Other examples
The reference answer already explains the reason for "Unknown modifier" warnings. This is just a comparison of other typical variants.
When forgetting to add regex
/
delimiters/
, the first non-letter symbol will be assumed to be one. Therefore the warning is often about what follows a grouping(…)
,[…]
meta symbol:
preg_match("[a-zA-Z]+:s*.$"
↑ ↑⬆
Sometimes your regex already uses a custom delimiter (
:
here), but still contains the same character as unescaped literal. It's then mistaken as premature delimiter. Which is why the very next symbol receives the "Unknown modifier ❌" trophy:
preg_match(":[[d:/]+]:"
↑ ⬆ ↑
When using the classic
/
delimiter, take care to not have it within the regex literally. This most frequently happens when trying to match unescaped filenames:
preg_match("/pathname/filename/i"
↑ ⬆ ↑
Or when matching angle/square bracket style tags:
preg_match("/<%tmpl:id>(.*)</%tmpl:id>/Ui"
↑ ⬆ ↑
Templating-style (Smarty or BBCode) regex patterns often require
{…}
or[…]
brackets. Both should usually be escaped. (An outermost{}
pair being the exception though).
They also get misinterpreted as paired delimiters when no actual delimiter is used. If they're then also used as literal character within, then that's, of course … an error.
preg_match("{bold[^}]+}"
↑ ⬆ ↑
Whenever the warning says "Delimiter must not be alphanumeric or backslash" then you also entirely forgot delimiters:
preg_match("ab?c*"
↑
"Unkown modifier 'g'" often indicates a regex that was copied verbatimly from JavaScript or Perl.
preg_match("/abc+/g"
⬆
PHP doesn't use the
/g
global flag. Instead thepreg_replace
function works on all occurences, andpreg_match_all
is the "global" searching pendant to the one-occurencepreg_match
.
So, just remove the
/g
flag.
See also:
· Warning: preg_replace(): Unknown modifier 'g'
· preg_replace: bad regex == 'Unknown Modifier'?
A more peculiar case pertains the PCRE_EXTENDED
/x
flag. This is often (or should be) used for making regexps more lofty and readable.
This allows to use inline
#
comments. PHP implements the regex delimiters atop PCRE. But it doesn't treat#
in any special way. Which is how a literal delimiter in a#
comment can become an error:
preg_match("/
ab?c+ # Comment with / slash in between
/x"
(Also noteworthy that using
#
as#abc+#x
delimiter can be doubly inadvisable.)
Interpolating variables into a regex requires them to be pre-escaped, or be valid regexps themselves. You can't tell beforehand if this is gonna work:
preg_match("/id=$var;/"
↑ ↺ ↑
It's best to apply
$var = preg_quote($var, "/")
in such cases.
See also:
· Unknown modifier '/' in ...? what is it?
Another alternative is using
Q…E
escapes for unquoted literal strings:
preg_match("/id=Q{$var}E;/mix");
Note that this is merely a convenient shortcut, not dependable/safe. It would fall apart in case that
$var
contained a literal'E'
itself (however unlikely).
Deprecated modifier /e is an entirely different problem. This has nothing to do with delimiters, but the implicit expression interpretation mode being phased out. See also: Replace deprecated preg_replace /e with preg_replace_callback
Alternative regex delimiters
As mentioned already, the quickest solution to this error is just picking a distinct delimiter. Any non-letter symbol can be used. Visually distinctive ones are often preferred:
~abc+~
!abc+!
@abc+@
#abc+#
=abc+=
%abc+%
Technically you could use $abc$
or |abc|
for delimiters. However, it's best to avoid symbols that serve as regex meta characters themselves.
The hash #
as delimiter is rather popular too. But care should be taken in combination with the x
/PCRE_EXTENDED
readability modifier. You can't use # inline
or (?#…)
comments then, because those would be confused as delimiters.
Quote-only delimiters
Occassionally you see "
and '
used as regex delimiters paired with their conterpart as PHP string enclosure:
preg_match("'abc+'"
preg_match('"abc+"'
Which is perfectly valid as far as PHP is concerned. It's sometimes convenient and unobtrusive, but not always legible in IDEs and editors.
Paired delimiters
An interesting variation are paired delimiters. Instead of using the same symbol on both ends of a regex, you can use any <...>
(...)
[...]
{...}
bracket/braces combination.
preg_match("(abc+)" # just delimiters here, not a capture group
While most of them also serve as regex meta characters, you can often use them without further effort. As long as those specific braces/parens within the regex are paired or escaped correctly, these variants are quite readable.
Fancy regex delimiters
A somewhat lazy trick (which is not endorsed hereby) is using non-printable ASCII characters as delimiters. This works easily in PHP by using double quotes for the regex string, and octal escapes for delimiters:
preg_match("01 abc+ 01mix"
The 01
is just a control character ␁ that's not usually needed. Therefore it's highly unlikely to appear within most regex patterns. Which makes it suitable here, even though not very legible.
Sadly you can't use Unicode glyps ❚
as delimiters. PHP only allows single-byte characters. And why is that? Well, glad you asked:
PHPs delimiters atop PCRE
The preg_*
functions utilize the PCRE regex engine, which itself doesn't care or provide for delimiters. For resemblence with Perl the preg_*
functions implement them. Which is also why you can use modifier letters /ism
instead of just constants as parameter.
See ext/pcre/php_pcre.c on how the regex string is preprocessed:
First all leading whitespace is ignored.
Any non-alphanumeric symbol is taken as presumed delimiter. Note that PHP only honors single-byte characters:
delimiter = *p++;
if (isalnum((int)*(unsigned char *)&delimiter) || delimiter == '\') {
php_error_docref(NULL,E_WARNING, "Delimiter must not…");
return NULL;
}
The rest of the regex string is traversed left-to-right. Only backslash
\
-escaped symbols are ignored.Should the delimiter be found again, the remainder is verified to only contain modifier letters.
If the delimiter is one of the
([{< )]}> )]}>
pairable braces/brackets, then the processing logic is more elaborate.
int brackets = 1; /* brackets nesting level */
while (*pp != 0) {
if (*pp == '\' && pp[1] != 0) pp++;
else if (*pp == end_delimiter && --brackets <= 0)
break;
else if (*pp == start_delimiter)
brackets++;
pp++;
}
It looks for correctly paired left and right delimiter, but ignores other braces/bracket types when counting.
The raw regex string is passed to the PCRE backend only after delimiter and modifier flags have been cut out.
Now this is all somewhat irrelevant. But explains where the delimiter warnings come from. And this whole procedure is all to have a minimum of Perl compatibility. There are a few minor deviations of course, like the […]
character class context not receiving special treatment in PHP.
More references
- preg_match(); - Unknown modifier '+'
- Unknown modifier '/' error in PHP
- PHP RegExpr error Unkown modifier '('
- Unknown modifier '(' when using preg_match() with a REGEX expression
- PHP: Regex - Unknown modifier error
- Warning: preg_match() [function.preg-match]: Unknown modifier '('
When does preg_match(): Unknown modifier error occur?
(Just a well-written question demonstrating prior research)
Other examples
The reference answer already explains the reason for "Unknown modifier" warnings. This is just a comparison of other typical variants.
When forgetting to add regex
/
delimiters/
, the first non-letter symbol will be assumed to be one. Therefore the warning is often about what follows a grouping(…)
,[…]
meta symbol:
preg_match("[a-zA-Z]+:s*.$"
↑ ↑⬆
Sometimes your regex already uses a custom delimiter (
:
here), but still contains the same character as unescaped literal. It's then mistaken as premature delimiter. Which is why the very next symbol receives the "Unknown modifier ❌" trophy:
preg_match(":[[d:/]+]:"
↑ ⬆ ↑
When using the classic
/
delimiter, take care to not have it within the regex literally. This most frequently happens when trying to match unescaped filenames:
preg_match("/pathname/filename/i"
↑ ⬆ ↑
Or when matching angle/square bracket style tags:
preg_match("/<%tmpl:id>(.*)</%tmpl:id>/Ui"
↑ ⬆ ↑
Templating-style (Smarty or BBCode) regex patterns often require
{…}
or[…]
brackets. Both should usually be escaped. (An outermost{}
pair being the exception though).
They also get misinterpreted as paired delimiters when no actual delimiter is used. If they're then also used as literal character within, then that's, of course … an error.
preg_match("{bold[^}]+}"
↑ ⬆ ↑
Whenever the warning says "Delimiter must not be alphanumeric or backslash" then you also entirely forgot delimiters:
preg_match("ab?c*"
↑
"Unkown modifier 'g'" often indicates a regex that was copied verbatimly from JavaScript or Perl.
preg_match("/abc+/g"
⬆
PHP doesn't use the
/g
global flag. Instead thepreg_replace
function works on all occurences, andpreg_match_all
is the "global" searching pendant to the one-occurencepreg_match
.
So, just remove the
/g
flag.
See also:
· Warning: preg_replace(): Unknown modifier 'g'
· preg_replace: bad regex == 'Unknown Modifier'?
A more peculiar case pertains the PCRE_EXTENDED
/x
flag. This is often (or should be) used for making regexps more lofty and readable.
This allows to use inline
#
comments. PHP implements the regex delimiters atop PCRE. But it doesn't treat#
in any special way. Which is how a literal delimiter in a#
comment can become an error:
preg_match("/
ab?c+ # Comment with / slash in between
/x"
(Also noteworthy that using
#
as#abc+#x
delimiter can be doubly inadvisable.)
Interpolating variables into a regex requires them to be pre-escaped, or be valid regexps themselves. You can't tell beforehand if this is gonna work:
preg_match("/id=$var;/"
↑ ↺ ↑
It's best to apply
$var = preg_quote($var, "/")
in such cases.
See also:
· Unknown modifier '/' in ...? what is it?
Another alternative is using
Q…E
escapes for unquoted literal strings:
preg_match("/id=Q{$var}E;/mix");
Note that this is merely a convenient shortcut, not dependable/safe. It would fall apart in case that
$var
contained a literal'E'
itself (however unlikely).
Deprecated modifier /e is an entirely different problem. This has nothing to do with delimiters, but the implicit expression interpretation mode being phased out. See also: Replace deprecated preg_replace /e with preg_replace_callback
Alternative regex delimiters
As mentioned already, the quickest solution to this error is just picking a distinct delimiter. Any non-letter symbol can be used. Visually distinctive ones are often preferred:
~abc+~
!abc+!
@abc+@
#abc+#
=abc+=
%abc+%
Technically you could use $abc$
or |abc|
for delimiters. However, it's best to avoid symbols that serve as regex meta characters themselves.
The hash #
as delimiter is rather popular too. But care should be taken in combination with the x
/PCRE_EXTENDED
readability modifier. You can't use # inline
or (?#…)
comments then, because those would be confused as delimiters.
Quote-only delimiters
Occassionally you see "
and '
used as regex delimiters paired with their conterpart as PHP string enclosure:
preg_match("'abc+'"
preg_match('"abc+"'
Which is perfectly valid as far as PHP is concerned. It's sometimes convenient and unobtrusive, but not always legible in IDEs and editors.
Paired delimiters
An interesting variation are paired delimiters. Instead of using the same symbol on both ends of a regex, you can use any <...>
(...)
[...]
{...}
bracket/braces combination.
preg_match("(abc+)" # just delimiters here, not a capture group
While most of them also serve as regex meta characters, you can often use them without further effort. As long as those specific braces/parens within the regex are paired or escaped correctly, these variants are quite readable.
Fancy regex delimiters
A somewhat lazy trick (which is not endorsed hereby) is using non-printable ASCII characters as delimiters. This works easily in PHP by using double quotes for the regex string, and octal escapes for delimiters:
preg_match("01 abc+ 01mix"
The 01
is just a control character ␁ that's not usually needed. Therefore it's highly unlikely to appear within most regex patterns. Which makes it suitable here, even though not very legible.
Sadly you can't use Unicode glyps ❚
as delimiters. PHP only allows single-byte characters. And why is that? Well, glad you asked:
PHPs delimiters atop PCRE
The preg_*
functions utilize the PCRE regex engine, which itself doesn't care or provide for delimiters. For resemblence with Perl the preg_*
functions implement them. Which is also why you can use modifier letters /ism
instead of just constants as parameter.
See ext/pcre/php_pcre.c on how the regex string is preprocessed:
First all leading whitespace is ignored.
Any non-alphanumeric symbol is taken as presumed delimiter. Note that PHP only honors single-byte characters:
delimiter = *p++;
if (isalnum((int)*(unsigned char *)&delimiter) || delimiter == '\') {
php_error_docref(NULL,E_WARNING, "Delimiter must not…");
return NULL;
}
The rest of the regex string is traversed left-to-right. Only backslash
\
-escaped symbols are ignored.Should the delimiter be found again, the remainder is verified to only contain modifier letters.
If the delimiter is one of the
([{< )]}> )]}>
pairable braces/brackets, then the processing logic is more elaborate.
int brackets = 1; /* brackets nesting level */
while (*pp != 0) {
if (*pp == '\' && pp[1] != 0) pp++;
else if (*pp == end_delimiter && --brackets <= 0)
break;
else if (*pp == start_delimiter)
brackets++;
pp++;
}
It looks for correctly paired left and right delimiter, but ignores other braces/bracket types when counting.
The raw regex string is passed to the PCRE backend only after delimiter and modifier flags have been cut out.
Now this is all somewhat irrelevant. But explains where the delimiter warnings come from. And this whole procedure is all to have a minimum of Perl compatibility. There are a few minor deviations of course, like the […]
character class context not receiving special treatment in PHP.
More references
- preg_match(); - Unknown modifier '+'
- Unknown modifier '/' error in PHP
- PHP RegExpr error Unkown modifier '('
- Unknown modifier '(' when using preg_match() with a REGEX expression
- PHP: Regex - Unknown modifier error
- Warning: preg_match() [function.preg-match]: Unknown modifier '('
When does preg_match(): Unknown modifier error occur?
(Just a well-written question demonstrating prior research)
edited May 23 '17 at 12:18
community wiki
6 revs
mario
Very nice explanation
– Svetoslav Marinov
Sep 13 '17 at 9:13
add a comment |
Very nice explanation
– Svetoslav Marinov
Sep 13 '17 at 9:13
Very nice explanation
– Svetoslav Marinov
Sep 13 '17 at 9:13
Very nice explanation
– Svetoslav Marinov
Sep 13 '17 at 9:13
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f20705399%2fwarning-preg-replace-unknown-modifier%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
6
Add delimeters around the pattern:
"/<div[^>]*><ul[^>]*>/"
– raina77ow
Dec 20 '13 at 14:07
1
@mario I don't really see why you put a bounty here? Are you really looking for new answers here? If yes what's wrong with the current one?
– Rizier123
Jul 2 '15 at 9:36
1
@Rizier123 The bounty description says it all: "One or more of the answers is exemplary and worthy of an additional bounty."
– birgire
Jul 2 '15 at 10:16
3
Yes, this isn't meant to attract more answers. The existing one is a pretty excellent example already. It's a great visual explaination, and likely applicable to many similar cases. And such mini bounties are mainly intended as temporary public bookmark - to make it better known. And perhaps establish this as another universal reference. (Though could make sense to craft an artificial CW answer with extra examples + links afterwards…)
– mario
Jul 2 '15 at 19:33
1
@mario If this should get an artifical answer, shouldn't we change the example a bit? I mean the OP is parsing HTML with regexes. I'm with you that the answer shows a lot of effort (and I like him and his posts) but I'm asking: Is this necessary? I mean a short "You need to enclose your regex with a delimiter" and a link to the (very good!) documentation would have been enough. Isn't it? IMHO all that extra information is going into the wrong direction and may confuse (expected newbie) users more than it helps.
– hek2mgl
Jul 2 '15 at 21:49