Fix a mismatch with the recursive subpatterns

This commit is contained in:
Petr Písař 2020-09-23 17:27:23 +02:00
parent 4348c5f039
commit bad6fe0227
2 changed files with 63 additions and 0 deletions

View File

@ -0,0 +1,56 @@
From f4cd5e29bc15621f2ab8fc5d7de0e68e62d43999 Mon Sep 17 00:00:00 2001
From: Hugo van der Sanden <hv@crypt.org>
Date: Tue, 15 Sep 2020 14:02:54 +0100
Subject: [PATCH] [gh18096] assume worst-case for GOSUBs we don't analyse
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
During study_chunk, under various conditions we avoid recursing into
a GOSUB. But we must avoid giving the enclosing scope the idea that
this GOSUB would match only an empty string, since that could trigger
wrong optimizations (eg CURLYX => CURLYM in the ticket).
So we mark the construct as infinite, as in the code branch where we
_do_ recurse into it.
Signed-off-by: Petr Písař <ppisar@redhat.com>
---
regcomp.c | 7 ++++++-
t/re/re_tests | 2 ++
2 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/regcomp.c b/regcomp.c
index 124ea5b90b..fae3f8079d 100644
--- a/regcomp.c
+++ b/regcomp.c
@@ -5212,7 +5212,12 @@ S_study_chunk(pTHX_ RExC_state_t *pRExC_state, regnode **scanp,
* might result in a minlen of 1 and not of 4,
* but this doesn't make us mismatch, just try a bit
* harder than we should.
- * */
+ *
+ * However we must assume this GOSUB is infinite, to
+ * avoid wrongly applying other optimizations in the
+ * enclosing scope - see GH 18096, for example.
+ */
+ is_inf = is_inf_internal = 1;
scan= regnext(scan);
continue;
}
diff --git a/t/re/re_tests b/t/re/re_tests
index 554a7004a2..ab5a0d8012 100644
--- a/t/re/re_tests
+++ b/t/re/re_tests
@@ -2023,6 +2023,8 @@ AB\s+\x{100} AB \x{100}X y - -
/(?iaax:A? \K +)/ African_Feh c - \\K + is forbidden - matches null string many times in regex
/(?iaa:A?\K+)/ African_Feh c - \\K+ is forbidden - matches null string many times in regex
/(?iaa:A?\K*)/ African_Feh c - \\K* is forbidden - matches null string many times in regex
+^((\w|<(\s)*(?1)(?3)*>)(?:(?3)*\+(?3)*(?2))*)(?3)*\+ a + b + <c + d> y $1 a + b # [GH #18096]
+^((\w|<(\s)*(?1)(?3)*>)(?:(?3)*\+(?3)*(?2))*)(?3)*\+ a + <b> + c y $1 a + <b> # [GH #18096]
# Keep these lines at the end of the file
# pat string y/n/etc expr expected-expr skip-reason comment
# vim: softtabstop=0 noexpandtab
--
2.25.4

View File

@ -241,6 +241,10 @@ Patch35: perl-5.33.1-sort-return-foo.patch
# character class with a white space, in upstream after 5.33.1
Patch36: perl-5.33.1-Heap-buffer-overflow-in-regex-bracket-group-whitespa.patch
# Fix a mismatch with the recursive subpatterns, GH#18096,
# in upstream after 5.33.2
Patch37: perl-5.33.2-gh18096-assume-worst-case-for-GOSUBs-we-don-t-analys.patch
# Link XS modules to libperl.so with EU::CBuilder on Linux, bug #960048
Patch200: perl-5.16.3-Link-XS-modules-to-libperl.so-with-EU-CBuilder-on-Li.patch
@ -4270,6 +4274,7 @@ you're not running VMS, this module does nothing.
%patch34 -p1
%patch35 -p1
%patch36 -p1
%patch37 -p1
%patch200 -p1
%patch201 -p1
@ -4313,6 +4318,7 @@ perl -x patchlevel.h \
'Fedora Patch34: Fix handling exceptions in a global destruction (GH#18063)' \
'Fedora Patch35: Fix sorting with a block that calls return (GH#18081)' \
'Fedora Patch36: Fix a buffer overflow when compiling a regular expression with a bracketed character class with a white space' \
'Fedora Patch37: Fix a mismatch with the recursive subpatterns (GH#18096)' \
'Fedora Patch200: Link XS modules to libperl.so with EU::CBuilder on Linux' \
'Fedora Patch201: Link XS modules to libperl.so with EU::MM on Linux' \
%{nil}
@ -7027,6 +7033,7 @@ popd
- Fix ownership of /usr/share/perl5/{ExtUtils,File,Module,Text,Time} directories
- Fix a buffer overflow when compiling a regular expression with a bracketed
character class with a white space
- Fix a mismatch with the recursive subpatterns (GH#18096)
* Thu Aug 27 2020 Petr Pisar <ppisar@redhat.com> - 4:5.32.0-462
- Fix inheritance resolution of lexial objects in a debugger (GH#17661)