500f10
From 69d1b3fc29677af8ade8dc15dba83f0589cb63d6 Mon Sep 17 00:00:00 2001
500f10
From: Lasse Collin <lasse.collin@tukaani.org>
500f10
Date: Tue, 29 Mar 2022 19:19:12 +0300
500f10
Subject: [PATCH] xzgrep: Fix escaping of malicious filenames (ZDI-CAN-16587).
500f10
500f10
Malicious filenames can make xzgrep to write to arbitrary files
500f10
or (with a GNU sed extension) lead to arbitrary code execution.
500f10
500f10
xzgrep from XZ Utils versions up to and including 5.2.5 are
500f10
affected. 5.3.1alpha and 5.3.2alpha are affected as well.
500f10
This patch works for all of them.
500f10
500f10
This bug was inherited from gzip's zgrep. gzip 1.12 includes
500f10
a fix for zgrep.
500f10
500f10
The issue with the old sed script is that with multiple newlines,
500f10
the N-command will read the second line of input, then the
500f10
s-commands will be skipped because it's not the end of the
500f10
file yet, then a new sed cycle starts and the pattern space
500f10
is printed and emptied. So only the last line or two get escaped.
500f10
500f10
One way to fix this would be to read all lines into the pattern
500f10
space first. However, the included fix is even simpler: All lines
500f10
except the last line get a backslash appended at the end. To ensure
500f10
that shell command substitution doesn't eat a possible trailing
500f10
newline, a colon is appended to the filename before escaping.
500f10
The colon is later used to separate the filename from the grep
500f10
output so it is fine to add it here instead of a few lines later.
500f10
500f10
The old code also wasn't POSIX compliant as it used \n in the
500f10
replacement section of the s-command. Using \<newline> is the
500f10
POSIX compatible method.
500f10
500f10
LC_ALL=C was added to the two critical sed commands. POSIX sed
500f10
manual recommends it when using sed to manipulate pathnames
500f10
because in other locales invalid multibyte sequences might
500f10
cause issues with some sed implementations. In case of GNU sed,
500f10
these particular sed scripts wouldn't have such problems but some
500f10
other scripts could have, see:
500f10
500f10
    info '(sed)Locale Considerations'
500f10
500f10
This vulnerability was discovered by:
500f10
cleemy desu wayo working with Trend Micro Zero Day Initiative
500f10
500f10
Thanks to Jim Meyering and Paul Eggert discussing the different
500f10
ways to fix this and for coordinating the patch release schedule
500f10
with gzip.
500f10
---
500f10
 src/scripts/xzgrep.in | 20 ++++++++++++--------
500f10
 1 file changed, 12 insertions(+), 8 deletions(-)
500f10
500f10
diff --git a/src/scripts/xzgrep.in b/src/scripts/xzgrep.in
500f10
index b180936..e5186ba 100644
500f10
--- a/src/scripts/xzgrep.in
500f10
+++ b/src/scripts/xzgrep.in
500f10
@@ -180,22 +180,26 @@ for i; do
500f10
          { test $# -eq 1 || test $no_filename -eq 1; }; then
500f10
       eval "$grep"
500f10
     else
500f10
+      # Append a colon so that the last character will never be a newline
500f10
+      # which would otherwise get lost in shell command substitution.
500f10
+      i="$i:"
500f10
+
500f10
+      # Escape & \ | and newlines only if such characters are present
500f10
+      # (speed optimization).
500f10
       case $i in
500f10
       (*'
500f10
 '* | *'&'* | *'\'* | *'|'*)
500f10
-        i=$(printf '%s\n' "$i" |
500f10
-            sed '
500f10
-              $!N
500f10
-              $s/[&\|]/\\&/g
500f10
-              $s/\n/\\n/g
500f10
-            ');;
500f10
+        i=$(printf '%s\n' "$i" | LC_ALL=C sed 's/[&\|]/\\&/;; $!s/$/\\/');;
500f10
       esac
500f10
-      sed_script="s|^|$i:|"
500f10
+
500f10
+      # $i already ends with a colon so don't add it here.
500f10
+      sed_script="s|^|$i|"
500f10
 
500f10
       # Fail if grep or sed fails.
500f10
       r=$(
500f10
         exec 4>&1
500f10
-        (eval "$grep" 4>&-; echo $? >&4) 3>&- | sed "$sed_script" >&3 4>&-
500f10
+        (eval "$grep" 4>&-; echo $? >&4) 3>&- |
500f10
+            LC_ALL=C sed "$sed_script" >&3 4>&-
500f10
       ) || r=2
500f10
       exit $r
500f10
     fi >&3 5>&-
500f10
-- 
500f10
2.35.1
500f10