2d39e2
From 69d1b3fc29677af8ade8dc15dba83f0589cb63d6 Mon Sep 17 00:00:00 2001
2d39e2
From: Lasse Collin <lasse.collin@tukaani.org>
2d39e2
Date: Tue, 29 Mar 2022 19:19:12 +0300
2d39e2
Subject: [PATCH] xzgrep: Fix escaping of malicious filenames (ZDI-CAN-16587).
2d39e2
2d39e2
Malicious filenames can make xzgrep to write to arbitrary files
2d39e2
or (with a GNU sed extension) lead to arbitrary code execution.
2d39e2
2d39e2
xzgrep from XZ Utils versions up to and including 5.2.5 are
2d39e2
affected. 5.3.1alpha and 5.3.2alpha are affected as well.
2d39e2
This patch works for all of them.
2d39e2
2d39e2
This bug was inherited from gzip's zgrep. gzip 1.12 includes
2d39e2
a fix for zgrep.
2d39e2
2d39e2
The issue with the old sed script is that with multiple newlines,
2d39e2
the N-command will read the second line of input, then the
2d39e2
s-commands will be skipped because it's not the end of the
2d39e2
file yet, then a new sed cycle starts and the pattern space
2d39e2
is printed and emptied. So only the last line or two get escaped.
2d39e2
2d39e2
One way to fix this would be to read all lines into the pattern
2d39e2
space first. However, the included fix is even simpler: All lines
2d39e2
except the last line get a backslash appended at the end. To ensure
2d39e2
that shell command substitution doesn't eat a possible trailing
2d39e2
newline, a colon is appended to the filename before escaping.
2d39e2
The colon is later used to separate the filename from the grep
2d39e2
output so it is fine to add it here instead of a few lines later.
2d39e2
2d39e2
The old code also wasn't POSIX compliant as it used \n in the
2d39e2
replacement section of the s-command. Using \<newline> is the
2d39e2
POSIX compatible method.
2d39e2
2d39e2
LC_ALL=C was added to the two critical sed commands. POSIX sed
2d39e2
manual recommends it when using sed to manipulate pathnames
2d39e2
because in other locales invalid multibyte sequences might
2d39e2
cause issues with some sed implementations. In case of GNU sed,
2d39e2
these particular sed scripts wouldn't have such problems but some
2d39e2
other scripts could have, see:
2d39e2
2d39e2
    info '(sed)Locale Considerations'
2d39e2
2d39e2
This vulnerability was discovered by:
2d39e2
cleemy desu wayo working with Trend Micro Zero Day Initiative
2d39e2
2d39e2
Thanks to Jim Meyering and Paul Eggert discussing the different
2d39e2
ways to fix this and for coordinating the patch release schedule
2d39e2
with gzip.
2d39e2
---
2d39e2
 src/scripts/xzgrep.in | 20 ++++++++++++--------
2d39e2
 1 file changed, 12 insertions(+), 8 deletions(-)
2d39e2
2d39e2
diff --git a/src/scripts/xzgrep.in b/src/scripts/xzgrep.in
2d39e2
index b180936..e5186ba 100644
2d39e2
--- a/src/scripts/xzgrep.in
2d39e2
+++ b/src/scripts/xzgrep.in
2d39e2
@@ -180,22 +180,26 @@ for i; do
2d39e2
          { test $# -eq 1 || test $no_filename -eq 1; }; then
2d39e2
       eval "$grep"
2d39e2
     else
2d39e2
+      # Append a colon so that the last character will never be a newline
2d39e2
+      # which would otherwise get lost in shell command substitution.
2d39e2
+      i="$i:"
2d39e2
+
2d39e2
+      # Escape & \ | and newlines only if such characters are present
2d39e2
+      # (speed optimization).
2d39e2
       case $i in
2d39e2
       (*'
2d39e2
 '* | *'&'* | *'\'* | *'|'*)
2d39e2
-        i=$(printf '%s\n' "$i" |
2d39e2
-            sed '
2d39e2
-              $!N
2d39e2
-              $s/[&\|]/\\&/g
2d39e2
-              $s/\n/\\n/g
2d39e2
-            ');;
2d39e2
+        i=$(printf '%s\n' "$i" | LC_ALL=C sed 's/[&\|]/\\&/;; $!s/$/\\/');;
2d39e2
       esac
2d39e2
-      sed_script="s|^|$i:|"
2d39e2
+
2d39e2
+      # $i already ends with a colon so don't add it here.
2d39e2
+      sed_script="s|^|$i|"
2d39e2
 
2d39e2
       # Fail if grep or sed fails.
2d39e2
       r=$(
2d39e2
         exec 4>&1
2d39e2
-        (eval "$grep" 4>&-; echo $? >&4) 3>&- | sed "$sed_script" >&3 4>&-
2d39e2
+        (eval "$grep" 4>&-; echo $? >&4) 3>&- |
2d39e2
+            LC_ALL=C sed "$sed_script" >&3 4>&-
2d39e2
       ) || r=2
2d39e2
       exit $r
2d39e2
     fi >&3 5>&-
2d39e2
-- 
2d39e2
2.35.1
2d39e2