Blame SOURCES/1452368-mpo-7.4.2-clone.2.patch

0febb9
From fb510f4e3dc6c13696bce6d3a79b8cea9b03b044 Mon Sep 17 00:00:00 2001
0febb9
From: =?UTF-8?q?Nikola=20Forr=C3=B3?= <nforro@redhat.com>
0febb9
Date: Mon, 22 May 2017 14:51:53 +0200
0febb9
Subject: [PATCH 1/2] clone.2: document features related to namespaces
0febb9
0febb9
---
0febb9
 man-pages/man2/____clone.2 | 524 ++++++++++++++++++++++++++++-----------------
0febb9
 man-pages/man2/clone.2     | 524 ++++++++++++++++++++++++++++-----------------
0febb9
 2 files changed, 658 insertions(+), 390 deletions(-)
0febb9
0febb9
diff --git a/man-pages/man2/____clone.2 b/man-pages/man2/____clone.2
0febb9
index 56d03cf..edf0994 100644
0febb9
--- a/man-pages/man2/____clone.2
0febb9
+++ b/man-pages/man2/____clone.2
0febb9
@@ -39,50 +39,23 @@
0febb9
 .\" 2008-11-19, mtk, document CLONE_NEWIPC
0febb9
 .\" 2008-11-19, Jens Axboe, mtk, document CLONE_IO
0febb9
 .\"
0febb9
-.\" FIXME Document CLONE_NEWUSER, which is new in 2.6.23
0febb9
-.\"       (also supported for unshare()?)
0febb9
-.\"
0febb9
-.TH CLONE 2 2013-04-16 "Linux" "Linux Programmer's Manual"
0febb9
+.TH CLONE 2 2016-12-12 "Linux" "Linux Programmer's Manual"
0febb9
 .SH NAME
0febb9
 clone, __clone2 \- create a child process
0febb9
 .SH SYNOPSIS
0febb9
 .nf
0febb9
 /* Prototype for the glibc wrapper function */
0febb9
 
0febb9
+.B #define _GNU_SOURCE
0febb9
 .B #include <sched.h>
0febb9
 
0febb9
 .BI "int clone(int (*" "fn" ")(void *), void *" child_stack ,
0febb9
 .BI "          int " flags ", void *" "arg" ", ... "
0febb9
-.BI "          /* pid_t *" ptid ", struct user_desc *" tls \
0febb9
+.BI "          /* pid_t *" ptid ", void *" newtls \
0febb9
 ", pid_t *" ctid " */ );"
0febb9
 
0febb9
-/* Prototype for the raw system call */
0febb9
-
0febb9
-.BI "long clone(unsigned long " flags ", void *" child_stack ,
0febb9
-.BI "          void *" ptid ", void *" ctid ,
0febb9
-.BI "          struct pt_regs *" regs );
0febb9
+/* For the prototype of the raw system call, see NOTES */
0febb9
 .fi
0febb9
-.sp
0febb9
-.in -4n
0febb9
-Feature Test Macro Requirements for glibc wrapper function (see
0febb9
-.BR feature_test_macros (7)):
0febb9
-.in
0febb9
-.sp
0febb9
-.BR clone ():
0febb9
-.ad l
0febb9
-.RS 4
0febb9
-.PD 0
0febb9
-.TP 4
0febb9
-Since glibc 2.14:
0febb9
-_GNU_SOURCE
0febb9
-.TP 4
0febb9
-.\" FIXME See http://sources.redhat.com/bugzilla/show_bug.cgi?id=4749
0febb9
-Before glibc 2.14:
0febb9
-_BSD_SOURCE || _SVID_SOURCE
0febb9
-    /* _GNU_SOURCE also suffices */
0febb9
-.PD
0febb9
-.RE
0febb9
-.ad b
0febb9
 .SH DESCRIPTION
0febb9
 .BR clone ()
0febb9
 creates a new process, in a manner similar to
0febb9
@@ -107,7 +80,7 @@ But see the description of
0febb9
 .B CLONE_PARENT
0febb9
 below.)
0febb9
 
0febb9
-The main use of
0febb9
+One use of
0febb9
 .BR clone ()
0febb9
 is to implement threads: multiple threads of control in a program that
0febb9
 run concurrently in a shared memory space.
0febb9
@@ -180,7 +153,7 @@ in order to specify what is shared between the calling process
0febb9
 and the child process:
0febb9
 .TP
0febb9
 .BR CLONE_CHILD_CLEARTID " (since Linux 2.5.49)"
0febb9
-Erase child thread ID at location
0febb9
+Clear (zero) the child thread ID at the location
0febb9
 .I ctid
0febb9
 in child memory when the child exits, and do a wakeup on the futex
0febb9
 at that address.
0febb9
@@ -190,9 +163,12 @@ system call.
0febb9
 This is used by threading libraries.
0febb9
 .TP
0febb9
 .BR CLONE_CHILD_SETTID " (since Linux 2.5.49)"
0febb9
-Store child thread ID at location
0febb9
+Store the child thread ID at the location
0febb9
 .I ctid
0febb9
-in child memory.
0febb9
+in the child's memory.
0febb9
+The store operation completes before
0febb9
+.BR clone ()
0febb9
+returns control to user space.
0febb9
 .TP
0febb9
 .BR CLONE_FILES " (since Linux 2.0)"
0febb9
 If
0febb9
@@ -206,27 +182,31 @@ or changes its associated flags (using the
0febb9
 .BR fcntl (2)
0febb9
 .B F_SETFD
0febb9
 operation), the other process is also affected.
0febb9
+If a process sharing a file descriptor table calls
0febb9
+.BR execve (2),
0febb9
+its file descriptor table is duplicated (unshared).
0febb9
 
0febb9
 If
0febb9
 .B CLONE_FILES
0febb9
 is not set, the child process inherits a copy of all file descriptors
0febb9
 opened in the calling process at the time of
0febb9
 .BR clone ().
0febb9
-(The duplicated file descriptors in the child refer to the
0febb9
-same open file descriptions (see
0febb9
-.BR open (2))
0febb9
-as the corresponding file descriptors in the calling process.)
0febb9
 Subsequent operations that open or close file descriptors,
0febb9
 or change file descriptor flags,
0febb9
 performed by either the calling
0febb9
 process or the child process do not affect the other process.
0febb9
+Note, however,
0febb9
+that the duplicated file descriptors in the child refer to the same open file
0febb9
+descriptions as the corresponding file descriptors in the calling process,
0febb9
+and thus share file offsets and file status flags (see
0febb9
+.BR open (2)).
0febb9
 .TP
0febb9
 .BR CLONE_FS " (since Linux 2.0)"
0febb9
 If
0febb9
 .B CLONE_FS
0febb9
-is set, the caller and the child process share the same file system
0febb9
+is set, the caller and the child process share the same filesystem
0febb9
 information.
0febb9
-This includes the root of the file system, the current
0febb9
+This includes the root of the filesystem, the current
0febb9
 working directory, and the umask.
0febb9
 Any call to
0febb9
 .BR chroot (2),
0febb9
@@ -238,7 +218,7 @@ other process.
0febb9
 
0febb9
 If
0febb9
 .B CLONE_FS
0febb9
-is not set, the child process works on a copy of the file system
0febb9
+is not set, the child process works on a copy of the filesystem
0febb9
 information of the calling process at the time of the
0febb9
 .BR clone ()
0febb9
 call.
0febb9
@@ -258,7 +238,7 @@ If this flag is not set, then (as with
0febb9
 the new process has its own I/O context.
0febb9
 
0febb9
 .\" The following based on text from Jens Axboe
0febb9
-The I/O context is the I/O scope of the disk scheduler (i.e,
0febb9
+The I/O context is the I/O scope of the disk scheduler (i.e.,
0febb9
 what the I/O scheduler uses to model scheduling of a process's I/O).
0febb9
 If processes share the same I/O context,
0febb9
 they are treated as one by the I/O scheduler.
0febb9
@@ -288,7 +268,7 @@ the process is created in the same IPC namespace as
0febb9
 the calling process.
0febb9
 This flag is intended for the implementation of containers.
0febb9
 
0febb9
-An IPC namespace provides an isolated view of System V IPC objects (see
0febb9
+An IPC namespace provides an isolated view of System\ V IPC objects (see
0febb9
 .BR svipc (7))
0febb9
 and (since Linux 2.6.30)
0febb9
 .\" commit 7eafd7c74c3f2e67c27621b987b28397110d643f
0febb9
@@ -308,17 +288,17 @@ When an IPC namespace is destroyed
0febb9
 (i.e., when the last process that is a member of the namespace terminates),
0febb9
 all IPC objects in the namespace are automatically destroyed.
0febb9
 
0febb9
-Use of this flag requires: a kernel configured with the
0febb9
-.B CONFIG_SYSVIPC
0febb9
-and
0febb9
-.B CONFIG_IPC_NS
0febb9
-options and that the process be privileged
0febb9
-.RB ( CAP_SYS_ADMIN ).
0febb9
+Only a privileged process
0febb9
+.RB ( CAP_SYS_ADMIN )
0febb9
+can employ
0febb9
+.BR CLONE_NEWIPC .
0febb9
 This flag can't be specified in conjunction with
0febb9
 .BR CLONE_SYSVSEM .
0febb9
+
0febb9
+For further information on IPC namespaces, see
0febb9
+.BR namespaces (7).
0febb9
 .TP
0febb9
 .BR CLONE_NEWNET " (since Linux 2.6.24)"
0febb9
-.\" FIXME Check when the implementation was completed
0febb9
 (The implementation of this flag was completed only
0febb9
 by about kernel version 2.6.29.)
0febb9
 
0febb9
@@ -326,7 +306,7 @@ If
0febb9
 .B CLONE_NEWNET
0febb9
 is set, then create the process in a new network namespace.
0febb9
 If this flag is not set, then (as with
0febb9
-.BR fork (2)),
0febb9
+.BR fork (2))
0febb9
 the process is created in the same network namespace as
0febb9
 the calling process.
0febb9
 This flag is intended for the implementation of containers.
0febb9
@@ -341,7 +321,7 @@ directory trees, sockets, etc.).
0febb9
 A physical network device can live in exactly one
0febb9
 network namespace.
0febb9
 A virtual network device ("veth") pair provides a pipe-like abstraction
0febb9
-.\" FIXME Add pointer to veth(4) page when it is eventually completed
0febb9
+.\" FIXME . Add pointer to veth(4) page when it is eventually completed
0febb9
 that can be used to create tunnels between network namespaces,
0febb9
 and can be used to create a bridge to a physical network device
0febb9
 in another namespace.
0febb9
@@ -350,54 +330,41 @@ When a network namespace is freed
0febb9
 (i.e., when the last process in the namespace terminates),
0febb9
 its physical network devices are moved back to the
0febb9
 initial network namespace (not to the parent of the process).
0febb9
+For further information on network namespaces, see
0febb9
+.BR namespaces (7).
0febb9
 
0febb9
-Use of this flag requires: a kernel configured with the
0febb9
-.B CONFIG_NET_NS
0febb9
-option and that the process be privileged
0febb9
-.RB ( CAP_SYS_ADMIN ).
0febb9
+Only a privileged process
0febb9
+.RB ( CAP_SYS_ADMIN )
0febb9
+can employ
0febb9
+.BR CLONE_NEWNET .
0febb9
 .TP
0febb9
 .BR CLONE_NEWNS " (since Linux 2.4.19)"
0febb9
-Start the child in a new mount namespace.
0febb9
-
0febb9
-Every process lives in a mount namespace.
0febb9
-The
0febb9
-.I namespace
0febb9
-of a process is the data (the set of mounts) describing the file hierarchy
0febb9
-as seen by that process.
0febb9
-After a
0febb9
-.BR fork (2)
0febb9
-or
0febb9
-.BR clone ()
0febb9
-where the
0febb9
-.B CLONE_NEWNS
0febb9
-flag is not set, the child lives in the same mount
0febb9
-namespace as the parent.
0febb9
-The system calls
0febb9
-.BR mount (2)
0febb9
-and
0febb9
-.BR umount (2)
0febb9
-change the mount namespace of the calling process, and hence affect
0febb9
-all processes that live in the same namespace, but do not affect
0febb9
-processes in a different mount namespace.
0febb9
-
0febb9
-After a
0febb9
-.BR clone ()
0febb9
-where the
0febb9
+If
0febb9
 .B CLONE_NEWNS
0febb9
-flag is set, the cloned child is started in a new mount namespace,
0febb9
+is set, the cloned child is started in a new mount namespace,
0febb9
 initialized with a copy of the namespace of the parent.
0febb9
-
0febb9
-Only a privileged process (one having the \fBCAP_SYS_ADMIN\fP capability)
0febb9
-may specify the
0febb9
+If
0febb9
 .B CLONE_NEWNS
0febb9
-flag.
0febb9
+is not set, the child lives in the same mount
0febb9
+namespace as the parent.
0febb9
+
0febb9
+Only a privileged process
0febb9
+.RB ( CAP_SYS_ADMIN )
0febb9
+can employ
0febb9
+.BR CLONE_NEWNS .
0febb9
 It is not permitted to specify both
0febb9
 .B CLONE_NEWNS
0febb9
 and
0febb9
 .B CLONE_FS
0febb9
+.\" See https://lwn.net/Articles/543273/
0febb9
 in the same
0febb9
 .BR clone ()
0febb9
 call.
0febb9
+
0febb9
+For further information on mount namespaces, see
0febb9
+.BR namespaces (7)
0febb9
+and
0febb9
+.BR mount_namespaces (7).
0febb9
 .TP
0febb9
 .BR CLONE_NEWPID " (since Linux 2.6.24)"
0febb9
 .\" This explanation draws a lot of details from
0febb9
@@ -411,73 +378,74 @@ If
0febb9
 .B CLONE_NEWPID
0febb9
 is set, then create the process in a new PID namespace.
0febb9
 If this flag is not set, then (as with
0febb9
-.BR fork (2)),
0febb9
+.BR fork (2))
0febb9
 the process is created in the same PID namespace as
0febb9
 the calling process.
0febb9
 This flag is intended for the implementation of containers.
0febb9
 
0febb9
-A PID namespace provides an isolated environment for PIDs:
0febb9
-PIDs in a new namespace start at 1,
0febb9
-somewhat like a standalone system, and calls to
0febb9
-.BR fork (2),
0febb9
-.BR vfork (2),
0febb9
+For further information on PID namespaces, see
0febb9
+.BR namespaces (7)
0febb9
+and
0febb9
+.BR pid_namespaces (7).
0febb9
+
0febb9
+Only a privileged process
0febb9
+.RB ( CAP_SYS_ADMIN )
0febb9
+can employ
0febb9
+.BR CLONE_NEWPID .
0febb9
+This flag can't be specified in conjunction with
0febb9
+.BR CLONE_THREAD
0febb9
 or
0febb9
+.BR CLONE_PARENT .
0febb9
+.TP
0febb9
+.BR CLONE_NEWUSER
0febb9
+(This flag first became meaningful for
0febb9
 .BR clone ()
0febb9
-will produce processes with PIDs that are unique within the namespace.
0febb9
+in Linux 2.6.23,
0febb9
+the current
0febb9
+.BR clone ()
0febb9
+semantics were merged in Linux 3.5,
0febb9
+and the final pieces to make the user namespaces completely usable were
0febb9
+merged in Linux 3.8.)
0febb9
 
0febb9
-The first process created in a new namespace
0febb9
-(i.e., the process created using the
0febb9
-.BR CLONE_NEWPID
0febb9
-flag) has the PID 1, and is the "init" process for the namespace.
0febb9
-Children that are orphaned within the namespace will be reparented
0febb9
-to this process rather than
0febb9
-.BR init (8).
0febb9
-Unlike the traditional
0febb9
-.B init
0febb9
-process, the "init" process of a PID namespace can terminate,
0febb9
-and if it does, all of the processes in the namespace are terminated.
0febb9
-
0febb9
-PID namespaces form a hierarchy.
0febb9
-When a new PID namespace is created,
0febb9
-the processes in that namespace are visible
0febb9
-in the PID namespace of the process that created the new namespace;
0febb9
-analogously, if the parent PID namespace is itself
0febb9
-the child of another PID namespace,
0febb9
-then processes in the child and parent PID namespaces will both be
0febb9
-visible in the grandparent PID namespace.
0febb9
-Conversely, the processes in the "child" PID namespace do not see
0febb9
-the processes in the parent namespace.
0febb9
-The existence of a namespace hierarchy means that each process
0febb9
-may now have multiple PIDs:
0febb9
-one for each namespace in which it is visible;
0febb9
-each of these PIDs is unique within the corresponding namespace.
0febb9
-(A call to
0febb9
-.BR getpid (2)
0febb9
-always returns the PID associated with the namespace in which
0febb9
-the process lives.)
0febb9
-
0febb9
-After creating the new namespace,
0febb9
-it is useful for the child to change its root directory
0febb9
-and mount a new procfs instance at
0febb9
-.I /proc
0febb9
-so that tools such as
0febb9
-.BR ps (1)
0febb9
-work correctly.
0febb9
-.\" mount -t proc proc /proc
0febb9
-(If
0febb9
-.BR CLONE_NEWNS
0febb9
-is also included in
0febb9
-.IR flags ,
0febb9
-then it isn't necessary to change the root directory:
0febb9
-a new procfs instance can be mounted directly over
0febb9
-.IR /proc .)
0febb9
+If
0febb9
+.B CLONE_NEWUSER
0febb9
+is set, then create the process in a new user namespace.
0febb9
+If this flag is not set, then (as with
0febb9
+.BR fork (2))
0febb9
+the process is created in the same user namespace as the calling process.
0febb9
+
0febb9
+For further information on user namespaces, see
0febb9
+.BR namespaces (7)
0febb9
+and
0febb9
+.BR user_namespaces (7)
0febb9
+
0febb9
+Before Linux 3.8, use of
0febb9
+.BR CLONE_NEWUSER
0febb9
+required that the caller have three capabilities:
0febb9
+.BR CAP_SYS_ADMIN ,
0febb9
+.BR CAP_SETUID ,
0febb9
+and
0febb9
+.BR CAP_SETGID .
0febb9
+.\" Before Linux 2.6.29, it appears that only CAP_SYS_ADMIN was needed
0febb9
+Starting with Linux 3.8,
0febb9
+no privileges are needed to create a user namespace.
0febb9
 
0febb9
-Use of this flag requires: a kernel configured with the
0febb9
-.B CONFIG_PID_NS
0febb9
-option and that the process be privileged
0febb9
-.RB ( CAP_SYS_ADMIN ).
0febb9
 This flag can't be specified in conjunction with
0febb9
-.BR CLONE_THREAD .
0febb9
+.BR CLONE_THREAD
0febb9
+or
0febb9
+.BR CLONE_PARENT .
0febb9
+For security reasons,
0febb9
+.\" commit e66eded8309ebf679d3d3c1f5820d1f2ca332c71
0febb9
+.\" https://lwn.net/Articles/543273/
0febb9
+.\" The fix actually went into 3.9 and into 3.8.3. However, user namespaces
0febb9
+.\" were, for practical purposes, unusable in earlier 3.8.x because of the
0febb9
+.\" various filesystems that didn't support userns.
0febb9
+.BR CLONE_NEWUSER
0febb9
+cannot be specified in conjunction with
0febb9
+.BR CLONE_FS .
0febb9
+
0febb9
+For further information on user namespaces, see
0febb9
+.BR user_namespaces (7).
0febb9
 .TP
0febb9
 .BR CLONE_NEWUTS " (since Linux 2.6.19)"
0febb9
 If
0febb9
@@ -486,27 +454,29 @@ is set, then create the process in a new UTS namespace,
0febb9
 whose identifiers are initialized by duplicating the identifiers
0febb9
 from the UTS namespace of the calling process.
0febb9
 If this flag is not set, then (as with
0febb9
-.BR fork (2)),
0febb9
+.BR fork (2))
0febb9
 the process is created in the same UTS namespace as
0febb9
 the calling process.
0febb9
 This flag is intended for the implementation of containers.
0febb9
 
0febb9
 A UTS namespace is the set of identifiers returned by
0febb9
 .BR uname (2);
0febb9
-among these, the domain name and the host name can be modified by
0febb9
+among these, the domain name and the hostname can be modified by
0febb9
 .BR setdomainname (2)
0febb9
 and
0febb9
-.BR
0febb9
 .BR sethostname (2),
0febb9
 respectively.
0febb9
 Changes made to the identifiers in a UTS namespace
0febb9
 are visible to all other processes in the same namespace,
0febb9
 but are not visible to processes in other UTS namespaces.
0febb9
 
0febb9
-Use of this flag requires: a kernel configured with the
0febb9
-.B CONFIG_UTS_NS
0febb9
-option and that the process be privileged
0febb9
-.RB ( CAP_SYS_ADMIN ).
0febb9
+Only a privileged process
0febb9
+.RB ( CAP_SYS_ADMIN )
0febb9
+can employ
0febb9
+.BR CLONE_NEWUTS .
0febb9
+
0febb9
+For further information on UTS namespaces, see
0febb9
+.BR namespaces (7).
0febb9
 .TP
0febb9
 .BR CLONE_PARENT " (since Linux 2.3.12)"
0febb9
 If
0febb9
@@ -530,12 +500,15 @@ is set, then the parent of the calling process, rather than the
0febb9
 calling process itself, will be signaled.
0febb9
 .TP
0febb9
 .BR CLONE_PARENT_SETTID " (since Linux 2.5.49)"
0febb9
-Store child thread ID at location
0febb9
+Store the child thread ID at the location
0febb9
 .I ptid
0febb9
-in parent and child memory.
0febb9
+in the parent's memory.
0febb9
 (In Linux 2.5.32-2.5.48 there was a flag
0febb9
 .B CLONE_SETTID
0febb9
 that did this.)
0febb9
+The store operation completes before
0febb9
+.BR clone ()
0febb9
+returns control to user space.
0febb9
 .TP
0febb9
 .BR CLONE_PID " (obsolete)"
0febb9
 If
0febb9
@@ -547,6 +520,7 @@ of not much use.
0febb9
 Since 2.3.21 this flag can be
0febb9
 specified only by the system boot process (PID 0).
0febb9
 It disappeared in Linux 2.5.16.
0febb9
+Since then, the kernel silently ignores it without error.
0febb9
 .TP
0febb9
 .BR CLONE_PTRACE " (since Linux 2.2)"
0febb9
 If
0febb9
@@ -556,11 +530,25 @@ then trace the child also (see
0febb9
 .BR ptrace (2)).
0febb9
 .TP
0febb9
 .BR CLONE_SETTLS " (since Linux 2.5.32)"
0febb9
-The
0febb9
+The TLS (Thread Local Storage) descriptor is set to
0febb9
+.I newtls.
0febb9
+
0febb9
+The interpretation of
0febb9
 .I newtls
0febb9
-argument is the new TLS (Thread Local Storage) descriptor.
0febb9
+and the resulting effect is architecture dependent.
0febb9
+On x86,
0febb9
+.I newtls
0febb9
+is interpreted as a
0febb9
+.IR "struct user_desc *"
0febb9
 (See
0febb9
-.BR set_thread_area (2).)
0febb9
+.BR set_thread_area (2)).
0febb9
+On x86_64 it is the new value to be set for the %fs base register
0febb9
+(See the
0febb9
+.I ARCH_SET_FS
0febb9
+argument to
0febb9
+.BR arch_prctl (2)).
0febb9
+On architectures with a dedicated TLS register, it is the new value
0febb9
+of that register.
0febb9
 .TP
0febb9
 .BR CLONE_SIGHAND " (since Linux 2.0)"
0febb9
 If
0febb9
@@ -612,16 +600,26 @@ from Linux 2.6.25 onward,
0febb9
 and was
0febb9
 .I removed
0febb9
 altogether in Linux 2.6.38.
0febb9
+Since then, the kernel silently ignores it without error.
0febb9
 .\" glibc 2.8 removed this defn from bits/sched.h
0febb9
 .TP
0febb9
 .BR CLONE_SYSVSEM " (since Linux 2.5.10)"
0febb9
 If
0febb9
 .B CLONE_SYSVSEM
0febb9
 is set, then the child and the calling process share
0febb9
-a single list of System V semaphore undo values (see
0febb9
+a single list of System V semaphore adjustment
0febb9
+.RI ( semadj )
0febb9
+values (see
0febb9
 .BR semop (2)).
0febb9
-If this flag is not set, then the child has a separate undo list,
0febb9
-which is initially empty.
0febb9
+In this case, the shared list accumulates
0febb9
+.I semadj
0febb9
+values across all processes sharing the list,
0febb9
+and semaphore adjustments are performed only when the last process
0febb9
+that is sharing the list terminates (or ceases sharing the list using
0febb9
+.BR unshare (2)).
0febb9
+If this flag is not set, then the child has a separate
0febb9
+.I semadj
0febb9
+list that is initially empty.
0febb9
 .TP
0febb9
 .BR CLONE_THREAD " (since Linux 2.4.0-test8)"
0febb9
 If
0febb9
@@ -703,7 +701,12 @@ must also include
0febb9
 .B CLONE_SIGHAND
0febb9
 if
0febb9
 .B CLONE_THREAD
0febb9
-is specified.
0febb9
+is specified
0febb9
+(and note that, since Linux 2.6.0-test6,
0febb9
+.BR CLONE_SIGHAND
0febb9
+also requires
0febb9
+.BR CLONE_VM
0febb9
+to be included).
0febb9
 
0febb9
 Signals may be sent to a thread group as a whole (i.e., a TGID) using
0febb9
 .BR kill (2),
0febb9
@@ -761,7 +764,7 @@ or
0febb9
 
0febb9
 If
0febb9
 .B CLONE_VFORK
0febb9
-is not set then both the calling process and the child are schedulable
0febb9
+is not set, then both the calling process and the child are schedulable
0febb9
 after the call, and an application should not rely on execution occurring
0febb9
 in any particular order.
0febb9
 .TP
0febb9
@@ -786,7 +789,7 @@ space of the calling process at the time of
0febb9
 Memory writes or file mappings/unmappings performed by one of the
0febb9
 processes do not affect the other, as with
0febb9
 .BR fork (2).
0febb9
-.SS The raw system call interface
0febb9
+.SS C library/kernel differences
0febb9
 The raw
0febb9
 .BR clone ()
0febb9
 system call corresponds more closely to
0febb9
@@ -801,16 +804,58 @@ arguments of the
0febb9
 .BR clone ()
0febb9
 wrapper function are omitted.
0febb9
 Furthermore, the argument order changes.
0febb9
-The raw system call interface on x86 and many other architectures is roughly:
0febb9
+In addition, there are variations across architectures.
0febb9
+
0febb9
+The raw system call interface on x86-64 and some other architectures
0febb9
+(including sh, tile, and alpha) is roughly:
0febb9
+
0febb9
 .in +4
0febb9
 .nf
0febb9
+.BI "long clone(unsigned long " flags ", void *" child_stack ,
0febb9
+.BI "           int *" ptid ", int *" ctid ,
0febb9
+.BI "           unsigned long " newtls );
0febb9
+.fi
0febb9
+.in
0febb9
+
0febb9
+On x86-32, and several other common architectures
0febb9
+(including score, ARM, ARM 64, PA-RISC, arc, Power PC, xtensa,
0febb9
+and MIPS),
0febb9
+.\" CONFIG_CLONE_BACKWARDS
0febb9
+the order of the last two arguments is reversed:
0febb9
 
0febb9
+.in +4
0febb9
+.nf
0febb9
 .BI "long clone(unsigned long " flags ", void *" child_stack ,
0febb9
-.BI "           void *" ptid ", void *" ctid ,
0febb9
-.BI "           struct pt_regs *" regs );
0febb9
+.BI "          int *" ptid ", unsigned long " newtls ,
0febb9
+.BI "          int *" ctid );
0febb9
+.fi
0febb9
+.in
0febb9
+
0febb9
+On the cris and s390 architectures,
0febb9
+.\" CONFIG_CLONE_BACKWARDS2
0febb9
+the order of the first two arguments is reversed:
0febb9
 
0febb9
+.in +4
0febb9
+.nf
0febb9
+.BI "long clone(void *" child_stack ", unsigned long " flags ,
0febb9
+.BI "           int *" ptid ", int *" ctid ,
0febb9
+.BI "           unsigned long " newtls );
0febb9
+.fi
0febb9
+.in
0febb9
+
0febb9
+On the microblaze architecture,
0febb9
+.\" CONFIG_CLONE_BACKWARDS3
0febb9
+an additional argument is supplied:
0febb9
+
0febb9
+.in +4
0febb9
+.nf
0febb9
+.BI "long clone(unsigned long " flags ", void *" child_stack ,
0febb9
+.BI "           int " stack_size , "\fR         /* Size of stack */"
0febb9
+.BI "           int *" ptid ", int *" ctid ,
0febb9
+.BI "           unsigned long " newtls );
0febb9
 .fi
0febb9
 .in
0febb9
+
0febb9
 Another difference for the raw system call is that the
0febb9
 .I child_stack
0febb9
 argument may be zero, in which case copy-on-write semantics ensure that the
0febb9
@@ -819,17 +864,13 @@ the stack.
0febb9
 In this case, for correct operation, the
0febb9
 .B CLONE_VM
0febb9
 option should not be specified.
0febb9
-
0febb9
-For some architectures, the order of the arguments for the system call
0febb9
-differs from that shown above.
0febb9
-On the score, microblaze, ARM, ARM 64, PA-RISC, arc, Power PC, xtensa,
0febb9
-and MIPS architectures,
0febb9
-the order of the fourth and fifth arguments is reversed.
0febb9
-On the cris and s390 architectures,
0febb9
-the order of the first and second arguments is reversed.
0febb9
+.\"
0febb9
 .SS blackfin, m68k, and sparc
0febb9
+.\" Mike Frysinger noted in a 2013 mail:
0febb9
+.\"     these arches don't define __ARCH_WANT_SYS_CLONE:
0febb9
+.\"     blackfin ia64 m68k sparc
0febb9
 The argument-passing conventions on
0febb9
-blackfin, m68k, and sparc are different from descriptions above.
0febb9
+blackfin, m68k, and sparc are different from the descriptions above.
0febb9
 For details, see the kernel (and glibc) source.
0febb9
 .SS ia64
0febb9
 On ia64, a different interface is used:
0febb9
@@ -883,7 +924,8 @@ will be set appropriately.
0febb9
 .SH ERRORS
0febb9
 .TP
0febb9
 .B EAGAIN
0febb9
-Too many processes are already running.
0febb9
+Too many processes are already running; see
0febb9
+.BR fork (2).
0febb9
 .TP
0febb9
 .B EINVAL
0febb9
 .B CLONE_SIGHAND
0febb9
@@ -908,6 +950,7 @@ was not.
0febb9
 .\" (Since Linux 2.6.0-test6.)
0febb9
 .TP
0febb9
 .B EINVAL
0febb9
+.\" commit e66eded8309ebf679d3d3c1f5820d1f2ca332c71
0febb9
 Both
0febb9
 .B CLONE_FS
0febb9
 and
0febb9
@@ -915,6 +958,14 @@ and
0febb9
 were specified in
0febb9
 .IR flags .
0febb9
 .TP
0febb9
+.BR EINVAL " (since Linux 3.9)"
0febb9
+Both
0febb9
+.B CLONE_NEWUSER
0febb9
+and
0febb9
+.B CLONE_FS
0febb9
+were specified in
0febb9
+.IR flags .
0febb9
+.TP
0febb9
 .B EINVAL
0febb9
 Both
0febb9
 .B CLONE_NEWIPC
0febb9
@@ -924,18 +975,25 @@ were specified in
0febb9
 .IR flags .
0febb9
 .TP
0febb9
 .B EINVAL
0febb9
-Both
0febb9
+One (or both) of
0febb9
 .BR CLONE_NEWPID
0febb9
-and
0febb9
+or
0febb9
+.BR CLONE_NEWUSER
0febb9
+and one (or both) of
0febb9
 .BR CLONE_THREAD
0febb9
+or
0febb9
+.BR CLONE_PARENT
0febb9
 were specified in
0febb9
 .IR flags .
0febb9
 .TP
0febb9
 .B EINVAL
0febb9
-Returned by
0febb9
+Returned by the glibc
0febb9
 .BR clone ()
0febb9
-when a zero value is specified for
0febb9
-.IR child_stack .
0febb9
+wrapper function when
0febb9
+.IR fn
0febb9
+or
0febb9
+.IR child_stack
0febb9
+is specified as NULL.
0febb9
 .TP
0febb9
 .B EINVAL
0febb9
 .BR CLONE_NEWIPC
0febb9
@@ -971,11 +1029,48 @@ but the kernel was not configured with the
0febb9
 .B CONFIG_UTS
0febb9
 option.
0febb9
 .TP
0febb9
+.B EINVAL
0febb9
+.I child_stack
0febb9
+is not aligned to a suitable boundary for this architecture.
0febb9
+For example, on aarch64,
0febb9
+.I child_stack
0febb9
+must be a multiple of 16.
0febb9
+.TP
0febb9
 .B ENOMEM
0febb9
 Cannot allocate sufficient memory to allocate a task structure for the
0febb9
 child, or to copy those parts of the caller's context that need to be
0febb9
 copied.
0febb9
 .TP
0febb9
+.BR ENOSPC " (since Linux 3.7)"
0febb9
+.\" commit f2302505775fd13ba93f034206f1e2a587017929
0febb9
+.B CLONE_NEWPID
0febb9
+was specified in flags,
0febb9
+but the limit on the nesting depth of PID namespaces
0febb9
+would have been exceeded; see
0febb9
+.BR pid_namespaces (7).
0febb9
+.TP
0febb9
+.BR ENOSPC " (since Linux 4.9; beforehand " EUSERS )
0febb9
+.B CLONE_NEWUSER
0febb9
+was specified in
0febb9
+.IR flags ,
0febb9
+and the call would cause the limit on the number of
0febb9
+nested user namespaces to be exceeded.
0febb9
+See
0febb9
+.BR user_namespaces (7).
0febb9
+
0febb9
+From Linux 3.11 to Linux 4.8, the error diagnosed in this case was
0febb9
+.BR EUSERS .
0febb9
+.TP
0febb9
+.BR ENOSPC " (since Linux 4.9)"
0febb9
+One of the values in
0febb9
+.I flags
0febb9
+specified the creation of a new user namespace,
0febb9
+but doing so would have caused the limit defined by the corresponding file in
0febb9
+.IR /proc/sys/user
0febb9
+to be exceeded.
0febb9
+For further details, see
0febb9
+.BR namespaces (7).
0febb9
+.TP
0febb9
 .B EPERM
0febb9
 .BR CLONE_NEWIPC ,
0febb9
 .BR CLONE_NEWNET ,
0febb9
@@ -989,22 +1084,62 @@ was specified by an unprivileged process (process without \fBCAP_SYS_ADMIN\fP).
0febb9
 .B CLONE_PID
0febb9
 was specified by a process other than process 0.
0febb9
 .TP
0febb9
+.B EPERM
0febb9
+.BR CLONE_NEWUSER
0febb9
+was specified in
0febb9
+.IR flags ,
0febb9
+but either the effective user ID or the effective group ID of the caller
0febb9
+does not have a mapping in the parent namespace (see
0febb9
+.BR user_namespaces (7)).
0febb9
+.TP
0febb9
+.BR EPERM " (since Linux 3.9)"
0febb9
+.\" commit 3151527ee007b73a0ebd296010f1c0454a919c7d
0febb9
+.B CLONE_NEWUSER
0febb9
+was specified in
0febb9
+.I flags
0febb9
+and the caller is in a chroot environment
0febb9
+.\" FIXME What is the rationale for this restriction?
0febb9
+(i.e., the caller's root directory does not match the root directory
0febb9
+of the mount namespace in which it resides).
0febb9
+.TP
0febb9
 .BR ERESTARTNOINTR " (since Linux 2.6.17)"
0febb9
+.\" commit 4a2c7a7837da1b91468e50426066d988050e4d56
0febb9
 System call was interrupted by a signal and will be restarted.
0febb9
 (This can be seen only during a trace.)
0febb9
-.SH VERSIONS
0febb9
-There is no entry for
0febb9
-.BR clone ()
0febb9
-in libc5.
0febb9
-glibc2 provides
0febb9
-.BR clone ()
0febb9
-as described in this manual page.
0febb9
+.TP
0febb9
+.BR EUSERS " (Linux 3.11 to Linux 4.8)"
0febb9
+.B CLONE_NEWUSER
0febb9
+was specified in
0febb9
+.IR flags ,
0febb9
+and the limit on the number of nested user namespaces would be exceeded.
0febb9
+See the discussion of the
0febb9
+.BR ENOSPC
0febb9
+error above.
0febb9
+.\" .SH VERSIONS
0febb9
+.\" There is no entry for
0febb9
+.\" .BR clone ()
0febb9
+.\" in libc5.
0febb9
+.\" glibc2 provides
0febb9
+.\" .BR clone ()
0febb9
+.\" as described in this manual page.
0febb9
 .SH CONFORMING TO
0febb9
 .BR clone ()
0febb9
 is Linux-specific and should not be used in programs
0febb9
 intended to be portable.
0febb9
 .SH NOTES
0febb9
-In the kernel 2.4.x series,
0febb9
+The
0febb9
+.BR kcmp (2)
0febb9
+system call can be used to test whether two processes share various
0febb9
+resources such as a file descriptor table,
0febb9
+System V semaphore undo operations, or a virtual address space.
0febb9
+
0febb9
+
0febb9
+Handlers registered using
0febb9
+.BR pthread_atfork (3)
0febb9
+are not executed during a call to
0febb9
+.BR clone ().
0febb9
+
0febb9
+In the Linux 2.4.x series,
0febb9
 .B CLONE_THREAD
0febb9
 generally does not make the parent of the new thread the same
0febb9
 as the parent of the calling process.
0febb9
@@ -1012,14 +1147,13 @@ However, for kernel versions 2.4.7 to 2.4.18 the
0febb9
 .B CLONE_THREAD
0febb9
 flag implied the
0febb9
 .B CLONE_PARENT
0febb9
-flag (as in kernel 2.6).
0febb9
+flag (as in Linux 2.6.0 and later).
0febb9
 
0febb9
 For a while there was
0febb9
 .B CLONE_DETACHED
0febb9
 (introduced in 2.5.32):
0febb9
 parent wants no child-exit signal.
0febb9
-In 2.6.2 the need to give this
0febb9
-together with
0febb9
+In Linux 2.6.2, the need to give this flag together with
0febb9
 .B CLONE_THREAD
0febb9
 disappeared.
0febb9
 This flag is still defined, but has no effect.
0febb9
@@ -1088,7 +1222,6 @@ To get the truth, it may be necessary to use code such as the following:
0febb9
 .\" https://bugzilla.redhat.com/show_bug.cgi?id=417521
0febb9
 .\" http://sourceware.org/bugzilla/show_bug.cgi?id=6910
0febb9
 .SH EXAMPLE
0febb9
-.SS Create a child that executes in a separate UTS namespace
0febb9
 The following program demonstrates the use of
0febb9
 .BR clone ()
0febb9
 to create a child process that executes in a separate UTS namespace.
0febb9
@@ -1098,7 +1231,7 @@ making it possible to see that the hostname
0febb9
 differs in the UTS namespaces of the parent and child.
0febb9
 For an example of the use of this program, see
0febb9
 .BR setns (2).
0febb9
-
0febb9
+.SS Program source
0febb9
 .nf
0febb9
 #define _GNU_SOURCE
0febb9
 #include <sys/wait.h>
0febb9
@@ -1198,6 +1331,7 @@ main(int argc, char *argv[])
0febb9
 .BR unshare (2),
0febb9
 .BR wait (2),
0febb9
 .BR capabilities (7),
0febb9
+.BR namespaces (7),
0febb9
 .BR pthreads (7)
0febb9
 .SH COLOPHON
0febb9
 This page is part of release 3.53 of the Linux
0febb9
diff --git a/man-pages/man2/clone.2 b/man-pages/man2/clone.2
0febb9
index d9ffe3e..d053b0e 100644
0febb9
--- a/man-pages/man2/clone.2
0febb9
+++ b/man-pages/man2/clone.2
0febb9
@@ -39,50 +39,23 @@
0febb9
 .\" 2008-11-19, mtk, document CLONE_NEWIPC
0febb9
 .\" 2008-11-19, Jens Axboe, mtk, document CLONE_IO
0febb9
 .\"
0febb9
-.\" FIXME Document CLONE_NEWUSER, which is new in 2.6.23
0febb9
-.\"       (also supported for unshare()?)
0febb9
-.\"
0febb9
-.TH CLONE 2 2013-04-16 "Linux" "Linux Programmer's Manual"
0febb9
+.TH CLONE 2 2016-12-12 "Linux" "Linux Programmer's Manual"
0febb9
 .SH NAME
0febb9
 clone, __clone2 \- create a child process
0febb9
 .SH SYNOPSIS
0febb9
 .nf
0febb9
 /* Prototype for the glibc wrapper function */
0febb9
 
0febb9
+.B #define _GNU_SOURCE
0febb9
 .B #include <sched.h>
0febb9
 
0febb9
 .BI "int clone(int (*" "fn" ")(void *), void *" child_stack ,
0febb9
 .BI "          int " flags ", void *" "arg" ", ... "
0febb9
-.BI "          /* pid_t *" ptid ", struct user_desc *" tls \
0febb9
+.BI "          /* pid_t *" ptid ", void *" newtls \
0febb9
 ", pid_t *" ctid " */ );"
0febb9
 
0febb9
-/* Prototype for the raw system call */
0febb9
-
0febb9
-.BI "long clone(unsigned long " flags ", void *" child_stack ,
0febb9
-.BI "          void *" ptid ", void *" ctid ,
0febb9
-.BI "          struct pt_regs *" regs );
0febb9
+/* For the prototype of the raw system call, see NOTES */
0febb9
 .fi
0febb9
-.sp
0febb9
-.in -4n
0febb9
-Feature Test Macro Requirements for glibc wrapper function (see
0febb9
-.BR feature_test_macros (7)):
0febb9
-.in
0febb9
-.sp
0febb9
-.BR clone ():
0febb9
-.ad l
0febb9
-.RS 4
0febb9
-.PD 0
0febb9
-.TP 4
0febb9
-Since glibc 2.14:
0febb9
-_GNU_SOURCE
0febb9
-.TP 4
0febb9
-.\" FIXME See http://sources.redhat.com/bugzilla/show_bug.cgi?id=4749
0febb9
-Before glibc 2.14:
0febb9
-_BSD_SOURCE || _SVID_SOURCE
0febb9
-    /* _GNU_SOURCE also suffices */
0febb9
-.PD
0febb9
-.RE
0febb9
-.ad b
0febb9
 .SH DESCRIPTION
0febb9
 .BR clone ()
0febb9
 creates a new process, in a manner similar to
0febb9
@@ -107,7 +80,7 @@ But see the description of
0febb9
 .B CLONE_PARENT
0febb9
 below.)
0febb9
 
0febb9
-The main use of
0febb9
+One use of
0febb9
 .BR clone ()
0febb9
 is to implement threads: multiple threads of control in a program that
0febb9
 run concurrently in a shared memory space.
0febb9
@@ -180,7 +153,7 @@ in order to specify what is shared between the calling process
0febb9
 and the child process:
0febb9
 .TP
0febb9
 .BR CLONE_CHILD_CLEARTID " (since Linux 2.5.49)"
0febb9
-Erase child thread ID at location
0febb9
+Clear (zero) the child thread ID at the location
0febb9
 .I ctid
0febb9
 in child memory when the child exits, and do a wakeup on the futex
0febb9
 at that address.
0febb9
@@ -190,9 +163,12 @@ system call.
0febb9
 This is used by threading libraries.
0febb9
 .TP
0febb9
 .BR CLONE_CHILD_SETTID " (since Linux 2.5.49)"
0febb9
-Store child thread ID at location
0febb9
+Store the child thread ID at the location
0febb9
 .I ctid
0febb9
-in child memory.
0febb9
+in the child's memory.
0febb9
+The store operation completes before
0febb9
+.BR clone ()
0febb9
+returns control to user space.
0febb9
 .TP
0febb9
 .BR CLONE_FILES " (since Linux 2.0)"
0febb9
 If
0febb9
@@ -206,27 +182,31 @@ or changes its associated flags (using the
0febb9
 .BR fcntl (2)
0febb9
 .B F_SETFD
0febb9
 operation), the other process is also affected.
0febb9
+If a process sharing a file descriptor table calls
0febb9
+.BR execve (2),
0febb9
+its file descriptor table is duplicated (unshared).
0febb9
 
0febb9
 If
0febb9
 .B CLONE_FILES
0febb9
 is not set, the child process inherits a copy of all file descriptors
0febb9
 opened in the calling process at the time of
0febb9
 .BR clone ().
0febb9
-(The duplicated file descriptors in the child refer to the
0febb9
-same open file descriptions (see
0febb9
-.BR open (2))
0febb9
-as the corresponding file descriptors in the calling process.)
0febb9
 Subsequent operations that open or close file descriptors,
0febb9
 or change file descriptor flags,
0febb9
 performed by either the calling
0febb9
 process or the child process do not affect the other process.
0febb9
+Note, however,
0febb9
+that the duplicated file descriptors in the child refer to the same open file
0febb9
+descriptions as the corresponding file descriptors in the calling process,
0febb9
+and thus share file offsets and file status flags (see
0febb9
+.BR open (2)).
0febb9
 .TP
0febb9
 .BR CLONE_FS " (since Linux 2.0)"
0febb9
 If
0febb9
 .B CLONE_FS
0febb9
-is set, the caller and the child process share the same file system
0febb9
+is set, the caller and the child process share the same filesystem
0febb9
 information.
0febb9
-This includes the root of the file system, the current
0febb9
+This includes the root of the filesystem, the current
0febb9
 working directory, and the umask.
0febb9
 Any call to
0febb9
 .BR chroot (2),
0febb9
@@ -238,7 +218,7 @@ other process.
0febb9
 
0febb9
 If
0febb9
 .B CLONE_FS
0febb9
-is not set, the child process works on a copy of the file system
0febb9
+is not set, the child process works on a copy of the filesystem
0febb9
 information of the calling process at the time of the
0febb9
 .BR clone ()
0febb9
 call.
0febb9
@@ -258,7 +238,7 @@ If this flag is not set, then (as with
0febb9
 the new process has its own I/O context.
0febb9
 
0febb9
 .\" The following based on text from Jens Axboe
0febb9
-The I/O context is the I/O scope of the disk scheduler (i.e,
0febb9
+The I/O context is the I/O scope of the disk scheduler (i.e.,
0febb9
 what the I/O scheduler uses to model scheduling of a process's I/O).
0febb9
 If processes share the same I/O context,
0febb9
 they are treated as one by the I/O scheduler.
0febb9
@@ -288,7 +268,7 @@ the process is created in the same IPC namespace as
0febb9
 the calling process.
0febb9
 This flag is intended for the implementation of containers.
0febb9
 
0febb9
-An IPC namespace provides an isolated view of System V IPC objects (see
0febb9
+An IPC namespace provides an isolated view of System\ V IPC objects (see
0febb9
 .BR svipc (7))
0febb9
 and (since Linux 2.6.30)
0febb9
 .\" commit 7eafd7c74c3f2e67c27621b987b28397110d643f
0febb9
@@ -308,17 +288,17 @@ When an IPC namespace is destroyed
0febb9
 (i.e., when the last process that is a member of the namespace terminates),
0febb9
 all IPC objects in the namespace are automatically destroyed.
0febb9
 
0febb9
-Use of this flag requires: a kernel configured with the
0febb9
-.B CONFIG_SYSVIPC
0febb9
-and
0febb9
-.B CONFIG_IPC_NS
0febb9
-options and that the process be privileged
0febb9
-.RB ( CAP_SYS_ADMIN ).
0febb9
+Only a privileged process
0febb9
+.RB ( CAP_SYS_ADMIN )
0febb9
+can employ
0febb9
+.BR CLONE_NEWIPC .
0febb9
 This flag can't be specified in conjunction with
0febb9
 .BR CLONE_SYSVSEM .
0febb9
+
0febb9
+For further information on IPC namespaces, see
0febb9
+.BR namespaces (7).
0febb9
 .TP
0febb9
 .BR CLONE_NEWNET " (since Linux 2.6.24)"
0febb9
-.\" FIXME Check when the implementation was completed
0febb9
 (The implementation of this flag was completed only
0febb9
 by about kernel version 2.6.29.)
0febb9
 
0febb9
@@ -326,7 +306,7 @@ If
0febb9
 .B CLONE_NEWNET
0febb9
 is set, then create the process in a new network namespace.
0febb9
 If this flag is not set, then (as with
0febb9
-.BR fork (2)),
0febb9
+.BR fork (2))
0febb9
 the process is created in the same network namespace as
0febb9
 the calling process.
0febb9
 This flag is intended for the implementation of containers.
0febb9
@@ -341,7 +321,7 @@ directory trees, sockets, etc.).
0febb9
 A physical network device can live in exactly one
0febb9
 network namespace.
0febb9
 A virtual network device ("veth") pair provides a pipe-like abstraction
0febb9
-.\" FIXME Add pointer to veth(4) page when it is eventually completed
0febb9
+.\" FIXME . Add pointer to veth(4) page when it is eventually completed
0febb9
 that can be used to create tunnels between network namespaces,
0febb9
 and can be used to create a bridge to a physical network device
0febb9
 in another namespace.
0febb9
@@ -350,54 +330,41 @@ When a network namespace is freed
0febb9
 (i.e., when the last process in the namespace terminates),
0febb9
 its physical network devices are moved back to the
0febb9
 initial network namespace (not to the parent of the process).
0febb9
+For further information on network namespaces, see
0febb9
+.BR namespaces (7).
0febb9
 
0febb9
-Use of this flag requires: a kernel configured with the
0febb9
-.B CONFIG_NET_NS
0febb9
-option and that the process be privileged
0febb9
-.RB ( CAP_SYS_ADMIN ).
0febb9
+Only a privileged process
0febb9
+.RB ( CAP_SYS_ADMIN )
0febb9
+can employ
0febb9
+.BR CLONE_NEWNET .
0febb9
 .TP
0febb9
 .BR CLONE_NEWNS " (since Linux 2.4.19)"
0febb9
-Start the child in a new mount namespace.
0febb9
-
0febb9
-Every process lives in a mount namespace.
0febb9
-The
0febb9
-.I namespace
0febb9
-of a process is the data (the set of mounts) describing the file hierarchy
0febb9
-as seen by that process.
0febb9
-After a
0febb9
-.BR fork (2)
0febb9
-or
0febb9
-.BR clone ()
0febb9
-where the
0febb9
-.B CLONE_NEWNS
0febb9
-flag is not set, the child lives in the same mount
0febb9
-namespace as the parent.
0febb9
-The system calls
0febb9
-.BR mount (2)
0febb9
-and
0febb9
-.BR umount (2)
0febb9
-change the mount namespace of the calling process, and hence affect
0febb9
-all processes that live in the same namespace, but do not affect
0febb9
-processes in a different mount namespace.
0febb9
-
0febb9
-After a
0febb9
-.BR clone ()
0febb9
-where the
0febb9
+If
0febb9
 .B CLONE_NEWNS
0febb9
-flag is set, the cloned child is started in a new mount namespace,
0febb9
+is set, the cloned child is started in a new mount namespace,
0febb9
 initialized with a copy of the namespace of the parent.
0febb9
-
0febb9
-Only a privileged process (one having the \fBCAP_SYS_ADMIN\fP capability)
0febb9
-may specify the
0febb9
+If
0febb9
 .B CLONE_NEWNS
0febb9
-flag.
0febb9
+is not set, the child lives in the same mount
0febb9
+namespace as the parent.
0febb9
+
0febb9
+Only a privileged process
0febb9
+.RB ( CAP_SYS_ADMIN )
0febb9
+can employ
0febb9
+.BR CLONE_NEWNS .
0febb9
 It is not permitted to specify both
0febb9
 .B CLONE_NEWNS
0febb9
 and
0febb9
 .B CLONE_FS
0febb9
+.\" See https://lwn.net/Articles/543273/
0febb9
 in the same
0febb9
 .BR clone ()
0febb9
 call.
0febb9
+
0febb9
+For further information on mount namespaces, see
0febb9
+.BR namespaces (7)
0febb9
+and
0febb9
+.BR mount_namespaces (7).
0febb9
 .TP
0febb9
 .BR CLONE_NEWPID " (since Linux 2.6.24)"
0febb9
 .\" This explanation draws a lot of details from
0febb9
@@ -411,73 +378,74 @@ If
0febb9
 .B CLONE_NEWPID
0febb9
 is set, then create the process in a new PID namespace.
0febb9
 If this flag is not set, then (as with
0febb9
-.BR fork (2)),
0febb9
+.BR fork (2))
0febb9
 the process is created in the same PID namespace as
0febb9
 the calling process.
0febb9
 This flag is intended for the implementation of containers.
0febb9
 
0febb9
-A PID namespace provides an isolated environment for PIDs:
0febb9
-PIDs in a new namespace start at 1,
0febb9
-somewhat like a standalone system, and calls to
0febb9
-.BR fork (2),
0febb9
-.BR vfork (2),
0febb9
+For further information on PID namespaces, see
0febb9
+.BR namespaces (7)
0febb9
+and
0febb9
+.BR pid_namespaces (7).
0febb9
+
0febb9
+Only a privileged process
0febb9
+.RB ( CAP_SYS_ADMIN )
0febb9
+can employ
0febb9
+.BR CLONE_NEWPID .
0febb9
+This flag can't be specified in conjunction with
0febb9
+.BR CLONE_THREAD
0febb9
 or
0febb9
+.BR CLONE_PARENT .
0febb9
+.TP
0febb9
+.BR CLONE_NEWUSER
0febb9
+(This flag first became meaningful for
0febb9
 .BR clone ()
0febb9
-will produce processes with PIDs that are unique within the namespace.
0febb9
+in Linux 2.6.23,
0febb9
+the current
0febb9
+.BR clone ()
0febb9
+semantics were merged in Linux 3.5,
0febb9
+and the final pieces to make the user namespaces completely usable were
0febb9
+merged in Linux 3.8.)
0febb9
 
0febb9
-The first process created in a new namespace
0febb9
-(i.e., the process created using the
0febb9
-.BR CLONE_NEWPID
0febb9
-flag) has the PID 1, and is the "init" process for the namespace.
0febb9
-Children that are orphaned within the namespace will be reparented
0febb9
-to this process rather than
0febb9
-.BR init (8).
0febb9
-Unlike the traditional
0febb9
-.B init
0febb9
-process, the "init" process of a PID namespace can terminate,
0febb9
-and if it does, all of the processes in the namespace are terminated.
0febb9
-
0febb9
-PID namespaces form a hierarchy.
0febb9
-When a new PID namespace is created,
0febb9
-the processes in that namespace are visible
0febb9
-in the PID namespace of the process that created the new namespace;
0febb9
-analogously, if the parent PID namespace is itself
0febb9
-the child of another PID namespace,
0febb9
-then processes in the child and parent PID namespaces will both be
0febb9
-visible in the grandparent PID namespace.
0febb9
-Conversely, the processes in the "child" PID namespace do not see
0febb9
-the processes in the parent namespace.
0febb9
-The existence of a namespace hierarchy means that each process
0febb9
-may now have multiple PIDs:
0febb9
-one for each namespace in which it is visible;
0febb9
-each of these PIDs is unique within the corresponding namespace.
0febb9
-(A call to
0febb9
-.BR getpid (2)
0febb9
-always returns the PID associated with the namespace in which
0febb9
-the process lives.)
0febb9
-
0febb9
-After creating the new namespace,
0febb9
-it is useful for the child to change its root directory
0febb9
-and mount a new procfs instance at
0febb9
-.I /proc
0febb9
-so that tools such as
0febb9
-.BR ps (1)
0febb9
-work correctly.
0febb9
-.\" mount -t proc proc /proc
0febb9
-(If
0febb9
-.BR CLONE_NEWNS
0febb9
-is also included in
0febb9
-.IR flags ,
0febb9
-then it isn't necessary to change the root directory:
0febb9
-a new procfs instance can be mounted directly over
0febb9
-.IR /proc .)
0febb9
+If
0febb9
+.B CLONE_NEWUSER
0febb9
+is set, then create the process in a new user namespace.
0febb9
+If this flag is not set, then (as with
0febb9
+.BR fork (2))
0febb9
+the process is created in the same user namespace as the calling process.
0febb9
+
0febb9
+For further information on user namespaces, see
0febb9
+.BR namespaces (7)
0febb9
+and
0febb9
+.BR user_namespaces (7)
0febb9
+
0febb9
+Before Linux 3.8, use of
0febb9
+.BR CLONE_NEWUSER
0febb9
+required that the caller have three capabilities:
0febb9
+.BR CAP_SYS_ADMIN ,
0febb9
+.BR CAP_SETUID ,
0febb9
+and
0febb9
+.BR CAP_SETGID .
0febb9
+.\" Before Linux 2.6.29, it appears that only CAP_SYS_ADMIN was needed
0febb9
+Starting with Linux 3.8,
0febb9
+no privileges are needed to create a user namespace.
0febb9
 
0febb9
-Use of this flag requires: a kernel configured with the
0febb9
-.B CONFIG_PID_NS
0febb9
-option and that the process be privileged
0febb9
-.RB ( CAP_SYS_ADMIN ).
0febb9
 This flag can't be specified in conjunction with
0febb9
-.BR CLONE_THREAD .
0febb9
+.BR CLONE_THREAD
0febb9
+or
0febb9
+.BR CLONE_PARENT .
0febb9
+For security reasons,
0febb9
+.\" commit e66eded8309ebf679d3d3c1f5820d1f2ca332c71
0febb9
+.\" https://lwn.net/Articles/543273/
0febb9
+.\" The fix actually went into 3.9 and into 3.8.3. However, user namespaces
0febb9
+.\" were, for practical purposes, unusable in earlier 3.8.x because of the
0febb9
+.\" various filesystems that didn't support userns.
0febb9
+.BR CLONE_NEWUSER
0febb9
+cannot be specified in conjunction with
0febb9
+.BR CLONE_FS .
0febb9
+
0febb9
+For further information on user namespaces, see
0febb9
+.BR user_namespaces (7).
0febb9
 .TP
0febb9
 .BR CLONE_NEWUTS " (since Linux 2.6.19)"
0febb9
 If
0febb9
@@ -486,27 +454,29 @@ is set, then create the process in a new UTS namespace,
0febb9
 whose identifiers are initialized by duplicating the identifiers
0febb9
 from the UTS namespace of the calling process.
0febb9
 If this flag is not set, then (as with
0febb9
-.BR fork (2)),
0febb9
+.BR fork (2))
0febb9
 the process is created in the same UTS namespace as
0febb9
 the calling process.
0febb9
 This flag is intended for the implementation of containers.
0febb9
 
0febb9
 A UTS namespace is the set of identifiers returned by
0febb9
 .BR uname (2);
0febb9
-among these, the domain name and the host name can be modified by
0febb9
+among these, the domain name and the hostname can be modified by
0febb9
 .BR setdomainname (2)
0febb9
 and
0febb9
-.BR
0febb9
 .BR sethostname (2),
0febb9
 respectively.
0febb9
 Changes made to the identifiers in a UTS namespace
0febb9
 are visible to all other processes in the same namespace,
0febb9
 but are not visible to processes in other UTS namespaces.
0febb9
 
0febb9
-Use of this flag requires: a kernel configured with the
0febb9
-.B CONFIG_UTS_NS
0febb9
-option and that the process be privileged
0febb9
-.RB ( CAP_SYS_ADMIN ).
0febb9
+Only a privileged process
0febb9
+.RB ( CAP_SYS_ADMIN )
0febb9
+can employ
0febb9
+.BR CLONE_NEWUTS .
0febb9
+
0febb9
+For further information on UTS namespaces, see
0febb9
+.BR namespaces (7).
0febb9
 .TP
0febb9
 .BR CLONE_PARENT " (since Linux 2.3.12)"
0febb9
 If
0febb9
@@ -530,12 +500,15 @@ is set, then the parent of the calling process, rather than the
0febb9
 calling process itself, will be signaled.
0febb9
 .TP
0febb9
 .BR CLONE_PARENT_SETTID " (since Linux 2.5.49)"
0febb9
-Store child thread ID at location
0febb9
+Store the child thread ID at the location
0febb9
 .I ptid
0febb9
-in parent and child memory.
0febb9
+in the parent's memory.
0febb9
 (In Linux 2.5.32-2.5.48 there was a flag
0febb9
 .B CLONE_SETTID
0febb9
 that did this.)
0febb9
+The store operation completes before
0febb9
+.BR clone ()
0febb9
+returns control to user space.
0febb9
 .TP
0febb9
 .BR CLONE_PID " (obsolete)"
0febb9
 If
0febb9
@@ -547,6 +520,7 @@ of not much use.
0febb9
 Since 2.3.21 this flag can be
0febb9
 specified only by the system boot process (PID 0).
0febb9
 It disappeared in Linux 2.5.16.
0febb9
+Since then, the kernel silently ignores it without error.
0febb9
 .TP
0febb9
 .BR CLONE_PTRACE " (since Linux 2.2)"
0febb9
 If
0febb9
@@ -556,11 +530,25 @@ then trace the child also (see
0febb9
 .BR ptrace (2)).
0febb9
 .TP
0febb9
 .BR CLONE_SETTLS " (since Linux 2.5.32)"
0febb9
-The
0febb9
+The TLS (Thread Local Storage) descriptor is set to
0febb9
+.I newtls.
0febb9
+
0febb9
+The interpretation of
0febb9
 .I newtls
0febb9
-argument is the new TLS (Thread Local Storage) descriptor.
0febb9
+and the resulting effect is architecture dependent.
0febb9
+On x86,
0febb9
+.I newtls
0febb9
+is interpreted as a
0febb9
+.IR "struct user_desc *"
0febb9
 (See
0febb9
-.BR set_thread_area (2).)
0febb9
+.BR set_thread_area (2)).
0febb9
+On x86_64 it is the new value to be set for the %fs base register
0febb9
+(See the
0febb9
+.I ARCH_SET_FS
0febb9
+argument to
0febb9
+.BR arch_prctl (2)).
0febb9
+On architectures with a dedicated TLS register, it is the new value
0febb9
+of that register.
0febb9
 .TP
0febb9
 .BR CLONE_SIGHAND " (since Linux 2.0)"
0febb9
 If
0febb9
@@ -612,16 +600,26 @@ from Linux 2.6.25 onward,
0febb9
 and was
0febb9
 .I removed
0febb9
 altogether in Linux 2.6.38.
0febb9
+Since then, the kernel silently ignores it without error.
0febb9
 .\" glibc 2.8 removed this defn from bits/sched.h
0febb9
 .TP
0febb9
 .BR CLONE_SYSVSEM " (since Linux 2.5.10)"
0febb9
 If
0febb9
 .B CLONE_SYSVSEM
0febb9
 is set, then the child and the calling process share
0febb9
-a single list of System V semaphore undo values (see
0febb9
+a single list of System V semaphore adjustment
0febb9
+.RI ( semadj )
0febb9
+values (see
0febb9
 .BR semop (2)).
0febb9
-If this flag is not set, then the child has a separate undo list,
0febb9
-which is initially empty.
0febb9
+In this case, the shared list accumulates
0febb9
+.I semadj
0febb9
+values across all processes sharing the list,
0febb9
+and semaphore adjustments are performed only when the last process
0febb9
+that is sharing the list terminates (or ceases sharing the list using
0febb9
+.BR unshare (2)).
0febb9
+If this flag is not set, then the child has a separate
0febb9
+.I semadj
0febb9
+list that is initially empty.
0febb9
 .TP
0febb9
 .BR CLONE_THREAD " (since Linux 2.4.0-test8)"
0febb9
 If
0febb9
@@ -703,7 +701,12 @@ must also include
0febb9
 .B CLONE_SIGHAND
0febb9
 if
0febb9
 .B CLONE_THREAD
0febb9
-is specified.
0febb9
+is specified
0febb9
+(and note that, since Linux 2.6.0-test6,
0febb9
+.BR CLONE_SIGHAND
0febb9
+also requires
0febb9
+.BR CLONE_VM
0febb9
+to be included).
0febb9
 
0febb9
 Signals may be sent to a thread group as a whole (i.e., a TGID) using
0febb9
 .BR kill (2),
0febb9
@@ -761,7 +764,7 @@ or
0febb9
 
0febb9
 If
0febb9
 .B CLONE_VFORK
0febb9
-is not set then both the calling process and the child are schedulable
0febb9
+is not set, then both the calling process and the child are schedulable
0febb9
 after the call, and an application should not rely on execution occurring
0febb9
 in any particular order.
0febb9
 .TP
0febb9
@@ -786,7 +789,7 @@ space of the calling process at the time of
0febb9
 Memory writes or file mappings/unmappings performed by one of the
0febb9
 processes do not affect the other, as with
0febb9
 .BR fork (2).
0febb9
-.SS The raw system call interface
0febb9
+.SS C library/kernel differences
0febb9
 The raw
0febb9
 .BR clone ()
0febb9
 system call corresponds more closely to
0febb9
@@ -801,16 +804,58 @@ arguments of the
0febb9
 .BR clone ()
0febb9
 wrapper function are omitted.
0febb9
 Furthermore, the argument order changes.
0febb9
-The raw system call interface on x86 and many other architectures is roughly:
0febb9
+In addition, there are variations across architectures.
0febb9
+
0febb9
+The raw system call interface on x86-64 and some other architectures
0febb9
+(including sh, tile, and alpha) is roughly:
0febb9
+
0febb9
 .in +4
0febb9
 .nf
0febb9
+.BI "long clone(unsigned long " flags ", void *" child_stack ,
0febb9
+.BI "           int *" ptid ", int *" ctid ,
0febb9
+.BI "           unsigned long " newtls );
0febb9
+.fi
0febb9
+.in
0febb9
+
0febb9
+On x86-32, and several other common architectures
0febb9
+(including score, ARM, ARM 64, PA-RISC, arc, Power PC, xtensa,
0febb9
+and MIPS),
0febb9
+.\" CONFIG_CLONE_BACKWARDS
0febb9
+the order of the last two arguments is reversed:
0febb9
 
0febb9
+.in +4
0febb9
+.nf
0febb9
 .BI "long clone(unsigned long " flags ", void *" child_stack ,
0febb9
-.BI "           void *" ptid ", void *" ctid ,
0febb9
-.BI "           struct pt_regs *" regs );
0febb9
+.BI "          int *" ptid ", unsigned long " newtls ,
0febb9
+.BI "          int *" ctid );
0febb9
+.fi
0febb9
+.in
0febb9
+
0febb9
+On the cris and s390 architectures,
0febb9
+.\" CONFIG_CLONE_BACKWARDS2
0febb9
+the order of the first two arguments is reversed:
0febb9
 
0febb9
+.in +4
0febb9
+.nf
0febb9
+.BI "long clone(void *" child_stack ", unsigned long " flags ,
0febb9
+.BI "           int *" ptid ", int *" ctid ,
0febb9
+.BI "           unsigned long " newtls );
0febb9
+.fi
0febb9
+.in
0febb9
+
0febb9
+On the microblaze architecture,
0febb9
+.\" CONFIG_CLONE_BACKWARDS3
0febb9
+an additional argument is supplied:
0febb9
+
0febb9
+.in +4
0febb9
+.nf
0febb9
+.BI "long clone(unsigned long " flags ", void *" child_stack ,
0febb9
+.BI "           int " stack_size , "\fR         /* Size of stack */"
0febb9
+.BI "           int *" ptid ", int *" ctid ,
0febb9
+.BI "           unsigned long " newtls );
0febb9
 .fi
0febb9
 .in
0febb9
+
0febb9
 Another difference for the raw system call is that the
0febb9
 .I child_stack
0febb9
 argument may be zero, in which case copy-on-write semantics ensure that the
0febb9
@@ -819,17 +864,13 @@ the stack.
0febb9
 In this case, for correct operation, the
0febb9
 .B CLONE_VM
0febb9
 option should not be specified.
0febb9
-
0febb9
-For some architectures, the order of the arguments for the system call
0febb9
-differs from that shown above.
0febb9
-On the score, microblaze, ARM, ARM 64, PA-RISC, arc, Power PC, xtensa,
0febb9
-and MIPS architectures,
0febb9
-the order of the fourth and fifth arguments is reversed.
0febb9
-On the cris and s390 architectures,
0febb9
-the order of the first and second arguments is reversed.
0febb9
+.\"
0febb9
 .SS blackfin, m68k, and sparc
0febb9
+.\" Mike Frysinger noted in a 2013 mail:
0febb9
+.\"     these arches don't define __ARCH_WANT_SYS_CLONE:
0febb9
+.\"     blackfin ia64 m68k sparc
0febb9
 The argument-passing conventions on
0febb9
-blackfin, m68k, and sparc are different from descriptions above.
0febb9
+blackfin, m68k, and sparc are different from the descriptions above.
0febb9
 For details, see the kernel (and glibc) source.
0febb9
 .SS ia64
0febb9
 On ia64, a different interface is used:
0febb9
@@ -883,7 +924,8 @@ will be set appropriately.
0febb9
 .SH ERRORS
0febb9
 .TP
0febb9
 .B EAGAIN
0febb9
-Too many processes are already running.
0febb9
+Too many processes are already running; see
0febb9
+.BR fork (2).
0febb9
 .TP
0febb9
 .B EINVAL
0febb9
 .B CLONE_SIGHAND
0febb9
@@ -908,6 +950,7 @@ was not.
0febb9
 .\" (Since Linux 2.6.0-test6.)
0febb9
 .TP
0febb9
 .B EINVAL
0febb9
+.\" commit e66eded8309ebf679d3d3c1f5820d1f2ca332c71
0febb9
 Both
0febb9
 .B CLONE_FS
0febb9
 and
0febb9
@@ -915,6 +958,14 @@ and
0febb9
 were specified in
0febb9
 .IR flags .
0febb9
 .TP
0febb9
+.BR EINVAL " (since Linux 3.9)"
0febb9
+Both
0febb9
+.B CLONE_NEWUSER
0febb9
+and
0febb9
+.B CLONE_FS
0febb9
+were specified in
0febb9
+.IR flags .
0febb9
+.TP
0febb9
 .B EINVAL
0febb9
 Both
0febb9
 .B CLONE_NEWIPC
0febb9
@@ -924,18 +975,25 @@ were specified in
0febb9
 .IR flags .
0febb9
 .TP
0febb9
 .B EINVAL
0febb9
-Both
0febb9
+One (or both) of
0febb9
 .BR CLONE_NEWPID
0febb9
-and
0febb9
+or
0febb9
+.BR CLONE_NEWUSER
0febb9
+and one (or both) of
0febb9
 .BR CLONE_THREAD
0febb9
+or
0febb9
+.BR CLONE_PARENT
0febb9
 were specified in
0febb9
 .IR flags .
0febb9
 .TP
0febb9
 .B EINVAL
0febb9
-Returned by
0febb9
+Returned by the glibc
0febb9
 .BR clone ()
0febb9
-when a zero value is specified for
0febb9
-.IR child_stack .
0febb9
+wrapper function when
0febb9
+.IR fn
0febb9
+or
0febb9
+.IR child_stack
0febb9
+is specified as NULL.
0febb9
 .TP
0febb9
 .B EINVAL
0febb9
 .BR CLONE_NEWIPC
0febb9
@@ -971,11 +1029,48 @@ but the kernel was not configured with the
0febb9
 .B CONFIG_UTS
0febb9
 option.
0febb9
 .TP
0febb9
+.B EINVAL
0febb9
+.I child_stack
0febb9
+is not aligned to a suitable boundary for this architecture.
0febb9
+For example, on aarch64,
0febb9
+.I child_stack
0febb9
+must be a multiple of 16.
0febb9
+.TP
0febb9
 .B ENOMEM
0febb9
 Cannot allocate sufficient memory to allocate a task structure for the
0febb9
 child, or to copy those parts of the caller's context that need to be
0febb9
 copied.
0febb9
 .TP
0febb9
+.BR ENOSPC " (since Linux 3.7)"
0febb9
+.\" commit f2302505775fd13ba93f034206f1e2a587017929
0febb9
+.B CLONE_NEWPID
0febb9
+was specified in flags,
0febb9
+but the limit on the nesting depth of PID namespaces
0febb9
+would have been exceeded; see
0febb9
+.BR pid_namespaces (7).
0febb9
+.TP
0febb9
+.BR ENOSPC " (since Linux 4.9; beforehand " EUSERS )
0febb9
+.B CLONE_NEWUSER
0febb9
+was specified in
0febb9
+.IR flags ,
0febb9
+and the call would cause the limit on the number of
0febb9
+nested user namespaces to be exceeded.
0febb9
+See
0febb9
+.BR user_namespaces (7).
0febb9
+
0febb9
+From Linux 3.11 to Linux 4.8, the error diagnosed in this case was
0febb9
+.BR EUSERS .
0febb9
+.TP
0febb9
+.BR ENOSPC " (since Linux 4.9)"
0febb9
+One of the values in
0febb9
+.I flags
0febb9
+specified the creation of a new user namespace,
0febb9
+but doing so would have caused the limit defined by the corresponding file in
0febb9
+.IR /proc/sys/user
0febb9
+to be exceeded.
0febb9
+For further details, see
0febb9
+.BR namespaces (7).
0febb9
+.TP
0febb9
 .B EPERM
0febb9
 .BR CLONE_NEWIPC ,
0febb9
 .BR CLONE_NEWNET ,
0febb9
@@ -989,22 +1084,62 @@ was specified by an unprivileged process (process without \fBCAP_SYS_ADMIN\fP).
0febb9
 .B CLONE_PID
0febb9
 was specified by a process other than process 0.
0febb9
 .TP
0febb9
+.B EPERM
0febb9
+.BR CLONE_NEWUSER
0febb9
+was specified in
0febb9
+.IR flags ,
0febb9
+but either the effective user ID or the effective group ID of the caller
0febb9
+does not have a mapping in the parent namespace (see
0febb9
+.BR user_namespaces (7)).
0febb9
+.TP
0febb9
+.BR EPERM " (since Linux 3.9)"
0febb9
+.\" commit 3151527ee007b73a0ebd296010f1c0454a919c7d
0febb9
+.B CLONE_NEWUSER
0febb9
+was specified in
0febb9
+.I flags
0febb9
+and the caller is in a chroot environment
0febb9
+.\" FIXME What is the rationale for this restriction?
0febb9
+(i.e., the caller's root directory does not match the root directory
0febb9
+of the mount namespace in which it resides).
0febb9
+.TP
0febb9
 .BR ERESTARTNOINTR " (since Linux 2.6.17)"
0febb9
+.\" commit 4a2c7a7837da1b91468e50426066d988050e4d56
0febb9
 System call was interrupted by a signal and will be restarted.
0febb9
 (This can be seen only during a trace.)
0febb9
-.SH VERSIONS
0febb9
-There is no entry for
0febb9
-.BR clone ()
0febb9
-in libc5.
0febb9
-glibc2 provides
0febb9
-.BR clone ()
0febb9
-as described in this manual page.
0febb9
+.TP
0febb9
+.BR EUSERS " (Linux 3.11 to Linux 4.8)"
0febb9
+.B CLONE_NEWUSER
0febb9
+was specified in
0febb9
+.IR flags ,
0febb9
+and the limit on the number of nested user namespaces would be exceeded.
0febb9
+See the discussion of the
0febb9
+.BR ENOSPC
0febb9
+error above.
0febb9
+.\" .SH VERSIONS
0febb9
+.\" There is no entry for
0febb9
+.\" .BR clone ()
0febb9
+.\" in libc5.
0febb9
+.\" glibc2 provides
0febb9
+.\" .BR clone ()
0febb9
+.\" as described in this manual page.
0febb9
 .SH CONFORMING TO
0febb9
 .BR clone ()
0febb9
 is Linux-specific and should not be used in programs
0febb9
 intended to be portable.
0febb9
 .SH NOTES
0febb9
-In the kernel 2.4.x series,
0febb9
+The
0febb9
+.BR kcmp (2)
0febb9
+system call can be used to test whether two processes share various
0febb9
+resources such as a file descriptor table,
0febb9
+System V semaphore undo operations, or a virtual address space.
0febb9
+
0febb9
+
0febb9
+Handlers registered using
0febb9
+.BR pthread_atfork (3)
0febb9
+are not executed during a call to
0febb9
+.BR clone ().
0febb9
+
0febb9
+In the Linux 2.4.x series,
0febb9
 .B CLONE_THREAD
0febb9
 generally does not make the parent of the new thread the same
0febb9
 as the parent of the calling process.
0febb9
@@ -1012,14 +1147,13 @@ However, for kernel versions 2.4.7 to 2.4.18 the
0febb9
 .B CLONE_THREAD
0febb9
 flag implied the
0febb9
 .B CLONE_PARENT
0febb9
-flag (as in kernel 2.6).
0febb9
+flag (as in Linux 2.6.0 and later).
0febb9
 
0febb9
 For a while there was
0febb9
 .B CLONE_DETACHED
0febb9
 (introduced in 2.5.32):
0febb9
 parent wants no child-exit signal.
0febb9
-In 2.6.2 the need to give this
0febb9
-together with
0febb9
+In Linux 2.6.2, the need to give this flag together with
0febb9
 .B CLONE_THREAD
0febb9
 disappeared.
0febb9
 This flag is still defined, but has no effect.
0febb9
@@ -1071,7 +1205,6 @@ To get the truth, it may be necessary to use code such as the following:
0febb9
 .\" https://bugzilla.redhat.com/show_bug.cgi?id=417521
0febb9
 .\" http://sourceware.org/bugzilla/show_bug.cgi?id=6910
0febb9
 .SH EXAMPLE
0febb9
-.SS Create a child that executes in a separate UTS namespace
0febb9
 The following program demonstrates the use of
0febb9
 .BR clone ()
0febb9
 to create a child process that executes in a separate UTS namespace.
0febb9
@@ -1081,7 +1214,7 @@ making it possible to see that the hostname
0febb9
 differs in the UTS namespaces of the parent and child.
0febb9
 For an example of the use of this program, see
0febb9
 .BR setns (2).
0febb9
-
0febb9
+.SS Program source
0febb9
 .nf
0febb9
 #define _GNU_SOURCE
0febb9
 #include <sys/wait.h>
0febb9
@@ -1181,6 +1314,7 @@ main(int argc, char *argv[])
0febb9
 .BR unshare (2),
0febb9
 .BR wait (2),
0febb9
 .BR capabilities (7),
0febb9
+.BR namespaces (7),
0febb9
 .BR pthreads (7)
0febb9
 .SH COLOPHON
0febb9
 This page is part of release 3.53 of the Linux
0febb9
-- 
0febb9
2.7.4
0febb9