Miroslav Lichvar 3cb7a1
commit 1a2dfe9b00b79a59acf905476bbc33c74d5770a3
Miroslav Lichvar 3cb7a1
Author: Jacob Keller <jacob.e.keller@intel.com>
Miroslav Lichvar 3cb7a1
Date:   Thu Jul 8 12:59:30 2021 -0700
Miroslav Lichvar 3cb7a1
Miroslav Lichvar 3cb7a1
    Increase the default tx_timestamp_timeout to 10
Miroslav Lichvar 3cb7a1
    
Miroslav Lichvar 3cb7a1
    The tx_timestamp_timeout configuration defines the number of
Miroslav Lichvar 3cb7a1
    milliseconds to wait for a Tx timestamp from the kernel stack. This
Miroslav Lichvar 3cb7a1
    delay is necessary as Tx timestamps are captured after a packet is sent
Miroslav Lichvar 3cb7a1
    and reported back via the socket error queue.
Miroslav Lichvar 3cb7a1
    
Miroslav Lichvar 3cb7a1
    The current default is to poll for up to 1 millisecond. In practice, it
Miroslav Lichvar 3cb7a1
    turns out that this is not always enough time for hardware and software
Miroslav Lichvar 3cb7a1
    to capture the timestamp and report it back. Some hardware designs
Miroslav Lichvar 3cb7a1
    require reading timestamps over registers or other slow mechanisms.
Miroslav Lichvar 3cb7a1
    
Miroslav Lichvar 3cb7a1
    This extra delay results in the timestamp not being sent back to
Miroslav Lichvar 3cb7a1
    userspace within the default 1 millisecond polling time. If that occurs
Miroslav Lichvar 3cb7a1
    the following can be seen from ptp4l:
Miroslav Lichvar 3cb7a1
    
Miroslav Lichvar 3cb7a1
      ptp4l[4756.840]: timed out while polling for tx timestamp
Miroslav Lichvar 3cb7a1
      ptp4l[4756.840]: increasing tx_timestamp_timeout may correct this issue,
Miroslav Lichvar 3cb7a1
                       but it is likely caused by a driver bug
Miroslav Lichvar 3cb7a1
      ptp4l[4756.840]: port 1 (p2p1): send sync failed
Miroslav Lichvar 3cb7a1
      ptp4l[4756.840]: port 1 (p2p1): MASTER to FAULTY on FAULT_DETECTED
Miroslav Lichvar 3cb7a1
                       (FT_UNSPECIFIED)
Miroslav Lichvar 3cb7a1
    
Miroslav Lichvar 3cb7a1
    This can confuse users because it implies this is a bug, when the
Miroslav Lichvar 3cb7a1
    correct solution in many cases is to just increase the timeout to
Miroslav Lichvar 3cb7a1
    a slightly higher value.
Miroslav Lichvar 3cb7a1
    
Miroslav Lichvar 3cb7a1
    Since we know this is a problem for many drivers and hardware designs,
Miroslav Lichvar 3cb7a1
    lets increase the default timeout.
Miroslav Lichvar 3cb7a1
    
Miroslav Lichvar 3cb7a1
    Note that a longer timeout should not affect setups which return the
Miroslav Lichvar 3cb7a1
    timestamp quickly. On modern kernels, the poll() call will return once
Miroslav Lichvar 3cb7a1
    the timestamp is reported back to the socket error queue. (On old
Miroslav Lichvar 3cb7a1
    kernels around the 3.x era the poll will sleep for the full duration
Miroslav Lichvar 3cb7a1
    before reporting the timestamp, but this is now quite an old kernel
Miroslav Lichvar 3cb7a1
    release).
Miroslav Lichvar 3cb7a1
    
Miroslav Lichvar 3cb7a1
    Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Miroslav Lichvar 3cb7a1
Miroslav Lichvar 3cb7a1
diff --git a/config.c b/config.c
Miroslav Lichvar 3cb7a1
index 760b395..03d981e 100644
Miroslav Lichvar 3cb7a1
--- a/config.c
Miroslav Lichvar 3cb7a1
+++ b/config.c
Miroslav Lichvar 3cb7a1
@@ -324,7 +324,7 @@ struct config_item config_tab[] = {
Miroslav Lichvar 3cb7a1
 	GLOB_ITEM_INT("ts2phc.pulsewidth", 500000000, 1000000, 999000000),
Miroslav Lichvar 3cb7a1
 	PORT_ITEM_ENU("tsproc_mode", TSPROC_FILTER, tsproc_enu),
Miroslav Lichvar 3cb7a1
 	GLOB_ITEM_INT("twoStepFlag", 1, 0, 1),
Miroslav Lichvar 3cb7a1
-	GLOB_ITEM_INT("tx_timestamp_timeout", 1, 1, INT_MAX),
Miroslav Lichvar 3cb7a1
+	GLOB_ITEM_INT("tx_timestamp_timeout", 10, 1, INT_MAX),
Miroslav Lichvar 3cb7a1
 	PORT_ITEM_INT("udp_ttl", 1, 1, 255),
Miroslav Lichvar 3cb7a1
 	PORT_ITEM_INT("udp6_scope", 0x0E, 0x00, 0x0F),
Miroslav Lichvar 3cb7a1
 	GLOB_ITEM_STR("uds_address", "/var/run/ptp4l"),
Miroslav Lichvar 3cb7a1
diff --git a/configs/default.cfg b/configs/default.cfg
Miroslav Lichvar 3cb7a1
index 64ef3bd..d615610 100644
Miroslav Lichvar 3cb7a1
--- a/configs/default.cfg
Miroslav Lichvar 3cb7a1
+++ b/configs/default.cfg
Miroslav Lichvar 3cb7a1
@@ -51,7 +51,7 @@ hybrid_e2e		0
Miroslav Lichvar 3cb7a1
 inhibit_multicast_service	0
Miroslav Lichvar 3cb7a1
 net_sync_monitor	0
Miroslav Lichvar 3cb7a1
 tc_spanning_tree	0
Miroslav Lichvar 3cb7a1
-tx_timestamp_timeout	1
Miroslav Lichvar 3cb7a1
+tx_timestamp_timeout	10
Miroslav Lichvar 3cb7a1
 unicast_listen		0
Miroslav Lichvar 3cb7a1
 unicast_master_table	0
Miroslav Lichvar 3cb7a1
 unicast_req_duration	3600
Miroslav Lichvar 3cb7a1
diff --git a/ptp4l.8 b/ptp4l.8
Miroslav Lichvar 3cb7a1
index fe9e150..7ca3474 100644
Miroslav Lichvar 3cb7a1
--- a/ptp4l.8
Miroslav Lichvar 3cb7a1
+++ b/ptp4l.8
Miroslav Lichvar 3cb7a1
@@ -496,7 +496,7 @@ switches all implement this option together with the BMCA.
Miroslav Lichvar 3cb7a1
 .B tx_timestamp_timeout
Miroslav Lichvar 3cb7a1
 The number of milliseconds to poll waiting for the tx time stamp from the kernel
Miroslav Lichvar 3cb7a1
 when a message has recently been sent.
Miroslav Lichvar 3cb7a1
-The default is 1.
Miroslav Lichvar 3cb7a1
+The default is 10.
Miroslav Lichvar 3cb7a1
 .TP
Miroslav Lichvar 3cb7a1
 .B check_fup_sync
Miroslav Lichvar 3cb7a1
 Because of packet reordering that can occur in the network, in the