This fixes a bug in tcp_split_unsent_seg() where a chained pbuf was
not correctly updating pcb->snd_queuelen during trimming and snd_queuelen
would desynchronize if pbuf_realloc() freed some of the chain
Also, use pbuf_clen() for adding the new remaining segment rather than ++.
The new remaining segment should always be one pbuf due to the semantics
of PBUF_RAM, but this follows the best practice of using pbuf_clen()
This fixes a bug in tcp_split_unsent_seg where oversized segments were not
handled during the split, leading to pcb->unsent_oversized and
useg->oversize_left getting out of sync with the split segment
This would result in over-writing the pbuf if another call to tcp_write()
happened after the split, but before the remainder of the split was sent in
tcp_output
Now pcb->unsent_oversized is explicitly cleared (because the remainder at
the tail is never oversized) and useg->oversized_left is cleared after it is
trimmed
This also updates the test_tcp_persist_split unit test to explicitly check for
this case
There were a couple cases in-between that could cause an exit from
tcp_output which don't use useg. With large send buffers, pcb->unacked
may be large and calculating useg is wasted in these exit cases
Some compilers may be re-ordering this already, but it doesn't hurt to
correctly arrange the code
This re-works the persist timer to have the following behavior:
1) Only start persist timer when a buffered segment doesn't fit within
the current window and there is no in-fligh data. Previously, the
persist timer was always started when the window went to zero even
if there was no buffered data (since timer was managed in receive
pathway rather than transmit pathway)
2) Upon first fire of persist timer, fill the remaining window if
non-zero by splitting the unsent segment. If split segment is sent,
persist timer is stopped, RTO timer is now ensuring reliable window
updates
3) If window is already zero when persist timer fires, send 1 byte probe
4) Persist timer and zero window probe should only be active when the
following are true:
* no in-flight data (pcb->unacked == NULL)
* when there is buffered data (pcb->unsent != NULL)
* when pcb->unsent->len > pcb->snd_wnd
- add a better-documented static function tcp_output_segment_busy
- try to reduce the number of checks
- tcp_rexmit_rto: iterate pcb->unacked only once
- no need to check for ref==1 in tcp_rexmit_fast when tcp_rexmit does
- call tcp_rexmit_fast if dupacks >= 3 (not == 3) and use TF_INFR flag to guard the fast-rexmit case (that way, it's triggered again on the next dupack)
There is already a guard in tcp_output_segment() for a pbuf still being
referenced by the netif driver due to deferred transmission, however the callers
are modifying state even when this gives up.
It seems cleaner to have the callers guard this case and avoid modifying their
state.
tcp_rexmit_rto() might better avoid re-transmission of any segments if any of
the unacked segments are deferred, to avoid loading the link further if it is
struggling to flush its buffered writes. Link level queues can be limited on
some devices and need spares for link management.
In tcp_output() there were a number of blocks of code performing
duplicate checks of 'if (seg == NULL)'. This combines them together
to reduce duplicate checks
TCP_OUTPUT_DEBUG and TCP_CWND_DEBUG also don't need to be guarded
by #if/#endif since the LWIP_DEBUGF infrastructure already compiles them
out when LWIP_DEBUG is not defined
Adds partial support for selective acknowledgements (RFC 2018).
This change makes lwIP negotiate SACK support, and include SACK
data in outgoing empty ACK packets. It does not include it
in outgoing packets with data payload.
It also does not add support for handling incoming SACKs.
Signed-off-by: goldsimon <goldsimon@gmx.de>
TCP timestamps were only sent if the remote side
requested it first. This enables the use of timestamps
for outgoing connections as well.
Signed-off-by: goldsimon <goldsimon@gmx.de>
This commit adds a timeout to the zero-window probing (persist timer)
mechanism. LwIP has not historically had a timeout for the persist
timer, leading to unbounded blocking if connection drops during the
zero-window condition
This commit also adds two units test, one to check the RTO timeout
and a second to check the zero-window probe timeout
This fixes build error if LWIP_NETIF_TX_SINGLE_PBUF==1.
Fixes: dd811bca06 ("Fix bug #50694 (TX exist more pbufs after enable LWIP_NETIF_TX_SINGLE_PBUF) by not executing phase 2 for LWIP_NETIF_TX_SINGLE_PBUF==1")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
This commit adds TCP Appropriate Byte Counting (ABC) support based on
RFC 3465
ABC replaces the previous congestion window growth mechanism and has been
configured with limit of 2 SMSS. See task #14128 for discussion on
defaults, but the goal is to mitigate the performance impact of delayed
ACKs on congestion window growth
This commit also introduces a mechanism to track when the stack is
undergoing a period following an RTO where data is being retransmitted.
Lastly, this adds a unit test to verify RTO period tracking and some
basic ABC cwnd checking
Now that tcp_connect() always determines the outgoing netif with a
route lookup, we can compute the effective MSS without doing the same
route lookup again. The outgoing netif is already known from one
other location that computes the MSS, so we can eliminate a redundant
route lookup there too. Reduce some macro clutter as a side effect.
While TCP_OVERSIZE works only when tcp_write() is used with
TCP_WRITE_FLAG_COPY, this new code achieves
similar benefits for the use case that the caller manages their own
send buffers and passes successive chunks of those to tcp_write()
without TCP_WRITE_FLAG_COPY.
In particular, if a buffer is passed to
tcp_write() that is adjacent in memory to the previously passed
buffer, it will be combined into the previous ROM pbuf reference
whenever possible, thus extending that ROM pbuf rather than allocating
a new ROM pbuf.
For the aforementioned use case, the advantages of this code are
twofold:
1) fewer ROM pbufs need to be allocated to send the same data, and,
2) the MAC layer gets outgoing TCP packets with shorter pbuf chains.
Original patch by Ambroz Bizjak <ambrop7@gmail.com>
Edited by David van Moolenbroek <david@minix3.org>
Signed-off-by: goldsimon <goldsimon@gmx.de>
This commit returns LwIP to previous behavior where if the next unsent
segment can't be sent due to the current send window, we start the
persist timer. This is done to engage window probing in the case that
the subsequent window update from the receiver is dropped, thus
preventing connection deadlock
This commit refines the previous logic to only target the following case:
1) Next unsent segment doesn't fit within the send window (not
congestion) and there is some room in the window
2) Unacked queue is empty (otherwise data is inflight and the RTO timer
will take care of any dropped window updates)
See commit d8f090a759 (which removed this
behavior) to reference the old logic. The old logic falsely started the
persit timer when the RTO timer was already running.
This comment is incorrect since commit 7d0dab9d7d
"partly fixed bug #25882: TCP hangs on MSS > pcb->snd_wnd
(by not creating segments bigger than half the window)".
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Let lwip use functions/macros prefixed by lwip_ internally to avoid naming clashes with external #includes.
Remove over-complicated #define handling in def.h
Make functions easier to override in cc.h. The following is sufficient now (no more LWIP_PLATFORM_BYTESWAP):
#define lwip_htons(x) <your_htons>
#define lwip_htonl(x) <your_htonl>
It is possible that the byte sent as a zero window probe is accepted
and acknowledged by the receiver side without the window being opened.
In that case, the stream has effectively advanced by one byte, and
since lwIP did not take this into account on the sender side, the
result was a desynchronization between the sender and the receiver.
That situation could occur even on a lwIP loopback device, after
filling up the receiver side's receive buffer, and resulted in an ACK
storm. This patch corrects the problem by advancing the sender's next
sequence number by one as needed when sending a zero window probe.