Any malicous segment could contain a SYN up to now (no check).
A SYN in the wrong segment could break OOSEQ queueing.
Fix this by allowing SYN only in states where it is required.
See bug #56397: Assert "tcp_receive: ooseq tcplen > rcv_wnd"
Signed-off-by: Simon Goldschmidt <goldsimon@gmx.de>
According to rfc5681:
https://tools.ietf.org/html/rfc5681
Paragraph 3.2. Fast Retransmit/Fast Recovery
The TCP sender SHOULD use the "fast retransmit" algorithm to detect
and repair loss, based on incoming duplicate ACKs. The fast
retransmit algorithm uses the arrival of 3 duplicate ACKs (as defined
in section 2, without any intervening ACKs which move SND.UNA) as an
indication that a segment has been lost. After receiving 3 duplicate
ACKs, TCP performs a retransmission of what appears to be the missing
segment, without waiting for the retransmission timer to expire.
Now consider the following scenario:
Server sends packets to client P0, P1, P2 .. PK.
Client sends packets to server P`0 P`1 ... P`k.
I.e. it is a pipelined conversation. Now lets assume that P1 is lost, Client will
send an empty "duplicate" ack upon receive of P2, P3... In addition client will
also send a new packet with "Client Data", P`0 P`1 .. e.t.c. according to sever receive
window and client congestion window.
Current implementation resets "duplicate" ack count upon receive of packets from client
that holds new data. This in turn prevents server from fast recovery upon 3-duplicate acks
receive.
This is not required as in this case "sender unacknowledged window" is not moving.
Signed-off-by: Solganik Alexander <sashas@lightbitslabs.com>
This hook is called from tcp_input() for all kinds of input pcbs when
selected to receive a pbuf (LISTEN, TIME_WAIT, rest). I can parse or
drop an rx pbuf.
Signed-off-by: goldsimon <goldsimon@gmx.de>
This introduces the concept of ext (external/extended) arguments per
tcp_pcb (also for listening pcbs) to store more data than just one
"void *arg" per pcb. The "arg" is for use to applications, whereas
the ext_args may be used by frameworks and leave "arg" untouched.
In addition to a void pointer, callbacks are added to help frameworks
migrate arguments from listen pcb to connection pcb and to free args
when the pcb is freed.
Signed-off-by: goldsimon <goldsimon@gmx.de>
This should make it easier to add debugging messages or other hooks
to the point where tcp pcbs are deallocated.
Signed-off-by: goldsimon <goldsimon@gmx.de>
Some systems need to take into account an RX buffer pool size when
advising an appropriate number of RX pbufs to queue on the ooseq
list. For some systems there is a practical hard limit beyond which
the rx pool becomes exhausted blocking reception of further buffers
until some are freed.
It also helps to be able to consider the available dynamic memory when
advising an appropriate maximum number of bytes to buffer on the ooseq
list.
These decisions can also benefit from knowing the number already
allocated on a particular pcb, so the ooseq tcp segement is passed to
these functions. For example, if the system only wants to allow the
total number of rx pbufs queued on all the ooseq lists to grow by one
and a pcb already has two then it can return three for this call, but
might return one for another call - supporting a greedy allocation
strategy.
Signed-off-by: goldsimon <goldsimon@gmx.de>
TCP SACKs were removed after some changes in the ooseq queue,
but before all unneeded packets were removed from it.
Because of that, we would sometimes include SACKs
for data already delivered in-order.
Signed-off-by: goldsimon <goldsimon@gmx.de>
This problem would appear to have only affected systems with multiple
interfaces. It was noted causing tcp resets when the pcb was lost, and there
might have been other associated problems.
Signed-off-by: Dirk Ziegelmeier <dirk@ziegelmeier.net>
This re-works the persist timer to have the following behavior:
1) Only start persist timer when a buffered segment doesn't fit within
the current window and there is no in-fligh data. Previously, the
persist timer was always started when the window went to zero even
if there was no buffered data (since timer was managed in receive
pathway rather than transmit pathway)
2) Upon first fire of persist timer, fill the remaining window if
non-zero by splitting the unsent segment. If split segment is sent,
persist timer is stopped, RTO timer is now ensuring reliable window
updates
3) If window is already zero when persist timer fires, send 1 byte probe
4) Persist timer and zero window probe should only be active when the
following are true:
* no in-flight data (pcb->unacked == NULL)
* when there is buffered data (pcb->unsent != NULL)
* when pcb->unsent->len > pcb->snd_wnd
- add a better-documented static function tcp_output_segment_busy
- try to reduce the number of checks
- tcp_rexmit_rto: iterate pcb->unacked only once
- no need to check for ref==1 in tcp_rexmit_fast when tcp_rexmit does
- call tcp_rexmit_fast if dupacks >= 3 (not == 3) and use TF_INFR flag to guard the fast-rexmit case (that way, it's triggered again on the next dupack)
Adds partial support for selective acknowledgements (RFC 2018).
This change makes lwIP negotiate SACK support, and include SACK
data in outgoing empty ACK packets. It does not include it
in outgoing packets with data payload.
It also does not add support for handling incoming SACKs.
Signed-off-by: goldsimon <goldsimon@gmx.de>
Changes for TCP Appropriate Byte Counting introduce a potential cwnd
rollover by not taking into account integer promotion on tcpwnd_size_t
during inequality comparisions
This fixes the issue by introducing a macro TCP_WND_INC which detects
the rollover correctly and now holds the tcpwnd_size_t at the maximum
value rather than not incrementing the window. This provides a slight
performance improvement by allowing full use of the tcpwnd_size_t number
space for the congestion window
This commit adds a timeout to the zero-window probing (persist timer)
mechanism. LwIP has not historically had a timeout for the persist
timer, leading to unbounded blocking if connection drops during the
zero-window condition
This commit also adds two units test, one to check the RTO timeout
and a second to check the zero-window probe timeout
This commit adds TCP Appropriate Byte Counting (ABC) support based on
RFC 3465
ABC replaces the previous congestion window growth mechanism and has been
configured with limit of 2 SMSS. See task #14128 for discussion on
defaults, but the goal is to mitigate the performance impact of delayed
ACKs on congestion window growth
This commit also introduces a mechanism to track when the stack is
undergoing a period following an RTO where data is being retransmitted.
Lastly, this adds a unit test to verify RTO period tracking and some
basic ABC cwnd checking
This commit corrects what looks like an ancient incorrect organization
of the logic for processing an ACK which acks new data. Once moved,
we can also change to using TCP_SEQ_LEQ on ackno instead of TCP_BETWEEN
because ackno has already been checked against snd_nxt
The work of checking the unsent queue and updating pcb->snd_buf (both
steps required for new data ACK) should be located under the conditional
that checks TCP_SEQ_BETWEEN(ackno, pcb->lastack+1, pcb->snd_nxt)
The comment following the unsent queue check/pcb->snd_buf update even
indicates "End of ACK for new data processing" when the logic is clearly
outside of this check
From what I can tell, this mis-organization isn't causing any incorrect
behavior since the unsent queue checked that ackno was between start of
segment and snd_nxt and recv_acked would be 0 during pcb->snd_buf update.
Instead this is waisted work for duplicate ACKS (can be common) and other
old ACKs