TCP SACKs were removed after some changes in the ooseq queue,
but before all unneeded packets were removed from it.
Because of that, we would sometimes include SACKs
for data already delivered in-order.
Signed-off-by: goldsimon <goldsimon@gmx.de>
This problem would appear to have only affected systems with multiple
interfaces. It was noted causing tcp resets when the pcb was lost, and there
might have been other associated problems.
Signed-off-by: Dirk Ziegelmeier <dirk@ziegelmeier.net>
There were a couple cases in-between that could cause an exit from
tcp_output which don't use useg. With large send buffers, pcb->unacked
may be large and calculating useg is wasted in these exit cases
Some compilers may be re-ordering this already, but it doesn't hurt to
correctly arrange the code
This re-works the persist timer to have the following behavior:
1) Only start persist timer when a buffered segment doesn't fit within
the current window and there is no in-fligh data. Previously, the
persist timer was always started when the window went to zero even
if there was no buffered data (since timer was managed in receive
pathway rather than transmit pathway)
2) Upon first fire of persist timer, fill the remaining window if
non-zero by splitting the unsent segment. If split segment is sent,
persist timer is stopped, RTO timer is now ensuring reliable window
updates
3) If window is already zero when persist timer fires, send 1 byte probe
4) Persist timer and zero window probe should only be active when the
following are true:
* no in-flight data (pcb->unacked == NULL)
* when there is buffered data (pcb->unsent != NULL)
* when pcb->unsent->len > pcb->snd_wnd
lwip_itoa would output the number 0 as \0. This fixes the issue by
adding a special check before the normal conversion loop
This was found via shell cmd idxtoname and win32 port. "lo0" should
be returned for index 1
- add a better-documented static function tcp_output_segment_busy
- try to reduce the number of checks
- tcp_rexmit_rto: iterate pcb->unacked only once
- no need to check for ref==1 in tcp_rexmit_fast when tcp_rexmit does
- call tcp_rexmit_fast if dupacks >= 3 (not == 3) and use TF_INFR flag to guard the fast-rexmit case (that way, it's triggered again on the next dupack)
There is already a guard in tcp_output_segment() for a pbuf still being
referenced by the netif driver due to deferred transmission, however the callers
are modifying state even when this gives up.
It seems cleaner to have the callers guard this case and avoid modifying their
state.
tcp_rexmit_rto() might better avoid re-transmission of any segments if any of
the unacked segments are deferred, to avoid loading the link further if it is
struggling to flush its buffered writes. Link level queues can be limited on
some devices and need spares for link management.
- expose `altcp_tcp_setup()` so we can wrap altcp over existing tcp pcb.
- avoid calling tcp_close() with NULL pcb.
- free altcp_pcb struct when error callback called.
According to `mqtt_tcp_err_cb()` in src/apps/mqtt/mqtt.c, altcp socket should
work the same way than raw tcp socket. So freeing altcp_pcb ensure this.
The new functions both take size_t as increment/decrement argument instead of s16_t (which needed to be range-checked before conversion everywhere) - in most places, the direction (increment or decrement) is known anyway, so no need to encode it in a sign bit
In tcp_output() there were a number of blocks of code performing
duplicate checks of 'if (seg == NULL)'. This combines them together
to reduce duplicate checks
TCP_OUTPUT_DEBUG and TCP_CWND_DEBUG also don't need to be guarded
by #if/#endif since the LWIP_DEBUGF infrastructure already compiles them
out when LWIP_DEBUG is not defined
The ip6_frag.drop counter is updated before all the code paths calling
goto nullreturn, so let's move updating ip6_frag.drop stats to nullreturn.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
This fixes a couple of occurrences where the src and dst parameters to
ip4_route_src() were swapped. This was most likely due to confusion between
ip_route(src, dst) and ip4_route_src(dst, src)
This was found in a system where LWIP_IPV4_SRC_ROUTING is 0
The UDP case was an application socket bound to INADDR_ANY with
IP_MULTICAST_IF set. Transmits would result in calling ip4_route(dst) where
dst was pcb->local_addr (which was INADDR_ANY) instead of pcb->mcast_ip4.
This resulted in a routing failure
The ICMP issue was found through code analysis only
There were uses of dhcp_release() followed immediately by dhcp_discover() but
dhcp_release() now stops dhcp so discovery would fail, so call dhcp_start()
after release which restarts discovery.
Signed-off-by: Dirk Ziegelmeier <dirk@ziegelmeier.net>
According to commit 1f780e86d5 ("PPP timeouts required depend on the number of allowed PPP sessions"):
Per PPP needs 6 timeouts (AUTH + PAP|CHAP|EAP + LCP + IPCP + IP6CP + PPPoE).
So update the minimal MEMP_NUM_SYS_TIMEOUT setting check accordingly.
Since we have LWIP_NUM_SYS_TIMEOUT_INTERNAL so just switch to use it.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: goldsimon <goldsimon@gmx.de>
The ip6_addr_t structure may have an addition slot so is not necessarily
the size of an ipv6 address, so some uses of sizeof(ip6_addr_t) were not
correct.
Signed-off-by: goldsimon <goldsimon@gmx.de>
No need to have additional if statement for PBUF_REF/PBUF_ROM.
It can be merged to the existing swtich(type) cases.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Adds partial support for selective acknowledgements (RFC 2018).
This change makes lwIP negotiate SACK support, and include SACK
data in outgoing empty ACK packets. It does not include it
in outgoing packets with data payload.
It also does not add support for handling incoming SACKs.
Signed-off-by: goldsimon <goldsimon@gmx.de>
TCP timestamps were only sent if the remote side
requested it first. This enables the use of timestamps
for outgoing connections as well.
Signed-off-by: goldsimon <goldsimon@gmx.de>
Changes for TCP Appropriate Byte Counting introduce a potential cwnd
rollover by not taking into account integer promotion on tcpwnd_size_t
during inequality comparisions
This fixes the issue by introducing a macro TCP_WND_INC which detects
the rollover correctly and now holds the tcpwnd_size_t at the maximum
value rather than not incrementing the window. This provides a slight
performance improvement by allowing full use of the tcpwnd_size_t number
space for the congestion window
etharp_query() queues packets, instead of sending, if a relevant arp-request is
pending.
Code walks the packet (a pbuf chain) to determine whether any pbufs are marked
'volatile': If so, we cannot simply enqueue the packet, and instead allocate a
new pbuf from RAM, copying the original packet, and enqueueing this new pbuf.
The bug here is that the allocation refers to the tot_len field of a temp pbuf*,
'p', instead of the head, 'q'.
In the case where the first pbuf of the chain is non-volatile but the second pbuf
*is* volatile, then we'll request an allocation that uses the tot_len field of
the second pbuf. If the first pbuf is non-zero length, the allocated pbuf (chain)
will be too small to allow the copy.
Signed-off-by: goldsimon <goldsimon@gmx.de>
All callers pass pbuf_type to pbuf_init_alloced_pbuf(), so make it take
pbuf_type instead of u8_t.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: goldsimon <goldsimon@gmx.de>
This commit adds a timeout to the zero-window probing (persist timer)
mechanism. LwIP has not historically had a timeout for the persist
timer, leading to unbounded blocking if connection drops during the
zero-window condition
This commit also adds two units test, one to check the RTO timeout
and a second to check the zero-window probe timeout
Fix below build error if LWIP_IPV4 == 0.
cc -g -Wall -DLWIP_DEBUG -pedantic -Werror -Wparentheses -Wsequence-point -Wswitch-default -Wextra -Wundef -Wshadow -Wpointer-arith -Wcast-qual -Wc++-compat -Wwrite-strings -Wold-style-definition -Wcast-align -Wmissing-prototypes -Wredundant-decls -Wnested-externs -Wno-address -Wunreachable-code -Wuninitialized -Wlogical-op -I. -I../../.. -I../../../../lwip/src/include -I../../../ports/unix/port/include -I../../../../mbedtls/include -Wno-redundant-decls -DLWIP_HAVE_MBEDTLS=1 -c ../../../../lwip/src/core/netif.c
../../../../lwip/src/core/netif.c: In function ‘netif_add’:
../../../../lwip/src/core/netif.c:284:7: error: ‘ipaddr’ undeclared (first use in this function)
if (ipaddr == NULL) {
^~~~~~
../../../../lwip/src/core/netif.c:284:7: note: each undeclared identifier is reported only once for each function it appears in
../../../../lwip/src/core/netif.c:285:14: error: implicit declaration of function ‘ip_2_ip4’ [-Werror=implicit-function-declaration]
ipaddr = ip_2_ip4(IP4_ADDR_ANY);
^~~~~~~~
../../../../lwip/src/core/netif.c:285:5: error: nested extern declaration of ‘ip_2_ip4’ [-Werror=nested-externs]
ipaddr = ip_2_ip4(IP4_ADDR_ANY);
^~~~~~
../../../../lwip/src/core/netif.c:285:23: error: ‘IP4_ADDR_ANY’ undeclared (first use in this function)
ipaddr = ip_2_ip4(IP4_ADDR_ANY);
^~~~~~~~~~~~
../../../../lwip/src/core/netif.c:287:7: error: ‘netmask’ undeclared (first use in this function)
if (netmask == NULL) {
^~~~~~~
../../../../lwip/src/core/netif.c:290:7: error: ‘gw’ undeclared (first use in this function)
if (gw == NULL) {
^~
cc1: all warnings being treated as errors
../../Common.allports.mk:94: recipe for target 'netif.o' failed
make: *** [netif.o] Error 1
Fixes: 5967380c20 ("netif_add: avoid passing NULL pointers to subsequent functions")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Dirk Ziegelmeier <dirk@ziegelmeier.net>
Current code fails to allocate zero length pbuf (e.g. for PBUF_RAW PBUF_POOL),
fix it.
Fixes: eb269e61b5 ("First step to clean up pbuf implementation: add pbuf_alloc_reference() to allocate pbufs referencing external payload; move member initialization to common function; simplify PBUF_POOL chain allocator")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: goldsimon <goldsimon@gmx.de>
This fixes build error if LWIP_NETIF_TX_SINGLE_PBUF==1.
Fixes: dd811bca06 ("Fix bug #50694 (TX exist more pbufs after enable LWIP_NETIF_TX_SINGLE_PBUF) by not executing phase 2 for LWIP_NETIF_TX_SINGLE_PBUF==1")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
This commit adds TCP Appropriate Byte Counting (ABC) support based on
RFC 3465
ABC replaces the previous congestion window growth mechanism and has been
configured with limit of 2 SMSS. See task #14128 for discussion on
defaults, but the goal is to mitigate the performance impact of delayed
ACKs on congestion window growth
This commit also introduces a mechanism to track when the stack is
undergoing a period following an RTO where data is being retransmitted.
Lastly, this adds a unit test to verify RTO period tracking and some
basic ABC cwnd checking
while ((q != NULL) && (options[offset] != DHCP_OPTION_END) && (offset < offset_max)) {
should be
while ((q != NULL) && (offset < offset_max) && (options[offset] != DHCP_OPTION_END)) {
See https://jira.reactos.org/browse/CORE-8978 for more info.
This sets the pbuf's if_idx during the loopif poll function (the
equivalent netif input function). This was found during IP_PKTINFO
development where p->if_idx is read and was uninitialized
This commit corrects what looks like an ancient incorrect organization
of the logic for processing an ACK which acks new data. Once moved,
we can also change to using TCP_SEQ_LEQ on ackno instead of TCP_BETWEEN
because ackno has already been checked against snd_nxt
The work of checking the unsent queue and updating pcb->snd_buf (both
steps required for new data ACK) should be located under the conditional
that checks TCP_SEQ_BETWEEN(ackno, pcb->lastack+1, pcb->snd_nxt)
The comment following the unsent queue check/pcb->snd_buf update even
indicates "End of ACK for new data processing" when the logic is clearly
outside of this check
From what I can tell, this mis-organization isn't causing any incorrect
behavior since the unsent queue checked that ackno was between start of
segment and snd_nxt and recv_acked would be 0 during pcb->snd_buf update.
Instead this is waisted work for duplicate ACKS (can be common) and other
old ACKs
altcp is an abstraction layer that prevents applications linking against the
tcp.h functions but provides the same functionality. It is used to e.g. add
SSL/TLS or proxy-connect support to an application written for the tcp callback
API without that application knowing the protocol details.
Applications written against the altcp API are directly linked against the
tcp callback API for LWIP_ALTCP==0, but then cannot use layered protocols.
If a locally generated TCP SYN packet is replied to with an ACK
packet, lwIP immediately sends a RST packet followed by resending the
SYN packet. This is expected, but on loopback interfaces the resent
SYN packet may immediately get another ACK reply, typically when the
other endpoint is in TIME_WAIT state (which ignores the RSTs). The
result is an endless loop of SYN, ACK, RST packets.
This patch applies the normal SYN retransmission limit in this
scenario, such that the endless loop is limited to a brief storm.
This commit changes ssthresh to be the largest effective congestion
window (amount of in-flight data). This follows the guidance of RFC
5681 which recommends setting ssthresh arbitrarily high.
LwIP was previously using the receive window value at the end of the
3-way handshake and in the case of an active open where the receiver
used window scaling and/or window auto-tuning, this resulted in a very
small ssthresh value even though the window ramped up once the connection
was established
Create new function dhcp_release_and_stop() that stops DHCP statemachine and sends release message if needed. Also stops AUTOIP if in coop mode.
Old dhcp_release() and dhcp_stop() function internally call dhcp_release_and_stop() now.
lwIP aims to support zero-copy TX, and thus, must internally handle
all cases that pbufs are referenced rather than copied upon low-level
output. However, in the current situation, the arp/ndp packet queuing
routines conservatively copy entire packets, even when unnecessary in
cases where lwIP is used in a zero-copy compliant manner. This patch
moves the decision whether to copy into a centralized macro, allowing
zero-copy compliant applications to override the macro to avoid the
unnecessary copies. The macro defaults to the safe behavior, though.
The patch simply copies the relevant bits from the UDP implementation.
Perhaps most notably, the patch does *not* copy the IPv4-only UDP
support for IP_MULTICAST_IF, because that option can also be
implemented using the interface index based approach. Largely thanks
to this omission, at least on 32-bit platforms, this patch does not
increase the RAW PCB size at all.
So far, the UDP core module implemented only IPv4 multicast support.
This patch extends the module with the features necessary for socket
layers on top to implement IPv6 multicast support as well:
o If a UDP PCB is bound to an IPv6 multicast address, a unicast source
address is selected and used to send the packet instead, as is
required (and was the case for IPv4 multicast already).
o Unlike IPv4's IP_MULTICAST_IF socket option, which takes a source
IPv4 address, the IPV6_MULTICAST_IF socket option (from RFC 3493)
takes an interface identifier to denote the interface to use for
outgoing multicast-destined packets. A new pair of UDP PCB API
calls, udp_[gs]et_multicast_netif_index(), are added to support
this. The new definition "NETIF_NO_INDEX" may be used to indicate
that lwIP should pick an interface instead.
IPv4 socket implementations may now also choose to map the given
source address to an interface index immediately and use the new
facility instead of the old udp_[gs]et_multicast_netif_addr() one.
A side effect of limiting the old facility to IPv4 is that for dual-
stack configurations with multicast support, the UDP PCB size is
reduced by (up to) 16 bytes.
o For configurations that enable loopback interface support, the IPv6
code now also supports multicast loopback (IPV6_MULTICAST_LOOP).
o The LWIP_MULTICAST_TX_OPTIONS opt.h setting now covers both IPv4
and IPv6, and as such is no longer strictly linked to IGMP. It is
therefore placed in its own lwIP options subgroup in opt.h.
The IPV6_MULTICAST_HOPS socket option can already be implemented using
the existing IP_MULTICAST_TTL support, and thus requires no additional
changes. Overall, this patch should not break any existing code.
If LWIP_CALLBACK_API is not defined, but TCP_LISTEN_BACKLOG is, then
the LWIP_EVENT_ACCEPT TCP event may be triggered for closed listening
sockets. This case is just as disastrous for the event API as it is
for the callback API, as there is no way for the event hook to tell
whether the listening PCB is still around. Add the same protection
against this case for TCP_LISTEN_BACKLOG as was already in place for
LWIP_CALLBACK_API.
Also remove one NULL check for LWIP_CALLBACK_API that had already
become redundant for all callers, making the TCP_EVENT_ACCEPT code
for that callback wrapper more in line with the rest of the wrappers.
This commit introduces a sockets_priv.h header for socket API internal
implementations intended to be used by sockets API C files, but not
applications
This commit moves struct lwip_setgetsockopt_data to the private header
because this is not part of the public sockets API, but needs to be
shared between sockets.c and memp.c
This header lays ground work for sharing other internal sockets types
/macros between API files (sockets.c and if_api.c)
Previously, on netifs with unrestricted MTUs (typically loopback
interfaces), it was possible to give a packet to the UDP/RAW API
calls that is so large that when prepending headers, the pbuf's
tot_len field would overflow. This could easily result in
undesirable behavior at lower layers, e.g. a crash when copying
the packet for later delivery.
This patch models such overflows as memory allocation errors, thus
resulting in clean failures. Checks have to be added in multiple
places to cover (hopefully) all cases.
Having the variable namining ret for a pointer makes the code looks odd,
ret looks like a value variable. Rename ret to pcb.
Also simplify the code in the do {} while() loop.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Couple of more cleanups for task #14314 involving includes:
1) if.h name should match if_api.c due to LwIP convention and history.
Standard if.h include can be used with compatibility header in
posix/net/if.h
2) API header (if.h) should not be included in core code. This include
has been eliminated by moving the definition of IF_NAMESIZE to
netif.h as NETIF_NAMESIZE. This is now the canonical definition
and IF_NAMESIZE just maps to it to provide the standard type
Now that tcp_connect() always determines the outgoing netif with a
route lookup, we can compute the effective MSS without doing the same
route lookup again. The outgoing netif is already known from one
other location that computes the MSS, so we can eliminate a redundant
route lookup there too. Reduce some macro clutter as a side effect.
This patch adds full support for IPv6 address scopes, thereby aiming
to be compliant with IPv6 standards in general and RFC 4007 in
particular. The high-level summary is that link-local addresses are
now meaningful only in the context of their own link, guaranteeing
full isolation between links (and their addresses) in this respect.
This isolation even allows multiple interfaces to have the same
link-local addresses locally assigned.
The implementation achieves this by extending the lwIP IPv6 address
structure with a zone field that, for addresses that have a scope,
carries the scope's zone in which that address has meaning. The zone
maps to one or more interfaces. By default, lwIP uses a policy that
provides a 1:1 mapping between links and interfaces, and considers
all other addresses unscoped, corresponding to the default policy
sketched in RFC 4007 Sec. 6. The implementation allows for replacing
the default policy with a custom policy if desired, though.
The lwIP core implementation has been changed to provide somewhat of
a balance between correctness and efficiency on on side, and backward
compatibility on the other. In particular, while the application would
ideally always provide a zone for a scoped address, putting this in as
a requirement would likely break many applications. Instead, the API
accepts both "properly zoned" IPv6 addresses and addresses that, while
scoped, "lack" a zone. lwIP will try to add a zone as soon as possible
for efficiency reasons, in particular from TCP/UDP/RAW PCB bind and
connect calls, but this may fail, and sendto calls may bypass that
anyway. Ultimately, a zone is always added when an IP packet is sent
when needed, because the link-layer lwIP code (and ND6 in particualar)
requires that all addresses be properly zoned for correctness: for
example, to provide isolation between links in the ND6 destination
cache. All this applies to packet output only, because on packet
input, all scoped addresses will be given a zone automatically.
It is also worth remarking that on output, no attempt is made to stop
outgoing packets with addresses for a zone not matching the outgoing
interface. However, unless the application explicitly provides
addresses that will result in such zone violations, the core API
implementation (and the IPv6 routing algorithm in particular) itself
will never take decisions that result in zone violations itself.
This patch adds a new header file, ip6_zone.h, which contains comments
that explain several implementation aspects in a bit more detail.
For now, it is possible to disable scope support by changing the new
LWIP_IPV6_SCOPES configuration option. For users of the core API, it
is important to note that scoped addresses that are locally assigned
to a netif must always have a zone set; the standard netif address
assignment functions always do this on behalf of the caller, though.
Also, core API users will want to enable LWIP_IPV6_SCOPES_DEBUG at
least initially when upgrading, to ensure that all addresses are
properly initialized.
The tests were in to catch user errors, but they seem to get in the way of application programming :-)
The checks in *_send() remain active to catch when PCB source and destination address types do not match