This commit adds TCP Appropriate Byte Counting (ABC) support based on
RFC 3465
ABC replaces the previous congestion window growth mechanism and has been
configured with limit of 2 SMSS. See task #14128 for discussion on
defaults, but the goal is to mitigate the performance impact of delayed
ACKs on congestion window growth
This commit also introduces a mechanism to track when the stack is
undergoing a period following an RTO where data is being retransmitted.
Lastly, this adds a unit test to verify RTO period tracking and some
basic ABC cwnd checking
while ((q != NULL) && (options[offset] != DHCP_OPTION_END) && (offset < offset_max)) {
should be
while ((q != NULL) && (offset < offset_max) && (options[offset] != DHCP_OPTION_END)) {
See https://jira.reactos.org/browse/CORE-8978 for more info.
This sets the pbuf's if_idx during the loopif poll function (the
equivalent netif input function). This was found during IP_PKTINFO
development where p->if_idx is read and was uninitialized
This commit corrects what looks like an ancient incorrect organization
of the logic for processing an ACK which acks new data. Once moved,
we can also change to using TCP_SEQ_LEQ on ackno instead of TCP_BETWEEN
because ackno has already been checked against snd_nxt
The work of checking the unsent queue and updating pcb->snd_buf (both
steps required for new data ACK) should be located under the conditional
that checks TCP_SEQ_BETWEEN(ackno, pcb->lastack+1, pcb->snd_nxt)
The comment following the unsent queue check/pcb->snd_buf update even
indicates "End of ACK for new data processing" when the logic is clearly
outside of this check
From what I can tell, this mis-organization isn't causing any incorrect
behavior since the unsent queue checked that ackno was between start of
segment and snd_nxt and recv_acked would be 0 during pcb->snd_buf update.
Instead this is waisted work for duplicate ACKS (can be common) and other
old ACKs
altcp is an abstraction layer that prevents applications linking against the
tcp.h functions but provides the same functionality. It is used to e.g. add
SSL/TLS or proxy-connect support to an application written for the tcp callback
API without that application knowing the protocol details.
Applications written against the altcp API are directly linked against the
tcp callback API for LWIP_ALTCP==0, but then cannot use layered protocols.
If a locally generated TCP SYN packet is replied to with an ACK
packet, lwIP immediately sends a RST packet followed by resending the
SYN packet. This is expected, but on loopback interfaces the resent
SYN packet may immediately get another ACK reply, typically when the
other endpoint is in TIME_WAIT state (which ignores the RSTs). The
result is an endless loop of SYN, ACK, RST packets.
This patch applies the normal SYN retransmission limit in this
scenario, such that the endless loop is limited to a brief storm.
This commit changes ssthresh to be the largest effective congestion
window (amount of in-flight data). This follows the guidance of RFC
5681 which recommends setting ssthresh arbitrarily high.
LwIP was previously using the receive window value at the end of the
3-way handshake and in the case of an active open where the receiver
used window scaling and/or window auto-tuning, this resulted in a very
small ssthresh value even though the window ramped up once the connection
was established
Create new function dhcp_release_and_stop() that stops DHCP statemachine and sends release message if needed. Also stops AUTOIP if in coop mode.
Old dhcp_release() and dhcp_stop() function internally call dhcp_release_and_stop() now.
lwIP aims to support zero-copy TX, and thus, must internally handle
all cases that pbufs are referenced rather than copied upon low-level
output. However, in the current situation, the arp/ndp packet queuing
routines conservatively copy entire packets, even when unnecessary in
cases where lwIP is used in a zero-copy compliant manner. This patch
moves the decision whether to copy into a centralized macro, allowing
zero-copy compliant applications to override the macro to avoid the
unnecessary copies. The macro defaults to the safe behavior, though.
The patch simply copies the relevant bits from the UDP implementation.
Perhaps most notably, the patch does *not* copy the IPv4-only UDP
support for IP_MULTICAST_IF, because that option can also be
implemented using the interface index based approach. Largely thanks
to this omission, at least on 32-bit platforms, this patch does not
increase the RAW PCB size at all.
So far, the UDP core module implemented only IPv4 multicast support.
This patch extends the module with the features necessary for socket
layers on top to implement IPv6 multicast support as well:
o If a UDP PCB is bound to an IPv6 multicast address, a unicast source
address is selected and used to send the packet instead, as is
required (and was the case for IPv4 multicast already).
o Unlike IPv4's IP_MULTICAST_IF socket option, which takes a source
IPv4 address, the IPV6_MULTICAST_IF socket option (from RFC 3493)
takes an interface identifier to denote the interface to use for
outgoing multicast-destined packets. A new pair of UDP PCB API
calls, udp_[gs]et_multicast_netif_index(), are added to support
this. The new definition "NETIF_NO_INDEX" may be used to indicate
that lwIP should pick an interface instead.
IPv4 socket implementations may now also choose to map the given
source address to an interface index immediately and use the new
facility instead of the old udp_[gs]et_multicast_netif_addr() one.
A side effect of limiting the old facility to IPv4 is that for dual-
stack configurations with multicast support, the UDP PCB size is
reduced by (up to) 16 bytes.
o For configurations that enable loopback interface support, the IPv6
code now also supports multicast loopback (IPV6_MULTICAST_LOOP).
o The LWIP_MULTICAST_TX_OPTIONS opt.h setting now covers both IPv4
and IPv6, and as such is no longer strictly linked to IGMP. It is
therefore placed in its own lwIP options subgroup in opt.h.
The IPV6_MULTICAST_HOPS socket option can already be implemented using
the existing IP_MULTICAST_TTL support, and thus requires no additional
changes. Overall, this patch should not break any existing code.
If LWIP_CALLBACK_API is not defined, but TCP_LISTEN_BACKLOG is, then
the LWIP_EVENT_ACCEPT TCP event may be triggered for closed listening
sockets. This case is just as disastrous for the event API as it is
for the callback API, as there is no way for the event hook to tell
whether the listening PCB is still around. Add the same protection
against this case for TCP_LISTEN_BACKLOG as was already in place for
LWIP_CALLBACK_API.
Also remove one NULL check for LWIP_CALLBACK_API that had already
become redundant for all callers, making the TCP_EVENT_ACCEPT code
for that callback wrapper more in line with the rest of the wrappers.
This commit introduces a sockets_priv.h header for socket API internal
implementations intended to be used by sockets API C files, but not
applications
This commit moves struct lwip_setgetsockopt_data to the private header
because this is not part of the public sockets API, but needs to be
shared between sockets.c and memp.c
This header lays ground work for sharing other internal sockets types
/macros between API files (sockets.c and if_api.c)
Previously, on netifs with unrestricted MTUs (typically loopback
interfaces), it was possible to give a packet to the UDP/RAW API
calls that is so large that when prepending headers, the pbuf's
tot_len field would overflow. This could easily result in
undesirable behavior at lower layers, e.g. a crash when copying
the packet for later delivery.
This patch models such overflows as memory allocation errors, thus
resulting in clean failures. Checks have to be added in multiple
places to cover (hopefully) all cases.
Having the variable namining ret for a pointer makes the code looks odd,
ret looks like a value variable. Rename ret to pcb.
Also simplify the code in the do {} while() loop.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Couple of more cleanups for task #14314 involving includes:
1) if.h name should match if_api.c due to LwIP convention and history.
Standard if.h include can be used with compatibility header in
posix/net/if.h
2) API header (if.h) should not be included in core code. This include
has been eliminated by moving the definition of IF_NAMESIZE to
netif.h as NETIF_NAMESIZE. This is now the canonical definition
and IF_NAMESIZE just maps to it to provide the standard type
Now that tcp_connect() always determines the outgoing netif with a
route lookup, we can compute the effective MSS without doing the same
route lookup again. The outgoing netif is already known from one
other location that computes the MSS, so we can eliminate a redundant
route lookup there too. Reduce some macro clutter as a side effect.
This patch adds full support for IPv6 address scopes, thereby aiming
to be compliant with IPv6 standards in general and RFC 4007 in
particular. The high-level summary is that link-local addresses are
now meaningful only in the context of their own link, guaranteeing
full isolation between links (and their addresses) in this respect.
This isolation even allows multiple interfaces to have the same
link-local addresses locally assigned.
The implementation achieves this by extending the lwIP IPv6 address
structure with a zone field that, for addresses that have a scope,
carries the scope's zone in which that address has meaning. The zone
maps to one or more interfaces. By default, lwIP uses a policy that
provides a 1:1 mapping between links and interfaces, and considers
all other addresses unscoped, corresponding to the default policy
sketched in RFC 4007 Sec. 6. The implementation allows for replacing
the default policy with a custom policy if desired, though.
The lwIP core implementation has been changed to provide somewhat of
a balance between correctness and efficiency on on side, and backward
compatibility on the other. In particular, while the application would
ideally always provide a zone for a scoped address, putting this in as
a requirement would likely break many applications. Instead, the API
accepts both "properly zoned" IPv6 addresses and addresses that, while
scoped, "lack" a zone. lwIP will try to add a zone as soon as possible
for efficiency reasons, in particular from TCP/UDP/RAW PCB bind and
connect calls, but this may fail, and sendto calls may bypass that
anyway. Ultimately, a zone is always added when an IP packet is sent
when needed, because the link-layer lwIP code (and ND6 in particualar)
requires that all addresses be properly zoned for correctness: for
example, to provide isolation between links in the ND6 destination
cache. All this applies to packet output only, because on packet
input, all scoped addresses will be given a zone automatically.
It is also worth remarking that on output, no attempt is made to stop
outgoing packets with addresses for a zone not matching the outgoing
interface. However, unless the application explicitly provides
addresses that will result in such zone violations, the core API
implementation (and the IPv6 routing algorithm in particular) itself
will never take decisions that result in zone violations itself.
This patch adds a new header file, ip6_zone.h, which contains comments
that explain several implementation aspects in a bit more detail.
For now, it is possible to disable scope support by changing the new
LWIP_IPV6_SCOPES configuration option. For users of the core API, it
is important to note that scoped addresses that are locally assigned
to a netif must always have a zone set; the standard netif address
assignment functions always do this on behalf of the caller, though.
Also, core API users will want to enable LWIP_IPV6_SCOPES_DEBUG at
least initially when upgrading, to ensure that all addresses are
properly initialized.
The tests were in to catch user errors, but they seem to get in the way of application programming :-)
The checks in *_send() remain active to catch when PCB source and destination address types do not match
This commit cleans up the remaining instance of global variable
"index" shadowing caused by using local variables and function
parameters named "index"
These were introduced in the recent interface index API commits
Adjusts assert logic from 9c80a66253
to allow for a netif driver's init callback to manually override
the number. When the init function is taking care of the unique
assignment, the assert simply checks that a valid number was provided
This commit adds an LWIP_ASSERT to detect when netif_num overflows and
we no longer have unique numbers per netif. Unique netif numbers are
needed to support interface indexes (task #14314)
The only cases where this could occur are with a deployment that attempts
to use the maximum 256 netifs at the same time or where netifs are being
constantly adding and removed. Neither of these use cases fit the
lightweight goals of LwIP
See discussion in task #14314 for more details
- Code duplication with etharp_raw()
- No great effect on perfomance
- May make reworking PBUF handling code more complicated (see bug #49914)
- The check for p->type == PBUF_REF is a strange special case, too
- Simon also voted to remove it
../../../../lwip/src/core/ipv6/ip6_frag.c: In function ‘ip6_reass’:
../../../../lwip/src/core/ipv6/ip6_frag.c:567:7: error: ISO C90 forbids mixed declarations and code [-Werror=pedantic]
Eliminate ETHADDR32_COPY macro - it cannot be used in ETH_PAD_SIZE case. I could have kept it by defining it to ETHADDR16_COPY in case of ETH_PAD_SIZE, but I did not consider it worth another #ifdef mess.
Fix below compile error:
../../../../lwip/src/core/ipv6/ip6_frag.c: In function ‘ip6_reass’:
../../../../lwip/src/core/ipv6/ip6_frag.c:533:20: error: declaration of ‘next_pbuf’ shadows a previous local [-Werror=shadow]
struct pbuf* next_pbuf = iprh->next_pbuf;
^~~~~~~~~
../../../../lwip/src/core/ipv6/ip6_frag.c:272:20: note: shadowed declaration is here
struct pbuf *q, *next_pbuf;
^~~~~~~~~
cc1: all warnings being treated as errors
../Common.mk:93: recipe for target 'ip6_frag.o' failed
make: *** [ip6_frag.o] Error 1
Fixes: 7cedf7ae71 ("IPv6: fragment reassembly fixes")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
This patch aims to fix three closely related issues.
o The implementation of IPV6_FRAG_COPYHEADER was fundamentally
incompatible with the presence of extension headers between the
IPv6 header and the Fragment Header. This patch changes the
implementation to support such extension headers as well, with
pretty much the same memory requirements. As a result, we can
remove the check that prevented such packets from being reassembled
in all cases, even with IPV6_FRAG_COPYHEADER off.
o Given that temporary data is stored in the Fragment Header of
packets saved for the purpose of reassembly, but ICMPv6 "Fragment
Reassembly Time Exceeded" packets contain part of the original
packet, such ICMPv6 packets could actually end up containing part
of the temporary data, which may even include a pointer value. The
ICMPv6 packet should contain the original, unchanged packet, so
save the original header data before overwriting it even if
IPV6_FRAG_COPYHEADER is disabled. This does add some extra memory
consumption.
o Previously, the reassembly would leave the fragment header in the
reassembled packet, which is not permitted by RFC 2460 and prevents
reassembly of particularly large packets (close to 65535 bytes
after reassembly). This patch gets rid of the fragment header. It
does require an implementation of memmove() for that purpose.
Note that this patch aims to improve correctness. Future changes
might restore some of the previous functionality in order to regain
optimal performance for certain cases (at the cost of more code).
As per RFC requirements, upon removing a router from the default
router list, remove any entries pointing to it from the destination
cache. While here, synchronize timing out entries in the default
router list with the rest of the timer code.
When removing a netif, clear the destination cache altogether
in order to prevent more general inconsistency. When this happens,
the entries for other netifs will have to be rebuilt, but removing
netifs should be sufficiently rare that this is not worth optimizing.
The current ND implementation does not yet implement the most basic
required ('MUST') checks for message validation and generation.
- implement some of the required checks for message validation;
- document the remaining missing message validation checks;
- hardcode the hop limit of Neighbor Discovery messages rather than
having it depend on lwIP configuration which, if changed, would
cause all of ND to cease working.
The introduction of address lifetimes also means that lwIP correctly
supports transitions between PREFERRED and DEPRECATED address states,
and that means that the source address selection must be changed to
take this into account. Adding this feature to the previous algorithm
would have resulted in a mess, so this patch rewrites the algorithm to
stay close to the rules described in RFC 6724 (formerly 3484) Sec. 5.
This yields the following changes:
- Rule 2 ("prefer appropriate scope") is now fully implemented, most
importantly allowing larger-scope addresses to be picked if no
smaller-scope addresses are available (e.g., a global address may
now be used to connect to a unique-local address);
- Rule 3 ("avoid deprecated addresses") is now also fully implemented;
- unknown-scope addresses are also supported, with lowest priority;
- the link between the prescribed rules and the actual algorithm is
made much more explicit, hopefully allowing future improvements to
be made more easily.
For reasons explained in comments, one previous deviation from the RFC
on Rule 2 is retained for now.
As laid out in RFC 5942, the assumption that a dynamically assigned
(SLAAC/DHCPv6) address implies an on-link subnet, is wrong. lwIP does
currently make that assumption, routing packets according to local
address subnets rather than the on-link prefix list. The result is
that packets may not make it to their destination due to incorrect
routing decisions.
This patch changes the routing algorithms to be (more) compliant with
RFC 5942, by implementing the following new routing policies:
- all routing decisions check the on-link prefix list first, and
select a default router for off-link routing only if there is no
matching entry in the on-link prefix list;
- dynamically assigned addresses (from address autoconfiguration) are
considered /128 assignments, and thus, no routing decisions are taken
based on matches against their (/64) subnet anymore;
- more generally, all addresses that have a lifetime are considered
dynamically assigned and thus of size /128, which is the required
behavior for externally implemented SLAAC clients and DHCPv6;
- statically assigned (i.e., manually configured) addresses are still
considered /64 assignments, and thus, their associated subnet is
considered for routing decisions, in order to behave as generally
expected by end users and to retain backward compatibility;
- the link-local address in IPv6 address slot #0 is considered static
and thus has no lifetime and an implied /64 subnet, although link-
local routing is currently always handled separately anyway.
IPv6 source address selection is kept as is, as the subnet tests in
the algorithm serve as poor man's longest-common-prefix equivalent
there (RFC 6724 Sec. 5, Rule 8).
Previously, IPv6 routing could select a next-hop router on a netif
that was down or disconnected, potentially resulting in packets being
dropped unnecessarily. This patch changes router selection to take
into account the state of the router's associated netif, eliminating
such unnecessary packet loss.
Also, this patch fixes the test for router validity, which was
erroneously based on the router's invalidation timer rather than its
neighbor cache entry state. Given that an expired router has no
associated neighbor cache entry, no invalid routers would previously
ever be returned.
Finally, this patch also adds round-robin selection of routers that
are not known to be reachable or probably reachable, as per RFC 4861
Sec. 6.3.6 point (2). Support for this feature was partially present
but not actually functional.
For applications that use NETIF_STATUS_CALLBACK to help keep track of
extra per-address shadow state of IPv6 addresses, even in the light of
autogenerated addresses (which may "spontaneously" appear/disappear),
state transitions between tentative, duplicated, and invalid are
important as well. Therefore, invoke the status callback for all such
state transitions. Continue to filter out state changes between
various levels of progress of the tentative state, though.
Previously, Duplicate Address Detection (DAD) would work only for the
link-local address. For DAD-spawned Neighbor Solicitation requests for
any other address, the request would use the link-local address as the
source, meaning the other side would send a targeted reply (RFC 4861
Sec. 7.2.4). However, the nd6 implementation currently does not
consider targeted replies for DAD--even though technically an RFC 4862
Sec. 5.4.4 violation--supposedly because no real-world scenario could
trigger that case. The combination of these factors resulted in DAD
being entirely ineffective for non-link-local addresses.
This patch forces all DAD-spawned Neighbor Solicitation packets to use
the unspecified ('any') address as source, as per RFC 4862 Sec. 5.4.2.
As a result, other nodes would reply with multicast replies, for which
there is appropriate DAD checking code.
The patch also makes a slight rearrangement of statements such that
MLD join messages are sent before the NS packets, rather than after.
In the cases that nd6 checks whether the interface is up before
sending a packet, also check whether the link is up. Without this
additional check, temporary link downtime could easily result in
unnecessary false negatives for Duplicate Address Detection.
In addition, use the netif abstraction macros to perform the checks.
In summary, this patch aims to resolve bugs #47923 and #48162, by
decoupling address autoconfiguration from the on-link prefix list,
since those are not related. Important necessary changes are needed
to meet this goal, ultimately bringing the lwIP ND6 implementation
closer to compliance with RFC 4862. The main changes are:
1. support for address lifetimes, and,
2. addition of a new DUPLICATED address state.
The decoupling implies that the prefix list can no longer be used to
maintain state for address autoconfiguration. Most importantly, the
lifetime of each address, which was previously derived from the
prefix slot's lifetime, must now be associated with the address
itself. This patch implements address lifetime tracking, maintaining
both a valid and a preferred lifetime for each address, along with
the corresponding address state changes (e.g., between PREFERRED and
DEPRECATED), all as required by RFC 4862.
The support for address lifetimes can be enabled with a new
LWIP_IPV6_ADDRESS_LIFETIMES setting in lwipopts.h. It is required for
autoconfiguration and enabled by default if autoconfiguration is
enabled as well, but it may also be enabled separately, so as to allow
application-controlled lifetime management (e.g., if autoconfiguration
is implemented in a separate application). A special valid-lifetime of
zero is used to denote a static address--that is, an address that was
configured manually, that does not have lifetimes, and that should be
left alone by the autoconfiguration functionality. Addresses assigned
without setting a lifetime are deemed static, thus preserving
compatibility with existing lwIP-based applications in this respect.
Similarly, the decoupling implies that the prefix list can no longer
be used to remember cases of address duplication. Previously, the
detection of a duplicated address would simply result in removal of
the address altogether. Instead, this patch introduces a new state
"DUPLICATED", indicating that the address, while technically still
present, has been found to conflict with other node addresses, and no
attempt should be made to produce an autoconfiguration address for
that prefix.
Manually added addresses, including the link-local address, once set
to DUPLICATED, will remain in that state until manual intervention.
Autoconfigured DUPLICATED addresses will expire according to their
valid-lifetime, essentially preserving the current behavior but
without the use of the prefix list. As a first attempt to approach
compliance with RFC 4862 Sec. 5.4.5, if the link-local address is
detected to be duplicated, all derived addresses are marked duplicated
as well, and no new addresses will be autoconfigured. More work is to
be done for full compliance with that section, however.
Together, those two main changes indeed do fully decouple address
autoconfiguration from the on-link prefix list. Changes to the latter
thus no longer affect the former, resolving bug #47923. Moreover, as a
result, autoconfiguration can, and does, now also take place on
advertised prefixes that do not have the on-link flag set, resolving
bug #48162. The routing changes mentioned in the discussion of that
bug are left to a separate patch, though.
This patch adds a new RAW_FLAGS_HDRINCL flag to the raw core
implementation. When this flag is set on a RAW PCB, the raw send
routines expect the caller to supply an IP header for the given
packet, and will use that IP header instead of prepending one to
the packet themselves.
This feature allows the IP_HDRINCL socket option to be implemented
in higher layers with no further effort. Even thoguh that option is
traditionally supported for IPv4 sockets only (e.g., see RFC 3542
Sec. 3), the RAW_FLAGS_HDRINCL flag supports both IPv4 and IPv6, as
much of the lower-level infrastructure was already in place anyway.
Similar to the core UDP API, the new function may be used to implement
IPV6_PKTINFO (RFC 3542 Sec. 4), for example. This patch makes no
further functional changes; it merely moves code around a bit.
The support for connecting raw sockets is extended to match the
support for UDP sockets, while keeping the current API unchanged:
- for connected sockets, filter incoming packets on source address;
- use a flag to indicate whether a socket is connected, at no extra
memory cost; the application may check this flag if needed;
- added raw_disconnect(), which so far existed in documentation only.
The caller of tcp_listen_with_backlog_and_err() usually check if the return
pcb is NULL before checking the err reason. I think the commit adding
tcp_listen_with_backlog_and_err() accidently change the behavior, Fix it.
Fixes: 98fc82fa71 ("added function tcp_listen_with_backlog_and_err() to get the error reason when listening fails")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
If MLD support is enabled, each locally assigned IPv6 address in the
appropriate state must be a member of the solicited-node multicast
group corresponding to that address. Ensure that this is always the
case by (re-)deciding on the membership upon every address state
change. By doing so, this patch enforces that user-initiated state
changes to addresses (e.g., deletion) never cause a desynchronization
with the corresponding solicited-node multicast group membership,
thereby making such user-initiated state changes simpler and safer.
The code in the for loop checks tmp_group->next == group, so current code
actually checks from the 3rd entry in the linked groups list. Fix it.
Fixes: 5c1dd6a4c6 ("Optimization in igmp_remove_group() pointed out by Axel Lin")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
This commit adds support for responding to a zero-window probe when
the refused_data pointer is set
A zero-window probe is a data segment received when rcv_ann_wnd
is 0. This corrects a standards violation where LwIP would not
respond to a zero-window probe with its current ACK value (RCV.NXT)
when it has refused data, thus leading to the probing TCP closing
out the connection
lwIP produces a TCP Initial Sequence Number (ISN) for each new TCP
connection. The current algorithm is simple and predictable however.
The result is that lwIP TCP connections may be the target of TCP
spoofing attacks. The problem of such attacks is well known, and a
recommended ISN generation algorithm is standardized in RFC 6528.
This algorithm requires a high-resolution timer and cryptographic
hashing function, though. The implementation (or best-effort
approximation) of both of these aspects is well beyond the scope of
lwIP itself.
For that reason, this patch adds LWIP_HOOK_TCP_ISN, a hook that
allows each platform to implement its own ISN generation using
locally available means. The hook provides full flexibility, in
that the hook may generate anything from a simple random number
(by being set to LWIP_RAND()) to a full RFC 6528 implementation.
Implementation note:
Users of the hook would typically declare the function prototype of
the hook function in arch/cc.h, as this is the last place where such
prototypes can be supplied. However, at that point, the ip_addr_t
type has not yet been defined. For that reason, this patch removes
the leading underscore from "struct _ip_addr", so that a prototype
of the hook function can use "struct ip_addr" instead of "ip_addr_t".
Signed-off-by: sg <goldsimon@gmx.de>
Fix below build error when LWIP_ND6_RDNSS_MAX_DNS_SERVERS == 0
../../../../lwip/src/core/ipv6/nd6.c: In function ‘nd6_input’:
../../../../lwip/src/core/ipv6/nd6.c:400:10: error: unused variable ‘rdnss_server_idx’ [-Werror=unused-variable]
u8_t rdnss_server_idx = 0;
^~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
../Common.mk:93: recipe for target 'nd6.o' failed
make: *** [nd6.o] Error 1
Fixes: 6b1950ec24 ("nd6: add support for RDNSS option (as per RFC 6106)")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Previously, ethip6 and lowpan6 each had their own copy of code that
used internal nd6 data structures to decide whether to send a packet
on the local link right away, or queue it while nd6 performed local
address resolution. This patch moves that code into nd6, thereby
eliminating all remaining cases of external access to internal nd6
data structures, as well as the need to expose two specific nd6
functions.
As a side effect, the patch effectively fixes two bugs in the lowpan6
code that were already fixed in the ethip6 code.
This patch rearranges the code division between nd6.c and ip6.c such
that the latter does not need to access ND6-internal data structures
(specifically, "default_router_list") directly anymore.
The new function, while currently not used internally, allows external
code to clear the ND destination cache in the case that it may have
become inconsistent with the current situation, for example as the
result of a change of locally assigned addresses, or a change in
routing tables implemented through the LWIP_HOOK_ND6_GET_GW hook.
On failure, nd6_get_next_hop_entry() returns an ERR_ type negative
error code. ethip6_output() erroneously assumed that that error would
always be ERR_MEM, even though it may also be ERR_RTE in practice.
With this patch, ethip6_output() simply forwards the returned error.
This commit increments the ip.drop statistic when an IP packet is
dropped due to no matching netif found and forwarding is disabled
This adds parity to the other places where mib2.ipinaddrerrors and
mib2.ipindiscards are incremented which also increment ip.drop
All the reset part of the code accessing netif->loop_first has lock protection,
the only missing part is "while (netif->loop_first != NULL)".
Fix it by adding lock protect around the while loop.
Also convert the code to use while{} loop instead of do .. while{} loop,
then we can avoid NULL test for in pointer in each loop and reduce a level of indent.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
While TCP_OVERSIZE works only when tcp_write() is used with
TCP_WRITE_FLAG_COPY, this new code achieves
similar benefits for the use case that the caller manages their own
send buffers and passes successive chunks of those to tcp_write()
without TCP_WRITE_FLAG_COPY.
In particular, if a buffer is passed to
tcp_write() that is adjacent in memory to the previously passed
buffer, it will be combined into the previous ROM pbuf reference
whenever possible, thus extending that ROM pbuf rather than allocating
a new ROM pbuf.
For the aforementioned use case, the advantages of this code are
twofold:
1) fewer ROM pbufs need to be allocated to send the same data, and,
2) the MAC layer gets outgoing TCP packets with shorter pbuf chains.
Original patch by Ambroz Bizjak <ambrop7@gmail.com>
Edited by David van Moolenbroek <david@minix3.org>
Signed-off-by: goldsimon <goldsimon@gmx.de>
Change lwIP UDP API to match socket behavior. Multicast traffic is now only received on a UDP PCB (and therefore on a UDP socket/netconn) when the PCB is bound to IP_ADDR_ANY.
Generally speaking, packets with a loopback destination address -
127.0.0.1 for IPv4 and ::1 for IPv6 - should not be accepted on
non-loopback interfaces. For IPv4, this is implied by RFC 1122
Sec. 3.2.1.3. For IPv6, it is mandated by RFC 4291 Sec. 2.5.3.
Failure to perform this filtering may have security implications, as
applications that bind sockets to loopback addresses may not expect
that nodes on the local external network be able to produce traffic
that will arrive at such sockets.
With this patch, lwIP drops packets that are sent to a loopback
address but do not originate from the interface that has the loopback
address assigned to it. This approach works regardless of whether it
is lwIP or the system using it that implements a loopback netif. The
only exception that must be made is for configurations that enable
netif packet loopback but disable the lwIP loopback netif: in that
case, loopback packets are routed across non-loopback netifs and would
thus be lost by the new filter as well.
For IPv6, loopback-destined packets are also no longer forwarded; the
IPv4 forwarding code already had a check for that.
As a small performance improvement, the IPv6 link-local/loopback
address check is now performed only once per packet rather than
repeatedly for every candidate netif.
In general, netif_default may be NULL, and various places in the code
already check for this case before attempting to dereference the
netif_default pointer. Some places do not perform this check though,
and may cause null pointer dereferences if netif_default is not set.
This patch adds NULL checks to those places as well.
bind() may change IP type when previous type is IPADDR_TYPE_ANY
connect() IP type must exactly match bind IP type
Use correct IPADDRx_ANY type when calling ip_route()
This commit returns LwIP to previous behavior where if the next unsent
segment can't be sent due to the current send window, we start the
persist timer. This is done to engage window probing in the case that
the subsequent window update from the receiver is dropped, thus
preventing connection deadlock
This commit refines the previous logic to only target the following case:
1) Next unsent segment doesn't fit within the send window (not
congestion) and there is some room in the window
2) Unacked queue is empty (otherwise data is inflight and the RTO timer
will take care of any dropped window updates)
See commit d8f090a759 (which removed this
behavior) to reference the old logic. The old logic falsely started the
persit timer when the RTO timer was already running.
This commit cleans up a duplicate #if check for LWIP_WND_SCALE in init.c
which was already under #if LWIP_WND_SCALE
This commit also improves documentation for TCP_WND in the window scaling
case to communicate TCP_WND is always the calculated (scaled) window value,
not the value reported in the TCP header
Our developers were confused by having to set both the window and scaling
factor and only after studying the usage of TCP_WND throughout the code, was
it determined to be the calculated (scaled) window
p needs to point to LWIP_MEM_ALIGN(memp_pools[i]->base) otherwise it will cause
assertion in overflow checking.
Fixes: c838e1ed5b ("Implement possibility to declare private memory pools")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
When memp_free_pool was split out from memp_free (c838e1ed5b),
the check for freeing the null pointer was lost.
This resulted in the null value being put back in the list of free
objects, causing all subsequent allocations of that type to fail.
The mld_group structure no longer has a 'netif' field, as such
structures are now linked from the corresponding netif structure.
For conditional checksumming, use the calling function's netif
reference instead.
This comment is incorrect since commit 7d0dab9d7d
"partly fixed bug #25882: TCP hangs on MSS > pcb->snd_wnd
(by not creating segments bigger than half the window)".
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Let lwip use functions/macros prefixed by lwip_ internally to avoid naming clashes with external #includes.
Remove over-complicated #define handling in def.h
Make functions easier to override in cc.h. The following is sufficient now (no more LWIP_PLATFORM_BYTESWAP):
#define lwip_htons(x) <your_htons>
#define lwip_htonl(x) <your_htonl>
TCP's snd_nxt represents the next sequence number after sent data, and
as such does not cover any unsent data queued on the connection. The
current implementation does not take the latter point into account
when processing FIN acknowledgments, mistakenly assuming that an
outgoing FIN is ACK'ed when the acknowledgment covers up to snd_nxt
while there is still unsent data. This patch adds a check for unsent
data to correct this, effectively preventing that TCP connections are
closed prematurely.
It is possible that the byte sent as a zero window probe is accepted
and acknowledged by the receiver side without the window being opened.
In that case, the stream has effectively advanced by one byte, and
since lwIP did not take this into account on the sender side, the
result was a desynchronization between the sender and the receiver.
That situation could occur even on a lwIP loopback device, after
filling up the receiver side's receive buffer, and resulted in an ACK
storm. This patch corrects the problem by advancing the sender's next
sequence number by one as needed when sending a zero window probe.
delay_time and stale_time are ticks now.
reachable_time and invalidation_timer are untouched since they may originate from telegram values -> not converting them to ticks avoids an integer division
commit 8c52afb6ca ("igmp: Optimize code by always skipping the first entry in the linked groups list - it is always the "allsystems" entry")
accidently changes the code logic. it should check groupref rather than group.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Reasoning:
- Makes code in single-netif case perform better and smaller
- IGMP / MLD6 code is a little bit easier to read and understand
- Easier to get multicast groups per netif when implementing drivers
Downside: In multi-netif mode, there are two more pointers on each netif, even if IGMP/MLD6 is not used on it. But these systems should not be so memory-constrained that this will matter.
When leaving a multicast group, remove the group from the list
before invoking the MAC filter callback. This avoids the need
for the callee to skip over the group that is about to be deleted.
This commit adds support to the sanity checks in init.c to ensure that
PBUF_POOL is in use
In ports with drivers/netifs that use PBUF_REF for the RX pathway, there
is no need for the PBUF_POOL memory pool. This allows the port to define
PBUF_POOL_SIZE to 0
commit 44e1a2d8e2 accidently includes below changes in tcp_listen_with_backlog
- tcp_backlog_set(lpcb, backlog);
+ lpcb->backlog = backlog;
Thus pass 0 to the backlog parameter of netconn_listen_with_backlog() fails.
Fixes: 44e1a2d8e2 ("define tcp_backlog_set() as dummy-define when backlog feature is disable")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: goldsimon <goldsimon@gmx.de>
MEMP_SANITY_REGION_BEFORE and MEMP_SANITY_REGION_AFTER can be overridden in
lwipopts.h, if one of it is set to 0 we got build error due to unused variable.
Fix unused variable build error when MEMP_OVERFLOW_CHECK >= 1 &&
(MEMP_SANITY_REGION_BEFORE == 0 || MEMP_SANITY_REGION_AFTER == 0).
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: goldsimon <goldsimon@gmx.de>
There is only one caller using memp_overflow_init(), and at that context
calling memp_overflow_init_element() actually simplifes the code.
Thus remove memp_overflow_init() function.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: goldsimon <goldsimon@gmx.de>
Use LWIP_MEM_ALIGN() in memp_overflow_init() to get alignment address for memp.
This fixes assertion in memp_overflow_check_element_overflow when
MEMP_OVERFLOW_CHECK is set.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Fix below build error.
../../../../../lwip/src/core/memp.c: In function ‘memp_free’:
../../../../../lwip/src/core/memp.c:490:31: error: request for member ‘tab’ in something not a structure or union
old_first = memp_pools[type].tab;
^
../../Common.mk:94: recipe for target 'memp.o' failed
make: *** [memp.o] Error 1
Fixes: de9054cb7a ("memp: cleaned up MEMP_MEM_MALLOC")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
In the BSD socket API world, IP_HDRINCL is a socket option for "raw"
sockets that indicates whether sent packets already include an IP
header. Within lwIP, "IP_HDRINCL" is redefined as a special value
that indicates to lwIP-internal functions that an IP header is already
included. While somewhat related, the two meanings are different and,
on platforms that define the IP_HDRINCL socket option, this results in
a conflict. This patch renames the lwIP one to "LWIP_IP_HDRINCL",
thus resolving the conflict.
- support memp stats when MEMP_MEM_MALLOC==1 (bug #48442);
- hide MEMP_MEM_MALLOC in memp.c instead of messing up the header file;
- make MEMP_OVERFLOW_CHECK work when MEMP_MEM_MALLOC==1
Fixes bug #48300 (Private mempools allocate foreign memory), bug #48354 (Portable alignment defines/include required for static allocation) and bug #47092 (Tag memory buffers like memp_memory_xxx and ram_heap with a macro so that attributes can be attached to their definitions)
Signed-off-by: Simon Goldschmidt <goldsimon@gmx.de>
If lwIP encounters a half-open connection (e.g. due to a restarted
application reusing the same port numbers) it will correctly send a
RST but will not resend the SYN until one retransmission timeout later
(approximately three seconds). This can increase the time taken by
lpxelinux.0 to fetch its configuration file from a few milliseconds to
around 30 seconds.
Fix by immediately retransmitting the SYN whenever a half-open
connection is detected.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: goldsimon <goldsimon@gmx.de>
lwip/src/core/init.c:256:32: error: "LWIP_COMPAT_MUTEX" is not defined [-Werror=undef]
#if LWIP_TCPIP_CORE_LOCKING && LWIP_COMPAT_MUTEX && !defined(LWIP_COMPAT_MUTEX_ALLOWED)
^
Setting LWIP_TCPIP_CORE_LOCKING is meaningless for NO_SYS targets,
therefore checking if LWIP_COMPAT_MUTEX is set does not make sense.
Introduced by 42dfa71f97: Make LWIP_TCPIP_CORE_LOCKING==1 the default
(and warn if LWIP_COMPAT_MUTEX==1 in that case as mutexes are required
to prevent priority inversion on tcpip_thread operations)
issue 1:
sys_arch_sem_wait() is supposed to return an elapsed time in ms, what could
happen given a > 1 kHz calling rate for high throughput systems is that it
might always returns 0 ms. This is a problem for systems which compute the
elapsed time from a high precision clock source.
This is what is currently happening in the unix port in sys_arch_sem_wait():
start time -> 1000000000; // ns
-- less than a ms before an event arrive --
end time -> 1000xxxxxx; // ns
return value -> (end time - start time)/1000000 -> 0
The return value is used to reduce the next timer interval, if
sys_arch_sem_wait() always return 0 no more timers are fired anymore
issue 2:
The current timer implementation for !NO_SYS targets only count elapsed
time while -waiting- for semaphore and doesn't count at all the time
spent by the stack to process packets. For CPU bound traffic patterns no
more timers are fired anymore.
Both are serious design issues which cannot be easily fixed without reworking
everything. This patch uses the properly implemented timers for NO_SYS targets
for !NO_SYS targets and merge them both into one single timers implementation.
Since allowing input validation to trip the ASSERT handler is bad,
let's just drop the packets instead if validation fails.
Signed-off-by: sg <goldsimon@gmx.de>
dhcp_discover_request_options is u8_t array, so the result is the same.
But use LWIP_ARRAYSIZE to get the number of array entries is better
because it works for all types.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
lwip/src/core/timers.c: In function ‘sys_check_timeouts’:
lwip/src/core/timers.c:328:5: error: "PBUF_POOL_FREE_OOSEQ" is not defined [-Werror=undef]
#if PBUF_POOL_FREE_OOSEQ
Fix it by declaring an empty PBUF_CHECK_FREE_OOSEQ() function if feature is
not enabled.