[tcp] Avoid printf format warnings on some compilers
In several places, we currently use size_t to represent a difference
between TCP sequence numbers. This can cause compiler warnings
relating to printf format specifiers, since the result of
(uint32_t+size_t) may be an unsigned long on some compilers.
Fix by using uint32_t for all variables that represent a difference
between TCP sequence numbers.
Tested-by: Joshua Oreman <oremanj@xenon.get-linux.org>
[tcp] Avoid rewinding sequence numbers on receiving old duplicate ACKs
Commit 558c1a4 ("[tcp] Improve robustness in the presence of duplicated
received packets") introduced a regression in that an old duplicate
ACK received while in the ESTABLISHED state would pass through normal
ACK processing, including updating tcp->snd_seq.
Fix by ensuring that ACK processing ignores all duplicate ACKs.
[tcp] Attempt to catch all possible error cases with debug messages
All TCP errors or unusual events should now generate a debugging
message at DBGLVL_LOG, with enough information (SEQ and ACK numbers)
to be able to identify the corresponding packet (or missing packet) in
a network trace from the remote end.
[tcp] Move high-frequency debug messages to DBGLVL_EXTRA
This makes it possible to leave TCP debugging enabled in order to see
interesting TCP events, without flooding the console with at least one
message per packet.
[tcp] Improve robustness in the presence of duplicated received packets
gPXE responds to duplicated ACKs with an immediate retransmission,
which can lead to a sorceror's apprentice syndrome. It also responds
to out-of-range (or old duplicate) ACKs with a RST, which can cause
valid connections to be dropped.
Fix the sorceror's apprentice syndrome by leaving the retransmission
timer running (and so inhibiting the immediate retransmission) when we
receive a potential duplicate ACK. This seems to match the behaviour
of Linux observed via wireshark traces.
Fix the RST issue by sending RST only on out-of-range ACKs that occur
before the connection is fully established, as per RFC 793.
These problems were exposed during development of the 802.11 wireless
link layer; the 802.11 protocol has a failure mode that can easily
cause duplicated packets. The fixes were tested in a controlled way
by faking large numbers of duplicated packets in the rtl8139 driver.
Originally-fixed-by: Joshua Oreman <oremanj@rwcr.net>
Some firewall devices seem to regard SYN,PSH as an invalid flag
combination and reject the packet. Fix by setting PSH only if SYN is
not set.
Reported-by: DSE Incorporated <dseinc@gmail.com>
[xfer] Make consistent assumptions that xfer metadata can never be NULL
The documentation in xfer.h and xfer.c does not say that the metadata
parameter is optional in calls such as xfer_deliver_iob_meta() and the
deliver_iob() method. However, some code in net/ is prepared to
accept a NULL pointer, and xfer_deliver_as_iob() passes a NULL pointer
directly to the deliver_iob() method.
Fix this mess of conflicting assumptions by making everything assume
that the metadata parameter is mandatory, and fixing
xfer_deliver_as_iob() to pass in a dummy metadata structure (as is
already done in xfer_deliver_iob()).
Apparently this can cause a major speedup on some iSCSI targets, which
will otherwise wait for a timer to expire before responding. It
doesn't seem to hurt other simple TCP test cases (e.g. HTTP
downloads).
Problem and solution identified by Shiva Shankar <802.11e@gmail.com>
[tcpip] Allow for transmission to multicast IPv4 addresses
When sending to a multicast address, it may be necessary to specify
the source address explicitly, since the multicast destination address
does not provide enough information to deduce the source address via
the miniroute table.
Allow the source address specified via the data-xfer metadata to be
passed down through the TCP/IP stack to the IPv4 layer, which can use
it as a default source address.
[i386] Change [u]int32_t to [unsigned] int, rather than [unsigned] long
This brings us in to line with Linux definitions, and also simplifies
adding x86_64 support since both platforms have 2-byte shorts, 4-byte
ints and 8-byte long longs.
Maintain state for the advertised window length, and only ever increase
it (instead of calculating it afresh on each transmit). This avoids
triggering "treason uncloaked" messages on Linux peers.
Respond to zero-length TCP keepalives (i.e. empty data packets
transmitted outside the window). Even if the peer wouldn't otherwise
expect an ACK (because its packet consumed no sequence space), force an
ACK if it was outside the window.
We don't yet generate TCP keepalives. It could be done, but it's unclear
what benefit this would have. (Linux, for example, doesn't start sending
keepalives until the connection has been idle for two hours.)
Modify data-xfer semantics: it is no longer necessary to call one of
request(), seek() or deliver_xxx() in order to start the data flow.
Autonomous generators must be genuinely autonomous (having their own
process), or otherwise arrange to be called. TCP does this by
starting the retry timer immediately.
Add some debugging statements.
Make TCP give up immediately when it receives -ENETUNREACH from
tcpip_tx(). This avoids the irritating wait when you accidentally type
"kernel pxelinux.0" before bringing up the network interface.
Add ENETUNREACH to strerror()'s list.
"when SYN is ACKed and we have already received SYN", or
"when SYN is received and we have already had SYN ACKed"
rather than just
"when SYN is ACKed"
This avoids spuriously calling the connected() method when we receive
a RST,ACK in response to a SYN.
Truncate TX length to TCP window at time of transmission rather than at
time of adding to TX packet; this is conceptually cleaner and also allows
the application to call tcp_send() multiple times to build up a single
packet.
Redefine TCP state to include "flags that have been sent" rather than
"flags that are currently being sent". This allows at least one special
case (checking that we haven't already sent a FIN in tcp_rx_fin()) to be
collapsed.