[tcp] Truncate TCP window to prevent future packet discards
Whenever memory pressure causes a queued packet to be discarded (and
so retransmitted), reduce the maximum TCP window to a size that would
have prevented the discard.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Discarding the active ARP cache entry in the middle of a download will
substantially disrupt the TCP stream. Try to minimise any such
disruption by treating ARP cache entries as expensive, and discarding
them only when nothing else is available to discard.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[netdevice] Process all received packets in net_poll()
The current logic is to process at most one received packet per call
to net_poll(), on the basis that refilling the hardware descriptor
ring should be delayed as little as possible. However, this limits
the rate at which packets can be processed and ultimately ends up
adding latency which, in turn, limits the achievable throughput.
With temporary modifications in place to essentially remove all
resource constraints (heap size increased to 16MB, RX descriptor ring
increased to 64 descriptors) and a TCP window size of 1MB, the
throughput on a gigabit (i.e. 119MBps) network can be observed to fall
off exponentially from around 115MBps to around 75MBps. Changing
net_poll() to process all received packets results in a steady
119MBps throughput.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[arp] Prevent ARP cache entries from being deleted mid-transmission
Each ARP cache entry maintains a transmission queue, which is sent out
as soon as the link-layer address is known. If multiple packets are
queued, then it is possible for memory pressure to cause the ARP cache
discarder to be invoked during transmission of the first packet, which
may cause the ARP cache entry to be deleted before the second packet
can be sent. This results in an invalid pointer dereference.
Avoid this problem by reference-counting ARP cache entries and
ensuring that an extra reference is held while processing the
transmission queue, and by using list_first_entry() rather than
list_for_each_entry_safe() to traverse the queue.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit ea61075 ("[tcp] Add support for TCP window scaling") introduced
a potential NULL pointer dereference by referring to the connection's
send window scale before checking whether or not the connection is
known.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[tls] Request a maximum fragment length of 2048 bytes
The default maximum plaintext fragment length for TLS is 16kB, which
is a substantial amount of memory for iPXE to have to allocate for a
temporary decryption buffer.
Reduce the memory footprint of TLS connections by requesting a maximum
fragment length of 2kB.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The maximum unscaled TCP window (64kB) implies a maximum bandwidth of
around 300kB/s on a WAN link with an RTT of 200ms. Add support for
the TCP window scaling option to remove this upper limit.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[tcpip] Allow for architecture-specific TCP/IP checksum routines
Calculating the TCP/IP checksum on received packets accounts for a
substantial fraction of the response latency.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[dhcp] Request broadcast responses when we already have an IPv4 address
FCoE requires the use of multiple local unicast link-layer addresses.
To avoid the complexity of managing multiple addresses, iPXE operates
in promiscuous mode. As a consequence, any unicast packets with
non-matching IPv4 addresses are rejected at the IPv4 layer (rather
than at the link layer).
This can cause problems when issuing a second DHCP request: if the
address chosen by the DHCP server does not match the existing address,
then the DHCP response will itself be rejected.
Fix by requesting a broadcast response from the DHCP server if the
network interface already has any IPv4 addresses.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[http] Provide credentials only when requested by server
Provide HTTP Basic authentication credentials only in response to a
401 Unauthorized response from the server.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[http] Defer processing response code until after receiving all headers
Some headers can modify the meaning of the response code. For
example, a WWW-Authenticate header can change the interpretation of a
401 Unauthorized response from "Access denied" to "Please
authenticate".
Signed-off-by: Michael Brown <mcb30@ipxe.org>
iSCSI generally includes a full SCSI response only when an error
occurs. iscsi_scsi_done() currently passes the NULL response through
to scsi_response(), which ends up causing scsicmd_response() to
dereference a NULL pointer.
Fix by calling scsi_response() only if we have a non-NULL response.
Reported-by: Brendon Walsh <brendonwalsh@niamu.com>
Tested-by: Brendon Walsh <brendonwalsh@niamu.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
X.509 certificate processing currently produces an overwhelming amount
of debugging information. Move some of this from DBGLVL_LOG to
DBGLVL_EXTRA, to make the output more manageable.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Automatically attempt to download any required cross-signing
certificates from http://ca.ipxe.org/auto, in order to enable the use
of standard SSL certificates issued by public CAs.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
To allow for automatic download of cross-signing certificates and for
OCSP, the validation of certificates must be an asynchronous process.
Create a stub validator which uses a job-control interface to report
the result of certificate validation.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[crypto] Allow certificate chains to be long-lived data structures
At present, certificate chain validation is treated as an
instantaneous process that can be carried out using only data that is
already in memory. This model does not allow for validation to
include non-instantaneous steps, such as downloading a cross-signing
certificate, or determining certificate revocation status via OCSP.
Redesign the internal representation of certificate chains to allow
chains to outlive the scope of the original source of certificates
(such as a TLS Certificate record).
Allow for certificates to be cached, so that each certificate needs to
be validated only once.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[http] Avoid using stack-allocated memory in http_step()
http_step() allocates a potentially large block of storage (since the
URI can be arbitrarily long), and can be invoked as part of an already
deep call stack via xfer_window_changed().
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow TFTP to be configured out by moving the next-server setting
definition (which is used by autoboot.c) from tftp.c to settings.c.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[tls] Fix wrong memset in function tls_clear_cipher
sizeof(cipherspec) is obviously wrong in this context, because it will
only zero the first 4 or 8 bytes (cipherspec is a pointer).
This problem was reported by cppcheck.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[syslog] Pass internal syslog() priority through to syslog console
Use a private ANSI escape sequence to convey the priority of an
internal syslog() message through to the syslog server.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[console] Do not share ANSI escape context between lineconsole users
An ANSI escape sequence context cannot be shared between multiple
users. Make the ANSI escape sequence context part of the line console
definition and provide individual contexts for each user.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[console] Exclude text-based UI output from logfile-based consoles
The output from text-based user interfaces such as the "config"
command is not generally meaningful for logfile-based consoles such as
syslog and vmconsole.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[console] Allow usage to be defined independently for each console
Add the concept of a "console usage", such as "standard output" or
"debug messages". Allow usages to be associated with each console
independently. For example, to send debugging output via the serial
port, while preventing it from appearing on the local console:
#define CONSOLE_SERIAL CONSOLE_USAGE_ALL
#define CONSOLE_PCBIOS ( CONSOLE_USAGE_ALL & ~CONSOLE_USAGE_DEBUG )
If no usages are explicitly specified, then a default set of usages
will be applied. For example:
#define CONSOLE_SERIAL
will have the same affect as
#define CONSOLE_SERIAL CONSOLE_USAGE_ALL
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[tls] Treat handshake digest algorithm as a session parameter
Simplify code by recording the active handshake digest algorithm as a
session parameter. (Note that we must still accumulate digests for
all supported algorithms, since we don't know which digest will
eventually be used until we receive the Server Hello.)
Signed-off-by: Michael Brown <mcb30@ipxe.org>
TLSv1.1 and earlier use a hybrid of MD5 and SHA-1 to generate digests
over the handshake messages. Formalise this as a separate digest
algorithm "md5+sha1".
Signed-off-by: Michael Brown <mcb30@ipxe.org>