[infiniband] Add the concept of an Infiniband upper-layer driver
Replace the explicit calls from the Infiniband core to the IPoIB layer
with the general concept of an Infiniband upper-layer driver
(analogous to a PCI driver) which can create arbitrary devices on top
of Infiniband devices.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[netdevice] Add the concept of a network upper-layer driver
Add the concept of a network upper-layer driver, which can create
arbitrary devices on top of network devices.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[retry] Hold reference while timer is running and during expiry callback
Guarantee that a retry timer cannot go out of scope while the timer is
running, and provide a guarantee to the expiry callback that the timer
will remain in scope during the entire callback (similar to the
guarantee provided to interface methods).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[xfer] Generalise metadata "whence" field to "flags" field
iPXE has never supported SEEK_END; the usage of "whence" offers only
the options of SEEK_SET and SEEK_CUR and so is effectively a boolean
flag. Further flags will be required to support additional metadata
required by the Fibre Channel network model, so repurpose the "whence"
field as a generic "flags" field.
xfer_seek() has always been used with SEEK_SET, so remove the "whence"
field altogether from its argument list.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[build] Fix misaligned table entries when using gcc 4.5
Declarations without the accompanying __table_entry cause misalignment
of the table entries when using gcc 4.5. Fix by adding the
appropriate __table_entry macro or (where possible) by removing
unnecessary forward declarations.
Signed-off-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[compiler] Prevent empty weak function stubs from being removed
Even with the noinline specifier added by commit 1a260f8, gcc may skip
calls to non-inlinable functions that it knows have no side
effects. This caused the get_cached_dhcpack() call in start_dhcp(),
the weak stub of which has no code in its body, to be removed,
preventing cached DHCP from working.
Fix by adding a __keepme macro to compiler.h expanding to asm(""), as
recommended by gcc's info page, and using it in the weak stub for
get_cached_dhcpack().
Reported-by: Aaron Brooks <aaron@brooks1.net>
Tested-by: Aaron Brooks <aaron@brooks1.net>
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
When we received an encrypted packet, after replacing it with its
decrypted version and freeing the encrypted original, we would
continue to look at the header of the now-freed original packet. Fix
by moving the header pointer to point at the decrypted packet instead.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The workhorse function for detecting 802.11 security was still named
_sec80211_detect(), a holdover from the old style of weak function
handling, with the result that all networks would be identified as
"unknown".
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[tcp] Allow out-of-order receive queue to be discarded
Allow packets in the receive queue to be discarded in order to free up
memory. This avoids a potential deadlock condition in which the
missing packet can never be received because the receive queue is
occupying all of the memory available for further RX buffers.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Maintain a queue of received packets, so that lost packets need not
result in retransmission of the entire TCP window.
Increase the TCP window to 8kB, in order that we can potentially
transmit enough duplicate ACKs to trigger Fast Retransmission at the
sender.
Using a 10MB HTTP download in qemu-kvm with an artificial drop rate of
1 in 64 packets, this reduces the download time from around 26s to
around 4s.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[netdevice] Provide a test mechanism for discarding packets at random
Setting NETDEV_DISCARD_RATE to a non-zero value will cause one in
every NETDEV_DISCARD_RATE packets to be discarded at random on both
the transmit and receive datapaths, allowing the robustness of
upper-layer network protocols to be tested even in simulation
environments that provide wholly reliable packet transmission.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[tcp] Treat ACKs as sent only when successfully transmitted
iPXE currently forces sending (i.e. sends a pure ACK even in the
absence of fresh data to send) only in response to packets that
consume sequence space or that lie outside of the receive window.
This ignores the possibility that a previous ACK was not actually sent
(due to, for example, the retransmission timer running).
This does not cause incorrect behaviour, but does cause unnecessary
retransmissions from our peer. For example:
1. Peer sends final data packet (ack 106 seq 521..523)
2. We send FIN (seq 106..107 ack 523)
3. Peer sends FIN (ack 106 seq 523..524)
4. We send nothing since retransmission timer is running for our FIN
5. Peer ACKs our FIN (ack 107 seq 524..524)
6. We send nothing since this packet consumes no sequence space
7. Peer retransmits FIN (ack 107 seq 523..524)
8. We ACK peer's FIN (seq 107..107 ack 524)
What should happen at step (6) is that we should ACK the peer's FIN,
since we can deduce that we have never sent this ACK.
Fix by maintaining an "ACK pending" flag that is set whenever we are
made aware that our peer needs an ACK (whether by consuming sequence
space or by sending a packet that appears out of order), and is
cleared only when the ACK packet has been transmitted.
Reported-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[tcp] Use a dedicated timer for the TIME_WAIT state
iPXE currently repurposes the retransmission timer to hold the TCP
connection in the TIME_WAIT state (i.e. waiting for up to 2*MSL in
case we are required to re-ACK our peer's FIN due to a lost ACK).
However, the fact that this timer is running will prevent such an ACK
from ever being sent, since the logic in tcp_xmit() assumes that a
running timer indicates that we ourselves are waiting for an ACK and
so blocks the transmission. (We always wait for an ACK before sending
our next packet, to keep our transmit data path as simple as
possible.)
Fix by using an entirely separate timer for the TIME_WAIT state, so
that packets can still be sent.
Reported-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Every other scalar integer value in struct tcp_connection is in host
byte order; change the definition of local_port to match.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The handshake record in TLS can contain multiple messages.
Originally-fixed-by: Timothy Stack <tstack@vmware.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[interface] Convert all data-xfer interfaces to generic interfaces
Remove data-xfer as an interface type, and replace data-xfer
interfaces with generic interfaces supporting the data-xfer methods.
Filter interfaces (as used by the TLS layer) are handled using the
generic pass-through interface capability. A side-effect of this is
that deliver_raw() no longer exists as a data-xfer method. (In
practice this doesn't lose any efficiency, since there are no
instances within the current codebase where xfer_deliver_raw() is used
to pass data to an interface supporting the deliver_raw() method.)
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[interface] Convert all name-resolution interfaces to generic interfaces
Remove name-resolution as an interface type, and replace
name-resolution interfaces with generic interfaces supporting the
resolv_done() method.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[interface] Convert all job-control interfaces to generic interfaces
Remove job-control as an interface type, and replace job-control
interfaces with generic interfaces supporting the close() method.
(Both done() and kill() are absorbed into the function of close();
kill() is merely close(-ECANCELED).)
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Standardise on using timer_init() to initialise an embedded retry
timer, to match the coding style used by other embedded objects.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Standardise on using ref_init() to initialise an embedded reference
count, to match the coding style used by other embedded objects.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Apart from format specifier fixes there are two changes in proper code:
- Change type of regs in skge_hw to unsigned long
- Cast result of sizeof in myri10ge to uint32_t
Both don't change anything for i386 and should be fine on x86_64.
Signed-off-by: Piotr Jaroszyński <p.jaroszynski@gmail.com>
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[dhcp] Don't consider invalid offers to be duplicates
This fixes a regression in BOOTP support; since BOOTP requests often
have the `siaddr' field set to 0.0.0.0, they would be considered
duplicates of the first zeroed-out offer slot.
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[build] Use weak definitions instead of weak declarations
This removes the need for inline safety wrappers, marginally reducing
the size penalty of weak functions, and works around an apparent
binutils bug that causes undefined weak symbols to not actually be
NULL when compiling with -fPIE (as EFI builds do).
A bug in versions of binutils prior to 2.16 (released in 2005) will
cause same-file weak definitions to not work with those
toolchains. Update the README to reflect our new dependency on
binutils >= 2.16.
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[dhcp] Honor PXEBS_SKIP option in discovery control
It is permissible for a DHCP packet containing PXE options to specify
only "discovery control", instead of the more typical boot menu +
prompt options. This is the strategy used by older versions of
dnsmasq; by specifying the discovery control as PXEBS_SKIP, they cause
vendor PXE ROMs to ignore boot server discovery and just use the
filename and next-server options in the initial (Proxy)DHCP packet.
Modify iPXE to accept this behavior, to be more compatible with the
Intel firmware.
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Tested-by: Kyle Kienapfel <kyle@shadowmage.org>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
PMKID checking is an additional pre-check that helps detect invalid
passphrases before going through the full handshaking procedure. It
takes up some amount of code size, and is not necessary from a
security perspective. It also is implemented improperly by some
routers, which was causing iPXE to give spurious authentication
errors. Remove it for these reasons.
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[tcp] Update received sequence number before delivering received data
iPXE currently updates the TCP sequence number after delivering the
data to the application via xfer_deliver_iob(). If the application
responds to the received data by transmitting more data, this would
result in a stale ACK number appearing in the transmitted packet,
which potentially causes retransmissions and also gives the
undesirable appearance of violating causality (by sending a response
to a message that we claim not to have yet received).
Reported-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some switch configurations will refuse to enable our port unless we
can speak LACP to inform the switch that we are alive. Add a very
simple passive LACP implementation that is sufficient to convince at
least Linux's bonding driver (when tested using qemu attached to a tap
device enslaved to a bond device configured as "mode=802.3ad").
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Access to the gpxe.org and etherboot.org domains and associated
resources has been revoked by the registrant of the domain. Work
around this problem by renaming project from gPXE to iPXE, and
updating URLs to match.
Also update README, LOG and COPYRIGHTS to remove obsolete information.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit 3d9dd93 introduced a regression in HTTP: if a URI without a
path is specified (e.g. http://netboot.me), we send the empty string
as our GET request. Reintroduce an extra slash when uri->path is NULL,
to turn this into the expected GET /.
Reported-by: Kyle Kienapfel <doctor.whom@gmail.com>
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Marty Connor <mdc@etherboot.org>
[dhcp] Keep multiple DHCP offers received, and use them intelligently
Instead of keeping only the best IP and PXE offers, store all of them,
and pick the best to use just before a request is sent. This allows
priority differentiation to work even when lower-priority offers
provide PXE options, and improves robustness at sites with broken PXE
servers intermingled with working ones: when a ProxyDHCP request times
out, instead of giving up, we try the next PXE offer we've received.
It also allows us to avoid breaking up combined IP+PXE offers, which
can be important with some firewall configurations. This behavior
matches that of most vendor PXE ROMs.
Store a reference to the DHCPOFFER packet in the offer structure, so
that when registering settings after a successful ACK we can register
the proxy PXE settings we originally received; this removes the need
for a nonstandard duplicate REQUEST/ACK to port 67 of proxy servers
like dnsmasq that provide PXE options in the OFFER.
Total cost: 450 bytes uncompressed.
Signed-off-by: Marty Connor <mdc@etherboot.org>