SRP is the SCSI RDMA Protocol. It allows for a method of SAN booting
whereby the target is responsible for reading and writing data using
Remote DMA directly to the initiator's memory. The software initiator
merely sends and receives SCSI commands; it never has to touch the
actual data.
[infiniband] Add last_opened_ibdev(), analogous to last_opened_netdev()
The minimal-surprise behaviour, when no explicit SRP initiator device
is specified, will probably be to use the most recently opened
Infiniband device. This matches our behaviour with using the most
recently opened net device for PXE, iSCSI, AoE, NBI, etc.
[infiniband] Add a "communication-managed reliable connection" protocol
SRP over Infiniband uses a protocol whereby data is sent via a
combination of the CM private data fields and the RC queue pair
itself. This seems sufficiently generic that it's worth having
available as a separate protocol.
The ACK timeout determines how long we take to notice a failed
Reliable Connection. Reducing it from the arbitrary value of 19 down
to 14 reduces the individual ACK timeout from around 2.1s to 67ms;
this in turn reduces the time to tear down and re-establish a broken
SRP session from around 30s to around 1s.
[hermon] Randomise the high-order bits of queue pair numbers
The Infiniband Communication Manager will refuse to establish a
connection if it believes the connection is already established.
There is no immediately obvious way to ask it to tear down the
existing connection and replace it; to issue a DREP we would need to
know the local and remote communication IDs used for the previous
connection setup.
We can work around this by randomising the high-order bits of the
queue pair number; these have no significance to the hardware, but are
sufficient to convince the IB CM that this is a different connection.
[infiniband] Handle duplicate Communication Management REPs
We will terminate our transaction as soon as we receive the first CM
REP, since that provides all the state that we need. However, the
peer may resend the REP if it didn't see our RTU, and if we don't
respond with another RTU we risk being disconnected. (This protocol
appears not to handle retries gracefully.)
Fix by adding a management agent that will listen for these duplicate
REPs and send back an RTU.
When a probe found no results, the list head of beacons would not be
freed, leaking 16 bytes of memory per probe.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
Previously the maximum packet length was computed using an erroneous
understanding of the role of the MIC field in TKIP-encrypted packets.
The field is actually considered to be part of the MSDU (encrypted and
fragmented data), not the MPDU (container for each encrypted
fragment). As such its size does not contribute to cryptographic
overhead outside the data field's size limitations. The net result is
that the previous maximum packet length value was 4 bytes too long;
fix it to the correct value of 2352.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
[802.11] Set channels early on to avoid tuning to an undefined channel
Some cards (such as ath5k) always need to tune to a particular channel
when they are reset; the reset may happen upon open(), which is before
the channels array would be set up (in prepare_probe()). Avoid tuning
the card to an inconsistent state by copying the hardware
supported-channels array to the 802.11 device's allowable-channels
array even before channels are "properly" set up.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
[802.11] Enhance support for driver PHY differences
The prior net80211 model of physical-layer behavior for drivers was
overly simplistic and limited the drivers that could be written. To
be more flexible, split the driver-provided list of supported rates by
band, and add a means for specifying a list of supported channels.
Allow drivers to specify a hardware channel value that will be tied to
uses of the channel.
Expose net80211_duration() to drivers, and make the rate it uses in
its computations configurable, so that it can be used in calculating
durations that must be set in hardware for ACK and CTS packets. Add
net80211_cts_duration() for the common case of calculating the
duration for a CTS packet.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
[geniso] Emit proper error message for incorrect location of ISOLINUX_BIN
If isolinux.bin is not installed in the expected location the error
message shown is slightly misleading.
Signed-off-by: Vibi Sreenivasan <vibi_sreenivasan@cms.com>
Signed-off-by: Michael Brown <mcb30@etherboot.org>
[infiniband] Add the concept of a management interface
A management interface is the component through which both local and
remote management agents are accessed.
This new implementation of a management interface allows for the user
to react to timed-out transactions, and also allows for cancellation
of in-progress transactions.
[romprefix] Cope with PnP BIOSes that fail to set %es:%di on entry
Some BIOSes support the BIOS Boot Specification (BBS) but fail to set
%es:%di correctly when calling the option ROM initialisation entry
point. This causes gPXE to identify the BIOS as non-PnP (and so
non-BBS), leaving the user unable to control the boot order.
Fix by scanning for the $PnP signature ourselves, rather than relying
on the BIOS having passed in %es:%di correctly.
Tested-by: Helmut Adrigan <helmut.adrigan@chello.at>
[infiniband] Change IB_{QPN,QKEY,QPT} names from {SMA,GMA} to {SMI,GSI}
The IBA specification refers to management "interfaces" and "agents".
The interface is the component that connects to the queue pair and
sends and receives MADs; the agent is the component that constructs
the reply to the MAD.
Rename the IB_{QPN,QKEY,QPT} constants as a first step towards making
this separation in gPXE.
[build] Mark __intel_new_proc_init with __libgcc rather than cdecl
The function __intel_new_proc_init() (called implicitly when building
using icc) is marked with __attribute__((cdecl)). This breaks
building on x86_64, where cdecl is meaningless.
Fix by replacing with the existing __libgcc macro, which is already
defined to be "__attribute__((cdecl))" for i386 builds and empty for
x86_64 builds.
[legal] Add the MIT and ISC licenses to licence.pl
The MIT and ISC licenses are legally equivalent to the bsd2 license,
but with slightly different verbiage.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Michael Brown <mcb30@etherboot.org>
[pxe] Dual-license pxe_api.h under the MIT license
pxe_api.h is just a description of API functions, it's actively
undesirable to have more implementations than necessary. Allowing it
under the MIT license lets the Syslinux libraries use it.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Michael Brown <mcb30@etherboot.org>
[tcp] Avoid printf format warnings on some compilers
In several places, we currently use size_t to represent a difference
between TCP sequence numbers. This can cause compiler warnings
relating to printf format specifiers, since the result of
(uint32_t+size_t) may be an unsigned long on some compilers.
Fix by using uint32_t for all variables that represent a difference
between TCP sequence numbers.
Tested-by: Joshua Oreman <oremanj@xenon.get-linux.org>
[build] Allow safe concurrent builds of .iso, .liso and .sdsk targets
The geniso, genliso and gensdsk scripts contain hard-coded temporary
directory names, and so could potentially collide with each other when
run as part of a concurrent build (e.g. "make -j 4").
Fix by using mktemp to generate suitable temporary directory names.
We add a syslinux floppy disk type using parts of the genliso script.
This floppy image cat be dd'ed to a physical floppy or used in
instances where a virtual floppy with an mountable DOS filesystem is
useful.
We also modify the genliso script to only generate .liso images
rather than creating images depending on how it is called.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
[802.11] Add support for 802.11 devices with software MAC layer
This is required for all modern 802.11 devices, and allows drivers
to be written for them with minimally more effort than is required
for a wired NIC.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
Modified-by: Michael Brown <mcb30@etherboot.org>
[hermon] Allow software GMA to receive packets destined for QP1
The Linux IB Communication Manager will always send MADs to QP1,
rather than back to the originating QP. On Hermon, QP1 is by default
handled by the embedded firmware. We can change this, but the cost is
that we have to handle both QP0 and QP1 (i.e. we have to provide SMA
as well as GMA service in software), and we have to use MLX queues
rather than standard UD queues (i.e. we have to construct the UD
datagrams by hand).
There doesn't seem to be any viable way around this situation, ugly
though it is.
[infiniband] Add infrastructure for RC queue pairs
Queue pairs are now assumed to be created in the INIT state, with a
call to ib_modify_qp() required to bring the queue pair to the RTS
state.
ib_modify_qp() no longer takes a modification list; callers should
modify the relevant queue pair parameters (e.g. qkey) directly and
then call ib_modify_qp() to synchronise the changes to the hardware.
The packet sequence number is now a property of the queue pair, rather
than of the device.
Each queue pair may have an associated address vector. For RC queue
pairs, this is the address vector that will be programmed in to the
hardware as the remote address. For UD queue pairs, it will be used
as the default address vector if none is supplied to ib_post_send().
[infiniband] Allow MAD handlers to indicate response via return value
Now that MAD handlers no longer return a status code, we can allow
them to return a pointer to a MAD structure if and only if they want
to send a response. This provides a more natural and flexible
approach than using a "response method" field within the handler's
descriptor.
[infiniband] Remove the return status code from MAD handlers
MAD handlers have to set the status fields within the MAD itself
anyway, in order to provide a meaningful response MAD; the additional
gPXE return status code is just noise.
Note that we probably don't need to ever explicitly set the status to
IB_MGMT_STATUS_OK, since it should already have this value from the
request. (By not explicitly setting the status in this way, we can
safely have ib_sma_set_xxx() call ib_sma_get_xxx() in order to
generate the GetResponse MAD without worrying that ib_sma_get_xxx()
will clear any error status set by ib_sma_set_xxx().)
[infiniband] Allow external QPN to differ from real QPN
Most IB hardware seems not to allow allocation of the genuine QPNs 0
and 1, so allow for the externally-visible QPN (as constructed and
parsed by ib_packet, where used) to differ from the real
hardware-allocated QPN.