The physical function defaults to operating in "PXE mode" after a
power-on reset. In this mode, receive descriptors are fetched and
written back as single descriptors. In normal (non-PXE mode)
operation, receive descriptors are fetched and written back only as
complete cachelines unless an interrupt is raised.
There is no way to return to PXE mode from non-PXE mode, and there is
no way for the virtual function driver to operate in PXE mode.
Choose to operate in non-PXE mode. This requires us to trick the
hardware into believing that it is raising an interrupt, so that it
will not defer writing back receive descriptors until a complete
cacheline (i.e. four packets) have been consumed. We do so by
configuring the hardware to use MSI-X with a dummy target location in
place of the usual APIC register.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[intelxl] Split out ring creation from context programming
The virtual function driver will use the same transmit and receive
descriptor ring structures, but will not itself construct and program
the ring context. Split out ring creation and destruction from the
programming of the ring context, to allow code to be shared between
physical and virtual function drivers.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[intelxl] Allow for arbitrary placement of ring tail registers
The virtual function transmit and receive ring tail register offsets
do not match those of the physical function. Allow the tail register
offsets to be specified separately.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The physical function driver does not allow the virtual function to
request the use of 16-byte receive descriptors. Switch to using
32-byte receive descriptors.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[intelxl] Provide a mechanism for handling "send to VF" events
Provide a weak stub function for handling the "send to VF" event used
for communications between the physical and virtual function drivers.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[intelxl] Allow admin cookie to hold extended opcode and return code
The "send to PF" and "send to VF" admin queue descriptors (ab)use the
cookie field to hold the extended opcode and return code values.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
A virtual function reset is triggered via an admin queue command and
will reset the admin queue configuration registers. Allow the admin
queues to be reinitialised after such a reset, without requiring the
overhead (and potential failure paths) of freeing and reallocating the
queues.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[intelxl] Use one admin queue buffer per admin queue descriptor
We currently use a single data buffer shared between all admin queue
descriptors. This works for the physical function driver since we
have at most one command in progress and only a single event (which
does not use a data buffer).
The communication path between the physical and virtual function
drivers uses the event data buffer, and there is no way to prevent a
solicited event (i.e. a response to a request) from being overwritten
by an unsolicited event (e.g. a link status change).
Provide individual data buffers for each admin event queue descriptor
(and for each admin command queue descriptor, for the sake of
consistency).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[intelxl] Allow for virtual function admin queue register maps
The register map for the virtual functions appears to have been
constructed using a random number generator.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[intelxl] Use VLAN tag in receive descriptor if present
The physical function driver does not allow the virtual function to
request that VLAN tags are left unstripped. Extract and use the VLAN
tag from the receive descriptor if present.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[vlan] Provide vlan_netdev_rx() and vlan_netdev_rx_err()
The Hermon driver uses vlan_find() to identify the appropriate VLAN
device for packets that are received with the VLAN tag already
stripped out by the hardware. Generalise this capability and expose
it for use by other network card drivers.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The Intel 40 Gigabit Ethernet virtual functions support only MSI-X
interrupts, and will write back completed interrupt descriptors only
when the device attempts to raise an interrupt (or when a complete
cacheline of receive descriptors has been completed).
We cannot actually use MSI-X interrupts within iPXE, since we never
have ownership of the APIC. However, an MSI-X interrupt is
fundamentally just a DMA write of a single dword to an arbitrary
address. We can therefore configure the device to "raise" an
interrupt by writing a meaningless value to an otherwise unused memory
location: this is sufficient to trigger the receive descriptor
writeback logic.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The first adapters in this family are X2522-10, X2522-25, X2541 and
X2542.
These no longer use PCI BAR 0 for I/O, but use that for memory. In
other words, BAR 2 on SFN8xxx adapters now becomes BAR 0.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[ethernet] Use standard 1500 byte MTU unless explicitly overridden
Devices that support jumbo frames will currently default to the
largest possible MTU. This assumption is valid for virtual adapters
such as virtio-net, where the MTU must have been configured by a
system administrator, but is unsafe in the general case of a physical
adapter.
Default to the standard Ethernet MTU, unless explicitly overridden
either by the driver or via the ${netX/mtu} setting.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add the function mii_find() in order to locate the PHY address.
Signed-off-by: Sylvie Barlow <sylvie.c.barlow@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[mii] Separate concepts of MII interface and MII device
We currently have no generic concept of a PHY address, since all
existing implementations simply hardcode the PHY address within the
MII access methods.
A bit-bashing MII interface will need to be provided with an explicit
PHY address in order to generate the correct waveform. Allow for this
by separating out the concept of a MII device (i.e. a specific PHY
address attached to a particular MII interface).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
This is required to work around a bug in some firmware versions.
Signed-off-by: Ameer Mahagneh <ameerm@mellanox.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[efi] Provide Map_Mem() and associated UNDI callbacks
Some drivers are known to call the optional Map_Mem() callback without
first checking that the callback exists. Provide a usable basic
implementation of Map_Mem() along with the other callbacks that become
mandatory if Map_Mem() is provided.
Note that in theory the PCI I/O protocol is allowed to require
multiple calls to Map(), with each call handling only a subset of the
overall mapped range. However, the reference implementation in EDK2
assumes that a single Map() will always suffice, so we can probably
make the same simplifying assumption here.
Tested with the Intel E3522X2.EFI driver (which, incidentally, fails
to cleanly remove one of its mappings).
Originally-implemented-by: Maor Dickman <maord@mellanox.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Remove the global variable shomron_nodnic_supported, since it may have
different values for different PCI devices.
Originally-fixed-by: Mohammed Taha <mohammedt@mellanox.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[xhci] Consume event TRB before reporting completion to USB core
Reporting a completion via usb_complete() will pass control outside
the scope of xhci.c, and could potentially result in a further call to
xhci_event_poll() before returning from usb_complete(). Since we
currently update the event consumer counter only after calling
usb_complete(), this can result in duplicate completions and
consequent corruption of the submission TRB ring structures.
Fix by updating the event ring consumer counter before passing control
to usb_complete().
Reported-by: Andreas Hammarskjöld <junior@2PintSoftware.com>
Tested-by: Andreas Hammarskjöld <junior@2PintSoftware.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[intel] Work around broken reset mechanism in i219 devices
The i219 appears to have a seriously broken reset mechanism. After
any transmit or receive activity, resetting the card will break both
the transmit and receive datapaths until the next PCI bus reset.
The Linux and BSD drivers include a convoluted workaround authored by
Intel which involves setting a bit in the undocumented FEXTNVM11
register, then transmitting a dummy 512-byte packet containing garbage
data, then reconfiguring the receive descriptor prefetch thresholds
and temporarily reenabling the receive datapath. The comments in the
Intel fix do not even remotely match what the code actually does, and
the code accidentally leaves the transmitter enabled after use.
Experimentation suggests that an equivalent fix is to simply set the
undocumented bit in FEXTNVM11 before enabling the transmit or receive
descriptor rings.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[xhci] Assume an invalid PSI table if any invalid PSI value is observed
Invalid protocol speed ID tables appear to be increasingly common in
the wild, to the point that it is infeasible to apply an explicit
XHCI_BAD_PSIV flag for each offending PCI device ID.
Fix by assuming an invalid PSI table as soon as any invalid value is
reported by the hardware.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[ena] Fix spurious uninitialised variable warning on older versions of gcc
Some older versions of gcc (observed with gcc 4.7.2) report a spurious
uninitialised variable warning in ena_get_device_attributes(). Work
around this warning by manually inlining the relevant code (which has
only a single call site).
Reported-by: xbgmsharp <xbgmsharp@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Most drivers do not utilise an MII interface, since the link state is
typically available directly from a memory-mapped register.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The SnpDxe driver raises the task priority level to TPL_CALLBACK when
calling the UNDI entry point. This does not appear to be a documented
requirement, but we should probably match the behaviour of SnpDxe to
minimise surprises to third party code.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The tap driver can retrieve a potentially unlimited number of packets
in a single poll. This can lead to heap exhaustion under heavy load.
Fix by imposing an artificial receive quota (as already used in other
drivers without natural receive limits).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The UEFI specification has an implicit and demonstrably incorrect
requirement (in the Mem_IO() calling convention) that any UNDI network
device has at most one memory BAR and one I/O BAR.
Some UEFI platforms have been observed to report the existence of
non-existent additional I/O BARs, causing iPXE to select the wrong
BAR. This problem does not affect the SnpDxe driver, since that
driver will always choose the lowest numbered existent BAR of each
type.
Adjust iPXE's behaviour to match that of SnpDxe, i.e. to always select
the lowest numbered BAR(s).
Debugged-by: Andreas Hammarskjöld <junior@2PintSoftware.com>
Debugged-by: Adklei <adklei@realtek.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[smsc75xx] Expose functionality shared with LAN78xx devices
The LAN78xx datapath is essentially identical to that of the SMSC75xx.
Expose the transmit, poll, and bulk IN endpoint operations to allow
for reuse by the LAN78xx driver.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[smscusb] Allow for alternative PHY register layouts
The LAN78xx PHY interrupt source and mask registers do not match those
used by the SMSC75xx and SMSC95xx.
Signed-off-by: Michael Brown <mcb30@ipxe.org>