[hyperv] Cope with Windows Server 2016 enlightenments
An "enlightened" external bootloader (such as Windows Server 2016's
winload.exe) may take ownership of the Hyper-V connection before all
INT 13 operations have been completed. When this happens, all VMBus
devices are implicitly closed and we are left with a non-functional
network connection.
Detect when our Hyper-V connection has been lost (by checking the
SynIC message page MSR). Reclaim ownership of the Hyper-V connection
and reestablish any VMBus devices, without disrupting any existing
iPXE state (such as IPv4 settings attached to the network device).
Windows Server 2016 will not cleanly take ownership of an active
Hyper-V connection. Experimentation shows that we can quiesce by
resetting only the SynIC message page MSR; this results in a
successful SAN boot (on a Windows 2012 R2 physical host). Choose to
quiesce by resetting (almost) all MSRs, in the hope that this will be
more robust against corner cases such as a stray synthetic interrupt
occurring during the handover.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
In some high-end Azure instances (e.g. NC6) we may receive an
unsolicited VMBUS_OFFER_CHANNEL message for a PCIe pass-through device
some time after completing the bus enumeration. This currently causes
apparently random failures due to unexpected VMBus message types.
Fix by ignoring any unsolicited VMBus messages.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The Windows drivers for VMBus devices are enumerated using the
instance UUID rather than the channel number. Include the instance
UUID within the iPXE device name to allow an iPXE network device to be
more easily associated with the corresponding Windows network device
when debugging.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[hyperv] Require support for VMBus version 3.0 or newer
We require the ability to disconnect from and reconnect to VMBus; if
we don't have this then there is no (viable) way for a loaded
operating system to continue to use any VMBus devices. (There is also
a small but non-zero risk that the host will continue to write to our
interrupt and monitor pages, since the VMBUS_UNLOAD message in earlier
versions is essentially a no-op.)
This requires us to ensure that the host supports protocol version 3.0
(VMBUS_VERSION_WIN8_1). However, we can't actually _use_ protocol
version 3.0, since doing so causes an iSCSI-booted Windows Server 2012
R2 VM to crash due to a NULL pointer dereference in vmbus.sys.
To work around this problem, we first ensure that we can connect using
protocol v3.0, then disconnect and reconnect using the oldest known
protocol.
This deliberately prevents the use of the iPXE native Hyper-V drivers
on older versions of Hyper-V, where we could use our drivers but in so
doing would break the loaded operating system.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[hyperv] Assume that VMBus xfer page ranges correspond to RNDIS messages
The (undocumented) VMBus protocol seems to allow for transfer
page-based packets where the data payload is split into an arbitrary
set of ranges within the transfer page set.
The RNDIS protocol includes a length field within the header of each
message, and it is known from observation that multiple RNDIS messages
can be concatenated into a single VMBus message.
iPXE currently assumes that the transfer page range boundaries are
entirely arbitrary, and uses the RNDIS header length to determine the
RNDIS message boundaries.
Windows Server 2012 R2 generates an RNDIS_INDICATE_STATUS_MSG for an
undocumented and unknown status code (0x40020006) with a malformed
RNDIS header length: the length does not cover the StatusBuffer
portion of the message. This causes iPXE to report a malformed RNDIS
message and to discard any further RNDIS messages within the same
VMBus message.
The Linux Hyper-V driver assumes that the transfer page range
boundaries correspond to RNDIS message boundaries, and so does not
notice the malformed length field in the RNDIS header.
Match the behaviour of the Linux Hyper-V driver: assume that the
transfer page range boundaries correspond to the RNDIS message
boundaries and ignore the RNDIS header length. This avoids triggering
the "malformed packet" error and also avoids unnecessary data copying:
since we now have one I/O buffer per RNDIS message, there is no longer
any need to use iob_split().
Signed-off-by: Michael Brown <mcb30@ipxe.org>