[lkrnprefix] Allow relocation when no initrd is present
Commit 2629b7e ("[pcbios] Inhibit all calls to INT 15,e820 and INT
15,e801 during POST") introduced a regression into .lkrn images when
used with no corresponding initrd.
Specifically, the semantics of the "maximum address for relocation"
value passed to install_prealloc() in %ebp changed so that zero became
a special value meaning "inhibit use of INT 15,e820 and INT 15,e801".
The %ebp value meaing "no upper limit on relocation" was changed from
zero to 0xffffffff, and all prefixes providing fixed values for %ebp
were updated to match the new semantics.
The .lkrn prefix provides the initrd base address as the maximum
address for relocation. When no initrd is present, this address will
be zero, and so will unintentionally trigger the "inhibit INT 15,e820
and INT 15,e801" behaviour.
Fix by explicitly setting %ebp to 0xffffffff if no initrd is present
before calling install_prealloc().
Reported-by: Ján ONDREJ (SAL) <ondrejj@salstar.sk>
Tested-by: Ján ONDREJ (SAL) <ondrejj@salstar.sk>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[romprefix] Display only one "Ctrl-B" prompt per PCI device during POST
If a multifunction PCI device exposes an iPXE ROM via each function,
then each function will display a "Press Ctrl-B to configure iPXE"
prompt, and delay for two seconds. Since a single instance of iPXE
can drive all functions on the multifunction device, this simply adds
unnecessary delay to the boot process.
Fix by inhibiting the "Press Ctrl-B" prompt for all except the first
function on a PCI device.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[pcbios] Inhibit all calls to INT 15,e820 and INT 15,e801 during POST
Many BIOSes do not construct the full system memory map until after
calling the option ROM initialisation entry points. For several
years, we have added sanity checks and workarounds to accommodate
charming quirks such as BIOSes which report the entire 32-bit address
space (including all memory-mapped PCI BARs) as being usable RAM.
The IBM x3650 takes quirky behaviour to a new extreme. Calling either
INT 15,e820 or INT 15,e801 during POST doesn't just get you invalid
data. We could cope with invalid data. Instead, these nominally
read-only API calls manage to trash some internal BIOS state, with the
result that the system memory map is _never_ constructed. This tends
to confuse subsequent bootloaders and operating systems.
[ GRUB 0.97 fails in a particularly amusing way. Someone thought it
would be a good idea for memcpy() to check that the destination memory
region is a valid part of the system memory map; if not, then memcpy()
will sulk, fail, and return NULL. This breaks pretty much every use
of memcpy() including, for example, those inserted implicitly by gcc
to copy non-const initialisers. Debugging is _fun_ when a simple call
to printf() manages to create an infinite recursion, exhaust the
available stack space, and shut down the CPU. ]
Fix by completely inhibiting calls to INT 15,e820 and INT 15,e801
during POST.
We do now allow relocation during POST up to the maximum address
returned by INT 15,88 (which seems so far to always be safe). This
allows us to continue to have a reasonable size of external heap, even
if the PMM allocation is close to the 1MB mark.
The downside of allowing relocation during POST is that we may
overwrite PMM-allocated memory in use by other option ROMs. However,
the downside of inhibiting relocation, when combined with also
inhibiting calls to INT 15,e820 and INT 15,e801, would be that we
might have no external heap available: this would make booting an OS
impossible and could prevent some devices from even completing
initialisation.
On balance, the lesser evil is probably to allow relocation during
POST (up to the limit provided by INT 15,88). Entering iPXE during
POST is a rare operation; on the even rarer systems where doing so
happens to overwrite a PMM-allocated region, then there exists a
fairly simple workaround: if the user enters iPXE during POST and
wishes to exit iPXE, then the user must reboot. This is an acceptable
cost, given the rarity of the situation and the simplicity of the
workaround.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[prefix] Use %cs as implicit parameter to uninstall()
romprefix.S currently calls uninstall() with an invalid value in %ax.
Consequently, base memory is not freed after a ROM boot attempt (or
after entering iPXE during POST).
The uninstall() function is physically present in .text16, and so can
use %cs to determine the .text16 segment address. The .data16 segment
address is not required, since uninstall() is called only by code
paths which set up .data16 to immediately follow .text16.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[tftp] Allow TFTP block size to be controlled via the PXE TFTP API
The PXE TFTP API allows the caller to request a particular TFTP block
size. Since mid-2008, iPXE has appended a "?blksize=xxx" parameter to
the TFTP URI constructed internally; nothing has ever parsed this
parameter. Nobody seems to have cared that this parameter has been
ignored for almost five years.
Fix by using xfer_window(), which provides a fairly natural way to
convey the block size information from the PXE TFTP API to the TFTP
protocol layer.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[undi] Work around specific devices with known broken interrupt behaviour
Some PXE stacks are known to claim that IRQs are supported, but then
never generate interrupts. No satisfactory solution has been found to
this problem; the workaround is to add the PCI vendor and device IDs
to a list of devices which will be treated as simply not supporting
interrupts.
This is something of a hack, since it will generate false positives
for identical devices with a working PXE stack (e.g. those that have
been reflashed with iPXE), but it's an improvement on the current
situation.
Reported-by: Richard Moore <rich@richud.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
At present, loading a bzImage via iPXE requires enough RAM to hold two
copies of each initrd file. Remove this constraint by rearranging the
initrds in place.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
No code from the original source remains within this file; relicense
under GPL2+ with a new copyright notice.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[build] Use -maccumulate-outgoing-args if required by gcc
Current versions of gcc require -maccumulate-outgoing-args if any
sysv_abi functions call ms_abi functions. This requirement is likely
to be lifted in future gcc versions, so test explicitly to see if the
current version of gcc requires -maccumulate-outgoing-args.
This problem is currently masked since the implied
-fasynchronous-unwind-tables (which is the default in current gcc
versions) implies -maccumulate-outgoing-args.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[int13] Do not zero %edx when jumping to a boot sector
Commit 73eb3f1 ("[int13] Zero all possible registers when jumping to a
boot sector") introduced a regression preventing the SAN-booting of
boot sectors which rely upon %dl containing the correct drive number
(such as most CD-ROM boot sectors).
Fix by not zeroing %edx.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[int13] Zero all possible registers when jumping to a boot sector
At least one boot sector (the DUET boot sector used for bootstrapping
EFI from a non-EFI system) fails to initialise the high words of
registers before using them in calculations, leading to undefined
behaviour.
Work around such broken boot sectors by explicitly zeroing the
contents of all registers apart from %cs:%ip and %ss:%sp.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[ipoib] Expose Ethernet-compatible eIPoIB link-layer addresses and headers
Almost all clients of the raw-packet interfaces (UNDI and SNP) can
handle only Ethernet link layers. Expose an Ethernet-compatible link
layer to local clients, while remaining compatible with IPoIB on the
wire. This requires manipulation of ARP (but not DHCP) packets within
the IPoIB driver.
This is ugly, but it's the only viable way to allow IPoIB devices to
be driven via the raw-packet interfaces.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[comboot] Accept only ".cbt" as an extension for COMBOOT images
COMBOOT images are detected by looking for a ".com" or ".cbt" filename
extension. There are widely-used files with a ".com" extension, such
as "wdsnbp.com", which are PXE images rather than COMBOOT images.
Avoid false detection of PXE images as COMBOOT images by accepting
only a ".cbt" extension as indicating a COMBOOT image.
Interestingly, this bug has been present for a long time but was
frequently concealed because the filename was truncated to fit the
fixed-length "name" field in struct image. (PXE binaries ending in
".com" tend to be related to Windows deployment products and so often
use pathnames including backslashes, which iPXE doesn't recognise as a
path separator and so treats as part of a very long filename.)
Commit 1c127a6 ("[image] Simplify image management commands and
internal API") made the image name a variable-length field, and so
exposed this flaw in the COMBOOT image detection algorithm.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[romprefix] Round up PMM allocation sizes to nearest 4kB
Some AMI BIOSes apparently break in exciting ways when asked for PMM
allocations for sizes that are not multiples of 4kB.
Fix by rounding up the image source area to the nearest 4kB. (The
temporary decompression area is already rounded up to the nearest
128kB, to facilitate sharing between multiple iPXE ROMs.)
Reported-by: Itay Gazit <itayg@mellanox.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Similarly to FreeBSD, OpenBSD requires the object format to be
specified as elf_i386_obsd rather than elf_i386.
Reported-by: Jiri B <jirib@devio.us>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[romprefix] Report a pessimistic runtime size estimate
PCI3.0 allows us to report a "runtime size" which can be smaller than
the actual ROM size. On systems that support PMM our runtime size
will be small (~2.5kB), which helps to conserve the limited option ROM
space. However, there is no guarantee that the PMM allocation will
succeed, and so we need to report the worst-case runtime size in the
PCI header.
Move the "shrunk ROM size" field from the PCI header to a new "iPXE
ROM header", allowing it to be accessed by ROM-manipulation utilities
such as disrom.pl.
Reported-by: Anton D. Kachalov <mouse@yandex-team.ru>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
PXENV_FILE_CMDLINE is an iPXE extension, and will not be supported by
most PXE stacks. Do not report any errors to the user, since in
almost all cases the error will mean simply "not loaded by iPXE".
Reported-by: Patrick Domack <patrickdk@patrickdk.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Attempt to restore the network device to the state it was in prior to
calling the NBP. This simplifies the task of taking follow-up action
in an iPXE script.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[pxeprefix] Place temporary stack after iPXE binary
Some BIOSes (observed on a Supermicro system with an AMI BIOS) seem to
use the area immediately below 0x7c00 to store data related to the
boot process. This data is currently liable to be overwritten by the
temporary stack used while decompressing and installing iPXE.
Try to avoid any such problems by placing the temporary stack
immediately after the loaded iPXE binary. Any memory used by the
stack could then potentially have been overwritten anyway by a larger
binary.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[bzimage] Update setup_move_size only for protocol versions 2.00 and 2.01
The setup_move_size field is not defined in protocol versions earlier
than 2.00 (and is obsolete in versions later than 2.01). In binaries
using versions earlier than 2.00, the relevant location is likely to
contain executable code.
Interestingly, this bug has been present since support for pre-2.00
protocol versions was added in 2009, and has been unexpectedly
modifying the memtest86+ code fragment:
mov $0x92, %dx
inb %dx, %al
Fortuitously, the modification exactly overwrote the value loaded into
%dx, and so the net effect was limited to causing Fast Gate A20
detection to always fail.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[undi] Align the received frame payload for faster processing
The undinet driver always has to make a copy of the received frame
into an I/O buffer. Align this copy sensibly so that subsequent
operations are as fast as possible.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[tcpip] Add faster algorithm for calculating the TCP/IP checksum
The generic TCP/IP checksum implementation requires approximately 10
CPU clocks per byte (as measured using the TSC). Improve this to
approximately 0.5 CPU clocks per byte by using "lodsl ; adcl" in an
unrolled loop.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[tcpip] Allow for architecture-specific TCP/IP checksum routines
Calculating the TCP/IP checksum on received packets accounts for a
substantial fraction of the response latency.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The "rep" prefix can be used with an iteration count of zero, which
allows the variable-length memcpy() to be implemented without using
any conditional jumps.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[romprefix] Treat 0xffffffff as an error return from PMM
PMM defines the return code 0xffffffff as meaning "unsupported
function". It's hard to imagine a PMM BIOS that doesn't support
pmmAllocate(), but apparently such things do exist.
Signed-off-by: Michael Brown <mcb30@ipxe.org>