[zbin] Fix check for existence of most recent output byte
The code in lzma_literal() checks to see if we are at the start of the
compressed input data in order to determine whether or not a most
recent output byte exists. This check is incorrect, since
initialisation of the decompressor will always consume the first five
bytes of the compressed input data.
Fix by instead checking whether or not we are at the start of the
output data stream. This is, in any case, a more logical check.
This issue was masked during development and testing since virtual
machines tend to zero the initial contents of RAM; the spuriously-read
"most recent output byte" is therefore likely to already be a zero
when running in a virtual machine.
Reported-by: Robin Smidsrød <robin@smidsrod.no>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[zbin] Allow decompressor to generate debug output via BIOS console
The 0xe9 debug port exists only on virtual machines. Provide an
option to print debug output on the BIOS console, to allow for
debugging on real hardware.
Note that this option can be used only if the decompressor is called
in flat real mode; the easiest way to achieve this is to build with
DEBUG=libprefix.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[prefix] Call decompressor in flat real mode when DEBUG=libprefix is enabled
Allow the decompressor the option of generating debugging output via
the BIOS console by calling it in flat real mode (rather than 16-bit
protected mode) when libprefix.S is built with debugging enabled.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[zbin] Perform extra normalisation after completing decompression
LZMA performs an extra normalisation after decompression is complete,
which does not affect the output but may consume an extra byte from
the input (and so may affect which byte is identified as being the
start of the next block).
Reported-by: Robin Smidsrød <robin@smidsrod.no>
Tested-by: Robin Smidsrød <robin@smidsrod.no>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
LZMA provides significantly better compression (by ~15%) than the
current NRV2B algorithm.
We use a raw LZMA stream (aka LZMA1) to avoid the need for code to
parse the LZMA2 block headers. We use parameters {lc=2,lp=0,pb=0} to
reduce the stack space required by the decompressor to acceptable
levels (around 8kB). Using lc=3 or pb=2 would give marginally better
compression, but at the cost of substantially increasing the required
stack space.
The build process now requires the liblzma headers to be present on
the build system, since we do not include a copy of an LZMA compressor
within the iPXE source tree. The decompressor is written from scratch
(based on XZ Embedded) and is entirely self-contained within the
iPXE source.
The branch-call-jump (BCJ) filter used to improve the compressibility
is specific to iPXE. We choose not to use liblzma's built-in BCJ
filter since the algorithm is complex and undocumented. Our BCJ
filter achieves approximately the same results (on typical iPXE
binaries) with a substantially simpler algorithm.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[prefix] Use .bss16 as temporary stack space for calls to install_block
Some decompression algorithms (e.g. LZMA) require large amounts of
temporary stack space, which may not be made available by all
prefixes. Use .bss16 as a temporary stack for the duration of the
calls to install_block (switching back to the external stack before we
start making calls into code which might access variables in .bss16),
and allow the decompressor to define a global symbol to force a
minimum value on the size of .bss16.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[build] Use PRODUCT_SHORT_NAME for end-user visible strings
Use PRODUCT_SHORT_NAME instead of a hardcoded "iPXE" for strings which
are typically shown in the user interface.
Note that this only allows for customisation of the user interface.
Where the "iPXE" string serves a technical purpose (such as in the
HTTP User-Agent), the string cannot be customised.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[mromprefix] Allow for .mrom images larger than 128kB
The .mrom payload has a code type of 0xff and so the initialisation
length field (single byte at offset 0x02) does not need to be
present. Use only the PCI header's image length field, which allows
the .mrom payload to be up to 32MB in size.
Inspired-by: Swift Geek <swiftgeek@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[mromprefix] Use PCI length field to obtain length of individual images
mromprefix.S currently uses the initialisation length field (single
byte at offset 0x02) to determine the length of a ROM image within a
multi-image ROM BAR. For PCI ROM images with a code type other than
0, the initialisation length field may not be present.
Fix by using the PCI header's image length field instead.
Inspired-by: Swift Geek <swiftgeek@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The build process has for a long time assumed that every ROM is a PCI
ROM, and will always include the PCI header and PCI-related
functionality (such as checking the PCI BIOS version, including the
PCI bus:dev.fn address within the ROM product name string, etc.).
While real ISA cards are no longer in use, some virtualisation
environments (notably VirtualBox) have support only for ISA ROMs.
This can cause problems: in particular, VirtualBox will call our
initialisation entry point with random garbage in %ax, which we then
treat as the PCI bus:dev.fn address of the autoboot device: this
generally prevents the default boot sequence from using any network
devices.
Create .isarom and .pcirom prefixes which can be used to explicitly
specify the type of ROM to be created. (Note that the .mrom prefix
always implies a PCI ROM, since the .mrom mechanism relies on
reconfiguring PCI BARs.)
Make .rom a magic prefix which will automatically select the
appropriate PCI or ISA ROM prefix for ROMs defined via a PCI_ROM() or
ISA_ROM() macro. To maintain backwards compatibility, we default to
building a PCI ROM for anything which is not directly derived from a
PCI_ROM() or ISA_ROM() macro (e.g. bin/intel.rom).
Add a selection of targets to "make everything" to ensure that the
(relatively obscure) ISA ROM build process is included within the
per-commit QA checks.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Since some PnP BIOSes fail to set %es:di to point to the PnP signature
on entry, we identify a PnP BIOS by scanning through the top 64kB of
base memory looking for the PnP structure. We therefore don't
actually use the values of %es:di provided to the initialisation entry
point, and so there is no need to preserve them.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[lkrnprefix] Make real-mode setup code relocatable
The bzImage boot protocol allows the real-mode code to be loaded at
any segment within base memory. (The fact that both iPXE and recent
versions of Syslinux will load the real-mode code at 1000:0000 is a
coincidence; it is not guaranteed by the specification.)
Fix by making the code relocatable.
Reported-by: Andrew Stuart <andrew@shopcusa.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The .lkrn prefix currently provides a zImage kernel with unused setup
sectors and the whole iPXE binary placed within the "protected mode
kernel" portion of the zImage.
The work carried out years ago to create the .mrom format provides a
mechanism allowing the iPXE binary to be split into a small real-mode
header and a larger payload. This neatly matches the way that a
bzImage is loaded: the "setup sectors" can contain the header and the
"protected mode kernel" can contain the payload.
This removes the size restrictions on an iPXE .lkrn image (and hence
on derived image formats such as .iso).
Also remove obsolete copyright information, since none of the original
code or functionality now remains.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[librm] Speed up real-to-protected mode transition under KVM
Ensure that all segment registers have zero in the low two bits before
transitioning to protected mode. This allows the CPU state to
immediately be deemed to be "valid", and eliminates the need for any
further emulated instructions.
Load the protected-mode interrupt descriptor table after switching to
protected mode, since this avoids triggering an EXCEPTION_NMI and
corresponding VM exit.
This reduces the time taken by real_to_prot under KVM by around 50%.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[librm] Use genuine real mode to accelerate operation in virtual machines
We currently use flat real mode wherever real mode is required. This
guarantees that we will not surprise some unsuspecting external caller
which has carefully set up flat real mode by suddenly reducing the
segment limits to 64kB.
However, operating in flat real mode imposes a severe performance
penalty in some virtualisation environments, since some CPUs cannot
fully virtualise flat real mode and so the hypervisor must fall back
to emulation. In particular, operating under KVM on a pre-Westmere
Intel CPU will be at least an order of magnitude slower, to the point
that there is a visible teletype effect when printing anything to the
BIOS console. (Older versions of KVM used to cheat and ignore the
"flat" part of flat real mode, which masked the problem.)
Switch (back) to using genuine real mode with 64kB segment limits
instead of flat real mode. Hopefully this won't break anything.
Add an explicit switch to flat real mode before returning to the BIOS
from the ROM prefix, since we know that a PMM BIOS will call the ROM
initialisation point (and potentially the BEV) in flat real mode.
As noted in previous commit messages, it is not possible to restore
the real-mode segment limits after a transition to protected mode,
since there is no way to know which protected-mode segment descriptor
was originally used to initialise the limit portion of the segment
register.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[romprefix] Do not clobber stack segment when returning to BIOS
Commit c429bf0 ("[romprefix] Store boot bus:dev.fn address as autoboot
device location") introduced a regression by using register %cx to
temporarily hold the PCI bus:dev.fn address, despite the fact that %cx
was already being used to hold the stored BIOS stack segment.
Consequently, when returning to the BIOS after a failed or cancelled
boot attempt, iPXE would end up calling INT 18 with the stack segment
set equal to the PCI bus:dev.fn address. Writing to essentially
random areas of memory tends to upset even the more robust BIOSes.
Fix by using register %ax to temporarily hold the PCI bus:dev.fn
address.
Reported-by: Anton D. Kachalov <mouse@yandex-team.ru>
Tested-by: Anton D. Kachalov <mouse@yandex-team.ru>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[romprefix] Store boot bus:dev.fn address as autoboot device location
Per the BIOS Boot Specification, the initialization phase of the ROM
is called with the PFA (PCI Function Address) in the %ax register.
The intention is that the ROM code will store that device address
somewhere and use it for booting from that device when the Boot Entry
Vector (BEV) is called. iPXE does store the PFA, but doesn't use it
to select the boot network device. This renders BIOS IPL lists fairly
ineffective.
Fix by using the BBS-specified bus:dev.fn address as the autoboot
device location.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[romprefix] Allow ROM banner timeout to be configured independently
iPXE currently prints a "Press Ctrl-B" banner twice: once when the ROM
is first called for initialisation and again if we attempt to boot
from the ROM. This slows boot, especially when the NIC is not the
primary boot device. Tools such as libguestfs make use of QEMU VMs
for performing maintenance on disk images and may make use of NICs in
the VM for network support. If iPXE introduces a static init-time
delay, that directly translates to increased runtime for the tools.
Fix by allowing the ROM banner timeout to be configured independently
of the main banner timeout.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[pxe] Always retrieve cached DHCPACK and apply to relevant network device
When chainloading, always retrieve the cached DHCPACK packet from the
underlying PXE stack, and apply it as the original contents of the
"net<X>.dhcp" settings block. This allows cached DHCP settings to be
used for any chainloaded iPXE binary (not just undionly.kkpxe).
This change eliminates the undocumented "use-cached" setting. Issuing
the "dhcp" command will now always result in a fresh DHCP request.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[romprefix] Fix incorrect pointer offset in undiloader.S
Commit 2422647 ("[prefix] Allow prefix to specify an arbitrary maximum
address for relocation") introduced a regression into the UNDI ROM
loader by preserving an extra register on the stack without modifying
the %sp-relative addresses used in the routine.
Fix by correcting the %sp-relative addresses to allow for the extra
preserved variable.
Signed-off-by: Frediano Ziglio <frediano.ziglio@citrix.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[romprefix] Report failure cause when unable to open payload
Report the cause of the failure when we are unable to open the .mrom
payload. There are two possible failure cases:
- Unable to find a suitable memory BAR to borrow (e.g. if the NIC
doesn't have a memory BAR that is at least as large as the
expansion ROM BAR, or if the memory BAR has been assigned a 64-bit
address which won't fit into the 32-bit expansion ROM BAR). This
will be reported as "BABABABA".
- Unable to find correct ROM image within the BAR. This will be
reported as the address (within the borrowed BAR) at which we first
fail to find a valid 55AA signature.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[lkrnprefix] Allow relocation when no initrd is present
Commit 2629b7e ("[pcbios] Inhibit all calls to INT 15,e820 and INT
15,e801 during POST") introduced a regression into .lkrn images when
used with no corresponding initrd.
Specifically, the semantics of the "maximum address for relocation"
value passed to install_prealloc() in %ebp changed so that zero became
a special value meaning "inhibit use of INT 15,e820 and INT 15,e801".
The %ebp value meaing "no upper limit on relocation" was changed from
zero to 0xffffffff, and all prefixes providing fixed values for %ebp
were updated to match the new semantics.
The .lkrn prefix provides the initrd base address as the maximum
address for relocation. When no initrd is present, this address will
be zero, and so will unintentionally trigger the "inhibit INT 15,e820
and INT 15,e801" behaviour.
Fix by explicitly setting %ebp to 0xffffffff if no initrd is present
before calling install_prealloc().
Reported-by: Ján ONDREJ (SAL) <ondrejj@salstar.sk>
Tested-by: Ján ONDREJ (SAL) <ondrejj@salstar.sk>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[romprefix] Display only one "Ctrl-B" prompt per PCI device during POST
If a multifunction PCI device exposes an iPXE ROM via each function,
then each function will display a "Press Ctrl-B to configure iPXE"
prompt, and delay for two seconds. Since a single instance of iPXE
can drive all functions on the multifunction device, this simply adds
unnecessary delay to the boot process.
Fix by inhibiting the "Press Ctrl-B" prompt for all except the first
function on a PCI device.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[pcbios] Inhibit all calls to INT 15,e820 and INT 15,e801 during POST
Many BIOSes do not construct the full system memory map until after
calling the option ROM initialisation entry points. For several
years, we have added sanity checks and workarounds to accommodate
charming quirks such as BIOSes which report the entire 32-bit address
space (including all memory-mapped PCI BARs) as being usable RAM.
The IBM x3650 takes quirky behaviour to a new extreme. Calling either
INT 15,e820 or INT 15,e801 during POST doesn't just get you invalid
data. We could cope with invalid data. Instead, these nominally
read-only API calls manage to trash some internal BIOS state, with the
result that the system memory map is _never_ constructed. This tends
to confuse subsequent bootloaders and operating systems.
[ GRUB 0.97 fails in a particularly amusing way. Someone thought it
would be a good idea for memcpy() to check that the destination memory
region is a valid part of the system memory map; if not, then memcpy()
will sulk, fail, and return NULL. This breaks pretty much every use
of memcpy() including, for example, those inserted implicitly by gcc
to copy non-const initialisers. Debugging is _fun_ when a simple call
to printf() manages to create an infinite recursion, exhaust the
available stack space, and shut down the CPU. ]
Fix by completely inhibiting calls to INT 15,e820 and INT 15,e801
during POST.
We do now allow relocation during POST up to the maximum address
returned by INT 15,88 (which seems so far to always be safe). This
allows us to continue to have a reasonable size of external heap, even
if the PMM allocation is close to the 1MB mark.
The downside of allowing relocation during POST is that we may
overwrite PMM-allocated memory in use by other option ROMs. However,
the downside of inhibiting relocation, when combined with also
inhibiting calls to INT 15,e820 and INT 15,e801, would be that we
might have no external heap available: this would make booting an OS
impossible and could prevent some devices from even completing
initialisation.
On balance, the lesser evil is probably to allow relocation during
POST (up to the limit provided by INT 15,88). Entering iPXE during
POST is a rare operation; on the even rarer systems where doing so
happens to overwrite a PMM-allocated region, then there exists a
fairly simple workaround: if the user enters iPXE during POST and
wishes to exit iPXE, then the user must reboot. This is an acceptable
cost, given the rarity of the situation and the simplicity of the
workaround.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[prefix] Use %cs as implicit parameter to uninstall()
romprefix.S currently calls uninstall() with an invalid value in %ax.
Consequently, base memory is not freed after a ROM boot attempt (or
after entering iPXE during POST).
The uninstall() function is physically present in .text16, and so can
use %cs to determine the .text16 segment address. The .data16 segment
address is not required, since uninstall() is called only by code
paths which set up .data16 to immediately follow .text16.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[romprefix] Round up PMM allocation sizes to nearest 4kB
Some AMI BIOSes apparently break in exciting ways when asked for PMM
allocations for sizes that are not multiples of 4kB.
Fix by rounding up the image source area to the nearest 4kB. (The
temporary decompression area is already rounded up to the nearest
128kB, to facilitate sharing between multiple iPXE ROMs.)
Reported-by: Itay Gazit <itayg@mellanox.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[romprefix] Report a pessimistic runtime size estimate
PCI3.0 allows us to report a "runtime size" which can be smaller than
the actual ROM size. On systems that support PMM our runtime size
will be small (~2.5kB), which helps to conserve the limited option ROM
space. However, there is no guarantee that the PMM allocation will
succeed, and so we need to report the worst-case runtime size in the
PCI header.
Move the "shrunk ROM size" field from the PCI header to a new "iPXE
ROM header", allowing it to be accessed by ROM-manipulation utilities
such as disrom.pl.
Reported-by: Anton D. Kachalov <mouse@yandex-team.ru>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
PXENV_FILE_CMDLINE is an iPXE extension, and will not be supported by
most PXE stacks. Do not report any errors to the user, since in
almost all cases the error will mean simply "not loaded by iPXE".
Reported-by: Patrick Domack <patrickdk@patrickdk.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[pxeprefix] Place temporary stack after iPXE binary
Some BIOSes (observed on a Supermicro system with an AMI BIOS) seem to
use the area immediately below 0x7c00 to store data related to the
boot process. This data is currently liable to be overwritten by the
temporary stack used while decompressing and installing iPXE.
Try to avoid any such problems by placing the temporary stack
immediately after the loaded iPXE binary. Any memory used by the
stack could then potentially have been overwritten anyway by a larger
binary.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[romprefix] Treat 0xffffffff as an error return from PMM
PMM defines the return code 0xffffffff as meaning "unsupported
function". It's hard to imagine a PMM BIOS that doesn't support
pmmAllocate(), but apparently such things do exist.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[romprefix] Allow .mrom image to be placed anywhere within the BAR
A .mrom image currently assumes that it is the first image within the
expansion ROM BAR, which may not be correct when multiple images are
present.
Fix by scanning through the BAR until we locate an image matching our
build ID.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[romprefix] Add a dummy ROM header to cover the .mrom payload
The header of a .mrom image declares its length to be only a few
kilobytes; the remainder is accessed via a sideband mechanism. This
makes it difficult to append an additional ROM image, such as an EFI
ROM.
Add a second, dummy ROM header covering the payload portion of the
.mrom image, allowing consumers to locate any appended ROM images in
the usual way.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[lkrnprefix] Copy command line before installing iPXE
The command line may be situated in an area of base memory that will
be overwritten by iPXE's real-mode segments, causing the command line
to be corrupted before it can be used.
Fix by creating a copy of the command line on the prefix stack (below
0x7c00) before installing the real-mode segments.
Reported-by: Dave Hansen <dave@sr71.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[pxe] Provide PXENV_FILE_EXIT_HOOK only for ipxelinux.0 builds
PXENV_FILE_EXIT_HOOK is designed to allow ipxelinux.0 to unload both
the iPXE and pxelinux components without affecting the underlying PXE
stack. Unfortunately, it causes unexpected behaviour in other
situations, such as when loading a non-embedded pxelinux.0 via
undionly.kpxe. For example:
PXE ROM -> undionly.kpxe -> pxelinux.0 -> chain.c32 to boot hd0
would cause control to return to iPXE instead of booting from the hard
disk. In some cases, this would result in a harmless but confusing
"No more network devices" message; in other cases stranger things
would happen, such as being returned to the iPXE shell prompt.
The fundamental problem is that when pxelinux detects
PXENV_FILE_EXIT_HOOK, it may attempt to specify an exit hook and then
exit back to iPXE, assuming that iPXE will in turn exit cleanly via
the specified exit hook. This is not a valid assumption in the
general case, since the action of exiting back to iPXE does not
directly cause iPXE to exit itself. (In the specific case of
ipxelinux.0, this will work since the embedded script exits as soon as
pxelinux.0 exits.)
Fix the unexpected behaviour in the non-ipxelinux.0 cases by including
support for PXENV_FILE_EXIT_HOOK only when using a new .kkkpxe format.
The ipxelinux.0 build process should therefore now use undionly.kkkpxe
instead of undionly.kkpxe.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow an initrd (such as an embedded script) to be passed to iPXE when
loaded as a .lkrn (or .iso) image. This allows an embedded script to
be varied without recompiling iPXE.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
iPXE specifies a value of 0 for cmdline_size, causing GRUB to not pass
in a command line. Fix by setting cmdline_size to the maximum value
of 2047.
Signed-off-by: Valentine Barshak <gvaxon@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>