[romprefix] Add .mrom format, allowing loading of large ROMs
Add an infrastructure allowing the prefix to provide an open_payload()
method for obtaining out-of-band access to the whole iPXE image. Add
a mechanism within this infrastructure that allows raw access to the
expansion ROM BAR by temporarily borrowing an address from a suitable
memory BAR on the same PCI card.
For cards that have a memory BAR that is at least as large as their
expansion ROM BAR, this allows large iPXE ROMs to be supported even on
systems where PMM fails, or where option ROM space pressure makes it
impossible to use PMM shrinking. The BIOS sees only a stub ROM of
approximately 3kB in size; the remainder (which can be well over 64kB)
is loaded only at the time iPXE is invoked.
As a nice side-effect, an iPXE .mrom image will continue to work even
if its PMM-allocated areas are overwritten between initialisation and
invocation.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[build] Replace obsolete makerom.pl with quick script using Option::ROM
The only remaining useful function of makerom.pl is to correct the ROM
and PnP checksums; the PCI IDs are set at link time, and padding is
performed using padimg.pl.
Option::ROM already provides a facility for correcting the checksums,
so we may as well just use this instead.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[romprefix] Inhibit the use of relocation during POST
It is common for system memory maps to be grotesquely unreliable
during POST. Many sanity checks have been added to the memory map
reading code, but these do not catch all problems.
Skip relocation entirely if called during POST. This should avoid the
problems typically encountered, at the cost of slightly disrupting the
memory map of an operating system booted via iPXE when iPXE was
entered during POST. Since this is a very rare special case (used,
for example, when reflashing an experimental ROM that would otherwise
prevent the system from completing POST), this is an acceptable cost.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[romprefix] Split PMM allocations for image source and decompression area
Some BIOSes (at least some AMI BIOSes) tend to refuse to allocate a
single area large enough to hold both the iPXE image source and the
temporary decompression area, despite promising a largest available
PMM memory block of several megabytes. This causes ROM image
shrinking to fail on these BIOSes, with undesirable consequences:
other option ROMs may be disabled due to shortage of option ROM space,
and the iPXE ROM may itself be corrupted by a further BIOS bug (again,
observed on an AMI BIOS) which causes large ROMs to end up overlapping
reserved areas of memory. This can potentially render a system
unbootable via any means.
Increase the chances of a successful PMM allocation by dropping the
alignment requirement (which is redundant now that we can enable A20
from within the prefix); this allows us to reduce the allocation size
from 2MB down to only the required size.
Increase the chances still further by using two separate allocations:
one to hold the image source (i.e. the copy of the ROM before being
shrunk) and the other to act as the decompression area. This allows
ROM image shrinking to take place even on systems that fail to
allocate enough memory for the temporary decompression area.
Improve the behaviour of iPXE in systems with multiple iPXE ROMs by
sharing PMM allocations where possible. Image source areas can be
shared with any iPXE ROMs with a matching build identifier, and the
temporary decompression area can be shared with any iPXE ROMs with the
same uncompressed size (rounded up to the nearest 128kB).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[prefix] Use area at top of INT 15,88 memory map for temporary decompression
Use INT 15,88 to find a suitable temporary decompression area, rather
than a fixed address. This hopefully gives us a better chance of not
treading on any PMM-allocated areas, in BIOSes where PMM support
exists but tends not to give us the large blocks that we ask for.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[pcbios] Always show INT 15,88 result under DEBUG=memmap
Always call INT 15,88 even if we don't use the result. This allows
DEBUG=memmap to show the complete result set returned by all of the
INT 15 memory-map calls.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Randomly generate a 32-bit build identifier that can be used to
identify identical iPXE ROMs when multiple such ROMs are present in a
system (e.g. when a multi-function NIC exposes the same iPXE ROM image
via each function's expansion ROM BAR).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[romprefix] Provide indication of successful call to install_prealloc
The existing "iPXE starting execution" message indicates that the BEV
(or INT19) was invoked, but gives no indication on whether or not the
iPXE source was successfully retrieved (e.g. from PMM). Split the
"starting execution message" into "starting execution...ok"; the "ok"
indicates that the main iPXE body was successfully decompressed and
relocated.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[prefix] Default to 1MB mark as fallback high memory load point
Now that we can use odd megabytes, there is no particular need to use
an even megabyte as the fallback temporary load point.
Note that the old warnings about avoiding 2MB pre-date our ability to
cooperate with other PXE ROMs by using PMM.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
iPXE is now capable of operating in odd megabytes of memory, so remove
the obsolete code enforcing an even-megabyte constraint.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[librm] Use libflat to enable A20 line on each real-to-protected transition
Use the shared code in libflat to perform the A20 transitions
automatically on each transition from real to protected mode. This
allows us to remove all explicit calls to gateA20_set().
The old warnings about avoiding automatically enabling A20 are
essentially redundant; they date back to the time when we would always
start hammering the keyboard controller without first checking to see
if gate A20 was already enabled (which it almost always is).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
iPXE currently insists on residing in an even megabyte. This imposes
undesirably severe constraints upon our PMM allocation strategy, and
limits our options for mechanisms to access ROMs greater than 64kB in
size.
Add A20 handling code to libflat so that prefixes are able to access
memory even in odd megabytes.
The algorithms and tuning parameters in the new A20 handling code are
based upon a mixture of the existing iPXE A20 code and the A20 code
from the 2.6.32 Linux kernel.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The flatten_real_mode routine is not needed until after decompressing
.text16.early, and currently performs various contortions to
compensate for the fact that .prefix may not be writable. Move
flatten_real_mode to .text16.early to save on (compressed) binary size
and simplify the code.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add a section .text16.early which is always kept inline with the
prefix. This will allow for some code sharing between the .prefix and
.text16 sections.
Note that the simple solution of just prepending the .prefix section
to the .text16 section will not work, because a bug in Wyse Streaming
Manager server (WLDRM13.BIN) requires us to place a dummy PXENV+ entry
point at the start of .text16.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[prefix] Use flat real mode for access to high memory
Use flat real mode rather than 16-bit protected mode for access to
high memory during installation. This simplifies the code by reducing
the number of CPU modes we need to think about, and also increases the
amount of code in common between the normal and (somewhat
hypothetical) KEEP_IT_REAL methods of operation.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
When returning to real mode, set 4GB segment limits instead of 64kB
limits. This change improves our chances of successfully returning to
a PMM-capable BIOS aftering entering iPXE during POST; the BIOS will
have set up flat real mode before calling our initialisation point,
and may be disconcerted if we then return in genuine real mode.
This change is unlikely to break anything, since any code that might
potentially access beyond 64kB must use addr32 prefixes to do so; if
this is the case then it is almost certainly code written to expect
flat real mode anyway.
Note that it is not possible to restore the real-mode segment limits
to their original values, since it is not possible to know which
protected-mode segment descriptor was originally used to initialise
the limit portion of the segment register.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The .hrom prefix provides an experimental mechanism for reducing
option ROM space usage on systems where PMM allocation fails, by
pretending that PMM allocation succeeded and gave us an address fixed
at compilation time. This is unreliable, and potentially dangerous.
In particular, when multiple gPXE ROMs are present in a system, each
gPXE ROM will assume ownership of the same fixed address, resulting in
undefined behaviour.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The .xrom prefix provides an experimental mechanism for loading ROM
images greater than 64kB in size by mapping the expansion ROM BAR in
at a hopefully-unused address. This is unreliable, and potentially
dangerous. In particular, there is no guarantee that any PCI bridges
between the CPU and the device will respond to accesses for the
"unused" memory region that is chosen, and it is possible that the
process of scanning for the "unused" memory region may end up issuing
reads to other PCI devices. If this ends up trampling on a register
with read side-effects belonging to an unrelated PCI device, this may
cause undefined behaviour.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Access to the gpxe.org and etherboot.org domains and associated
resources has been revoked by the registrant of the domain. Work
around this problem by renaming project from gPXE to iPXE, and
updating URLs to match.
Also update README, LOG and COPYRIGHTS to remove obsolete information.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
gPXE currently overwrites the filename stored in the cached DHCP
packets when a call to PXENV_TFTP_READ_FILE or PXENV_RESTART_TFTP is
made. This code has existed for many years as a workaround for RIS,
which seemed to require that this be done.
pxe_set_cached_filename() causes problems with the Bootix NBP, and a
recent test demonstrates that RIS will complete successfully even with
pxe_set_cached_filename() removed. There have been many changes to
the DHCP and PXE logic since this code was first added, and it is
quite plausible that it was masking a bug that no longer exists.
Reported-by: Alex Zeffertt <alex.zeffertt@eu.citrix.com>
Debugged-by: Shao Miller <Shao.Miller@yrdsb.edu.on.ca>
Signed-off-by: Michael Brown <mcb30@etherboot.org>
[pxe] Avoid potential interrupt storms when using shared interrupts
Current gPXE code always returns "OURS" in response to
PXENV_UNDI_ISR:START. This is harmless for non-shared interrupt
lines, and avoids the complexity of trying to determine whether or not
we really did cause the interrupt. (This is a non-trivial
determination; some drivers don't have interrupt support and hook the
system timer interrupt instead, for example.)
A problem occurs when we have a shared interrupt line, the other
device asserts an interrupt, and the controlling ISR does not chain to
the other device's ISR when we return "OURS". Under these
circumstances, the other device's ISR never executes, and so the
interrupt remains asserted, causing an interrupt storm.
Work around this by returning "OURS" if and only if our net device's
interrupt is currently recorded as being enabled. Since we always
disable interrupts as a result of a call to PXENV_UNDI_ISR:START, this
guarantees that we will eventually (on the second call) return "NOT
OURS", allowing the other ISR to be called. Under normal operation,
including a non-shared interrupt situation, this change will make no
difference since PXENV_UNDI_ISR:START would be called only when
interrupts were enabled anyway.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
In the actual SYSLINUX suite's comboot implementation, the version
string is prefixed by CR LF, and the copyright string has a leading
space. Some tools (specifically HDT) assume these padding characters
exist, so we should probably return strings in a similar format.
Signed-off-by: Michael Brown <mcb30@etherboot.org>
Loading multiple UNDI instances would be useful in systems that have
several network cards with vendor PXE ROMs. However, we cannot rely on
UNDI ROMs working correctly with multiple instances loaded
simultaneously.
The gPXE UNDI driver supports the following multi-NIC configurations:
1. Chainloading undionly.kpxe on a specific NIC.
2. Loading the UNDI driver for the first probed device and ignoring all
other UNDI devices in the system.
This patch refuses to probe additional UNDI devices so there can never
be multiple instances of UNDI loaded.
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
The .elf, .elfd, .lmelf, and .lmelfd prefices were brought over from
legacy Etherboot and they do not build in gPXE. This patch removes the
ELF prefices.
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
The unfinished .exe prefix was brought over from legacy Etherboot.
There has been no demand for .exe images so this patch removes the
prefix.
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
The DOS .com prefix was brought over from legacy Etherboot but does not
build. There has been no demand for .com images so this patch removes
the prefix.
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
The .lkrn prefix allows gPXE to be loaded as a Linux bzImage. The
bImage prefix was carried over from legacy Etherboot and does not build.
This patch removes the .bImage prefix, use .lkrn instead.
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
It might be the case that we wish to chain to an NBP without
being "in the way". We now implement a hook in our exit path
for gPXE *.*pxe build targets. The hook is a pointer to a
SEG16:OFF16 which we try to jump to during exit. By default,
this pointer results in the usual exit path.
We also implement the "pxenv_file_exit_hook" PXE API routine
to allow the user to specify an alternate SEG16:OFF16 to jump
to during exit.
Unfortunately, this additional PXE extension has a cost
in code size. Fortunately, a look at the size difference
for a gPXE .rom build target shows zero size difference
after compression.
The routine is documented in doc/pxe_extensions as follows:
FILE EXIT HOOK
Op-Code: PXENV_FILE_EXIT_HOOK (00e7h)
Input: Far pointer to a t_PXENV_FILE_EXIT_HOOK parameter
structure that has been initialized by the caller.
Output: PXENV_EXIT_SUCCESS or PXENV_EXIT_FAILURE must be
returned in AX. The Status field in the parameter
structure must be set to one of the values represented
by the PXENV_STATUS_xxx constants.
Description:Modify the exit path to jump to the specified code.
Only valid for pxeprefix-based builds.
typedef struct s_PXENV_FILE_EXIT_HOOK {
PXENV_STATUS_t Status;
SEGOFF16_t Hook;
} t_PXENV_FILE_EXIT_HOOK;
Set before calling API service:
Hook: The SEG16:OFF16 of the code to jump to.
Returned from API service:
Status: See PXENV_STATUS_xxx constants.
Requested-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Shao Miller <shao.miller@yrdsb.edu.on.ca>
Signed-off-by: Marty Connor <mdc@etherboot.org>
[prefix] Add .xrom prefix for a ROM that loads itself by PCI accesses
The standard option ROM format provides a header indicating the size
of the entire ROM, which the BIOS will reserve space for, load, and
call as necessary. However, this space is strictly limited to 128k for
all ROMs. gPXE ameliorates this somewhat by reserving space for itself
in high memory and relocating the majority of its code there, but on
systems prior to PCI3 enough space must still be present to load the
ROM in the first place. Even on PCI3 systems, the BIOS often limits the
size of ROM it will load to a bit over 64kB.
These space problems can be solved by providing an artificially small
size in the ROM header: just enough to let the prefix code (at the
beginning of the ROM image) be loaded by the BIOS. To the BIOS, the
gPXE ROM will appear to be only a few kilobytes; it can then load
the rest of itself by accessing the ROM directly using the PCI
interface reserved for that task.
There are a few problems with this approach. First, gPXE needs to find
an unmapped region in memory to map the ROM so it can read from it;
this is done using the crude but effective approach of scanning high
memory (over 0xF0000000) for a sufficiently large region of all-ones
(0xFF) reads. (In x86 architecture, all-ones is returned for accesses
to memory regions that no mapped device can satisfy.) This is not
provably valid in all situations, but has worked well in practice.
More importantly, this type of ROM access can only work if the PCI ROM
BAR exists at all. NICs on physical add-in PCI cards generally must
have the BAR in order for the BIOS to be able to load their ROM, but
ISA cards and LAN-on-Motherboard cards will both fail to load gPXE
using this scheme.
Due to these uncertainties, it is recommended that .xrom only be used
when a regular .rom image is infeasible due to crowded option ROM
space. However, when it works it could allow loading gPXE images
as large as a flash chip one could find - 128kB or even higher.
Signed-off-by: Marty Connor <mdc@etherboot.org>
For extremely tight space requirements and specific applications, it is
sometimes desirable to create gPXE images that cannot provide the PXE API
functionality to client programs. Add a configuration header option,
PXE_STACK, that can be removed to remove this stack. Also add PXE_MENU
to control the PXE boot menu, which most uses of gPXE do not need.
Signed-off-by: Marty Connor <mdc@etherboot.org>
[pxe] Support cached DHCP packets in .kkpxe images
If we don't unload the PXE stack before executing gPXE, automatically
take advantage of the cached DHCPACK that the underlying/parent PXE
stack can provide. If that cached DHCPACK contains option 175.178, or
the user sets the use-cached setting before invoking DHCP, the real
DHCP request will be skipped and the cached DHCPACK will be used for
network configuration. Otherwise, the cached settings block is thrown
away as soon as a fresh one is acquired.
Signed-off-by: Marty Connor <mdc@etherboot.org>
[pxe] Separate parent PXE API caller from UNDINET driver
Calling the parent PXE stack (the stack that loaded us, for
undionly.kkpxe) can be useful for more than UNDI calls; for instance,
it lets us get cached DHCP packets to avoid re-DHCP when working with
embedded images.
Signed-off-by: Marty Connor <mdc@etherboot.org>
[tftp] Make TFTP size requests abort transfer with an error
pxenv_tftp_get_fsize is an API call that PXE clients can call to
obtain the size of a remote file. It is implemented by starting a TFTP
transfer with pxe_tftp_open, waiting for the response and then
stopping the transfer with pxe_tftp_close(). This leaves the session
hanging on the TFTP server and it will try to resend the packet
repeatedly (verified with tftpd-hpa) until it times out.
This patch adds a method "tftpsize" that will abort the transfer after
the first packet is received from the server. This will terminate the
session on the server and is the same behaviour as Intel's PXE ROM
exhibits.
Together with a qemu patch to handle the ERROR packet (submitted to
qemu's mailing list), this resolves a specific issue where booting
pxegrub with qemu's TFTP server would be slow or hang.
I've tested this against qemu's tftp server and against my normal boot
infrastructure (tftpd-hpa). Booting pxegrub and loading extra files
now produces a trace similar to Intel's PXE client and there are no
spurious retransmits from tftpd any more.
Signed-off-by: Thomas Horsten <thomas@horsten.com>
Signed-off-by: Milan Plzik <milan.plzik@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
[sanboot] Prevent leaking a stack reference for "keep-san" AoE
When the "keep-san" option is used, the function is exited without
unregistering the stack allocated int13h drive. To prevent a dangling
pointer to the stack, these structs should be heap allocated.
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
[prefix] Add .hrom prefix for a ROM that loads high under PCI3 without PMM
gPXE currently takes advantage of the feature of PCI3.0 that allows
option ROMs to relocate the bulk of their code to high memory and so
take up only a small amount of space in the option ROM area. Currently,
the relocation can only take place if the BIOS's implementation of PMM
can be made to return blocks aligned to an even megabyte, because of
the A20 gate. AMI BIOSes, in particular, will not return allocations
that gPXE can use.
Ameliorate the situation somewhat by adding a prefix, .hrom, that works
identically to .rom except in the case that PMM allocation fails. Where
.rom would give up and place itself entirely in option ROM space, .hrom
moves to a block (assumed free) at HIGHMEM_LOADPOINT = 4MB. This allows
for the use of larger gPXE ROMs than would otherwise be possible.
Because there is no way to check that the area at HIGHMEM_LOADPOINT is
really free, other devices using that memory during the boot process
will cause failure for gPXE, the other device, or both. In practice
such conflicts will likely not occur, but this prefix should still be
considered EXPERIMENTAL.
Signed-off-by: Marty Connor <mdc@etherboot.org>
The disk partition prefix code in hdprefix.S reads the gPXE image in
tracks, not individual sectors. This means it will attempt to read
beyond the end of the image if the .hd image type is not padded to 32
KB.
This issue is affects virtualization software which may execute a .hd or
.usb image file directly - effectively running a machine with a tiny
disk containing just the gPXE image. Boot will fail when gPXE tries to
read beyond the end of disk.
[multiboot] Build memory map after shutting down and unhiding gPXE
The Multiboot memory map needs to be built after unhiding gPXE and
downloaded images from memory. Solaris faults during boot when trying
to access the ramdisk, which is hidden from the memory map while gPXE is
executing. This issue is fixed by using the memory map from after gPXE
unhides itself.
Reported-by: Moinak Ghosh <moinakg@belenix.org>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
[e820mangler] Add missing CLC ins. for success path
The get_underlying_e820 function should return with CF unset on success.
Reported-by: Timothy Stack <tstack@vmware.com>
Signed-off-by: Marty Connor <mdc@etherboot.org>
[linker] Expand and correct symbol requirement macros
REQUIRE_SYMBOL() formerly used a formulation of symbol requirement
that would allow a link to succeed despite lacking a required symbol,
because it did not introduce any relocations. Fix by renaming it to
REQUEST_SYMBOL() (since the soft-requirement behavior can be useful)
and add a REQUIRE_SYMBOL() that truly requires.
Add EXPORT_SYMBOL() and IMPORT_SYMBOL() for REQUEST_SYMBOL()-like
behavior that allows one to make use of the symbol, by combining a
weak external on the symbol itself with a REQUEST_SYMBOL() of a second
symbol.
Signed-off-by: Marty Connor <mdc@etherboot.org>
[int13] Guard against BIOSes that "fix" the drive count
Some BIOSes (observed with an AMI BIOS on a SunFire X2200) seem to
reset the BIOS drive counter at 40:75 after a failed boot attempt.
This causes problems when attempting a Windows direct-to-iSCSI
installation: bootmgr.exe calls INT 13,0800 and gets told that there
are no hard disks, so never bothers to read the MBR in order to obtain
the boot disk signature. The Windows iSCSI initiator will detect the
iBFT and connect to the target, and everything will appear to work
except for the error message "This computer's hardware may not support
booting to this disk. Ensure that the disk's controller is enabled in
the computer's BIOS menu."
Fix by checking the BIOS drive counter on every INT 13 call, and
updating it whenever necessary.
[autoboot] Ensure that an error message is always printed for a boot failure
The case of an unsupported SAN protocol will currently not result in
any error message. Fix by printing the error message at the top level
using strerror(), rather than using hard-coded error messages in the
error paths.
[netdevice] Allow the hardware and link-layer addresses to differ in size
IPoIB has a 20-byte link-layer address, of which only eight bytes
represent anything relating to a "hardware address".
The PXE and EFI SNP APIs expect the permanent address to be the same
size as the link-layer address, so fill in the "permanent address"
field with the initial link layer address (as generated by
register_netdev() based upon the real hardware address).
[netdevice] Separate out the concept of hardware and link-layer addresses
The hardware address is an intrinsic property of the hardware, while
the link-layer address can be changed at runtime. This separation is
exposed via APIs such as PXE and EFI, but is currently elided by gPXE.
Expose the hardware and link-layer addresses as separate properties
within a net device. Drivers should now fill in hw_addr, which will
be used to initialise ll_addr at the time of calling
register_netdev().