The various early-exit paths in parse_uri() accidentally bypass the
URI field decoding. The result is that opaque or relative URIs do not
undergo URI field decoding, resulting in double-encoding when the URIs
are subsequently used. For example:
#!ipxe
set mac ${macstring}
imgfetch /boot/by-mac/${mac:uristring}
would result in an HTTP GET such as
GET /boot/by-mac/00%253A0c%253A29%253Ac5%253A39%253Aa1 HTTP/1.1
rather than the expected
GET /boot/by-mac/00%3A0c%3A29%3Ac5%3A39%3Aa1 HTTP/1.1
Fix by ensuring that URI decoding is always applied regardless of the
URI format.
Reported-by: Andrew Widdersheim <awiddersheim@inetu.net>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
TFTP URIs are intrinsically problematic, since:
- TFTP servers may use either normal slashes or backslashes as a
directory separator,
- TFTP servers allow filenames to be specified using relative paths
(with no initial directory separator),
- TFTP filenames present in a DHCP filename field may use special
characters such as "?" or "#" that prevent parsing as a generic URI.
As of commit 7667536 ("[uri] Refactor URI parsing and formatting"), we
have directly constructed TFTP URIs from DHCP next-server and filename
pairs, avoiding the generic URI parser. This eliminated the problems
related to special characters, but indirectly made it impossible to
parse a "tftp://..." URI string into a TFTP URI with a non-absolute
path.
Re-introduce the convention of requiring an extra slash in a
"tftp://..." URI string in order to specify a TFTP URI with an initial
slash in the filename. For example:
tftp://192.168.0.1/boot/pxelinux.0 => RRQ "boot/pxelinux.0"
tftp://192.168.0.1//boot/pxelinux.0 => RRQ "/boot/pxelinux.0"
This is ugly, but there seems to be no other sensible way to provide
the ability to specify all possible TFTP filenames.
A side-effect of this change is that format_uri() will no longer add a
spurious initial "/" when formatting a relative URI string. This
improves the console output when fetching an image specified via a
relative URI.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Merge the functionality of parse_next_server_and_filename() and
tftp_uri() into a single pxe_uri(), which takes a server address
(IPv4/IPv6/none) and a filename, and produces a URI using the rule:
- if the filename is a hierarchical absolute URI (i.e. includes a
scheme such as "http://" or "tftp://") then use that URI and ignore
the server address,
- otherwise, if the server address is recognised (according to
sa_family) then construct a TFTP URI based on the server address,
port, and filename
- otherwise fail.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit 09b057c ("[settings] Remove "uristring" setting type") removed
support for URI-encoded settings via the "uristring" setting type, on
the basis that such encoding was no longer necessary to avoid problems
with the command line parser.
Other valid use cases for the "uristring" setting type do exist: for
example, a password containing a '/' character expanded via
chain http://username:${password:uristring}@server.name/boot.php
Restore the existence of the "uristring" setting, avoiding the
potentially large stack allocations that were used in the old code
prior to commit 09b057c ("[settings] Remove "uristring" setting
type").
Requested-by: Robin Smidsrød <robin@smidsrod.no>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[legal] Relicense files under GPL2_OR_LATER_OR_UBDL
These files cannot be automatically relicensed by util/relicense.pl
since they either contain unusual but trivial contributions (such as
the addition of __nonnull function attributes), or contain lines
dating back to the initial git revision (and so require manual
knowledge of the code's origin).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add support for parsing of URIs containing literal IPv6 addresses
(e.g. "http://[fe80::69ff:fe50:5845%25net0]/boot.ipxe").
Duplicate URIs by directly copying the relevant fields, rather than by
formatting and reparsing a URI string. This relaxes the requirements
on the URI formatting code and allows it to focus on generating
human-readable URIs (e.g. by not escaping ':' characters within
literal IPv6 addresses). As a side-effect, this allows relative URIs
containing parameter lists (e.g. "../boot.php##params") to function
as expected.
Add validity check for FTP paths to ensure that only printable
characters are accepted (since FTP is a human-readable line-based
protocol with no support for character escaping).
Construct TFTP next-server+filename URIs directly, rather than parsing
a constructed "tftp://..." string,
Add self-tests for URI functions.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
HTTP POST requires the ability to associate a parameter list with a
URI. There is no standardised syntax for this. Use a non-standard
URI syntax to incorporate the specification of a parameter list within
a URI:
URI = [ absoluteURI | relativeURI ]
[ "#" fragment ] [ "##params" [ "=" paramsName ] ]
e.g.
http://boot.ipxe.org/demo/boot.php##paramshttp://boot.ipxe.org/demo/boot.php##params=mylist
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Access to the gpxe.org and etherboot.org domains and associated
resources has been revoked by the registrant of the domain. Work
around this problem by renaming project from gPXE to iPXE, and
updating URLs to match.
Also update README, LOG and COPYRIGHTS to remove obsolete information.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
[uri] Handle an empty unparse_uri() result properly
Previously, if none of the URI parts requested existed in the passed
URI, unparse_uri() would not touch the destination buffer at all; this
could lead to use of uninitialized data. Fix by setting buf[0] = '\0'
before unparsing whenever we have room to do so.
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Marty Connor <mdc@etherboot.org>
Currently, handling of URI escapes is ad-hoc; escaped strings are
stored as-is in the URI structure, and it is up to the individual
protocol to unescape as necessary. This is error-prone and expensive
in terms of code size. Modify this behavior by unescaping in
parse_uri() and escaping in unparse_uri() those fields that typically
handle URI escapes (hostname, user, password, path, query, fragment),
and allowing unparse_uri() to accept a subset of fields to print so
it can be easily used to generate e.g. the escaped HTTP path?query
request.
Signed-off-by: Joshua Oreman <oremanj@rwcr.net>
Signed-off-by: Marty Connor <mdc@etherboot.org>
[uri] Avoid interpreting DOS-style path names as opaque URIs
A DOS-style full path name such as "C:\Program Files\tftpboot\nbp.0"
satisfies the syntax requirements for a URI with a scheme of "C" and
an opaque portion of "\Program Files\tftpboot\nbp.0".
Add a check in parse_uri() to ignore schemes that are apparently only
a single character long; this avoids interpreting DOS-style paths in
this way, and shouldn't affect any practical URI scheme.