mirror/ipxe.git - mirror/ipxe.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	[malloc] Clean up debug messages	Michael Brown	2025-02-03	1	-29/+30
\| \| \| \|	Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[crypto] Add definitions and tests for the NIST P-384 elliptic curve	Michael Brown	2025-01-30	9	-0/+379
\| \| \| \|	Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[crypto] Add definitions and tests for the NIST P-256 elliptic curve	Michael Brown	2025-01-28	9	-0/+325
\| \| \| \|	Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[crypto] Add support for Weierstrass elliptic curve point multiplication	Michael Brown	2025-01-28	3	-0/+1044
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The NIST elliptic curves are Weierstrass curves and have the form y^2 = x^3 + ax + b with each curve defined by its field prime, the constants "a" and "b", and a generator base point. Implement a constant-time algorithm for point addition, based upon Algorithm 1 from "Complete addition formulas for prime order elliptic curves" (Joost Renes, Craig Costello, and Lejla Batina), and use this as a Montgomery ladder commutative operation to perform constant-time point multiplication. The code for point addition is implemented using a custom bytecode interpreter with 16-bit instructions, since this results in substantially smaller code than compiling the somewhat lengthy sequence of arithmetic operations directly. Values are calculated modulo small multiples of the field prime in order to allow for the use of relaxed Montgomery reduction. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[crypto] Add a generic implementation of a Montgomery ladder	Michael Brown	2025-01-28	2	-34/+194
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The Montgomery ladder may be used to perform any operation that is isomorphic to exponentiation, i.e. to compute the result r = g^e = g * g * g * g * .... * g for an arbitrary commutative operation "*", base or generator "g", and exponent "e". Implement a generic Montgomery ladder for use by both modular exponentiation and elliptic curve point multiplication (both of which are isomorphic to exponentiation). Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[test] Add generic tests for elliptic curve point multiplication	Michael Brown	2025-01-22	2	-0/+153
\| \| \| \|	Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[tls] Allow for NIST elliptic curve point formats	Michael Brown	2025-01-21	5	-11/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The elliptic curve point representation for the x25519 curve includes only the X value, since the curve is designed such that the Montgomery ladder does not need to ever know or calculate a Y value. There is no curve point format byte: the public key data is simply the X value. The pre-master secret is also simply the X value of the shared secret curve point. The point representation for the NIST curves includes both X and Y values, and a single curve point format byte that must indicate that the format is uncompressed. The pre-master secret for the NIST curves does not include both X and Y values: only the X value is used. Extend the definition of an elliptic curve to allow the point size to be specified separately from the key size, and extend the definition of a TLS named curve to include an optional curve point format byte and a pre-master secret length. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[crypto] Generalise elliptic curve key exchange to ecdhe_key()	Michael Brown	2025-01-21	3	-9/+87
\| \| \| \| \| \| \| \|	Split out the portion of tls_send_client_key_exchange_ecdhe() that actually performs the elliptic curve key exchange into a separate function ecdhe_key(). Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[crypto] Add bigint_ntoa() for transcribing big integers	Michael Brown	2025-01-20	2	-0/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In debug messages, big integers are currently printed as hex dumps. This is quite verbose and cumbersome to check against external sources. Add bigint_ntoa() to transcribe big integers into a static buffer (following the model of inet_ntoa(), eth_ntoa(), uuid_ntoa(), etc). Abbreviate big integers that will not fit within the static buffer, showing both the most significant and least significant digits in the transcription. This is generally the most useful form when visually comparing against external sources (such as test vectors, or results produced by high-level languages). Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[crypto] Extract bigint_reduce_supremum() from bigint_mod_exp()	Michael Brown	2025-01-10	2	-7/+44
\| \| \| \| \| \| \| \| \| \| \| \|	Calculating the Montgomery constant (R^2 mod N) is done in our implementation by zeroing the double-width representation of N, subtracting N once to give (R^2 - N) in order to obtain a positive value, then reducing this value modulo N. Extract this logic from bigint_mod_exp() to a separate function bigint_reduce_supremum(), to allow for reuse by other code. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[crypto] Allow for relaxed Montgomery reduction	Michael Brown	2024-12-18	3	-33/+184
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Classic Montgomery reduction involves a single conditional subtraction to ensure that the result is strictly less than the modulus. When performing chains of Montgomery multiplications (potentially interspersed with additions and subtractions), it can be useful to work with values that are stored modulo some small multiple of the modulus, thereby allowing some reductions to be elided. Each addition and subtraction stage will increase this running multiple, and the following multiplication stages can be used to reduce the running multiple since the reduction carried out for multiplication products is generally strong enough to absorb some additional bits in the inputs. This approach is already used in the x25519 code, where multiplication takes two 258-bit inputs and produces a 257-bit output. Split out the conditional subtraction from bigint_montgomery() and provide a separate bigint_montgomery_relaxed() for callers who do not require immediate reduction to within the range of the modulus. Modular exponentiation could potentially make use of relaxed Montgomery multiplication, but this would require R>4N, i.e. that the two most significant bits of the modulus be zero. For both RSA and DHE, this would necessitate extending the modulus size by one element, which would negate any speed increase from omitting the conditional subtractions. We therefore retain the use of classic Montgomery reduction for modular exponentiation, apart from the final conversion out of Montgomery form. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[efi] Add EFI_TCG2_PROTOCOL header and GUID definition	Michael Brown	2024-12-17	4	-0/+345
\| \| \| \|	Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[efi] Update to current EDK2 headers	Michael Brown	2024-12-17	19	-84/+535
\| \| \| \|	Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[crypto] Calculate inverse of modulus on demand in bigint_montgomery()	Michael Brown	2024-12-16	3	-36/+23
\| \| \| \| \| \| \| \| \| \| \| \| \|	Reduce the number of parameters passed to bigint_montgomery() by calculating the inverse of the modulus modulo the element size on demand. Cache the result, since Montgomery reduction will be used repeatedly with the same modulus value. In all currently supported algorithms, the modulus is a public value (or a fixed value defined by specification) and so this non-constant timing does not leak any private information. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[gve] Run startup process only while device is open	Michael Brown	2024-12-03	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The startup process is scheduled to run when the device is opened and terminated (if still running) when the device is closed. It assumes that the resource allocation performed in gve_open() has taken place, and that the admin and transmit/receive data structure pointers are therefore valid. The process initialisation in gve_probe() erroneously calls process_init() rather than process_init_stopped() and will therefore schedule the startup process immediately, before the relevant resources have been allocated. This bug is masked in the typical use case of a Google Cloud instance with a single NIC built with the config/cloud/gce.ipxe embedded script, since the embedded script will immediately open the NIC (and therefore allocate the required resources) before the scheduled process is allowed to run for the first time. In a multi-NIC instance, undefined behaviour will arise as soon as the startup process for the second NIC is allowed to run. Fix by using process_init_stopped() to avoid implicitly scheduling the startup process during gve_probe(). Originally-fixed-by: Kal Cutter Conley <kalcutterc@nvidia.com> Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[crypto] Remove obsolete bigint_mod_multiply()	Michael Brown	2024-11-28	3	-277/+0
\| \| \| \| \| \| \| \| \| \|	There is no further need for a standalone modular multiplication primitive, since the only consumer is modular exponentiation (which now uses Montgomery multiplication instead). Remove the now obsolete bigint_mod_multiply(). Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[crypto] Use Montgomery reduction for modular exponentiation	Michael Brown	2024-11-28	5	-29/+164
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Speed up modular exponentiation by using Montgomery reduction rather than direct modular reduction. Montgomery reduction in base 2^n requires the modulus to be coprime to 2^n, which would limit us to requiring that the modulus is an odd number. Extend the implementation to include support for exponentiation with even moduli via Garner's algorithm as described in "Montgomery reduction with even modulus" (Koç, 1994). Since almost all use cases for modular exponentation require a large prime (and hence odd) modulus, the support for even moduli could potentially be removed in future. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[crypto] Add bigint_montgomery() to perform Montgomery reduction	Michael Brown	2024-11-27	3	-0/+174
\| \| \| \| \| \| \| \| \| \|	Montgomery reduction is substantially faster than direct reduction, and is better suited for modular exponentiation operations. Add bigint_montgomery() to perform the Montgomery reduction operation (often referred to as "REDC"), along with some test vectors. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[crypto] Use inverse size as effective size for bigint_mod_invert()	Michael Brown	2024-11-27	2	-6/+11
\| \| \| \| \| \| \| \| \| \| \| \|	Montgomery reduction requires only the least significant element of an inverse modulo 2^k, which in turn depends upon only the least significant element of the invertend. Use the inverse size (rather than the invertend size) as the effective size for bigint_mod_invert(). This eliminates around 97% of the loop iterations for a typical 2048-bit RSA modulus. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[crypto] Eliminate temporary working space for bigint_mod_invert()	Michael Brown	2024-11-27	3	-46/+65
\| \| \| \| \| \| \| \| \| \|	With a slight modification to the algorithm to ignore bits of the residue that can never contribute to the result, it is possible to reuse the as-yet uncalculated portions of the inverse to hold the residue. This removes the requirement for additional temporary working space. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[crypto] Eliminate temporary working space for bigint_reduce()	Michael Brown	2024-11-26	3	-117/+72
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Direct modular reduction is expected to be used in situations where there is no requirement to retain the original (unreduced) value. Modify the API for bigint_reduce() to reduce the value in place, (removing the separate result buffer), impose a constraint that the modulus and value have the same size, and require the modulus to be passed in writable memory (to allow for scaling in place). This removes the requirement for additional temporary working space. Reverse the order of arguments so that the constant input is first, to match the usage pattern for bigint_add() et al. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[crypto] Expose carry flag from big integer addition and subtraction	Michael Brown	2024-11-26	8	-85/+140
\| \| \| \| \| \| \| \|	Expose the effective carry (or borrow) out flag from big integer addition and subtraction, and use this to elide an explicit bit test when performing x25519 reduction. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[crypto] Add bigint_msb_is_set() to clarify code	Michael Brown	2024-11-20	3	-5/+30
\| \| \| \| \| \| \| \|	Add a dedicated bigint_msb_is_set() to reduce the amount of open coding required in the common case of testing the sign of a two's complement big integer. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[efi] Ensure local drives are connected when attempting a SAN boot	Michael Brown	2024-11-20	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	UEFI systems may choose not to connect drivers for local disk drives when the boot policy is set to attempt a network boot. This may cause the "sanboot" command to be unable to boot from a local drive, since the relevant block device and filesystem drivers may not have been connected. Fix by ensuring that all available drivers are connected before attempting to boot from an EFI block device. Reported-by: Andrew Cottrell <andrew.cottrell@xtxmarkets.com> Tested-by: Andrew Cottrell <andrew.cottrell@xtxmarkets.com> Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[build] Allow for per-architecture cross-compilation prefixes	Michael Brown	2024-10-29	2	-128/+139
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We currently require the variable CROSS (or CROSS_COMPILE) to be set to specify the global cross-compilation prefix. This becomes cumbersome when developing across multiple CPU architectures, requiring frequent editing of build command lines and preventing incompatible architectures from being built with a single command. Allow a default cross-compilation prefix for each architecture to be specified via the CROSS_COMPILE_<arch> variables. These may then be provided as environment variables, e.g. using export CROSS_COMPILE_arm32=arm-linux-gnu- export CROSS_COMPILE_arm64=aarch64-linux-gnu- export CROSS_COMPILE_loong64=loongarch64-linux-gnu- export CROSS_COMPILE_riscv32=riscv64-linux-gnu- export CROSS_COMPILE_riscv64=riscv64-linux-gnu- This change requires some portions of the Makefile to be rearranged, to allow for the fact that $(CROSS_COMPILE) may not have been set until the build directory has been parsed to determine the CPU architecture. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[riscv] Check if seed CSR is accessible from S-mode	Michael Brown	2024-10-28	2	-0/+82
\| \| \| \| \| \| \| \| \| \| \| \| \|	The seed CSR defined by the Zkr extension is accessible only in M-mode by default. Older versions of OpenSBI (prior to version 1.4) do not set mseccfg.sseed, with the result that attempts to access the seed CSR from S-mode will raise an illegal instruction exception. Add a facility for testing the accessibility of arbitrary CSRs, and use it to check that the seed CSR is accessible before reporting the seed CSR entropy source as being functional. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[sbi] Add support for running as a RISC-V SBI payload	Michael Brown	2024-10-28	16	-0/+532
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add basic support for running directly on top of SBI, with no UEFI firmware present. Build as e.g.: make CROSS=riscv64-linux-gnu- bin-riscv64/ipxe.sbi The resulting binary can be tested in QEMU using e.g.: qemu-system-riscv64 -M virt -cpu max -serial stdio \ -kernel bin-riscv64/ipxe.sbi No drivers or executable binary formats are supported yet, but the unit test suite may be run successfully. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[build] Allow default platform to vary by architecture	Michael Brown	2024-10-28	1	-5/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Restructure the parsing of the build directory name from bin[[-<arch>]-<platform>] to bin[-<arch>[-<platform>]] and allow for a per-architecture default build platform. For the sake of backwards compatibility, handle "bin-efi" as a special case equivalent to "bin-i386-efi". Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[pci] Provide a null PCI API for platforms with no PCI bus	Michael Brown	2024-10-28	3	-0/+198
\| \| \| \|	Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[riscv] Add missing volatile qualifiers on timer and seed CSR accesses	Michael Brown	2024-10-28	2	-9/+11
\| \| \| \| \| \| \| \| \| \|	The timer and entropy seed CSRs will, by design, return different values each time they are read. Add the missing volatile qualifiers on the inline assembly to prevent gcc from assuming that repeated invocations may be elided. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[riscv] Add support for the seed CSR as an entropy source	Michael Brown	2024-10-28	3	-0/+114
\| \| \| \| \| \| \| \| \|	The Zkr entropy source extension defines a potentially unprivileged seed CSR that can be read to obtain 16 bits of entropy input, with a mandated requirement that 256 entropy input bits read from the seed CSR will contain at least 128 bits of min-entropy. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[riscv] Add support for RDTIME as a timer source	Michael Brown	2024-10-28	3	-0/+197
\| \| \| \| \| \| \| \| \| \| \| \|	The Zicntr extension defines an unprivileged wall-clock time CSR that roughly matches the behaviour of an invariant TSC on x86. The nominal frequency of this timer may be read from the "timebase-frequency" property of the CPU node in the device tree. Add a timer source using RDTIME to provide implementations of udelay() and currticks(), modelled on the existing RDTSC-based timer for x86. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[riscv] Add support for checking CPU extensions reported via device tree	Michael Brown	2024-10-28	3	-0/+117
\| \| \| \| \| \| \| \| \| \| \| \| \|	RISC-V seems to allow for direct discovery of CPU features only from M-mode (e.g. by setting up a trap handler and then attempting to access a CSR), with S-mode code expected to read the resulting constructed ISA description from the device tree. Add the ability to check for the presence of named extensions listed in the "riscv,isa" property of the device tree node corresponding to the boot hart. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[fdt] Add ability to parse unsigned integer properties	Michael Brown	2024-10-28	2	-0/+39
\| \| \| \|	Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[pci] Drag in PCI settings mechanism only when PCI support is present	Michael Brown	2024-10-25	3	-3/+42
\| \| \| \| \| \| \|	Allow for the existence of platforms with no PCI bus by including the PCI settings mechanism only if PCI bus support is included. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[uaccess] Rename UACCESS_EFI to UACCESS_FLAT	Michael Brown	2024-10-25	4	-118/+89
\| \| \| \| \| \| \| \| \|	Running with flat physical addressing is a fairly common early boot environment. Rename UACCESS_EFI to UACCESS_FLAT so that this code may be reused in non-UEFI boot environments that also use flat physical addressing. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[smbios] Provide a null SMBIOS API for platforms with no concept of SMBIOS	Michael Brown	2024-10-25	4	-0/+67
\| \| \| \|	Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[riscv] Add support for reboot and power off via SBI	Michael Brown	2024-10-22	5	-0/+120
\| \| \| \|	Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[riscv] Add support for the SBI debug console	Michael Brown	2024-10-22	5	-0/+263
\| \| \| \| \| \| \| \|	Add the ability to issue Supervisor Binary Interface (SBI) calls via the ECALL instruction, and use the SBI DBCN extension to implement a debug console. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[crypto] Add bigint_mod_invert() to calculate inverse modulo a power of two	Michael Brown	2024-10-21	3	-0/+143
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Montgomery multiplication requires calculating the inverse of the modulus modulo a larger power of two. Add bigint_mod_invert() to calculate the inverse of any (odd) big integer modulo an arbitrary power of two, using a lightly modified version of the algorithm presented in "A New Algorithm for Inversion mod p^k (Koç, 2017)". The power of two is taken to be 2^k, where k is the number of bits available in the big integer representation of the invertend. The inverse modulo any smaller power of two may be obtained simply by masking off the relevant bits in the inverse. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[usb] Expose USB device descriptor and strings via settings	Michael Brown	2024-10-18	6	-6/+191
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allow scripts to read basic information from USB device descriptors via the settings mechanism. For example: echo USB vendor ID: ${usb/${busloc}.8.2} echo USB device ID: ${usb/${busloc}.10.2} echo USB manufacturer name: ${usb/${busloc}.14.0} The general syntax is usb/<bus:dev>.<offset>.<length> where bus:dev is the USB bus:device address (as obtained via the "usbscan" command, or from e.g. ${net0/busloc} for a USB network device), and <offset> and <length> select the required portion of the USB device descriptor. Following the usage of SMBIOS settings tags, a <length> of zero may be used to indicate that the byte at <offset> contains a USB string descriptor index, and an <offset> of zero may be used to indicate that the <length> contains a literal USB string descriptor index. Since the byte at offset zero can never contain a string index, and a literal string index can never be zero, the combination of both <length> and <offset> being zero may be used to indicate that the entire device descriptor is to be read as a raw hex dump. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[usb] Add "usbscan" command for iterating over USB devices	Michael Brown	2024-10-17	6	-2/+225
\| \| \| \| \| \| \| \|	Implement a "usbscan" command as a direct analogy of the existing "pciscan" command, allowing scripts to iterate over all detected USB devices. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[crypto] Separate out bigint_reduce() from bigint_mod_multiply()	Michael Brown	2024-10-15	3	-37/+296
\| \| \| \| \| \| \| \| \| \| \|	Faster modular multiplication algorithms such as Montgomery multiplication will still require the ability to perform a single direct modular reduction. Neaten up the implementation of direct reduction and split it out into a separate bigint_reduce() function, complete with its own unit tests. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[crypto] Use architecture-independent bigint_is_set()	Michael Brown	2024-10-10	6	-95/+19
\| \| \| \| \| \| \| \| \| \| \|	Every architecture uses the same implementation for bigint_is_set(), and there is no reason to suspect that a future CPU architecture will provide a more efficient way to implement this operation. Simplify the code by providing a single architecture-independent implementation of bigint_is_set(). Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[crypto] Rename bigint_rol()/bigint_ror() to bigint_shl()/bigint_shr()	Michael Brown	2024-10-07	8	-60/+60
\| \| \| \| \| \| \| \| \| \| \|	The big integer shift operations are misleadingly described as rotations since the original x86 implementations are essentially trivial loops around the relevant rotate-through-carry instruction. The overall operation performed is a shift rather than a rotation. Update the function names and descriptions to reflect this. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[crypto] Eliminate temporary carry space for big integer multiplication	Michael Brown	2024-09-27	9	-197/+111
\| \| \| \| \| \| \| \| \| \| \| \|	An n-bit multiplication product may be added to up to two n-bit integers without exceeding the range of a (2n)-bit integer: (2^n - 1)*(2^n - 1) + (2^n - 1) + (2^n - 1) = 2^(2n) - 1 Exploit this to perform big integer multiplication in constant time without requiring the caller to provide temporary carry space. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[arm] Support building as a Linux userspace binary for AArch32	Michael Brown	2024-09-24	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for building as a Linux userspace binary for AArch32. This allows the self-test suite to be more easily run for the 32-bit ARM code. For example: make CROSS=arm-linux-gnu- bin-arm32-linux/tests.linux qemu-arm -L /usr/arm-linux-gnu/sys-root/ \ ./bin-arm32-linux/tests.linux Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[arm] Check PMCCNTR availability before use for profiling	Michael Brown	2024-09-24	2	-3/+99
\| \| \| \| \| \| \| \| \| \| \| \|	Reading from PMCCNTR causes an undefined instruction exception when running in PL0 (e.g. as a Linux userspace binary), unless the PMUSERENR.EN bit is set. Restructure profile_timestamp() for 32-bit ARM to perform an availability check on the first invocation, with subsequent invocations returning zero if PMCCNTR could not be enabled. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[profile] Standardise return type of profile_timestamp()	Michael Brown	2024-09-24	8	-45/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	All consumers of profile_timestamp() currently treat the value as an unsigned long. Only the elapsed number of ticks is ever relevant: the absolute value of the timestamp is not used. Profiling is used to measure short durations that are generally fewer than a million CPU cycles, for which an unsigned long is easily large enough. Standardise the return type of profile_timestamp() as unsigned long across all CPU architectures. This allows 32-bit architectures such as i386 and riscv32 to omit all logic associated with retrieving the upper 32 bits of the 64-bit hardware counter, which simplifies the code and allows riscv32 and riscv64 to share the same implementation. Signed-off-by: Michael Brown <mcb30@ipxe.org>
*	[crypto] Use constant-time big integer multiplication	Michael Brown	2024-09-23	14	-612/+355
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Big integer multiplication currently performs immediate carry propagation from each step of the long multiplication, relying on the fact that the overall result has a known maximum value to minimise the number of carries performed without ever needing to explicitly check against the result buffer size. This is not a constant-time algorithm, since the number of carries performed will be a function of the input values. We could make it constant-time by always continuing to propagate the carry until reaching the end of the result buffer, but this would introduce a large number of redundant zero carries. Require callers of bigint_multiply() to provide a temporary carry storage buffer, of the same size as the result buffer. This allows the carry-out from the accumulation of each double-element product to be accumulated in the temporary carry space, and then added in via a single call to bigint_add() after the multiplication is complete. Since the structure of big integer multiplication is identical across all current CPU architectures, provide a single shared implementation of bigint_multiply(). The architecture-specific operation then becomes the multiplication of two big integer elements and the accumulation of the double-element product. Note that any intermediate carry arising from accumulating the lower half of the double-element product may be added to the upper half of the double-element product without risk of overflow, since the result of multiplying two n-bit integers can never have all n bits set in its upper half. This simplifies the carry calculations for architectures such as RISC-V and LoongArch64 that do not have a carry flag. Signed-off-by: Michael Brown <mcb30@ipxe.org>