aboutsummaryrefslogtreecommitdiffstats
path: root/src/crypto/bigint.c
Commit message (Collapse)AuthorAgeFilesLines
* [crypto] Support direct reduction only for Montgomery constant R^2 mod NMichael Brown9 days1-146/+102
| | | | | | | | | | | | | | | | | | | | | | The only remaining use case for direct reduction (outside of the unit tests) is in calculating the constant R^2 mod N used during Montgomery multiplication. The current implementation of direct reduction requires a writable copy of the modulus (to allow for shifting), and both the modulus and the result buffer must be padded to be large enough to hold (R^2 - N), which is twice the size of the actual values involved. For the special case of reducing R^2 mod N (or any power of two mod N), we can run the same algorithm without needing either a writable copy of the modulus or a padded result buffer. The working state required is only two bits larger than the result buffer, and these additional bits may be held in local variables instead. Rewrite bigint_reduce() to handle only this use case, and remove the no longer necessary uses of double-sized big integers. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [crypto] Add a generic implementation of a Montgomery ladderMichael Brown2025-01-281-34/+154
| | | | | | | | | | | | | | | | The Montgomery ladder may be used to perform any operation that is isomorphic to exponentiation, i.e. to compute the result r = g^e = g * g * g * g * .... * g for an arbitrary commutative operation "*", base or generator "g", and exponent "e". Implement a generic Montgomery ladder for use by both modular exponentiation and elliptic curve point multiplication (both of which are isomorphic to exponentiation). Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [crypto] Add bigint_ntoa() for transcribing big integersMichael Brown2025-01-201-0/+47
| | | | | | | | | | | | | | | | | In debug messages, big integers are currently printed as hex dumps. This is quite verbose and cumbersome to check against external sources. Add bigint_ntoa() to transcribe big integers into a static buffer (following the model of inet_ntoa(), eth_ntoa(), uuid_ntoa(), etc). Abbreviate big integers that will not fit within the static buffer, showing both the most significant and least significant digits in the transcription. This is generally the most useful form when visually comparing against external sources (such as test vectors, or results produced by high-level languages). Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [crypto] Extract bigint_reduce_supremum() from bigint_mod_exp()Michael Brown2025-01-101-4/+26
| | | | | | | | | | | | Calculating the Montgomery constant (R^2 mod N) is done in our implementation by zeroing the double-width representation of N, subtracting N once to give (R^2 - N) in order to obtain a positive value, then reducing this value modulo N. Extract this logic from bigint_mod_exp() to a separate function bigint_reduce_supremum(), to allow for reuse by other code. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [crypto] Allow for relaxed Montgomery reductionMichael Brown2024-12-181-20/+155
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Classic Montgomery reduction involves a single conditional subtraction to ensure that the result is strictly less than the modulus. When performing chains of Montgomery multiplications (potentially interspersed with additions and subtractions), it can be useful to work with values that are stored modulo some small multiple of the modulus, thereby allowing some reductions to be elided. Each addition and subtraction stage will increase this running multiple, and the following multiplication stages can be used to reduce the running multiple since the reduction carried out for multiplication products is generally strong enough to absorb some additional bits in the inputs. This approach is already used in the x25519 code, where multiplication takes two 258-bit inputs and produces a 257-bit output. Split out the conditional subtraction from bigint_montgomery() and provide a separate bigint_montgomery_relaxed() for callers who do not require immediate reduction to within the range of the modulus. Modular exponentiation could potentially make use of relaxed Montgomery multiplication, but this would require R>4N, i.e. that the two most significant bits of the modulus be zero. For both RSA and DHE, this would necessitate extending the modulus size by one element, which would negate any speed increase from omitting the conditional subtractions. We therefore retain the use of classic Montgomery reduction for modular exponentiation, apart from the final conversion out of Montgomery form. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [crypto] Calculate inverse of modulus on demand in bigint_montgomery()Michael Brown2024-12-161-22/+18
| | | | | | | | | | | | | Reduce the number of parameters passed to bigint_montgomery() by calculating the inverse of the modulus modulo the element size on demand. Cache the result, since Montgomery reduction will be used repeatedly with the same modulus value. In all currently supported algorithms, the modulus is a public value (or a fixed value defined by specification) and so this non-constant timing does not leak any private information. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [crypto] Remove obsolete bigint_mod_multiply()Michael Brown2024-11-281-53/+0
| | | | | | | | | | There is no further need for a standalone modular multiplication primitive, since the only consumer is modular exponentiation (which now uses Montgomery multiplication instead). Remove the now obsolete bigint_mod_multiply(). Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [crypto] Use Montgomery reduction for modular exponentiationMichael Brown2024-11-281-15/+132
| | | | | | | | | | | | | | | | | Speed up modular exponentiation by using Montgomery reduction rather than direct modular reduction. Montgomery reduction in base 2^n requires the modulus to be coprime to 2^n, which would limit us to requiring that the modulus is an odd number. Extend the implementation to include support for exponentiation with even moduli via Garner's algorithm as described in "Montgomery reduction with even modulus" (Koç, 1994). Since almost all use cases for modular exponentation require a large prime (and hence odd) modulus, the support for even moduli could potentially be removed in future. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [crypto] Add bigint_montgomery() to perform Montgomery reductionMichael Brown2024-11-271-0/+77
| | | | | | | | | | Montgomery reduction is substantially faster than direct reduction, and is better suited for modular exponentiation operations. Add bigint_montgomery() to perform the Montgomery reduction operation (often referred to as "REDC"), along with some test vectors. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [crypto] Eliminate temporary working space for bigint_mod_invert()Michael Brown2024-11-271-22/+41
| | | | | | | | | | With a slight modification to the algorithm to ignore bits of the residue that can never contribute to the result, it is possible to reuse the as-yet uncalculated portions of the inverse to hold the residue. This removes the requirement for additional temporary working space. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [crypto] Eliminate temporary working space for bigint_reduce()Michael Brown2024-11-261-46/+25
| | | | | | | | | | | | | | | | Direct modular reduction is expected to be used in situations where there is no requirement to retain the original (unreduced) value. Modify the API for bigint_reduce() to reduce the value in place, (removing the separate result buffer), impose a constraint that the modulus and value have the same size, and require the modulus to be passed in writable memory (to allow for scaling in place). This removes the requirement for additional temporary working space. Reverse the order of arguments so that the constant input is first, to match the usage pattern for bigint_add() et al. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [crypto] Add bigint_msb_is_set() to clarify codeMichael Brown2024-11-201-3/+2
| | | | | | | | Add a dedicated bigint_msb_is_set() to reduce the amount of open coding required in the common case of testing the sign of a two's complement big integer. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [crypto] Add bigint_mod_invert() to calculate inverse modulo a power of twoMichael Brown2024-10-211-0/+54
| | | | | | | | | | | | | | | | | Montgomery multiplication requires calculating the inverse of the modulus modulo a larger power of two. Add bigint_mod_invert() to calculate the inverse of any (odd) big integer modulo an arbitrary power of two, using a lightly modified version of the algorithm presented in "A New Algorithm for Inversion mod p^k (Koç, 2017)". The power of two is taken to be 2^k, where k is the number of bits available in the big integer representation of the invertend. The inverse modulo any smaller power of two may be obtained simply by masking off the relevant bits in the inverse. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [crypto] Separate out bigint_reduce() from bigint_mod_multiply()Michael Brown2024-10-151-37/+176
| | | | | | | | | | | Faster modular multiplication algorithms such as Montgomery multiplication will still require the ability to perform a single direct modular reduction. Neaten up the implementation of direct reduction and split it out into a separate bigint_reduce() function, complete with its own unit tests. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [crypto] Rename bigint_rol()/bigint_ror() to bigint_shl()/bigint_shr()Michael Brown2024-10-071-8/+8
| | | | | | | | | | | The big integer shift operations are misleadingly described as rotations since the original x86 implementations are essentially trivial loops around the relevant rotate-through-carry instruction. The overall operation performed is a shift rather than a rotation. Update the function names and descriptions to reflect this. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [crypto] Eliminate temporary carry space for big integer multiplicationMichael Brown2024-09-271-76/+32
| | | | | | | | | | | | An n-bit multiplication product may be added to up to two n-bit integers without exceeding the range of a (2n)-bit integer: (2^n - 1)*(2^n - 1) + (2^n - 1) + (2^n - 1) = 2^(2n) - 1 Exploit this to perform big integer multiplication in constant time without requiring the caller to provide temporary carry space. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [crypto] Use constant-time big integer multiplicationMichael Brown2024-09-231-2/+115
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Big integer multiplication currently performs immediate carry propagation from each step of the long multiplication, relying on the fact that the overall result has a known maximum value to minimise the number of carries performed without ever needing to explicitly check against the result buffer size. This is not a constant-time algorithm, since the number of carries performed will be a function of the input values. We could make it constant-time by always continuing to propagate the carry until reaching the end of the result buffer, but this would introduce a large number of redundant zero carries. Require callers of bigint_multiply() to provide a temporary carry storage buffer, of the same size as the result buffer. This allows the carry-out from the accumulation of each double-element product to be accumulated in the temporary carry space, and then added in via a single call to bigint_add() after the multiplication is complete. Since the structure of big integer multiplication is identical across all current CPU architectures, provide a single shared implementation of bigint_multiply(). The architecture-specific operation then becomes the multiplication of two big integer elements and the accumulation of the double-element product. Note that any intermediate carry arising from accumulating the lower half of the double-element product may be added to the upper half of the double-element product without risk of overflow, since the result of multiplying two n-bit integers can never have all n bits set in its upper half. This simplifies the carry calculations for architectures such as RISC-V and LoongArch64 that do not have a carry flag. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [crypto] Add bigint_swap() to conditionally swap big integersMichael Brown2024-01-191-0/+25
| | | | | | | Add a helper function bigint_swap() that can be used to conditionally swap a pair of big integers in constant time. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [crypto] Profile the various stages of modular multiplicationMichael Brown2019-08-171-0/+29
| | | | Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [legal] Relicense files under GPL2_OR_LATER_OR_UBDLMichael Brown2015-03-021-1/+5
| | | | | | | Relicense files for which I am the sole author (as identified by util/relicense.pl). Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [legal] Update FSF mailing address in GPL licence textsMichael Brown2012-07-201-1/+2
| | | | | Suggested-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [crypto] Force caller to provide temporary storage for modular calculationsMichael Brown2012-03-181-25/+37
| | | | | | | | | | | | | bigint_mod_multiply() and bigint_mod_exp() require a fixed amount of temporary storage for intermediate results. (The amount of temporary storage required depends upon the size of the integers involved.) When performing calculations for 4096-bit RSA the amount of temporary storage space required will exceed 2.5kB, which is too much to allocate on the stack. Avoid this problem by forcing the caller to allocate temporary storage. Signed-off-by: Michael Brown <mcb30@ipxe.org>
* [crypto] Add big-integer library for RSA calculationsMichael Brown2012-03-131-0/+122
RSA requires modular exponentiation using arbitrarily large integers. Given the sizes of the modulus and exponent, all required calculations can be done without any further dynamic storage allocation. The x86 architecture allows for efficient large integer support via inline assembly using the instructions that take advantage of the carry flag (e.g. "adcl", "rcrl"). This implemention is approximately 80% smaller than the (more generic) AXTLS implementation. Signed-off-by: Michael Brown <mcb30@ipxe.org>