BigInt #120

xwu · 2020-06-08T20:31:09Z

Here's another stab at implementing a BigInt type.

Implementation

I started with a sign-magnitude representation, then made a few gradual refinements (sadly, the Git history got nuked somewhere in there):

Taking inspiration from the Java implementation, I switched to storing the signum instead of the sign; this allows us to eliminate the two possible representations of zero.
I decided to include the count of trailing zero words (i.e., an exponent) in a "combination" field, where _combination = _signum * (_exponent + 1). This not only permits constant-time random access of the two's complement representation of negative values, but allows us not to store those trailing zeros. To determine whether a value is a power of two, for instance, it suffices to check whether there is only one nonzero word that has only one nonzero bit.

The resultant representation is somewhat reminiscent of IEEE floating point, with each value decomposed into a signum, exponent, and significand such that notionally value = _signum * _significand << (UInt.bitWidth * _exponent). Overall, it seems to meld the best of both worlds (sign-magnitude and two's complement) for the reasons given above. I don't know of another implementation of BigInt that takes this approach, so either it's something new and worthwhile or a very silly diversion.

Taking @lorentey's point that it is best to have some accommodation for storing smaller numbers without allocating an array, I take a slightly different approach here: instead of storing two words in a tuple and any longer values in an array, this implementation unconditionally stores the low word of the significand inline. In this way, serially incrementing or decrementing even most large values should not trigger a copy-on-write operation.

Both schoolbook and Karatsuba multiplication are implemented; currently, ~~the cutoff is hardcoded at 16 words, but perhaps this should be tunable (although would such an API need to be thread-safe?)~~ [edit: Karatsuba multiplication is temporarily disabled].

Bitwise operators perform as they should (working with the logical two's complement representation—in fact, for simplicity, their implementations make use of efficient constant-time random access to the actual two's complement representation that is provided by Words).

Only a few new APIs that aren't already present on standard library integer types are introduced:

The initializer init(words:) to create a value from a collection of UInt-typed words.
The BinaryInteger static methods gcd(_:_:) (Stein's algorithm), lcm(_:_:), pow(_:_:) (iterative squaring), sqrt(_:).
The BinaryInteger instance method inverse(modulo:) (extended Euclidean algorithm) and static method pow(_:_:modulo:) (right-to-left binary method).

To-dos:

Implement division (Knuth Algorithm D; Burnikel-Ziegler can be done in a follow-up PR).
Update README.
Audit @inlinable attribute usage.
Implement the most salient BigInt-specific operations (can be done in a follow-up PR).

Testing

I have verified that operations currently implemented produce the expected results. Not all edge cases are covered yet, however.

So that I can test calculations using randomly generated operands, I use attaswift/BigInt as a reference implementation and check that the current implementation produces the same result as does attaswift/BigInt (with one exception, where on manual checking it appears that the latter implementation has a bug).

To-dos:

Test random number generation APIs, gcd, lcm, pow, sqrt.
Compare performance.
Consider stripping attaswift/BigInt test dependency.

xwu · 2020-06-09T12:56:50Z

@benrimmington Hmm, your review comments just vanished. I've addressed the copyright header situation.

stephentyrone · 2020-06-09T19:51:38Z

Awesome, @xwu. I'll give this a thorough review sometime this week.

…ble.

xwu · 2020-06-10T16:48:41Z

It seems that use of a custom BigInt._Significand and Slice<BigInt._Significand> (during Karatsuba multiplication) are major bottlenecks. Once both were disabled, and with compiler optimization and all methods inlinable, we get something like this:

pidigits(10000)

BigInt without custom _Significand (without Karatsuba multiplication):
average: 5.226, relative standard deviation: 0.322%

This seems to be competitive with most implementations of BigInt not based on GMP! And we have room to optimize (for instance, by enabling faster multiplication and division).

attaswift/BigInt:
average: 10.745, relative standard deviation: 0.223%

Earlier observations:
Karatsuba multiplication is in need of a great deal more optimization; benchmarking pidigits with optimizations on and all methods inlinable, we get something like this on my machine:

pidigits(500)

BigInt with custom _Significand (with Karatsuba multiplication):
average: 2.142, relative standard deviation: 1.434%

BigInt with custom _Significand (without Karatsuba multiplication):
average: 0.230, relative standard deviation: 4.056%

attaswift/BigInt:
average: 0.030, relative standard deviation: 21.966%

It seems that use of the Slice type (which I rely on heavily for Karatsuba multiplication) is quite suboptimal.

There's also another order of magnitude of difference in performance that we need to look into. For a fairer comparison, I deleted the division step and disabled Karatsuba multiplication for both implementations. Then I tried swapping out the custom _Significand for just [UInt]:

pidigits_without_division(5000)

BigInt with custom _Significand (without Karatsuba multiplication):
average: 1.749, relative standard deviation: 0.589%

BigInt without custom _Significand (without Karatsuba multiplication):
average: 0.134, relative standard deviation: 7.622%

attaswift/BigInt (without Karatsuba multiplication):
average: 0.239, relative standard deviation: 2.283%

So it seems the idea of keeping the low word separate from the rest of the array is inhibiting performance. The signum-exponent-significand representation, though, seems to be just fine from a performance perspective.

…d "pidigits" performance test. [BigInt] Disable custom _Significand and Karatsuba multiplication. [BigIntModule] Fix up tests and add "pidigits" performance test.

Sajjon · 2020-06-23T10:11:00Z

@xwu Excellent performance! Great job! Any plan to include BigUInt as well? :)

stephentyrone · 2020-06-23T14:25:09Z

@Sajjon Since a BigInt, by definition, can never overflow, there's really no such thing as a BigUInt (the distinction between signed and unsigned only makes sense for fixed-width types).

Sajjon · 2020-06-23T14:31:28Z

@stephentyrone Hmm sorry I might misunderstand you, but if I understand it correctly @xwu excellent work so far just declares BigInt which is signed and thus can allow for negative values. But exactly the same use cases using Swift Foundation where I might want to model my data with UInt instead of Int to mark that the value cannot and should not ever be negative, I would like to do the same thing using Big integers, but at compile time, disallow for negative values.

attaswift/BigInt contains BigUInt which I use today :)

stephentyrone · 2020-06-23T14:35:14Z

@Sajjon Sure, you can define such a type, but Swift really discourages using unsigned types just for "a value that can never be negative" (see: sizes being Int instead of UInt). For fixed-width types, there are other differences that justify having the unsigned types (the arithmetic operations and shifts have different semantics, the set of representable values is different, even if you restrict to all-positive values), but none of those differences hold for non-fixed-width types.

Sajjon · 2020-06-23T14:59:36Z

@stephentyrone Yes I'm quite surprised that Int is the default type of Swift Array's size and also Indices... What is the index -1? It makes sense as an ephemeral value, like, subtract 1 from an index, but only during that subtraction, which more elegantly is expressed as a throwing subtraction function between UInt. So I'm super curious, why did Int become the default integer type, instead of UInt, at least in the context of Array/Collections? And is that (the context of Arrays/Collections) the only reason why "Swift really discourages using unsigned types just for...." ?

Instead of using UInt for disallowing negative values one can of course wrap it in a custom struct, but since Swift lacks Haskell's newtype functionality and since we do not have easy way to delegate conformance of protocols, such as FixedWidthInteger (and all other relevant numerical protocols) it is really cumbersome to wrap values if one still wants to use integer functionality on this new type.

stephentyrone · 2020-06-23T15:39:39Z

So I'm super curious, why did Int become the default integer type, instead of UInt, at least in the context of Array/Collections? And is that (the context of Arrays/Collections) the only reason why "Swift really discourages using unsigned types just for...." ?

Once you decide to eliminate implicit conversions, this decision follows naturally, because otherwise any form of indexing arithmetic becomes super-cumbersome:

Suppose I don't know the relative order of i and j, but want to know how far apart they are. If everything is signed, even though they're "always positive", I do abs(i - j). To work with unsigned indices, I would have to use i < j ? j - i : i - j or UInt(abs(Int(i) - Int(j))).
More generally, being able to unconditionally do arithmetic that forms a possibly-out-of-range index, and then check if its in-range before use, is often a much simpler idiom that needing to detect conditions that could lead to forming an out-of-range index before I do it. Consider a really simple convolution with zero extension:

for i in a.indices {
  a[i] = (0 ... 5).reduce(into: 0) {
    let j = i - 2 + $1
    if a.indices.contains(j) { $0 += weights[$1]*a[j] }
  }
}

If we had unsigned indices, the closure would either look like this:

    let j = i + $1
    if j >= 2 || j < a.count + 2 { // can't subtract 2 from `j` until we know it's >= 2.
      $0 += weights[$1]*a[j - 2]
    }

or maybe this if you're clever:

    let j = i + $1 &- 2 // if this wraps, we'll fail the check below
    if j < a.count {
      $0 += weights[$1]*a[j]
    }

These aren't a lot more code, but how they work is significantly less obvious to a new reader.

Sajjon · 2020-06-25T18:02:38Z

@stephentyrone Excellent answer, description and motivation, thank you!

@xwu Would be great if the API of BigInt could somewhat reflect the one of attaswift/BigInt- it would allow for easy transition - I can help out id you want to? :) Just let me know when the current implementation is somewhat stable (or maybe it is?).

xwu · 2020-06-25T18:28:05Z

@xwu Would be great if the API of BigInt could somewhat reflect the one of attaswift/BigInt- it would allow for easy transition - I can help out id you want to? :) Just let me know when the current implementation is somewhat stable (or maybe it is?).

All implementations are subject to change, I think.

There are only seven APIs here that are introduced beyond what's required by existing protocols, and any differences in their naming as compared to attaswift/BigInt are deliberate, often reflecting what's been formalized in the Swift Evolution process since the time that attaswift/BigInt was created as well as the design of other modules in Swift Numerics :) They are subject to further discussion, of course, but I think at this stage what's much more important is nailing down the core implementation. In fact, I hesitated to add any APIs beyond the integer protocols at all for this very reason, because I'd rather not have that discussion quite yet. If the naming of any of the seven APIs I've added is controversial I'd rather strip it from this PR.

stephentyrone · 2020-06-25T18:36:28Z

There are only seven APIs here that are introduced beyond what's required by existing protocols

Yeah, the space of integer API is extremely constrained by what's in the stdlib protocols, so there's very little room for variation.

Sajjon · 2020-06-25T18:50:17Z

Ok got it! I saw you've implemented pow(_ base: Self, _ exponent: Self, modulo modulus: Self) -> Self? which is great 👍, I've used that a lot in attaswift/BigInt when implementing ECC.

Can't wait to replace (awesome) attaswift/BigInt with just Swift Numerics package everywhere.

Sajjon · 2020-06-25T18:52:09Z

Also, should maybe the BigInt prototype in Swift repo under Prototypes be deleted after this PR is merged? Seems like no need to keep it around? Thoughts on that? :)

stephentyrone · 2020-06-25T19:54:45Z

It's used for testing a bunch of Stdlib functionality, so it should stay put for now.

dwaite · 2020-07-27T05:48:42Z

Suppose I don't know the relative order of i and j, but want to know how far apart they are. If everything is signed, even though they're "always positive", I do abs(i - j). To work with unsigned indices, I would have to use i < j ? j - i : i - j or UInt(abs(Int(i) - Int(j))).

That and the edge cases - for instance, if the value of the UInts i or j is greater than Int.max, the difference will overflow. Since you may be already be using the 64bit unsigned integer type, dealing with that edge case requires more work. Work that you don't typically deal with in C (not that the compiler there will help you by doing the "right" thing).

swanysteve · 2020-10-01T13:55:15Z

I've been working with the code in this PR. I've found a problem in the random number generation.

In my case,
BigInt.random(in: BigInt(0)...BigInt("12345678901234567890")! )
works but
BigInt.random(in: BigInt(0)...BigInt("98765432109876543210")! )
hangs.

It turns out the first number occupies 64 bits but the second needs 67 bits.

After debugging, in the implementation of
extension RandomNumberGenerator { @usableFromInline internal mutating func _next(upperBound: BigInt) -> BigInt {
in RandomNumberGenerator.swift,
the two cases where results are OR-ed with mask should instead be AND-ed with mask. I have seen the repeat loop run multiple times, but I no longer see a hang.

I'm very new to Swift and don't yet know how to run the unit tests, but will work on that next.

swanysteve · 2020-10-08T16:50:49Z

While working on a unit test for LCM, I've found a problem with division.

The following test snippet fails

    let s = "3182990411991163758463334586237477812181413821081264903177942613807541528844305823312360947978873366530408681494780"
    let t = "16012987498029214696648267754993196043335730812460137832692217701508007910953126091749743662257786118530085930814191"
    let m = (BigInt(s)!, AttaswiftBigInt(s)!)
    let n = (BigInt(t)!, AttaswiftBigInt(t)!)
    XCTAssertEqual((m.0*n.0) / m.0, BigInt((m.1*n.1) / m.1))

xwu · 2020-10-08T23:31:57Z

@swanysteve Yikes, that was a think-o in the implementation, which I've now corrected.

swanysteve · 2020-10-09T11:43:02Z

Thanks @xwu . My LCM and Pow tests now pass.

I'm still getting a hang in my random test.

  func testRandom() {
    let b1 = BigInt(0)
    let t1 = BigInt("12345678901234567890")! //64 bits
    let t2 = BigInt("98765432109876543210")! //67 bits

    func testClosedRanges() {
      let n1 = BigInt.random(in: b1...t1)
      XCTAssert( n1 >= b1 && n1 <= t1);
      let n2 = BigInt.random(in: b1...t2)
      XCTAssert( n2 >= b1 && n2 <= t2);
    }

    func testOpenRanges() {
      let n1 = BigInt.random(in: b1..<t1)
      XCTAssert( n1 >= b1 && n1 < t1);
      let n2 = BigInt.random(in: b1..<t2)
      XCTAssert( n2 >= b1 && n2 < t2);
    }

    testClosedRanges()
    testOpenRanges()
  }

xwu · 2020-10-09T11:48:25Z

~~As you've noted, that's a standard library bug :)~~

swanysteve · 2020-10-09T12:07:36Z

Sorry, I don't understand. The bug is in BigIntModule/RandomNumberGenerator.swift which is part of this PR, AFAICT.

xwu · 2020-10-09T12:38:18Z

As you can tell, I haven't been focusing on this much. You're absolutely right; will fix that think-o tonight. Thanks for exercising this code, and keep these coming!

…ext'.

…values and adding '_isZero'.

… and restore a few stylistic choices.

1oo7 · 2021-02-06T06:42:08Z

What's the hold-up on this PR?

xwu · 2021-02-06T16:34:47Z

@1oo7 Glad you're enthusiastic about this PR. As you can see, there is some additional work I'd like to do before this gets merged. Karatsuba multiplication is excruciatingly slow, for instance, due to some design decisions that need to be revised. What's more, the PR is waiting on @stephentyrone's comments.

Do feel free to contribute any notes about the implementation, particularly bugs you find. However, in general, it's helpful not to bump PRs, and please avoid using the approval features on GitHub as an upvote.

1oo7 · 2021-02-09T20:53:59Z

@1oo7 Glad you're enthusiastic about this PR. As you can see, there is some additional work I'd like to do before this gets merged. Karatsuba multiplication is excruciatingly slow, for instance, due to some design decisions that need to be revised. What's more, the PR is waiting on @stephentyrone's comments.

Do feel free to contribute any notes about the implementation, particularly bugs you find. However, in general, it's helpful not to bump PRs, and please avoid using the approval features on GitHub as an upvote.

Thanks Xiaodi. I recognize your name from Swift.org forums were we have had some conversations. Hope to see this progress and also eventually get BigDecimal as well. Thanks for all your hard work.

Sajjon · 2022-09-16T10:25:59Z

Will StaticBigInt change the status of this PR, as in help it progress? https://github.com/apple/swift-evolution/blob/main/proposals/0368-staticbigint.md

xwu · 2022-09-16T12:50:53Z

No doubt!

Add sketch for BigIntModule.

d91a39c

xwu mentioned this pull request Jun 8, 2020

[Sketch]Arbitrary-precision BigInt #84

Merged

xwu added 2 commits June 8, 2020 20:14

[BigIntModule] Add copyright headers and update an outdated comment.

1b6cca7

[BigInt] Implement division (Knuth).

87b62a9

xwu force-pushed the BigInt branch from 15cd858 to 87b62a9 Compare June 9, 2020 12:33

xwu changed the title ~~[Sketch] BigInt~~ BigInt Jun 9, 2020

[NFC][BigInt] Add some comments and make two internal methods @inlina…

182a140

…ble.

xwu force-pushed the BigInt branch from 8f819ff to 182a140 Compare June 10, 2020 03:35

[BigIntModule] Update README.

4c0d382

[BigInt] Disable custom _Significand and Karatsuba multiplication, ad…

5976abc

…d "pidigits" performance test. [BigInt] Disable custom _Significand and Karatsuba multiplication. [BigIntModule] Fix up tests and add "pidigits" performance test.

xwu force-pushed the BigInt branch from f9ec64e to 5976abc Compare June 10, 2020 17:55

xwu added 2 commits June 11, 2020 11:10

[BigIntModule] Add BigIntModule dependency to Numerics target.

c8b2e1a

[BigIntModule] Implement integer algorithms.

7bc7910

xwu added 2 commits October 8, 2020 20:29

[BigInt] Fix a bug in the implementation of Knuth's Algorithm D.

86e72f9

Merge branch 'master' into BigInt

0f57bb1

xwu changed the base branch from master to main October 9, 2020 00:51

xwu added 6 commits October 9, 2020 21:02

[BigInt] Fix a bug in the implementation of 'RandomNumberGenerator._n…

7ed3ddf

…ext'.

[BigInt] Simplify implementation by always normalizing newly created …

8364081

…values and adding '_isZero'.

[BigInt] Excise obsolete attempt at inline storage of lowest word.

f0e2571

[BigInt] Remove custom 'BigInt._Significand(_:_:)' initializer.

713734f

[BigInt] Fix example in documentation.

4102a08

[BigInt] Change normalized representation of zero, simplify division,…

775b898

… and restore a few stylistic choices.

1oo7 approved these changes Feb 6, 2021

View reviewed changes

LiarPrincess mentioned this pull request Feb 22, 2023

[BigInt tests][No merge] 🐰 How NOT to write performance tests #256

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BigInt #120

BigInt #120

xwu commented Jun 8, 2020 •

edited

Loading

xwu commented Jun 9, 2020

stephentyrone commented Jun 9, 2020

xwu commented Jun 10, 2020 •

edited

Loading

Sajjon commented Jun 23, 2020

stephentyrone commented Jun 23, 2020

Sajjon commented Jun 23, 2020 •

edited

Loading

stephentyrone commented Jun 23, 2020

Sajjon commented Jun 23, 2020

stephentyrone commented Jun 23, 2020

Sajjon commented Jun 25, 2020

xwu commented Jun 25, 2020 •

edited

Loading

stephentyrone commented Jun 25, 2020

Sajjon commented Jun 25, 2020

Sajjon commented Jun 25, 2020 •

edited

Loading

stephentyrone commented Jun 25, 2020

dwaite commented Jul 27, 2020

swanysteve commented Oct 1, 2020

swanysteve commented Oct 8, 2020

xwu commented Oct 8, 2020 •

edited

Loading

swanysteve commented Oct 9, 2020

xwu commented Oct 9, 2020 •

edited

Loading

swanysteve commented Oct 9, 2020

xwu commented Oct 9, 2020

1oo7 commented Feb 6, 2021

xwu commented Feb 6, 2021

1oo7 commented Feb 9, 2021

Sajjon commented Sep 16, 2022

xwu commented Sep 16, 2022

BigInt #120

Are you sure you want to change the base?

BigInt #120

Conversation

xwu commented Jun 8, 2020 • edited Loading

Implementation

To-dos:

Testing

To-dos:

xwu commented Jun 9, 2020

stephentyrone commented Jun 9, 2020

xwu commented Jun 10, 2020 • edited Loading

Sajjon commented Jun 23, 2020

stephentyrone commented Jun 23, 2020

Sajjon commented Jun 23, 2020 • edited Loading

stephentyrone commented Jun 23, 2020

Sajjon commented Jun 23, 2020

stephentyrone commented Jun 23, 2020

Sajjon commented Jun 25, 2020

xwu commented Jun 25, 2020 • edited Loading

stephentyrone commented Jun 25, 2020

Sajjon commented Jun 25, 2020

Sajjon commented Jun 25, 2020 • edited Loading

stephentyrone commented Jun 25, 2020

dwaite commented Jul 27, 2020

swanysteve commented Oct 1, 2020

swanysteve commented Oct 8, 2020

xwu commented Oct 8, 2020 • edited Loading

swanysteve commented Oct 9, 2020

xwu commented Oct 9, 2020 • edited Loading

swanysteve commented Oct 9, 2020

xwu commented Oct 9, 2020

1oo7 commented Feb 6, 2021

xwu commented Feb 6, 2021

1oo7 commented Feb 9, 2021

Sajjon commented Sep 16, 2022

xwu commented Sep 16, 2022

xwu commented Jun 8, 2020 •

edited

Loading

xwu commented Jun 10, 2020 •

edited

Loading

Sajjon commented Jun 23, 2020 •

edited

Loading

xwu commented Jun 25, 2020 •

edited

Loading

Sajjon commented Jun 25, 2020 •

edited

Loading

xwu commented Oct 8, 2020 •

edited

Loading

xwu commented Oct 9, 2020 •

edited

Loading