-
-
Notifications
You must be signed in to change notification settings - Fork 279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Understanding usage of System.Numerics APIs #254
Comments
Thanks!
Sure! How would you want to do that?
Shoot, I get distracted by AIstuff for a few months and you're already this far into 8's cycle :P Sounds like I've got some catching up to do. I suspect there are quite a few legacy-informed decisions I'll need to revisit (in a good delete-heavy way). |
@tannergooding ! i saw your work but forgot to ask you or check latety, glad you got pointed here, because this did come up, I was going to ask for this: Vector2 Vector2/3/4, Matrix3x2and those kind of physics and graphics related , real world types to have SIMD intrinsics.. i believe Julia and Swift might have it. but im absolutely hooked on .net 7 and JIT . Unity has some more homogenous vector and matrix types., some have special extensions for Vector2i, and vector2long but ideally wouid be T with generalized SIMD like has gone nicely to Vector128 which is amazing progress, letting pages of old code to just get erased :) So even if not on 3d mabye 2D because that is what maps to pixels. it should IMO be generalized in the core math, is practical but admittedly still niche and but trending up in use case , mabye out ofthe scope of bepu near term.. also as Ross said, late in the cyle for both i guess but , if Net 9 ... or to keep in mind, or if is already been requested enough or implementable without huge impact... I'm also sidetracked
but heres the additional cases i see: Quantum gravity theorists use the types. crypography.. floating point error have crashed rockets already. while space games with large cooridnate systems often do local frames via floats for either local physics or as a workaround, the universe state is best stored as ulong or even bigint .. But there are good reasons to use Int32 / fixed/ posit / unums homogenous math for integration steps. In large worlds often ulongs are used, the floats are taken from the data for convienience, but the data is kept as countables even for sparse fields. so in summary besides the other game frameworks having it , as well as CAD. and a huge cry for 3DStudio max to store doubles because models ( drift after multiple edits) . in 1981 Autocad visuary founder John Walker decided on doubles for at database level, ( floats at display list level) or civil 1 . determinism and reversiblity , for multiplayer, for energy conservation, for Eulerian fluid math, and for distributed comuting. DSP and fixed points maths are standard for this stuff. the issues with IEEE floats are just unsolvable and doubles are generally too big and the issues remain.
Sorry about the typos and edits.if homogenous cooridnates systems are accommodated I'm sure i |
Not sure if BepuPhysics would benefit from it but I do miss a Matrix4x3, because it's exactly what's needed to represent a model in 3D space, on a matrix 4x4 the 4th column is usually left to be 0,0,0,1. So a 4x3 would mean less bandwidth, and I guess it could be further optimized @tannergooding does numeric.vectors have a specific place for discussion? I also use vectors extensively but don't want to steal discussion |
@RossNordby. That's really up to you. We could setup something informal over Teams or Discord, we could just have an async discussion on GitHub here, or something else. In general whatever is easiest on you all. |
I've looked at providing a For supporting floating-point There is also the consideration that accelerating
IEEE 754 floating-point as spec'd is deterministic, if nothing else. The main issue is that many scenarios, particularly games, compile with features like "fast-math" which tells the compiler "I don't care about deterministic behavior, give me speed instead". The other of which is that many core math functions (such as The other issue is that floating-point is an approximation in general, so even if deterministic, there is natural error that is introduced and which must be handled. You therefore cannot thing of things in terms of "regular math", but instead must modify the domain to explicitly account for this error. There are many tricks and ways this can be done, including without losing significant performance, but it is in general something that must be accounted for.
They certainly aren't unsolvable. Just requires a little bit of math to understand the limits and fit something into the general model that works. Using something with more precision, like The best approaches tend to be a little bit of each, so you have the right balance between speed and accuracy. -- Keep in mind that in practice, no one actually works in infinite precision. NASA uses 16-digits for PI and you need something The main consideration is that of the 4 billion representable 32-bit floating-point values, ~50% of them exist in the domain of This leads to several commonly used solutions for handling things both efficiently and correctly to account for the precision/domain limitations without losing perf or overall accuracy. |
Depends on what the discussion is about. In general we allow discussion threads to be opened on https://github.com/dotnet/runtime/discussions You can also open up API proposals there as well: https://github.com/dotnet/runtime/issues/new?assignees=&labels=api-suggestion&template=02_api_proposal.yml&title=%5BAPI+Proposal%5D%3A+ - Just keep in mind we have a general process and following the template helps ensure all the information we need for API review exists. If the person opening the proposal doesn't fill it out, then the .NET team has to instead, and that generall means it doesn't get reviewed or considered as quickly. I am personally also on Discord for both the C# Community and the .NET Evolution servers. The former is for general discussion and the latter is more for working on |
Alright, suppose I'll start here since async's easy to schedule, and we can hop elsewhere if useful. What's being used, where, and how it's workedThe biggest use of the numerics APIs by execution time tends to be in constraint solving and narrow phase execution, which look like this: All of these implementations are fed with bundles of packed constraints or collision pairs, and every lane of execution is fully independent. They're heavily reliant on AoSoA-packed struct types containing A lot of these implementations are a bit old and were built with much older JIT versions in mind, hence the quantity of Historically, between the struct representation and refness causing aliasing difficulties, codegen was often extremely stack shuffley. It's improved hugely (as of my last close look in the 7 previews, probably moreso now), so newer parts of the codebase tend to use far less ref and more operators. During my last relevant testing in the 6-7 timeframe, I did notice operators and other functions that got inlined still sometimes produced worse instructions than manual inlining, for example: (I have not yet revisited these for the latest previews; no idea if this is still relevant.) Intrinsics have started sneaking their way into parts of the codebase as I revisit things. The gather/scatter used by constraints is a notable case: This was built before the cross-platform helpers were added; I did take a quick swing at porting as much over as I could for the sake of accelerating ARM (rather than using the current rather bad fallback) but ran into some difficulties with getting efficient permutes. Unfortunately, this was several months ago, so I've mostly forgotten the details. There are a number of other places in the codebase that would benefit from similar fast gather/scatter/transpositions, so that's probably going to see more work. I've adopted the cross-platform helpers quite thoroughly... somewhere... but I can't find any big examples in bepuphysics2, so apparently that was in another project. They're good and I like them! I find most new intrinsics-y codechunks I write now are cross platform first with sprinkled platform specific fine-tuning as needed. There are quite a few uses of the more traditional types like Some of them are mostly copied from the ancient bepuphysics1 codebase. I've also accumulated some custom types like I also anticipate using the improved non-AoSoA types more, and in more performance sensitive areas, soonishly. There are places where the wide AoSoA representation is not ideal due to extremely high divergence (e.g. hull-hull collisions); providing a fast narrow path will likely be a significant win there. I'd also like to provide certain types of queries (like those mentioned in #150) without batching (because holy moly the As a side note, the interop between I guess in summary, I use/will use most stuff you've stuck in there, and the improvements have been really nice to have! Additional things that could be usefulGiven that so much of the library is based on executing machine-width bundles, AVX512 is a promising feature for the future (assuming Intel gets consistent support). I've seen some of the work towards this so... thumbs up, I suppose! Otherwise, I find it a bit difficult to point out specific APIs that I want. The discrete features I've usually wanted most are those that unlock new capabilities that weren't feasible before, or which make things systematically easier, as opposed to, say, another helper method. (I'm not opposed to helper methods, of course, but I don't mind implementing them myself :P)
I'm not immediately sure what the next most significant unlock would be (apart from the obvious ones like "supporting more instruction sets"). Maybe indirect stuff, like codegen quality- the ability to realistically use operators on largeish custom types was extremely valuable for sanity. (And, again outside of numerics, I admit I've sometimes wanted memory aliasing hints. I'm not convinced adding such a thing would actually be a good idea, because oof, but I've thought about it.) That's about it off the top of my head. Might remember some other things later- let me know if you've got any questions! |
Hey, was pointed at this repo by some other people and just wanted to say looks awesome!
As the owner of the System.Numerics and System.Runtime.Intrinsics APIs on the .NET libraries side of things, it would be great if we could have a sync so I could better understand how you're using things here and any changes or improvements that are needed in the space.
For .NET 8, I added/improved acceleration for Vector2/3/4, added SIMD acceleration for Quaternion/Plane, and rewrote Matrix3x2 and Matrix4x4. This resulted in 8-48x perf improvements in many core scenarios.
We're also looking to add some more APIs to these types to help cover some "missing" functionality, but getting some additional input from real world use cases will help justify the work and ensure its being prioritized correctly (as well as ensuring any cases we haven't thought of yet are tracked).
Look forward to hearing from you!
The text was updated successfully, but these errors were encountered: