Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential Performance Issue Tracking #177

Open
bryanedds opened this issue Feb 16, 2017 · 8 comments
Open

Potential Performance Issue Tracking #177

bryanedds opened this issue Feb 16, 2017 · 8 comments
Labels
discussion performance issues or potential issues relating to performance

Comments

@bryanedds
Copy link
Owner

bryanedds commented Feb 16, 2017

Current potential performance issues in Nu, in no particular order -

Potential Issue - Event handlers in a dictionary are slower than handlers on the subscribed object a la C#. This means a look up for every publish. However, this is an artifact of a publisher-neutral event system rather than anything related to FP.

Possible Solution - A lot of optimization is already done to avoid publish calls that won't have a useful effect. Beyond these, I have yet to think of further solutions.

Potential Issue - Farseer Physics Engine doesn't scale to 1000s of interacting bodies - Genbox/VelcroPhysics#29

Possible Solution - Presumably Farseer could be replaced with a much faster 2D physics lib, perhaps one written in C or C++. Of course, the question then becomes about the overhead of the required marshalling.

Potential Issue - The string hashing required for each Xtension property look-up is suboptimal.

Possible Solution - Not many practical ones. This issue wouldn't exist if .NET lazy-cached hashes in strings, but there's no reason to believe it ever will. At one point I used an alternative type to string called 'Lun' (later called 'Name') which contained a string and its lazily-computed hash, but it wasn't very friendly to use. I decided to get rid of it in favor of .NET strings to simplify Nu's API. I'm pretty sure this was the right decision, but I can't prove it one way or another without making large speculative changes to the engine.

Update - Now that F# finally has implicit ctors, reintroducing the Name type shouldn't cause as many changes as it previously would have. This might now be a practical experiment to run.

Potential Issue - LOH threshold is perhaps too small.

Possible Solution - Upgrading to >= .NET 4.8 will allow us to configure it via GCLOHThreshold - https://docs.microsoft.com/en-us/dotnet/framework/configure-apps/file-schema/runtime/gclohthreshold-element

Update - I've tried increasing LOH, but I cannot observe it having any effect. It's like my attempt is being ignored by the runtime.

Potential Issue - Potentially a lot of events when a subscribed entity transforms -

let world = World.publishEntityChange (nameof Transform) () () publishChangeEvents entity world // OPTIMIZATION: eliding data for computed change events for speed.
let world =
if is2d then
let perimeterCenteredChanged = transformNew.PerimeterCentered <> transformOld.PerimeterCentered
let perimeterChanged = positionChanged || scaleChanged || offsetChanged || sizeChanged || perimeterCenteredChanged
let boundsChanged = perimeterChanged || rotationChanged
if boundsChanged then
let world = World.publishEntityChange Constants.Engine.BoundsPropertyName () () publishChangeEvents entity world
let world =
if perimeterChanged then
let world = World.publishEntityChange (nameof transformNew.Perimeter) () () publishChangeEvents entity world
let world = World.publishEntityChange (nameof transformNew.PerimeterUnscaled) () () publishChangeEvents entity world
let world = World.publishEntityChange (nameof transformNew.PerimeterCenter) () () publishChangeEvents entity world
let world = World.publishEntityChange (nameof transformNew.PerimeterBottom) () () publishChangeEvents entity world
let world = World.publishEntityChange (nameof transformNew.PerimeterBottomLeft) () () publishChangeEvents entity world
let world = World.publishEntityChange (nameof transformNew.PerimeterMin) () () publishChangeEvents entity world
let world = World.publishEntityChange (nameof transformNew.PerimeterMax) () () publishChangeEvents entity world
world
else world
let world = if positionChanged then World.publishEntityChange (nameof transformNew.Position) transformOld.Position transformNew.Position publishChangeEvents entity world else world
let world = if scaleChanged then World.publishEntityChange (nameof transformNew.Scale) transformOld.Scale transformNew.Scale publishChangeEvents entity world else world
let world = if offsetChanged then World.publishEntityChange (nameof transformNew.Offset) transformOld.Offset transformNew.Offset publishChangeEvents entity world else world
let world = if sizeChanged then World.publishEntityChange (nameof transformNew.Size) transformOld.Size transformNew.Size publishChangeEvents entity world else world
let world = if perimeterCenteredChanged then World.publishEntityChange (nameof transformNew.PerimeterCentered) transformOld.PerimeterCentered transformNew.PerimeterCentered publishChangeEvents entity world else world
world
else world
else
let boundsChanged = positionChanged || rotationChanged || scaleChanged || offsetChanged || sizeChanged
if boundsChanged then
let world = World.publishEntityChange Constants.Engine.BoundsPropertyName () () publishChangeEvents entity world
let world = if positionChanged then World.publishEntityChange (nameof transformNew.Position) transformOld.Position transformNew.Position publishChangeEvents entity world else world
let world = if scaleChanged then World.publishEntityChange (nameof transformNew.Scale) transformOld.Scale transformNew.Scale publishChangeEvents entity world else world
let world = if offsetChanged then World.publishEntityChange (nameof transformNew.Offset) transformOld.Offset transformNew.Offset publishChangeEvents entity world else world
let world = if sizeChanged then World.publishEntityChange (nameof transformNew.Size) transformOld.Size transformNew.Size publishChangeEvents entity world else world
world
else world
let world =
if rotationChanged then
let world = World.publishEntityChange (nameof transformNew.Rotation) transformOld.Rotation transformNew.Rotation publishChangeEvents entity world
let world = World.publishEntityChange (nameof transformNew.Angles) () () publishChangeEvents entity world
let world = World.publishEntityChange (nameof transformNew.Degrees) () () publishChangeEvents entity world
world
else world
let world =
if elevationChanged
then World.publishEntityChange (nameof transformNew.Elevation) transformOld.Elevation transformNew.Elevation publishChangeEvents entity world
else world
let world =
if overflowChanged
then World.publishEntityChange (nameof transformNew.Overflow) transformOld.Overflow transformNew.Overflow publishChangeEvents entity world
else world

Possible Solution - Probably nothing great. Could selectively disable a chunk of transform events depending on the application. Not real sure what to do here other than assess that this is part of the cost of doing business declaratively.

Potential Issue - Synchronizing entity properties via World.setEntityPropertyFast requires a small and likely cache-local dictionary look-up via WorldModuleEntity.EntitySetters, which is surprisingly fast.

Possible Solution - A faster alternative might be hard-coding a duplicate of the EntitySetters table in a match expression or using a loftier technique such as code generation in the MVU implementation.

Potential Issue - Nu Text rendering might be quite inefficient due to not caching target render buffers. IIRC, render buffers use for text are allocated and deallocated on a one-off basis. I do not see how that could possibly scale well.

Possible Solution - Code it properly. :)

Potential Issue - Only seems to cause a couple small hiccups at the start of programs, but currently .NET GC compaction is not yet parallelized and therefore can cause stalls while it does its thing. This doesn't seem to happen once Nu programs hit their steady state after a couple seconds. Fortunately, according to the .NET team, it appears that parallel compacting is being implemented.

Possible Solution - Wait for parallel compacting GC to ship (.NET 9?). Otherwise, issue GC.Collect between scenes if needed to.

Update - On .NET 9 now and it seems like it has helped with the issue. However, we need to do conrete measurements to make sure.

Potential Issue - Setting physics properties after creating an entity, such as is done by the MMCC initializers, can cause a lot of body recreation inside the physics engines due to the way that RigidBodyFacet's property change handlers work.

Possible Solution - Instead of recreating the physics bodies, create addition body property synchronization messages to make body recreation less often necessary.

@bryanedds
Copy link
Owner Author

bryanedds commented Jul 4, 2017

On writing a compiler for the AMSL -

This is a large task, even for partial compilation.

I estimate 400 hours worth of work if I were to do it myself.

It would probably a fair bit longer for someone who doesn't have as much knowledge about the interpreter's implementation.

@bryanedds
Copy link
Owner Author

bryanedds commented Jun 2, 2019

I read somewhere that, due to security checks, it's significantly slower to get / set a property with reflection than to get / set its backing field. So if we can get / set the backing fields directly, that could speed up serialization.

@bryanedds
Copy link
Owner Author

I'm currently working on putting the main subsystem processing on separate threads. If this works well, it should at least double performance.

@bryanedds
Copy link
Owner Author

I've managed to get rendering and audio onto separate threads, but not physics. Putting physics on a separate thread may play hell with certain semantic guarantees that are highly desirable. Additionally, I'm not sure if Farseer was even built for this. since I don't know if we can do raycasts and such while the it is integrating.

Maybe better is a physics engine that internally threads itself across cores. Unfortunately, I can't find a .NET wrapper for Box2D, which I think would do this.

@bryanedds
Copy link
Owner Author

bryanedds commented Jul 7, 2019

I just found out that threading does not work with the out-of-box SDL renderer, so I have to temporarily put the rendering and audio code back on the main thread. The only way to get rendering on another thread is to write an OpenGL renderer from scratch, which I don't immediately have time for.

@bryanedds
Copy link
Owner Author

Today I attempted to utilize WeakReference in ComponentRef to lighten the load on the GC's scan process. This was not a good idea since it crushed performance. Apparently there is more than enough compute in WeakReference.TryGetTarget to obliterate any potential gains from the hypothetical reduction in GC scan process. Bummer.

@bryanedds
Copy link
Owner Author

Another solution to the rendering performance problem is the use of SDL_gpu to do rendering. I will be playing with this possibility over the next week.

@bryanedds
Copy link
Owner Author

I was unable to utilize SDL_gpu due to this issue - grimfang4/sdl-gpu#15 (comment)

I don't know if the maintainer, @grimfang4 is aware of the issue tho?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion performance issues or potential issues relating to performance
Projects
None yet
Development

No branches or pull requests

1 participant