Substantial Python loop overhead for VBBL #104

kmzzhang · 2023-10-02T22:01:08Z

Currently, the binary lens light curve is calculated in a python loop where each point is evaluated separately.

https://github.com/rpoleski/MulensModel/blob/master/source/MulensModel/magnificationcurve.py#L421

When using VBBL as the finite source method, the python loop overhead can slow things down by up to ~7 times compared to VBBL's own python wrapper, where the loop occurs in C++. This most apparent when VBBL is used for the full light curve (which itself decides automatically whether to trigger full FS calculation). Perhaps could aggregate points that use VBBL and move the loop into C++? I considered making a pull request but I realize this may involve refracting larger parts of the code.

jenniferyee · 2023-10-03T14:52:32Z

@kmzzhang We are about to do a major refactor for Version 3, so we can add this to the list. In particular, we are talking about having subclasses for models, which could include creating MagnificationCurve subclasses such as MagnificationCurveVBBL()

rpoleski · 2023-10-03T17:32:59Z

We don't have to wait till v3. This seems relatively easy thing. I'm guessing we did it on the epoch-by-epoch basis because it's easier to pass a specific number of floats between python on C++, rather than arrays of unknown sizes.
Thanks @kmzzhang for bringing that up!

kmzzhang · 2023-10-03T22:06:32Z

You're welcome. I believe this also applies to 2L1S point source when the SG12 solver in VBBL is used.

Since the finite source method is specified in time intervals in MulensModel, perhaps one way to refractor the code is to aggregate epochs by those intervals and calculate the magnifications?

rpoleski · 2023-10-04T02:33:04Z

Yes, that's what I plan to do.

rpoleski · 2023-10-19T14:28:40Z

@kmzzhang Can you please share how you pass pointers/vectors of floats between python and C++? I see that there are different approaches to PyArg_Parse*() functions and would prefer not to re-invent the wheel, if you have already done so.

kmzzhang · 2023-10-30T04:13:00Z

@rpoleski Apologies for the late reply. I don't have a particular way of doing this, but perhaps you could use Valario's way of making python wrappers: https://github.com/valboz/VBBinaryLensing/tree/master/VBBinaryLensing/lib. He has a python wrapper for the BinaryLightCurve function that takes array of times (ts). One could easily modify this function (and its wrapper) to take arrays of source locations y1s and y2s instead of times. Everything else would be the same.

rpoleski · 2024-10-29T19:41:43Z

@kmzzhang Can you provide example codes that show significant differences in execution?

kmzzhang · 2024-11-04T16:41:09Z

@rpoleski I did some quick tests
https://gist.github.com/kmzzhang/c174cb3586ebab69dde5388efd2868c4

If MM only uses VBBL for points that need full finite source, it is around 0%--40% slower.
If MM uses VBBL for full light curve, up to an order of magnitude slower depending on how many points need full FS.

Note that in the "Point source" example, VBBL native python wrapper is faster than MM point source method, even when VBBL automatically decides whether to do FS or not for each point (I kept non zero rho but none of the points actually needed FS).

So once the loop is moved within C++ one doesn't need to specify the time interval needing full FS and it's still lot faster.

kmzzhang · 2024-11-05T01:50:50Z

Also if I recall correctly, for each subsequent time sampling VBBL initializes lens-equation root finding from the roots for the previous time sampling. But since MulensModel calls the VBBL python wrapper separately for each time stamp VBBL had to initialize the roots from zero every single time making it much slower. So it's most likely more than an overhead.

rpoleski · 2024-11-08T10:39:20Z

There is one more aspect: MM still uses some older version VBBL, not VBM. I don't know how much they differ in speed.

CoastEgo · 2024-12-06T09:10:52Z

I guess one reason is that MulensModel doesn't have the keyword for relative tolerance control (right?), which significantly affects the speed.

rpoleski · 2024-12-09T15:33:06Z

I've started a new branch for that: vbbl_reltol

rpoleski · 2024-12-13T10:13:18Z

Setting relative tolerance is ready to be merged. The only question is what default values we set for absolute and relative tolerance. I'd like the code to work well for Roman data, so I propose RelTol = 0.001 and Tol = 0 (i.e., the latter is ignored).
@jenniferyee what do you think?

jenniferyee added the version3 label Oct 3, 2023

rpoleski added enhancement help wanted labels Oct 3, 2023

rpoleski removed the version3 label Dec 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Substantial Python loop overhead for VBBL #104

Substantial Python loop overhead for VBBL #104

kmzzhang commented Oct 2, 2023

jenniferyee commented Oct 3, 2023

rpoleski commented Oct 3, 2023

kmzzhang commented Oct 3, 2023

rpoleski commented Oct 4, 2023

rpoleski commented Oct 19, 2023

kmzzhang commented Oct 30, 2023

rpoleski commented Oct 29, 2024

kmzzhang commented Nov 4, 2024

kmzzhang commented Nov 5, 2024

rpoleski commented Nov 8, 2024

CoastEgo commented Dec 6, 2024

rpoleski commented Dec 9, 2024

rpoleski commented Dec 13, 2024

Substantial Python loop overhead for VBBL #104

Substantial Python loop overhead for VBBL #104

Comments

kmzzhang commented Oct 2, 2023

jenniferyee commented Oct 3, 2023

rpoleski commented Oct 3, 2023

kmzzhang commented Oct 3, 2023

rpoleski commented Oct 4, 2023

rpoleski commented Oct 19, 2023

kmzzhang commented Oct 30, 2023

rpoleski commented Oct 29, 2024

kmzzhang commented Nov 4, 2024

kmzzhang commented Nov 5, 2024

rpoleski commented Nov 8, 2024

CoastEgo commented Dec 6, 2024

rpoleski commented Dec 9, 2024

rpoleski commented Dec 13, 2024