Replies: 11 comments
-
Thank you for your issue, note that TorchJD is a framework to do JD (not necessarily a framework to do UPGrad), we could in principle implement config in TorchJD as an aggregator, is that what you are asking? If This aggregators feels like a mix of IMTL-G and PCGrad which are both not typically performing well, I doubt that this would lead to better performance than UPGrad but we can try if you think this is worth the shot! When trying to make the sum non-conflicting, there is essentially two very natural ways by projecting onto the non-conflicting cone (dual cone of the rows of the Jacobian), one consist of projecting the sum of the gradients onto the non-conflicting cone, this is |
Beta Was this translation helpful? Give feedback.
-
Then it would make sense to ping the original authors as they can respond better. |
Beta Was this translation helpful? Give feedback.
-
Hi, thanks for the interesting comment! I am not sure whether I fully understand the notation there, but I think there are some interesting points I would like to discuss.
Once this equation is solved, then x should have positive dot products to any I have just checked the torched package, and I think it is great work! I really like your ideas and will have a deeper look into your work and see whether we can make some comparisons! Cheers, |
Beta Was this translation helpful? Give feedback.
-
@qiauil in that case, your formulation is equivalent to the one with the Moore Penrose pseudo inverse, in this case I agree that it is non-conflicting. However consider a point on the Pareto front, then it is Pareto stationary and therefore there is That being said, I am curious of the performance of your aggregator and I think this could be implemented in TorchJD. The advantage is that you would be able to parallelize the computation of the gradients (using vmap from torch), so it should be faster! I'm guessing that the function EDIT: I think that you could avoid the problem of diving by zero by solving equivalently |
Beta Was this translation helpful? Give feedback.
-
Thanks for your comments! Yes, I would agree that near the Pareto front, it is hard to get a non-conflict direction. We also considered this point during the implementation. However, we found that it actually works well near the Pareto front due to the implementation of pseudo-inverse/ least square solution and are trying to get an approximate solution of the above equations. When near to the Pareto front, we will get a zero gradient. We can show a simple example here: import torch
import time
b=torch.ones(2)
xs=[]
for i in range(1000):
A=torch.rand(1,1000)
A=torch.cat((A,-A),dim=0)
xs.append(torch.linalg.lstsq(A,b).solution)
xs=torch.stack(xs)
print(torch.mean(xs))
print(torch.max(xs))
print(torch.min(xs)) which gives us 4.3675e-11,3.0981e-10, and -2.2936e-10. Yes, the |
Beta Was this translation helpful? Give feedback.
-
@qiauil this correspond to the example I gave with import torch
import time
b=torch.ones(2)
xs=[]
for i in range(1000):
A=torch.rand(1,1000)
A=torch.cat((A,-0.1*A),dim=0)
xs.append(torch.linalg.lstsq(A,b).solution)
xs=torch.stack(xs)
print(torch.mean(xs))
print(torch.max(xs))
print(torch.min(xs)) here I get |
Beta Was this translation helpful? Give feedback.
-
Hi, @PierreQuinton Sorry, I forgot the unit vector operation; the following example will work: import torch
import time
b=torch.ones(2)
xs=[]
for i in range(1000):
A=torch.rand(1,1000)
A=torch.cat((A,-0.1*A),dim=0)
A=torch.nan_to_num((A/(A.norm(dim=1)).unsqueeze(1)),0)
xs.append(torch.linalg.lstsq(A,b).solution)
xs=torch.stack(xs)
print(torch.mean(xs))
print(torch.max(xs))
print(torch.min(xs)) it gives 3.4520e-10, 9.2795e-09 and -1.0622e-08. |
Beta Was this translation helpful? Give feedback.
-
@qiauil I think that you are right, in 2 dimension this will be non-conflicting! I still have doubts about higher dimensions. Also I would rather see a proof than an example, I will try to prove/disprove it later. |
Beta Was this translation helpful? Give feedback.
-
I need more popcorn. Be right back... |
Beta Was this translation helpful? Give feedback.
-
FYI I just enabled discussions in the repo, and transferred this from issue to discussion. |
Beta Was this translation helpful? Give feedback.
-
Note that a PR to integrate ConFIG to torchjd is now open here, with some additional discussion. |
Beta Was this translation helpful? Give feedback.
-
Not super familiar with the literature lately. Can you explain what are the differences of your approach to ConFIG?
https://github.com/tum-pbs/ConFIG
Beta Was this translation helpful? Give feedback.
All reactions