-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX ProxNewton solver with fixpoint strategy #259
FIX ProxNewton solver with fixpoint strategy #259
Conversation
Frankly compared to the cost of the rest of the computation I expect this
to be totally negligible
Il sab 1 giu 2024, 12:15 Badr MOUFAD ***@***.***> ha scritto:
… ***@***.**** commented on this pull request.
------------------------------
In skglm/solvers/anderson_cd.py
<#259 (comment)>
:
> @@ -184,7 +184,7 @@ def solve(self, X, y, datafit, penalty, w_init=None, Xw_init=None):
opt_ws = penalty.subdiff_distance(w[:n_features], grad_ws, ws)
elif self.ws_strategy == "fixpoint":
opt_ws = dist_fix_point_cd(
- w[:n_features], grad_ws, lipschitz, datafit, penalty, ws
+ w[:n_features], grad_ws, lipschitz[ws], datafit, penalty, ws
I feel that we should expect performance regression in AnderonCD as after
this we will be making copies
WDYT @mathurinm <https://github.com/mathurinm>?
—
Reply to this email directly, view it on GitHub
<#259 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACETTQUUZKB4LRBG6JXI54TZFGNLTAVCNFSM6AAAAABIT65EP6VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDAOJSGA3TEMRQHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have also
skglm/skglm/solvers/multitask_bcd.py
Line 234 in 9682660
def dist_fix_point_bcd(W, grad_ws, lipschitz, datafit, penalty, ws): |
We better adopt the same convention here as well.
I'm +1 with tackling it in this PR as this it touches the fixpoint ws strategy.
@Badr-MOUFAD merge if happy |
@Badr-MOUFAD With import numpy as np
from skglm.utils.jit_compilation import compiled_clone
from skglm import datafits
from skglm import penalties
from skglm.solvers import ProxNewton
from skglm.utils.data import make_correlated_data
X, y, _ = make_correlated_data(50, 100, random_state=0)
y = np.abs(y) // 1
datafit = compiled_clone(datafits.Quadratic())
penalty = compiled_clone(penalties.L0_5(alpha=1))
# penalty = compiled_clone(penalties.L1(alpha=1))
alpha_max = penalties.L1(alpha=1).alpha_max(datafit.gradient(X, y, np.zeros(len(y))))
penalty.alpha = alpha_max / 10
solver = ProxNewton(verbose=3, max_iter=20, warm_start=True, fit_intercept=False, tol=1e-4, ws_strategy="fixpoint", max_pn_iter=20, p0=10)
solver.solve(X, y, datafit, penalty) |
It seems that the solver get trapped in a working set that doesn't happen to be the support of the solution I have added a print in Iter 0 : [ 0 18 1 13 12 25 26 7 6 14]
Iter 1 : [12 11 10 6 18 25 7 3 2 1]
Iter 2 : [ 6 13 8 9 10 25 18 3 11 29]
Iter 3 : [ 6 13 8 9 10 25 18 3 11 29]
Iter 4 : [ 6 13 8 9 10 25 18 3 11 29]
Iter 5 : [ 6 13 8 9 10 25 18 3 11 29]
Iter 6 : [ 6 13 8 9 10 25 18 3 11 29]
Iter 7 : [ 6 13 8 9 10 25 18 3 11 29]
Iter 8 : [ 6 13 8 9 10 25 18 3 11 29]
Iter 9 : [ 6 13 8 9 10 25 18 3 11 29]
Iter 10 : [ 6 13 8 9 10 25 18 3 11 29]
Iter 11 : [ 6 13 8 9 10 25 18 3 11 29]
Iter 12 : [ 6 13 8 9 10 25 18 3 11 29]
Iter 13 : [ 6 13 8 9 10 25 18 3 11 29]
Iter 14 : [ 6 13 8 9 10 25 18 3 11 29]
Iter 15 : [ 6 13 8 9 10 25 18 3 11 29]
Iter 16 : [ 6 13 8 9 10 25 18 3 11 29]
Iter 17 : [ 6 13 8 9 10 25 18 3 11 29]
Iter 18 : [ 6 13 8 9 10 25 18 3 11 29]
Iter 19 : [ 6 13 8 9 10 25 18 3 11 29] Perhaps it is something related to the non-convexity of the penalty 🤔 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that the issue related to the L_05
penalty is not closely related to this PR and suggest investigating it later in another PR.
Thanks for the fix @mathurinm 🚀
fixes #256
Hard to test properly, but the following now works fine: