Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected result from flaml.default.LGBMClassifier on iris #1247

Open
amueller opened this issue Oct 18, 2023 · 2 comments
Open

Unexpected result from flaml.default.LGBMClassifier on iris #1247

amueller opened this issue Oct 18, 2023 · 2 comments
Assignees

Comments

@amueller
Copy link

I'm trying to benchmark the zero-shot flaml.default.LGBMClassifier and I have seen some unexpected results. I'm working on Flaml 2.1.1.

 
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from flaml.default import LGBMClassifier
 
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, train_size=.5)
lgbm = LGBMClassifier().fit(X_train, y_train)
lgbm.score(X_test, y_test)

produces a test score of 0.3, which is chance. Using the standard 75/25 split, I get an accuracy of .92, which is around the expected value. Using a random forest with scikit-learn defaults, I get .92 both for the 50/50 split in the example as well as for the 75/25 split.
I assume there's an issue where a parameter configuration is chosen that doesn't allow growing a tree at all.

@levscaut levscaut self-assigned this Oct 19, 2023
@levscaut
Copy link
Collaborator

From what I observed, the small dataset doesn't match quite close to existing datapoints, which failed to apply a nice hyperparameters combination to LGBM. I'll add this data as a datapoint to the lgbm default configs, to let KNN match a nice hyperparameters combination to this data. I'll raise a PR soon.

@amueller
Copy link
Author

Thank you! I have a benchmark containing a lot of small datasets, I'd love to run it again once you added the point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants