-
Notifications
You must be signed in to change notification settings - Fork 30
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Create rule S6974: Subclasses of Scikit-Learn's "BaseEstimator" shoul…
…d not set attributes ending with "_" in the "__init__" method (#3891)
- Loading branch information
1 parent
86d6b7c
commit 2ad4854
Showing
3 changed files
with
116 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
{ | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
{ | ||
"title": "Subclasses of Scikit-Learn's \"BaseEstimator\" should not set attributes ending with \"_\" in the \"__init__\" method", | ||
"type": "CODE_SMELL", | ||
"status": "ready", | ||
"remediation": { | ||
"func": "Constant\/Issue", | ||
"constantCost": "5min" | ||
}, | ||
"tags": [ | ||
], | ||
"defaultSeverity": "Major", | ||
"ruleSpecification": "RSPEC-6974", | ||
"sqKey": "S6974", | ||
"scope": "All", | ||
"defaultQualityProfiles": ["Sonar way"], | ||
"quickfix": "partial", | ||
"code": { | ||
"impacts": { | ||
"MAINTAINABILITY": "MEDIUM", | ||
"RELIABILITY": "HIGH" | ||
}, | ||
"attribute": "CONVENTIONAL" | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
This rule raises an issue when an attributes ending with ``++_++`` is set in the ``++__init__++`` method of a class inheriting from | ||
Scikit-Learn `BaseEstimator` | ||
|
||
|
||
== Why is this an issue? | ||
|
||
On a Scikit-Learn estimator, attributes that have a trailing underscore represents attributes that are estimated. | ||
These attributes have to be set in the fit method. Their presence are used to verify if an estimator has been fitted. | ||
|
||
[source,python] | ||
---- | ||
from sklearn.neighbors import KNeighborsClassifier | ||
X = [[0], [1], [2], [3]] | ||
y = [0, 0, 1, 1] | ||
knn = KNeighborsClassifier(n_neighbors=1) | ||
knn.fit(X, y) | ||
knn.n_samples_fit_ | ||
---- | ||
|
||
In the example above the attributes of the `KNeighborsClassifier`, ``++n_samples_fit_++`` | ||
is set only after the estimator's `fit` method is called. Calling ``++n_samples_fit_++`` before the estimator is fitted | ||
would raise an `AttributeError` exception. | ||
|
||
When implementing a custom estimator by subclassing Scikit-Learn's `BaseEstimator`, | ||
it is important to follow the above convention and not set attributes with a trailing underscore inside the ``++__init__++`` method. | ||
|
||
== How to fix it | ||
|
||
To fix this issue, move the attributes with a trailing underscore from the ``++__init__++`` method to the `fit` method. | ||
|
||
=== Code examples | ||
|
||
==== Noncompliant code example | ||
|
||
[source,python,diff-id=1,diff-type=noncompliant] | ||
---- | ||
from sklearn.base import BaseEstimator | ||
class MyEstimator(BaseEstimator): | ||
def __init__(self): | ||
self.estimated_attribute_ = None # Noncompliant: an estimated attribute is set in the __init__ method. | ||
---- | ||
|
||
==== Compliant solution | ||
|
||
[source,python,diff-id=1,diff-type=compliant] | ||
---- | ||
from sklearn.base import BaseEstimator | ||
class MyEstimator(BaseEstimator): | ||
def fit(self, X, y): | ||
self.estimated_attribute_ = some_estimation(X) # Compliant | ||
---- | ||
|
||
== Resources | ||
=== Documentation | ||
|
||
* Scikit-Learn documentation - https://scikit-learn.org/stable/developers/develop.html#parameters-and-init[Parameters and init] | ||
* S6970 - The Scikit-learn `fit` method should be called before methods yielding results | ||
|
||
|
||
ifdef::env-github,rspecator-view[] | ||
|
||
(visible only on this page) | ||
|
||
== Implementation specification | ||
|
||
Verify that subclasses of BaseEstimator do not have attributes with a trailing underscore in the __init__ method. | ||
Verify that inherited alongside BaseEstimator do not have attributes with a trailing underscore in the __init__ method. | ||
|
||
=== Message | ||
|
||
Primary : Move this estimated attribute inside the `fit` method. | ||
|
||
Secondary: this attributes is used in this estimator (in the case of mixins) | ||
|
||
|
||
=== Issue location | ||
|
||
Primary : name of the attribute | ||
|
||
Secondary : the name of the estimator subclassing the mixin and the BaseEstimator | ||
|
||
=== Quickfix | ||
|
||
Possible quickfix if the `fit` method does not exist. (Add the fit method with the estimated attribute) and delete the estimated attributes | ||
from the __init__ method. This is only possible in an estimator directly subclassing BaseEstimator. | ||
|
||
endif::env-github,rspecator-view[] |