Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sourcery refactored main branch #1

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

sourcery-ai[bot]
Copy link

@sourcery-ai sourcery-ai bot commented Apr 29, 2022

Branch main refactored by Sourcery.

If you're happy with these changes, merge this Pull Request using the Squash and merge strategy.

See our documentation here.

Run Sourcery locally

Reduce the feedback loop during development by using the Sourcery editor plugin:

Review changes via command line

To manually merge these changes, make sure you're on the main branch, then run:

git fetch origin sourcery/main
git merge --ff-only FETCH_HEAD
git reset HEAD^

Help us improve this pull request!

@sourcery-ai sourcery-ai bot requested a review from RemilYoucef April 29, 2022 08:57
Comment on lines -8 to +45
def plot_explanations(W, S_list, nb_subgroups, c, att_names, patt_descriptions) :
def plot_explanations(W, S_list, nb_subgroups, c, att_names, patt_descriptions):

for j in range(0, nb_subgroups) :
for j in range(nb_subgroups):

print(j,'------------------------------------------------')
print(patt_descriptions[S_list[j]])
print(patt_descriptions[S_list[j]])
coefficients = W[S_list[j]].coef_
logic = coefficients > 0
coefficients_abs = np.abs(coefficients)
contributions = coefficients_abs / np.sum(coefficients_abs, axis = 1).reshape(-1,1)
features_importance = contributions[c] * 100
limit = 0.75

f_importance = features_importance[features_importance > limit]
f_importance = f_importance / np.sum(f_importance) * 100
f_importance = f_importance.round(2)
att_names_ = list(pd.Series(att_names[:362])[features_importance > limit])


f_importance_1 = f_importance[logic[c][features_importance > limit]]
att_names_1 = [x for i,x in enumerate (att_names_) if logic[c][features_importance > limit][i]]

f_importance_2 = f_importance[~logic[c][features_importance > limit]]
att_names_2 = [x for i,x in enumerate (att_names_) if not logic[c][features_importance > limit][i]]

plt.style.use('fivethirtyeight')
plt.figure(figsize =(3, 4))
plt.barh(att_names_2, f_importance_2,color='#e74c3c',height=0.65)
plt.barh(att_names_1, f_importance_1,color='#1abc9c',height=0.65)
plt.barh(att_names_1, f_importance_1,color='#1abc9c',height=0.65)
all_f_importance = np.concatenate((f_importance_2,f_importance_1))
for i, v in enumerate(all_f_importance) :
plt.text(v + 0.4, i, str(v)+'%', fontsize = 9)
for i, v in enumerate(all_f_importance):
plt.text(v + 0.4, i, f'{str(v)}%', fontsize = 9)

plt.xlabel("Features Importance",fontsize=10)
plt.xticks(fontsize=10)
plt.yticks(fontsize=10)
plt.grid(True, linestyle='--', which='major',color='grey', alpha=0.75)
plt.savefig('FIGURES/f_'+str(j))
plt.savefig(f'FIGURES/f_{str(j)}')
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function plot_explanations refactored with the following changes:

Comment on lines 47 to 62
def sort_subgroups_support(S,K) :
S_copy = S.copy()
l_best_s = []
for i in range(0,K) :
inter = 0
s_best = None

for s in S_copy :
if len(s) > inter :
inter = len(s)
s_best = s
l_best_s.append(s_best)
S_copy.remove(s_best)
return l_best_s No newline at end of file
def sort_subgroups_support(S,K):
S_copy = S.copy()
l_best_s = []
for _ in range(K):
inter = 0
s_best = None

for s in S_copy :
if len(s) > inter :
inter = len(s)
s_best = s
l_best_s.append(s_best)
S_copy.remove(s_best)

return l_best_s
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function sort_subgroups_support refactored with the following changes:

Comment on lines -18 to +37
def generate_all_neighbors (data, data_compressed, n_neigh, numerical_cols, numerical_cols_compressed, categ_unique, categ_unique_compressed,n_var, model) :
def generate_all_neighbors(data, data_compressed, n_neigh, numerical_cols, numerical_cols_compressed, categ_unique, categ_unique_compressed,n_var, model):

list_neighs = []
num_size = numerical_cols.size
num_size_compressed = numerical_cols_compressed.size
n = np.size(data, 0)

n = np.size(data, 0)
covn = cal_covn(data, num_size, n_var)
covn_compressed = cal_covn(data_compressed, num_size_compressed, n_var)

base = np.zeros(data.shape[1])
neighbors_base = np.random.multivariate_normal(base, covn, n_neigh)

base_compressed = np.zeros(data_compressed.shape[1])
neighbors_base_compressed = np.random.multivariate_normal(base_compressed, covn_compressed, n_neigh)
for i in range(0,n) :

for i in range(n):
neighbors = neighbors_base + data[i]
neighbors_compressed = neighbors_base_compressed + data_compressed[i]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function generate_all_neighbors refactored with the following changes:

Comment on lines -7 to +14
def patterns (P, split_point1, split_point2, data, att_names_) :
def patterns(P, split_point1, split_point2, data, att_names_):


patt_dict = dict()
rank = 0
for s,p in P.items() :
patt_dict = {}
for rank, (s, p) in enumerate(P.items()):

description = ''
it = 0
d = dict ()
while (it < len(p)) :
d = {}
for it in range(0, len(p), 3):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function patterns refactored with the following changes:

Comment on lines -18 to +22
def loss_global_wb (data_test,list_neigh,model, limit) :
def loss_global_wb(data_test,list_neigh,model, limit):

n = np.size(data_test,0)
data_neigh_O, target_neigh_O_proba = sampling_sb(data_test,np.arange(n),list_neigh,model)
global_loss = calc_loss(data_neigh_O, target_neigh_O_proba, limit)
return global_loss
return calc_loss(data_neigh_O, target_neigh_O_proba, limit)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function loss_global_wb refactored with the following changes:

Comment on lines -26 to +31
def loss_local_models (n,list_neigh,model, limit) :
def loss_local_models(n,list_neigh,model, limit):

loss = 0
for i in range(0,n) :
for i in range(n):
data_neigh_i= list_neigh[i][0]
target_neigh_i_proba = list_neigh[i][1]
loss += calc_loss(data_neigh_i, target_neigh_i_proba, limit)
loss += calc_loss(data_neigh_i, target_neigh_i_proba, limit)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function loss_local_models refactored with the following changes:

Comment on lines -46 to +47
def fscore_sd (S,data_test,list_neigh,model,nb_classes) :
def fscore_sd(S,data_test,list_neigh,model,nb_classes):

iteration = 0
for s in S :
for iteration, s in enumerate(S):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function fscore_sd refactored with the following changes:

Comment on lines -66 to +65
def fscore_local_models (data_test,n,list_neigh,model,nb_classes) :
def fscore_local_models(data_test,n,list_neigh,model,nb_classes):


iteration = 0
for i in range(0,n) :
for iteration, i in enumerate(range(n)):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function fscore_local_models refactored with the following changes:

Comment on lines -99 to +92
def similarity (W,nb_classes) :
def similarity(W,nb_classes):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function similarity refactored with the following changes:

Comment on lines -124 to +121
def avg_non_similar (dist,treshold) :
def avg_non_similar(dist,treshold):

nb_non_sim = 0
nb_non_sim = 0
nb_sbgrps = np.size(dist,0)
for i in range (0, nb_sbgrps) :
for i in range(nb_sbgrps):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function avg_non_similar refactored with the following changes:

@sourcery-ai
Copy link
Author

sourcery-ai bot commented Apr 29, 2022

Sourcery Code Quality Report

✅  Merging this PR will increase code quality in the affected files by 1.89%.

Quality metrics Before After Change
Complexity 8.63 🙂 7.37 ⭐ -1.26 👍
Method Length 98.90 🙂 97.05 🙂 -1.85 👍
Working memory 11.03 😞 10.73 😞 -0.30 👍
Quality 53.83% 🙂 55.72% 🙂 1.89% 👍
Other metrics Before After Change
Lines 336 331 -5
Changed files Quality Before Quality After Quality Change
packages/features_importance.py 55.51% 🙂 55.53% 🙂 0.02% 👍
packages/neighbors_generation.py 38.31% 😞 38.33% 😞 0.02% 👍
packages/patterns_extraction.py 22.03% ⛔ 27.19% 😞 5.16% 👍
packages/performances.py 68.80% 🙂 69.22% 🙂 0.42% 👍

Here are some functions in these files that still need a tune-up:

File Function Complexity Length Working Memory Quality Recommendation
packages/patterns_extraction.py patterns 22 😞 270 ⛔ 15 😞 27.19% 😞 Refactor to reduce nesting. Try splitting into smaller methods. Extract out complex expressions
packages/neighbors_generation.py generate_all_neighbors 13 🙂 350 ⛔ 16 ⛔ 30.51% 😞 Try splitting into smaller methods. Extract out complex expressions
packages/features_importance.py plot_explanations 3 ⭐ 283 ⛔ 13 😞 45.62% 😞 Try splitting into smaller methods. Extract out complex expressions
packages/performances.py similarity 12 🙂 150 😞 10 😞 51.41% 🙂 Try splitting into smaller methods. Extract out complex expressions
packages/performances.py fscore_local_models 4 ⭐ 160 😞 10 😞 57.91% 🙂 Try splitting into smaller methods. Extract out complex expressions

Legend and Explanation

The emojis denote the absolute quality of the code:

  • ⭐ excellent
  • 🙂 good
  • 😞 poor
  • ⛔ very poor

The 👍 and 👎 indicate whether the quality has improved or gotten worse with this pull request.


Please see our documentation here for details on how these metrics are calculated.

We are actively working on this report - lots more documentation and extra metrics to come!

Help us improve this quality report!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants