Compounding Factor (KL divergence) Computation in results.ipynb #1

Bernhard-Steindl · 2024-03-05T11:41:27Z

Hello! 😊

Thank you very much for sharing the code you used for your paper on gender fairness in music recommendations. 🙏🏻
I have one question regarding the computation of the Compounding Factor metric resp. the Kullback-Leibler Divergence between the Population Distribution and the Metric Scores Distribution.

In the notebooks/results.ipynb file you define two functions in one cell:

def kl_event_diff(p, q):
    return - (p.loc['m'] * np.log2(p.loc['m']/q.loc['m'])) + (p.loc['f'] * np.log2(p.loc['f']/q.loc['f']))
def kl_divergence(p, q):
    return (p.loc['m'] * np.log2(p.loc['m']/q.loc['m'])) + (p.loc['f'] * np.log2(p.loc['f']/q.loc['f']))

The only difference between the two functions is that the first summand is preceded by a minus sign in the kl_event_diff function.
Later you just use the kl_event_diff function for computing the KL Divergence, and the kl_divergence function is not used at all in your notebook.

kl_diffs = groups_percentages_tmp.combine(compounding_factors_tmp,overwrite=False,func=kl_event_diff)

I could find the “Score Dist. (M/F)” values you used in your paper in tables also in the notebook output.
But, I could not find the “CompFct” values from your tables in the Jupyter notebook output.

The output for the Compounding Factor in the Jupyter notebook indicate that you indeed used the kl_event_diff function, instead of the kl_divergence function.
However, only if I use the formula of the function kl_divergence, I get the same values for "CompFact" as in your paper.

I wonder if the kl_event_diff function was inadvertently used in the result.ipynb, and if I am correct that the function kl_divergence was actually used for data analysis and reporting in the paper.
Is there a reason why the kl_event_diff function is used in the file instead of the function kl_divergence?

I guess the kl_divergence should actually be used for computing the Compounding Factor metric.

${\displaystyle CompFct^\mu = KL\Bigl(B||C^\mu\Bigl)}$
${\displaystyle KL\Bigl(P||Q\Bigl)=\sum _{x\in X}p(x)\log \left({p(x) \over q(x)}\right)}$

I appreciate your response. Thank you! ☺️
Best regards,
Bernhard

Example for computing CompFct for NDCG@10 for model ALS and STANDARD scenario (Paper Table 4).

Population distribution (B) = group percentages:
P = [m= 0.778929, f= 0.221071] (from the notebook output)
P = [m= 0.779, f= 0.221] (from the paper, rounded to 3 decimal digits)

ALS Score Distribution (Standard scenario):
Q = [m= 0.811982, f= 0.188018] (from the notebook output)
Q = [m= 0.812, f= 0.188] (from the paper, rounded to 3 decimal digits)

CompFct = ( P[m] * LOG2(P[m] / Q[m]) ) + ( P[f] * LOG2(P[f] / Q[f]) ) =

( 0.778929 * log2(0.778929 / 0.811982) ) + ( 0.221071 * log2(0.221071 / 0.188018) ) = 
= (-0.0467014006470215) + (0.0516508073193989) =
= 0.0049494066723774 ~= 0.005

The CompFct value 0.005 is written in the paper and the formula corresponds to kl_divergence.
But, if I were to use the kl_event_diff formula the result would be 0,09835220797.

Image source:

Melchiorre, A.B., Rekabsaz, N., Parada-Cabaleiro, E., Brandl, S., Lesota, O., Schedl, M., 2021. Investigating gender fairness of recommendation algorithms in the music domain. Information Processing & Management 58, 102666. https://doi.org/10.1016/j.ipm.2021.102666

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compounding Factor (KL divergence) Computation in results.ipynb #1

Compounding Factor (KL divergence) Computation in results.ipynb #1

Bernhard-Steindl commented Mar 5, 2024 •

edited

Loading

Compounding Factor (KL divergence) Computation in results.ipynb #1

Compounding Factor (KL divergence) Computation in results.ipynb #1

Comments

Bernhard-Steindl commented Mar 5, 2024 • edited Loading

Bernhard-Steindl commented Mar 5, 2024 •

edited

Loading