Gender shades: intersectional accuracy disparities in commercial gender classification, Buolamwini and Gebru, 2018

Paper, Tags: #machine-learning

We present an approach to evaluate bias present in automated facial analysis algorithms and datasets with respect to phenotypic subgroups. We create a facial analysis dataset, balanced by gender and skin type. We evaluate 3 commercial gender classification systems using our dataset and show that darker-skinned females are the most misclassified group.

Since race and ethnic labels are ustable, we use skin type (6 levels) as a more visually precise label to measure dataset diversity. As for gender, we use female and male.

In commercial applications, male subjects are more accurately classified than female subjects, and lighter subjects better than darker individuals. Darker-skinned females are the group with the highest error rate.

We need rigorous reporting on the performance metrics on which algorithmic fairness debates center. Algorithmic transparency and accountability reach beyond technical reports and should include mechanisms for consent and redress.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1802.md

1802.md

Gender shades: intersectional accuracy disparities in commercial gender classification, Buolamwini and Gebru, 2018

Paper, Tags: #machine-learning

Files

1802.md

Latest commit

History

1802.md

File metadata and controls

Gender shades: intersectional accuracy disparities in commercial gender classification, Buolamwini and Gebru, 2018

Paper, Tags: #machine-learning