Bias scan tool FAQ – What is it?
What is a bias scan tool?
An algorithm that identifies bias in AI systems. It identifies potentially discriminated groups of similar users.
Bias scan tool – Input data
What input does the bias scan tool need?
A .csv file of max. 10mb, with columns structured as follows: features, predicted labels, truth labels. Only the order, not the naming, of the columns is of importance.
Features: unscaled numeric values, e.g., feat_1, feat_2, ..., feat_n;
Predicted label: 0 or 1;
Truth label: 0 or 1.
Bias scan tool – Generate a report
If your report cannot be downloaded, please check your column structure and numeric values. Or contact us.
Algorithm – The implemented bias scan tool is based on k-means Hierarchical Bias-Aware Clustering (HBAC), as described by Misztal-Radecka and Indurkya in Information Processing and Management (2021) [link]. An implementation of the HBAC algorithm can be found on Github.
Explanatory note on example report
Statistical significant differences in features between cluster with most negative bias (cluster 4, bias=-0.27) and rest of dataset:
So, tweets of users with a verified profile, above average sentiment score and below average number of URLs used in their tweets are classified significantly more often as disinformation by the BERT-based classifier. Now, with the help of subject matter experts a qualitative assessment is needed to examine the measured quantitative disparities further.
More details on this case study can be found here.