Bias scan

Unsupervised bias detection: To identify potentially discriminated groups of similar users in AI systems. Working in tandem with the qualitative doctrine of law and ethics to assess fair AI.

Bias scan tool FAQ – What is it?

What is a bias scan tool?
An algorithm that identifies bias in AI systems. It identifies potentially discriminated groups of similar users.

Bias scan tool – Input data

What input does the bias scan tool need?
A .csv file of max. 10mb, with columns structured as follows: features, predicted labels, truth labels. Only the order, not the naming, of the columns is of importance.

Features: unscaled numeric values, e.g., feat_1, feat_2, ..., feat_n;

Predicted label: 0 or 1;

Truth label: 0 or 1.


feat_1 feat_2 ... feat_n pred_label truth_label
10 1 ... 0.1 1 1
20 2 ... 0.2 1 0
30 3 ... 0.3 0 0

Bias scan tool – Generate a report

Select a file

File loaded

Uploaded! The file is scanned now:

If your report cannot be downloaded, please check your column structure and numeric values. Or contact us.

Algorithm – The implemented bias scan tool is based on k-means Hierarchical Bias-Aware Clustering (HBAC), as described by Misztal-Radecka and Indurkya in Information Processing and Management (2021) [link]. An implementation of the HBAC algorithm can be found on Github.

Download an example report

Explanatory note on example report
Statistical significant differences in features between cluster with most negative bias (cluster 4, bias=-0.27) and rest of dataset:

feature difference p-value
verified 0.53468 0.00000
sentiment_score 0.95686 0.00005
#URLs -0.74095 0.00005

So, tweets of users with a verified profile, above average sentiment score and below average number of URLs used in their tweets are classified significantly more often as disinformation by the BERT-based classifier. Now, with the help of subject matter experts a qualitative assessment is needed to examine the measured quantitative disparities further.

More details on this case study can be found here.