Revealing another dark side of trained Artificial Intelligence (AI) models, new research has claimed that Google AI datasets identified most women wearing masks as if their mouths were covered by duct tapes.
Not just Google. When put to work, artificial intelligence-powered IBM Watson virtual assistant was not far behind on gender bias.
In 23 per cent of cases, Watson saw a woman wearing a gag while in another 23 per cent, it was sure the woman was “wearing a restraint or chains”.
To reach this conclusion, Ilinca Barsan, Director of Data Science, Wunderman Thompson Data used 265 images of men in masks and 265 images of women in masks, of varying picture quality, mask style and context — from outdoor pictures to office snapshots, from stock images to iPhone selfies, from DIY cotton masks to N95 respirators.
The results showed that AI algorithms are, indeed, written by “men”.
Out of the 265 images of men in masks, Google correctly identified 36 per cent as containing PPE. It also mistook 27 per cent of images as depicting facial hair.
“While inaccurate, this makes sense, as the model was likely trained on thousands and thousands of images of bearded men.
“Despite not explicitly receiving the label man, the AI seemed to make the association that something covering a man’s lower half of the face was likely to be facial hair,” said Barsan who deciphers data at Wunderman Thompson, a New York-based global marketing communications agency.
Beyond that, 15 per cent of images were misclassified as duct tape.
“This suggested that it may be an issue for both men and women. We needed to learn if the misidentification was more likely to happen to women,” she said in a statement.
Most interestingly (and worrisome), the tool mistakenly identified 28 per cent women as depicting duct tape.
At almost twice the number for men, it was the single most common “bad guess” for labeling masks.
When Microsoft’s Computer Vision looked at the image sets, it suggested that 40 per cent of the women were wearing a fashion accessory, while 14 per cent were wearing lipstick, instead of spotting the face masks.
“Even as a data scientist, who spends big chunks of her time scrubbing and prepping datasets, the idea of potentially harmful AI bias can feel a little abstract; like something that happens to other people’s models, and accidentally gets embedded into other people’s data products,” Barsan elaborated.
IBM Watson correctly identified 12 per cent of men to be wearing masks, while it is only right 5 per cent of the time for women.
Overall, for 40 per cent of images of women, Microsoft Azure Cognitive Services identified the mask as a fashion accessory compared to only 13 per cent of images of men.
“Going one step further, the computer vision model suggested that 14 per cent of images of masked women featured lipstick, while 12 per cent of images of men mistook the mask for a beard,” Barsan informed.
These labels seem harmless in comparison, she added, but it’s still a sign of underlying bias and the model’s expectation of what type of things it will and won’t see when you feed it the image of a woman.
“I was baffled by the duct-tape label because I’m a woman and, therefore, more likely to receive a duct-tape label back from Google in the first place. But gender is not even close to the only dimension we must consider here,” she lamented.
The researchers wrote the machines were looking for inspiration in “a darker corner of the web where women are perceived as victims of violence or silenced.”