Facial recognition systems trained on millions of photos of people without their consent

Facial recognition algorithms are being trained using photos of people who have not given their consent, legal experts have warned.

Companies like IBM are scraping millions of publicly available images from Flickr and other sites in order to improve the technology, though the people in the photos have no idea this is happening.

Civil rights activists warn that this technology could one day be used to track and spy on the same people whose faces have been used to train it.

"This is the dirty little secret of AI training sets. Researchers often just grab whatever images are available in the wild," NYU School of Law professor Jason Schultz told NBC, who first reported on the issue.

Around 100 million Creative Commons-licensed images are available for artificial intelligence researchers to draw upon to train facial recognition systems through Yahoo's YFCC-100M dataset.

IBM used around one million images from the dataset in its 'Diversity in Faces' research that aimed to improve AI's historical issue with identifying women and people with darker skin.

"We are harnessing the power of science to create AI systems that are more fair and accurate," IBM researcher John Smith wrote in a blog that detailed the research.

"The AI systems learn what they're taught, and if they are not taught with robust and diverse datasets, accuracy and fairness could be at risk. For that reason, IBM, along with AI developers and the research community, need to be thoughtful about what data we use for training."

The researcher claims the publicly available images are the best way of ensuring training data is large enough and diverse enough to reflect the distribution of face types around the world.

People who have since discovered their pictures are in the dataset used by IBM took to Twitter to question the ethics of using such images.

"IBM is using 14 of my photos," said Flickr co-founder Caterina Fake. "IBM says people can opt out, but is making it impossible to do so."

The Independent reached out to IBM for a comment.