Michael Mandiberg, Professor of Media Culture at the College of Staten Island, is in the midst of a unique investigation into stock photo images, which we see every day in ads and other media.
Their project, Taking Stock, used artificial intelligence (AI) to analyze 130 million stock images to create not only new works of art, but also to gain an understanding of characteristics that these images have in common. The results are a series of fascinating images and videos.
Explaining their motivation for undertaking this project, Mandiberg said, “AI tools that create images and text, like chatbots, have gained a lot of attention by figuring out patterns in huge amounts of data. News reports often talk about how this data can be biased, but we don’t fully understand how big or important the biases in the data are. People know the data isn’t perfect, but no one has explained exactly what those biases look like. I wanted to see what I could learn by looking at the actual data using both creative and analytic approaches.”
Methodology and Findings
Mandiberg’s methodology involved creating a program that downloaded every image of an individual person from a wide variety of stock photo Websites—from large sites like Shutterstock through smaller collections like Afripics. The process that they employed is referred to as “scraping.” In all, Mandiberg collected 130 million pictures.
As one would imagine, performing an analysis of that many images can be a daunting task, but Mandiberg was up to the challenge. “I started by writing code with AI and machine learning tools to check if an image had a face, where the face was, and which way it was looking,” they explained. “I focused on five million images where the face was looking straight at the camera. I also studied the position and gestures of the person’s hands.”
Mandiberg also looked at metadata, such as keywords, gender, ethnicity, age, and location, with which photographers tag their images. They then used an AI tool to sort the images according to similar keywords and identify common topics and themes. “I used an AI tool to group images by similar keywords and find topics or themes. As a result, the program created 64 topics, which were each named after the first three keywords in each group.”
They noted that “The largest group was about clothing, like ‘shirt, clothes, jeans’. The second largest was ‘business, corporate, executive’, which was expected. But the third largest surprised me: ‘finger, gesture, point.’ These are images of people pointing at blank spaces or holding empty phones, which is an unusual gesture that AI is mistakenly learning we do all the time.”
During the process, Mandiberg had some other discoveries, which shed light on commonalities among the images. “Some of what I found wasn’t surprising: most of the images show young white people smiling happily. But other things stood out: about two-thirds of the images are of women, while only one-third are men. I wasn’t surprised that most of the images come from North American and European photographers. But I was surprised to learn most of these were made in Eastern European countries that used to be part of the Soviet Union. Ukraine and Russia alone each created more than twice as many images as the United States. These countries have predominantly white populations, and their images reflect that. Less than 10% of the images from these countries include ethnicities other than white/Caucasian/European.”
Creating New Images and Videos
“To make the prints for Taking Stock,” Mandiberg stated, “the code selects images from one of the themes (like “finger, gesture, point”). It picks images where the person is looking forward, their right hand is near their mouth in a pointing gesture, and their left hand isn’t visible. From thousands of images, the code combines the 64 images that are the most similar in the group.
When analyzing faces, I used face recognition code to calculate a unique ‘face signature’ for each person. To create the videos, the code looks at body poses and gestures and chooses each image based on how similar the faces are. It finds multiple images of the same person or someone who looks very similar. This creates a smooth transition between the fast-moving images.”
Each set of 64 images produces what Mandiberg calls a “ghostly image of a person from the waist up that represents the much larger cluster, but if you look closely, you can see halos of each person from the individual layers, as well as their watermarks and square edges. The image is sharp around the nose, mouth, and eyes but gets blurry toward the edges. This is a blur unlike any I have seen before. It is a statistical blur that visualizes how probability impacts the process of generating AI images.”
These images will contain a variety of gender presentations, depending on the gender breakdown within each topic. “The ‘fashion, beautiful, pose’ image has traces of lipstick and tumbling long hair,” they explained, “while the ‘business, corporate, executive’ image has short hair, the traces of a moustache, and the clear contours of a business suit and tie. Most images, however, show varying degrees of androgyny.”
What does Mandiberg hope the viewing public will get from these works? “I hope the average viewer will gain an understanding of the hidden patterns in the images that shape our visual culture, and how AI learns and then replicates these patterns. This understanding may be an emotional one, in response to the visual images, or it may be more analytic. Both lead to different forms of understanding.”
Looking to the Future
As Taking Stock has already received press attention in Brooklyn Rail, as well as noteworthy exposure, last fall, at the Paris Photo fair, Mandiberg commented that they now face the monumental task of analyzing all of the project’s data. They also reported that in May, a Swiss museum will feature the project in an exhibition, and in an essay in the accompanying catalog.
By Terry Mares