October 10, 2025 | 4 minute read

AI Literacy for All: Teaching & Learning from Youth Auditing Black-Box AI Systems

by Danaë Metaxa

Notes from talk

AI was not always as ubiquitous as it is now. In 2014, Facebook was “manipulating emotions,” showing positive and negative ideas in a newsfeed and then seeing that content become contagious. This was surprising to many people, who were unaware that there was an algorithm. When they found out, they were surprised and angry.

AI is different now. It’s pervasive and people are used to it.

But computing education looks very similar to how it did before. This isn’t necessarily a bad thing, but it makes more sense when “we are only educating people to become programmers.” Today, we should be educating all people to be aware of using these systems—to be “users” of those systems.

AI auditing is a way this occurs. This is a technical method for understanding “black-box” systems; auditors query a system and then attempt to infer the algorithm’s approach. This is used to evaluate standards, often related to equity and justice. This may be related to regulation, liability, and to social norms. For example, a Google Search for Computer Programmer may return homogonous results. In this example, a survey benchmarked against the US Bureau of Labor Statistics. 100 common occupations were identified, and automated searches collected the top 100 images that resulted. Crowdsourced questionnaires identified who was in an image. There is slight underrepresentation of women in results, and a stronger effect of underrepresentation of people of color in search results.

AI auditors are typically researchers or journalists. Audits take many forms. A scraping audit comes from public sources. A “sock puppet” audit is when an auditor pretends to be a user. A third type of audit involves everyday users. This type of auditing may be shown on social media. This is an “everyday audit,” where everyday users detect or raise awareness of issues of harmful behavior that they encounter.

TikTok has generative AI filters. Users can define filters and deploy them on the platform. The AI Manga filter was a popular filter; it takes an input image and spits out an AI Manga style depiction of the image. Users started to notice strange things happening in the filter, such as people being added to a filter, or body parts changed. Researchers collected 400 videos using the filter, coded them for actions, outcomes, a trend, and an emotion. 8 themes emerged.

One theme was anthropomorphization. Uploading a photo without a person might add a person. Another was race and gender bias, with lightening and whitening of skin. A third was sexualization of images. Users went to great lengths to try to solicit this result. This resulted in trends like showing a fist and having it turned into a chest.

A finding is that “AI impacts are complex.” People were sometimes satisfied with results, and found them fun and funny. They motivated tests in a spirit of fun; people engaged playfully and creatively, to try to have the AI respond in a certain inaccurate way. Auditing can be fun, creative, and organic. Users, especially younger users on TikTok, were already doing this. And engagement is non-technical.

These types of audits are a way to prepare kids to use systems and to audit those systems. An example of this was through a youth-auditing experiment. 14 high schoolers participated in a two-week program, using TikTok’s Effect House, a way to set up a flow to provide a text prompt and an input. The tool re-renders the input according to the prompt. The research team decomposed auditing into five steps: creating a hypothesis, selecting inputs, running tests, analyze the data, report the findings. The high school participants experimented with things like athlete rendering. A student observed that “tennis player” resulted in white skin, and “basketball” in black skin. Students then audited against that hypothesis. They generated 1200 images, and then analyzed the results. Findings included things like “nail technicians appeared female,” “priests looked 100% male,” “senators all had white aesthetics,” and all looked older. The researchers then did their own assessment, and the results match the results from the teenagers; the researchers concluded that teenagers are able to run successful AI audits.

In a next phase, the researchers worked with teenagers and teachers to develop lesson plans. Teenagers shared technical expertise, feedback on the project itself, and detailed about teen interests. They also described the way auditing changed their perspectives and behaviors. A final phase will focus on teaching teachers to teach teachers, and scale AI auditing education practices.

Read some more

A review of Man-Computer Symbiosis