Google apologizes for “missing the mark” after Gemini generated racially diverse Nazis

DragonFruit@kbin.social · 1 year ago

Google apologizes for “missing the mark” after Gemini generated racially diverse Nazis

Ferk@kbin.social · edit-2 1 year ago

While the result from generating an image through AI is not meant to be “factually” accurate, its meant to be as accurate as possible when it comes to matching the prompt that is provided. And the prompt “1943 German Soldier” has some implications in what kind of images would be expected and which kinds wouldn’t.

I’m not convinced that attempting to “balance a biased training dataset” in the way that this is apparently being done is really attainable or worthwhile.

An AI can only work based on biases, and it’s impossible to correct/balance the dataset without just introducing a different bias. Because the model is just a collection of biases that discriminate between how different descriptions relate to pictures. If there was no bias for the AI to rely on, they would not be able to pick anything to show.

For example, the AI does not know whether the word “Soldier” really corresponds to someone with a uniform like in the picture, it’s just biased to expect that kind of picture to match the description of “Soldier”, regardless of whether the image is really a soldier or just someone wearing army uniform.

Describing a picture is, on itself, an exercise of assumptions, biases, appearances that are just based on pre-conceived notions of what are our expectations when comparing the picture to our own reality. So the AI needs to show whatever corresponds to those biases in order to match as accuratelly as possible our biased expectations for what those descriptions mean.

If the dataset is complete enough, and yet it’s biased to show predominantly a particular gender or ethnicity when asking for “Soldier” because that happens to be the most common image of what a “Soldier” is, but you want a different ethnicity or gender, then add that ethnicity/gender to the prompt (like you said in the first point), instead supporting the idea of having the developers artificially bias the results in a different direction that contradicts the dataset just because the results aren’t politically correct. …it would be more honest to add a disclaimer and still show the result as it is, instead of artificially manipulating it in a direction that activelly pushes the IA to hallucinate concepts that could not have been possibly found in the dataset.

Alternativelly: expand your dataset with more valuable data in a direction that does not contradict reality (eg. introduce more pictures of soldiers of different ethnics from situations that actually are found in our reality). You’ll be altering the data, but you would be doing it without distorting the bias and with examples grounded in reality.