
This week, OpenAI announced its latest models: o3 and o4-mini. These are reasoning models, which break down a prompt into multiple parts that are then addressed one at a time. The goal is for the bot to âthinkâ through a request more deeply than other models might, and arrive at a deeper, more accurate result.Â
While there are many possible functions for OpenAIâs âmost powerfulâ reasoning model, one use that has blown up a bit on social media is for geoguessingâthe act of identifying a location by analyzing only what you can see in an image. As TechCrunch reported, users on X are posting about their experiences asking o3 to pinpoint locations from random photos, and showing glowing results. The bot will guess where in the world it thinks the photo was taken, and break down its reasons for thinking so. For example, it might say it zeroed-in on a certain color license plate that denotes a particular country, or that it noticed a particular language or writing style on a sign.
According to some of these users, ChatGPT isnât using any metadata hidden in the images to help it identify the locations: Some testers are stripping that data out of the photos before sharing them with the model, so, theoretically, itâs working off of reasoning and web search alone.Â
On the one hand, this is a fun task to put ChatGPT through. Geoguessing is all the rage online, so making the practice more accessible could be a good thing. On the other, there are clear privacy and security implications here: Someone with access to ChatGPTâs o3 model could use the reasoning model to identify where someone lives or is staying based on an otherwise anonymous image of theirs.Â
I decided to test out o3âs geoguessing capabilities with some stills from Google Street View, to see whether the internet hype was up to snuff. The good news is that, from my own experience, this is far from a perfect tool. In fact, it doesnât seem like itâs much better at the task than OpenAIâs non-reasoning models, like 4o.
Testing o3âs geoguessing skills
o3 can handle clear landmarks with relative ease: I first tested a view from a highway in Minnesota, facing the skyline of Minneapolis in the foreground. It only took the bot a minute and six seconds to identify the city, and got that we were looking down I-35W. It also instantly identified the PanthĂŠon in Paris, noting that the screenshot was from the time it was under renovation in 2015. (I didn’t know that when I submitted it!)
Credit: Lifehacker
Next, I wanted to try non-famous landmarks and locations. I found a random street corner in Springfield, Illinois, featuring the cityâs Central Baptist Churchâa red brick building with a steeple. This is when things started to get interesting: o3 cropped the image in multiple parts, looking for identifying characteristics in each. Since this is a reasoning model, you can see what itâs looking for in certain crops, too. Like other times I’ve tested out reasoning models, it’s weird to see the bot “thinking” with human-like interjections. (e.g. “Hmm,” “but wait,” and “I remember.”) It’s also interesting to see how it picks out specific details, like noting the architectural style of a section of a building, or where in the world a certain park bench is most commonly seen. Depending on where the bot is in its thinking process, it may start to search the web for more information, and you can click those links to investigate what it’s referencing yourself.
Despite all this reasoning, this location stumped the bot, and it wasnât able to complete the analysis. After three minutes and 47 seconds, the bot seemed like it was getting close to figuring it out, saying: âThe location at 400 E Jackson Street in Springfield, IL could be near the Cathedral Church of St. Paul. My crop didnât capture the whole board, so I need to adjust the coordinates and test the bounding box. Alternatively, the architecture might help identify itâa red brick Greek Revival with a white steeple, combined with a high-rise that could be ‘Embassy Plaza.’ The term ‘Redeemer’ could relate to ‘Redeemer Lutheran Church.’ I’ll search my memory for more details about landmarks near this address.â
What do you think so far?

Credit: Lifehacker
The bot correctly identified the street, but more impressively, the city itself. I was also impressed by its analysis of the church. While it was struggling to identify the specific church, it was able to analyze its style, which could have put it on the right path. However, the analysis quickly fell apart. The next âthoughtâ was about how the location might be in Springfield, Missouri or Kansas City. This is the first time I saw anything about Missouri, which made me wonder whether the bot hallucinated between the two Springfields. From here, the bot lost the plot, wondering if the church was in Omaha, or maybe that it was the Topeka Governorâs Mansion (which doesnât really look anything like the church).
It kept thinking for another couple minutes, speculating about other locations the block could be in, before pausing the analysis altogether. This tracked with a subsequent experience I had testing a random town in Kansas: After three minutes of thinking, the bot thought my image was from Fulton, Illinoisâthough, to its credit, it was pretty sure the picture was from somewhere in the midwest. I asked it to try again, and it thought for a while, again guessing wildly different cities in various states, before pausing the analysis for good.
Now is not the time for fear
The thing is, GPT-4o seems to be about even with o3 when it comes to location recognition. It was able to instantly identify that skyline of Minneapolis and immediately guessed that the Kansas photo was actually in Iowa. (It was incorrect, of course, but it was quick about it.) That seems to align with othersâ experiences with the models: TechCrunch was able to get o3 to identify one location 4o couldnât, but the models were matched evenly other than that.Â
While there are certainly some privacy and security concerns with AI in general, I don’t think o3 in particular needs to be singled out as a specific threat. It can be used to correctly guess where an image was taken, sure, but it can also easily get it wrongâor crash out entirely. Seeing as 4o is capable of a similar level of accuracy, I’d say there’s as much concern today as there was over the past year or so. It’s not great, but it’s also not dire. I’d save the panic for an AI model that gets it right almost every time, especially when the image is obscure.
In regards to the privacy and security concerns, OpenAI shared the following with TechCrunch: âOpenAI o3 and o4-mini bring visual reasoning to ChatGPT, making it more helpful in areas like accessibility, research, or identifying locations in emergency response. Weâve worked to train our models to refuse requests for private or sensitive information, added safeguards intended to prohibit the model from identifying private individuals in images, and actively monitor for and take action against abuse of our usage policies on privacy.”
Source link