o1 news - Search News

News

ChatGPT o3 hallucinates more than o1, and OpenAI has no idea why

OpenAI delivered advanced ChatGPT reasoning models this month that are more capable than o1, but they also hallucinate more.

OpenAI's most capable models hallucinate more than earlier ones

OpenAI says its latest models, o3 and o4-mini, are its most powerful yet. However, research shows the models also hallucinate ...

5don MSN

OpenAI's o3 and o4-mini hallucinate way higher than previous models

By OpenAI 's own testing, its newest reasoning models, o3 and o4 -mini, hallucinate significantly higher than o1.

Hosted on MSN24d

OpenAI’s o1 model sure tries to deceive humans a lot

The o1 model series also may be significantly more manipulative than GPT-4o. According to OpenAI’s tests using an open-source test evaluation called MakeMePay, o1 was approximately 20% more ...

What the heck is OpenAI doing with its ChatGPT models?

OpenAI unleashed a flurry of new ChatGPT variants over the week, each featuring interesting new features and very confusing ...

Open AI’s new models hallucinate more than the old ones

You would think that the number of hallucinations would decrease over time, but according to internal tests from Open AI, the ...

7don MSN

OpenAI’s new reasoning AI models hallucinate more

OpenAI's reasoning AI models are getting better, but their hallucinating isn't, according to benchmark results.

11don MSN

Sam Altman says OpenAI deserves to be mocked for its confusing AI names — and a 'fix' is coming

"How about we fix our model naming by this summer and everyone gets a few more months to make fun of us," OpenAI's Sam Altman ...

Crypto Briefing10d

OpenAI unveils o3 and o4-mini with breakthrough image reasoning

OpenAI's o3 and o4-mini models introduce breakthrough image reasoning for enhanced performance in reasoning, visual, and ...

OpenAI's newest o3 and o4-mini models excel at coding and math – but hallucinate more often

Historically, each new generation of OpenAI's models has delivered incremental improvements in factual accuracy, with ...

Results that may be inaccessible to you are currently showing.

Hide inaccessible results