OpenAI will get caught vibe graphing

Throughout its massive GPT-5 livestream on Thursday, OpenAI confirmed off just a few charts that made the mannequin appear fairly spectacular — however in the event you look intently, some graphs had been a little bit bit off.

In a single, paradoxically exhibiting how effectively GPT-5 does in “deception evals throughout fashions,” the size is in all places. For “coding deception,” for instance, GPT-5 apparently will get a 50.0 p.c deception charge, however that’s in comparison with OpenAI’s smaller 47.4 p.c o3 rating which in some way has a bigger bar.

Or this one, the place one among GPT-5’s scores is decrease than o3’s however is proven with an even bigger bar. On this identical chart, o3 and GPT-4o’s scores are completely different however proven with equally-sized bars. That chart was dangerous sufficient that CEO Sam Altman commented on it, calling it a “mega chart screwup.” An OpenAI advertising and marketing staffer additionally apologized for the “unintentional chart crime.”

OpenAI didn’t instantly reply to a request for remark. And whereas it’s unclear if OpenAI used GPT-5 to truly make the charts, it’s nonetheless not an awesome search for the corporate on its massive launch day — particularly when it’s touting the “important advances in decreasing hallucinations” with its new mannequin.

Source link

- Advertisement -

OpenAI will get caught vibe graphing

LEAVE A REPLY Cancel reply

Construct Your Dream Bakery To Uncover Your Internal Cookie

39 Sensible Amazon Objects Our Readers Are Loving In 2025

If Dear Magnificence Merchandise Are Placing A Dent In Your Finances, Strive These 35 Inexpensive Options

Eye Docs Are Virtually Begging You To By no means, Ever Do These 9 Issues

If You Are A Wise Shopper With A Whimsical Streak, You’ll Love These 36 Enjoyable However Helpful Merchandise

More Articles Like This

Category

Links

Stay Updated