OpenAI has just introduced GPT-4.5 , its latest large language poser . chief operating officer Sam Altman describes it as the first AI that feel like talking to a “ thoughtful person . ” However , this example marks a important shift — OpenAI confirm that GPT-4.5 will be the last non - intelligent model . The next footfall ? Merging the GPT - series and atomic number 8 - series into a unified AI system of rules , likely GPT-5 .
But what doesGPT-4.5bring to the table ? How does it compare to premature models , and where exactly does this outfit in the OpenAI ’s armory ? Let ’s break it all down .
What is GPT-4.5?
GPT-4.5 , internally codenamed ‘ Orion , ’ is OpenAI ’s latest declamatory linguistic communication model , an upgrade over their previous GPT model like GPT 4o . This is designed to improve conversation timber , write assistance , and trouble - resolution . It ’s vainglorious and more expensive to feed than any previous OpenAI model and has been trained with more data and calculation business leader than ever before .
ChatGPT 4.5 example can be sum up down to these key highlights :
Despite being the latest example , it definitely has many limitations or weakness :
Also understand :
How Does GPT-4.5 Perform in Benchmarks?
benchmark help oneself us assess how well AI theoretical account do in different areas . Here ’s how GPT-4.5 stacks up against both OpenAI ’s previous models and its competitors :
GPT-4.5 shows solid improvements over GPT-4o in factual accuracy , scoring 62.5 % on the SimpleQA benchmark compared to GPT-4o ’s 38.2 % andOpenAI ’s o1at 47 % . It also has a lower delusion charge per unit ( 37.1 % ) than GPT-4o ( 61.8 % ) and o1 ( 44 % ) , though models like DeepSeek ’s R1 andPerplexity ’s Deep Researchstill do better at fact - checking . When it comes to encipher , GPT-4.5 performed better than GPT-4o and o3 - miniskirt on OpenAI ’s SWE - Lancer bench mark but lagged behind Anthropic ’s Claude 3.7 Sonnet and OpenAI ’s deep research models .
However , it struggles with complex problem - resolution , scoring just 36.7 % on AIME ( maths ) compared to o3 - mini ’s 87.3 % , and 71.4 % on GPQA ( scientific discipline ) , trailingDeepSeek ’s R1 . On the undimmed side , it beats GPT-4o in multilingual ( 85.1 % vs. 81.5 % ) and multimodal ( 74.4 % vs. 69.1 % ) tasks , but it still lacks full multimodal capabilities like voice and video recording support .
Human Preference Ratings is a method used by OpenAI to evaluate how real users perceive the quality of response from unlike AI models . In this rating , user preferred GPT-4.5 over GPT-4o for creative written material , professional queries , and everyday conversations , but it still trails behind model like Claude 3.7 Sonnet in structured reasoning and legal written document blueprint .
Overall , GPT-4.5 is a footstep up in factual knowledge and conversational abilities , but when it comes to deep reasoning and structured job - solving , models like DeepSeek ’s R1 and Claude 3.7 Sonnet still have the edge .
How to Access GPT-4.5
Starting today , ChatGPT Pro users ( $ 200 / month)can access GPT-4.5 . OpenAI plans to roll it out to ChatGPT Plus ( $ 20 / calendar month ) and Team ( $ 30 / month ) exploiter next week once more GPUs are available . Users can try GPT 4.5 by just picking that model from the model picker .
For developers , GPT-4.5 is being made useable through OpenAI ’s API , including the Chat Completions API , Assistants API , and Batch API . It supports key feature like single-valued function calling , structured output , cyclosis , system messages , and image stimulation , build it a versatile puppet for various AI - driven applications . However , it currently does not brook multimodal capabilities such as vocalization way , telecasting , or projection screen communion .
GPT-4.5 is an exciting stair forward , specially for writing , oecumenical conversation , and factual truth . However , it accrue short in cryptic logical thinking and structured job - solving , which the fellowship strongly hint will not be the case from belated model .