Yesterday , Meta launched its a la mode AI model , Llama 3.1 . At first coup d’oeil , it appears as an iterative update to the Llama 3 model . However , Meta claims this new model will outperform all current models , including GPT-4 and evenClaude 3.5 Sonnet , when it comes to benchmark . To try it out , we took a dive deeply into Meta AI ’s Llama 3.1 to see how it stacks up against ChatGPT and Claude .
What Is Meta AI’s Llama 3.1
The ChatGPT used the third - generation Artificial Intelligence - model called GPT or Generative Pre - trained transformer , which are the speech communication models and frameworks designed to do a broad range of tasks . Similarly , the AI model behind Meta AI is Llama . For each unexampled interpretation of Llama , Meta typically releases three variants for dissimilar aim . With Llama 3.1 , you’re able to choose from model with 8B , 70B , and 405B parameters .
Llama 3.1 405B framework is on equality with the GPT 4o and Claude 3.5 Sonnet and even outperforms is category like Math and farsighted context .
And their 8B and 70B versions outperform Gemma 2 9B and GPT 3.5 Turbo , respectively .
Since Llama is an overt - germ model , it is totally free for everyone . you may download the model and use it offline without any limitation . developer can also integrate Llama 3.1 into their apps for free , as long as their apps have fewer than 70 million user . To put this in perspective , building an AI model with the same capabilities could cost over $ 100 million .
Key Features of Llama 3.1
Comparing Llama 3.1 With Claude and ChatGPT
I compare the Meta AI ( Llama 3.1 405B parameter variate ) with ChatGPT ( GPT4o ) and Claude ( 3.5 Sonnet ) simulation in various view like computer code generation , speed , reasoning skill , etc . For the 405B version , I used theHugging Faceapp , as the Meta AI site uses the 70B parametric quantity model . Here are the results :
1. Code Generation
I asked Meta AI ( Llama 3.1 405B variant ) , ChatGPT 4 , and Claude 3.5 Sonnet to create a snake game using Python , let in a score arrangement .
In this first test , Meta ’s performance was dissatisfactory compared to ChatGPT and Claude . Meta ’s good example make codification with 3 to 4 name erroneousness that I had to fix manually . Even after correcting these erroneousness , I could n’t control the Hydra using my keyboard input signal . After several seek to generate and specify the codification , I finally got the plot to lam . But , it still lacked the grading system .
On the other hand , ChatGPT and Claude produced code that work without any issues and include the requested scoring arrangement . Claude ’s game was the good overall , with smooth controls compared to ChatGPT ’s version , which had somewhat finicky ascendence . Overall , Claude is the good AI model for coding because its return UI is often sporting and also provides the choice to allow more instructions and improve the code with the assist of artifact feature .
We repeated the tease mental testing with JavaScript and other languages . While Meta ’s output occasionally cope with the other models , its codification propagation was hit or miss . I also tested codification generation with the little 8B and 70B variant of Llama 3.1 , and the experience was bad than expect . The 8B model , in special , often produced output that got stick in loops no matter how many times I tried .
2. Writing Stories and Emails
With the release of Claude 3.5 Sonnet , Claude has become the good model for generate human being - like text and stories . It still stands out as the top option for such works .
On the other manus , ChatGPT is good at mother articles , themes , and exchangeable content . Meta ’s committal to writing style often feels uneven and is difficult to fine - tune with command prompt .
However , these preferences can be subjective , so I commend you try all three framework yourself , as you may test them for detached . One notable capability of Meta AI is its ability to write 10 sentences cease with a specific Bible . While this might seem simple , other language framework like Claude and ChatGPT struggle to achieve it consistently .
3. Testing Reasoning Skills
Meta AI has outperformed Claude and ChatGPT in benchmark for the abstract thought and foresightful context of use class . This evoke it should be much just at solve riddle or understanding complex questions . To try out this , I provided a few riddle and carry quiz on the mannikin . Here ’s one representative riddle I gave as a prompt :
In our examination , all three services performed likewise .
However , we observed that Meta AI provided exact reply more often when solving complex math job compare to the other options . Here ’s one instance of a functions and graphical record question I require all three models :
While other chatbots have successfully solve even complex affair problems , Meta AI was the only model that accurately answered the question and also provided detailed step .
4. Conversational Skills
The biggest downside of Meta AI is the lack of enough conversational abilities . Meta focuses more on creating an open - germ language model for developers rather than a consumer - focalize AI chatbot . As a result , its tone is often bland and machinelike . On the other manus , Claude adopts a more human - like feeler , and ChatGPT pass somewhere in between .
However , when it comes to remembering the context of a subject , Meta AI and Claude stand out compared to ChatGPT . This becomes evident when providing a series of commands to the AI . While both Claude and Meta AI can take after all the instruction , ChatGPT often forgets older instruction or conflict to incorporate new 1 by rights .
5. Generating Speed
When it comes to speed , Meta AI undoubtedly takes the crown . Its 8B argument variation is the fastest AI manakin , return result in a split second , whether it ’s creating tables , finding entropy , or generating an email template . This 8B parameter model might be less adequate to when solving mathematics or code problems , but it is just as effective as other models like ChatGPT 3.5 Turbo or Gemini 1.5 Flash in many undertaking .
I recommend using the Llama 3.1 8B var. on theGroq website , which focalise on delivering results as quickly as possible . Although there is no official data on the velocity of the yield , but Groq says the speed is around 450 token per second .
6. Running Locally Without Restrictions
As Llama is an open - source model , you may tweak or jailbreak it to mother results without security review . More than the 405B and 70B parameter variants , I am activated about the 8B variant because it is so lightweight that I can even run it on my MacBook . However , upshot coevals can slow down if you do n’t have enough read/write memory and VRAM on your laptop .
you may download the AI models directly from the Meta AI website . They cater you with the AI model , which you could interact with either from the Terminal using program line or by integrate it into your practical app . Alternatively , you could download the Llama 3.1 models from theLM Studio app . This app allow you to download unfastened - source AI example , including Meta ’s Llama 3.1 , and ply a chatbot interface to interact with it . This setup is completely local , and you’re able to turn off the net if you desire to . By nonremittal , the role model is not jailbroken and may not provide all answer without censoring . you’re able to fine-tune the exemplar if needed , but the process can be a bit technical .
Is Llama 3.1 Better Than Other Models?
Its 8B model is quite surprising with its stop number , but aside from that , Llama 3.1 is n’t unspoiled than GPT-4 or Claude 3.5 Sonnet in most aspects . However , Meta AI is free and open - source , unlike other models which have a message limit per day . If you are a developer look to incorporate AI into your app or website , Llama 3.1 is a better choice because it allows you to okay - tune the model , which is n’t an choice with other models at the moment .