FreeWilly1 and its successor FreeWilly2 are powerful new open-source Large Language Models (LLMs) developed by Stability AI’s CarperAI team. Both models perform exceptionally well in reasoning competitions using many different metrics. Supervised fine-tuning (SFT) in the industry-standard Alpaca format was used to fine-tune the FreeWilly1 model, built on top of the original LLaMA 65B foundation model. FreeWilly2 uses the LLaMA 2 70B base model to achieve performance on par with GPT-3.5 on some tasks.
The FreeWilly models’ training was heavily influenced by Microsoft’s ground-breaking approach, described in the article “Orca: Progressive Learning from Complex Explanation Traces of GPT-4.” The team prompted language models with high-quality instructions to generate our copy of the dataset, which contains 600,000 data points (approximately 10% of the dataset size utilized in the original Orca work).
Using this method, the researchers generated 500,000 cases using a less complex LLM model and an extra 100,000 using a more complex LLM model. They thoroughly screened these datasets, removing cases originating from evaluation benchmarks to guarantee valid comparisons. Their approach to synthetically generated datasets is validated by the FreeWilly models performing exceptionally well across multiple benchmarks despite training on only a tenth of the sample size used in the original Orca paper.
The researchers used EleutherAI’s lm-eval-harness, to which they added AGIEval, to conduct evaluations of these models. The findings show that both FreeWilly models are top-notch when resolving difficult issues in specialized disciplines like law and mathematics, performing intricate reasoning, and recognizing language nuance.
The team believes the two models improve our ability to grasp the spoken language and open up previously impossible possibilities. They hope to see all the innovative uses of these models in artificial intelligence.
Check out the Reference Article and Project Page for FreeWilly1 and its successor FreeWilly2. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 26k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.
🚀 Check Out 900+ AI Tools in AI Tools Club
Dhanshree Shenwai is a Computer Science Engineer and has a good experience in FinTech companies covering Financial, Cards & Payments and Banking domain with keen interest in applications of AI. She is enthusiastic about exploring new technologies and advancements in today’s evolving world making everyone’s life easy.
edge with data: Actionable market intelligence for global brands, retailers, analysts, and investors. (Sponsored)
Credit: Source link