AWESOMEDEEPLEARNING Telegram 230
How big do LLMs need to be able to reason?๐Ÿค” Microsoft released Orca 2 this week, a 13B Llama-based LLM trained on complex tasks and reasoning. ๐Ÿง Orca's performance comes from its use of synthetically generated data from bigger LLMs. I took a deeper look at paper and extracted the implementation and other insights.

๐—œ๐—บ๐—ฝ๐—น๐—ฒ๐—บ๐—ฒ๐—ป๐˜๐—ฎ๐˜๐—ถ๐—ผ๐—ป:
1๏ธโƒฃ Constructed a new dataset (Orca 2) with ~817K samples using prompts from FLAN, and GPT-4 to generate reasoning responses with the help of detailed system prompts.
2๏ธโƒฃ Grouped prompts into categories based on similarity to assign tailored system prompt that demonstrate different reasoning techniques.
3๏ธโƒฃ Replaced the original system prompt with a more generic one, to have the model learn the underlying reasoning strategy (Prompt erasing).
4๏ธโƒฃ Used progressive learning, starting with finetune Llama on FLAN-v2 (1 ep) , retrain on 5M ChatGPT data from Orca 1 (3 ep), combine 1M GPT-4 data from Orca 1 & 800k new Orca 2 data for final training (4 ep).

๐—œ๐—ป๐˜€๐—ถ๐—ด๐—ต๐˜๐˜€:
๐Ÿ“Š Imitation learning can improve capabilities with enough data.
๐Ÿ”ฌ Reasoning and longer generations to get the correct answer help smaller models to compete with bigger LLMs.
๐Ÿ’ซ Prompt Erasing helped Orca to โ€œlearnโ€ reasoning
๐ŸŽฏ Lowest hallucination rates of comparable models on summarization
โš™๏ธ Used packing for training, concatenating multiple examples into one sequence.
๐Ÿ‘จโ€๐Ÿฆฏ Masked user & system inputs (prompt) and only used generation for loss
๐Ÿ–ฅ Trained on 32 A100 for 80h

Paper: https://huggingface.co/papers/2311.11045
Model: https://huggingface.co/microsoft/Orca-2-13b
โค5๐Ÿ‘1



tgoop.com/awesomedeeplearning/230
Create:
Last Update:

How big do LLMs need to be able to reason?๐Ÿค” Microsoft released Orca 2 this week, a 13B Llama-based LLM trained on complex tasks and reasoning. ๐Ÿง Orca's performance comes from its use of synthetically generated data from bigger LLMs. I took a deeper look at paper and extracted the implementation and other insights.

๐—œ๐—บ๐—ฝ๐—น๐—ฒ๐—บ๐—ฒ๐—ป๐˜๐—ฎ๐˜๐—ถ๐—ผ๐—ป:
1๏ธโƒฃ Constructed a new dataset (Orca 2) with ~817K samples using prompts from FLAN, and GPT-4 to generate reasoning responses with the help of detailed system prompts.
2๏ธโƒฃ Grouped prompts into categories based on similarity to assign tailored system prompt that demonstrate different reasoning techniques.
3๏ธโƒฃ Replaced the original system prompt with a more generic one, to have the model learn the underlying reasoning strategy (Prompt erasing).
4๏ธโƒฃ Used progressive learning, starting with finetune Llama on FLAN-v2 (1 ep) , retrain on 5M ChatGPT data from Orca 1 (3 ep), combine 1M GPT-4 data from Orca 1 & 800k new Orca 2 data for final training (4 ep).

๐—œ๐—ป๐˜€๐—ถ๐—ด๐—ต๐˜๐˜€:
๐Ÿ“Š Imitation learning can improve capabilities with enough data.
๐Ÿ”ฌ Reasoning and longer generations to get the correct answer help smaller models to compete with bigger LLMs.
๐Ÿ’ซ Prompt Erasing helped Orca to โ€œlearnโ€ reasoning
๐ŸŽฏ Lowest hallucination rates of comparable models on summarization
โš™๏ธ Used packing for training, concatenating multiple examples into one sequence.
๐Ÿ‘จโ€๐Ÿฆฏ Masked user & system inputs (prompt) and only used generation for loss
๐Ÿ–ฅ Trained on 32 A100 for 80h

Paper: https://huggingface.co/papers/2311.11045
Model: https://huggingface.co/microsoft/Orca-2-13b

BY GenAi, Deep Learning and Computer Vision




Share with your friend now:
tgoop.com/awesomedeeplearning/230

View MORE
Open in Telegram


Telegram News

Date: |

While some crypto traders move toward screaming as a coping mechanism, many mental health experts have argued that โ€œscream therapyโ€ is pseudoscience. Scientific research or no, it obviously feels good. The channel also called on people to turn out for illegal assemblies and listed the things that participants should bring along with them, showing prior planning was in the works for riots. The messages also incited people to hurl toxic gas bombs at police and MTR stations, he added. Today, we will address Telegram channels and how to use them for maximum benefit. Informative A new window will come up. Enter your channel name and bio. (See the character limits above.) Click โ€œCreate.โ€
from us


Telegram GenAi, Deep Learning and Computer Vision
FROM American