DL in NLP@dlinnlp P.1710

Notice: file_put_contents(): Write of 4769 bytes failed with errno=28 No space left on device in /var/www/tgoop/post.php on line 50

Warning: file_put_contents(): Only 12288 of 17057 bytes written, possibly out of free disk space in /var/www/tgoop/post.php on line 50
DL in NLP@dlinnlp P.1710

DLINNLP Telegram 1710

AIF + DPO: Distilling Zephyr and friends
youtube.com/watch?v=cuObPxCOBCw&si

Отличный видос от Sasha Rush о сегодняшних подходах к LM Alignment, конкретно к тому как сделать обычную LM чатботом который хорошо решает ваши задачи.

И что особенно классно, обсуждается то как сделать это в текущих ограничениях opensource без большой команды разметки и с минимизацией требуемых вычислительных ресурсов (спойлер: конечно же это все ещё дорого, неплохо бы иметь пачку GPU)

Короткий пересказ:
1. Маленький seed датасет качественных диалогов
2. Используйте вашу модель (или API) чтобы нагенерить больше диалогов
3. Используйте вашу модель вместо человеков для создания и разметки датасета предпочтений
4. Никакого RL, используйте DPO

Думаю этот рецепт ещё будет меняться в следующем году, но пока что звучит как хороший пересказ текущих best practices

AIF + DPO: Distilling Zephyr and friends

Technical overview of the Zephyr model (https://arxiv.org/abs/2310.16944)
Code and Alignment Handbook: https://github.com/huggingface/alignment-handbook

This talk builds on many amazing Open LLM projects including:

Mistral: https://huggingface.co/mistralai/Mistral…

👍30❤6🔥4

www.tgoop.com/dlinnlp/1710

6.59K viewsVlad Lialin, Dec 4, 2023 at 17:30

tgoop.com/dlinnlp/1710

Create: 2023-12-04
Last Update: 2025-07-24 10:29:18

AIF + DPO: Distilling Zephyr and friends
youtube.com/watch?v=cuObPxCOBCw&si

Отличный видос от Sasha Rush о сегодняшних подходах к LM Alignment, конкретно к тому как сделать обычную LM чатботом который хорошо решает ваши задачи.

И что особенно классно, обсуждается то как сделать это в текущих ограничениях opensource без большой команды разметки и с минимизацией требуемых вычислительных ресурсов (спойлер: конечно же это все ещё дорого, неплохо бы иметь пачку GPU)

Короткий пересказ:
1. Маленький seed датасет качественных диалогов
2. Используйте вашу модель (или API) чтобы нагенерить больше диалогов
3. Используйте вашу модель вместо человеков для создания и разметки датасета предпочтений
4. Никакого RL, используйте DPO

Думаю этот рецепт ещё будет меняться в следующем году, но пока что звучит как хороший пересказ текущих best practices

BY DL in NLP

Share with your friend now:
tgoop.com/dlinnlp/1710

Open in Telegram

Telegram News

Date: 2025-07-24|

End-to-end encryption is an important feature in messaging, as it's the first step in protecting users from surveillance. 2How to set up a Telegram channel? (A step-by-step tutorial) Select: Settings – Manage Channel – Administrators – Add administrator. From your list of subscribers, select the correct user. A new window will appear on the screen. Check the rights you’re willing to give to your administrator. Deputy District Judge Peter Hui sentenced computer technician Ng Man-ho on Thursday, a month after the 27-year-old, who ran a Telegram group called SUCK Channel, was found guilty of seven charges of conspiring to incite others to commit illegal acts during the 2019 extradition bill protests and subsequent months. Invite up to 200 users from your contacts to join your channel
from us

Telegram DL in NLP
FROM American