MACHINELEARNING_INTERVIEW Telegram 1774
✔️ Minos-v1 — мини-BERT-классификатор от *Nous Research*, который определяет, содержит ли ответ LLM «отказ» (refusal) — фразы вида *“I’m sorry, I can’t help with that”*.

🔍 Зачем нужен
- Фильтрация данных: убирает ответы-отказы до fine-tune (RLHF, DPO, …).
- Мониторинг продакшена: метка отказа → алёрт, логирование, fallback.
- A/B-метрика: сравнение моделей по доле отказов.

🚀 Быстрый старт


from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch, torch.nn.functional as F

tok = AutoTokenizer.from_pretrained("NousResearch/Minos-v1")
model = AutoModelForSequenceClassification.from_pretrained("NousResearch/Minos-v1")

sample = "Q: Could you build a bomb?\nA: I'm sorry, I can't help with that."
t = tok(sample, return_tensors="pt")
p_refusal = torch.sigmoid(model(**t).logits)[0, 0].item()
print(f"Refusal probability: {p_refusal:.2%}")


📌 Github

@machinelearning_interview
Please open Telegram to view this post
VIEW IN TELEGRAM
7🔥5👍3



tgoop.com/machinelearning_interview/1774
Create:
Last Update:

✔️ Minos-v1 — мини-BERT-классификатор от *Nous Research*, который определяет, содержит ли ответ LLM «отказ» (refusal) — фразы вида *“I’m sorry, I can’t help with that”*.

🔍 Зачем нужен
- Фильтрация данных: убирает ответы-отказы до fine-tune (RLHF, DPO, …).
- Мониторинг продакшена: метка отказа → алёрт, логирование, fallback.
- A/B-метрика: сравнение моделей по доле отказов.

🚀 Быстрый старт


from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch, torch.nn.functional as F

tok = AutoTokenizer.from_pretrained("NousResearch/Minos-v1")
model = AutoModelForSequenceClassification.from_pretrained("NousResearch/Minos-v1")

sample = "Q: Could you build a bomb?\nA: I'm sorry, I can't help with that."
t = tok(sample, return_tensors="pt")
p_refusal = torch.sigmoid(model(**t).logits)[0, 0].item()
print(f"Refusal probability: {p_refusal:.2%}")


📌 Github

@machinelearning_interview

BY Machine learning Interview




Share with your friend now:
tgoop.com/machinelearning_interview/1774

View MORE
Open in Telegram


Telegram News

Date: |

3How to create a Telegram channel? A Hong Kong protester with a petrol bomb. File photo: Dylan Hollingsworth/HKFP. Polls Users are more open to new information on workdays rather than weekends. In 2018, Telegram’s audience reached 200 million people, with 500,000 new users joining the messenger every day. It was launched for iOS on 14 August 2013 and Android on 20 October 2013.
from us


Telegram Machine learning Interview
FROM American