КПД@quant_prune_distill P.406

QUANT_PRUNE_DISTILL Telegram 406

[Model page]

DeepSeek 🐳 выкатили пару часов назад на лицехватах 🤗 веса DeepSeek-R1 в публичный доступ!

Напомню, что это Reasoning модель, под цепоцки рассуждений а-ля o1, o1-mini, o3.

В модели 685B параметров и веса выложены в fp8-E4M3.
Архитектура почти идентична DeepSeek-V3.

Так что, счастливые обладатели 8+1 H100, развлекайтесь на здоровье)

👍14😁6❤1

www.tgoop.com/quant_prune_distill/406

1.88K viewsedited Jan 20 at 08:52

tgoop.com/quant_prune_distill/406

Create: 2025-01-20
Last Update: 2025-08-28 22:11:42

[Model page]

DeepSeek 🐳 выкатили пару часов назад на лицехватах 🤗 веса DeepSeek-R1 в публичный доступ!

Напомню, что это Reasoning модель, под цепоцки рассуждений а-ля o1, o1-mini, o3.

В модели 685B параметров и веса выложены в fp8-E4M3.
Архитектура почти идентична DeepSeek-V3.

Так что, счастливые обладатели 8+1 H100, развлекайтесь на здоровье)

BY КПД

Share with your friend now:
tgoop.com/quant_prune_distill/406

Open in Telegram

Telegram News

Date: 2025-08-28|

How to build a private or public channel on Telegram? How to create a business channel on Telegram? (Tutorial) SUCK Channel Telegram Administrators Telegram has announced a number of measures aiming to tackle the spread of disinformation through its platform in Brazil. These features are part of an agreement between the platform and the country's authorities ahead of the elections in October.
from us

Telegram КПД
FROM American