DL in NLP@dlinnlp P.1605

Warning: file_put_contents(aCache/aDaily/post/dlinnlp/-1605-1606-): Failed to open stream: No space left on device in /var/www/tgoop/post.php on line 50
DL in NLP@dlinnlp P.1605

DLINNLP Telegram 1605

Flash attention in practice 🔥

PyTorch 2.0 has flash-attention built-in, here's how you can use it:

1. Replace your attention op with torch.nn.functional.scaled_dot_product_attention
1. Use 16-bit float (which you should always be using for training anyway)
1. Make sure that your head dim is a multiple of 8 and no more than 128

Lookup git diff above as an example.

Result:
1. 2010 examples/sec ⟼ 2790 examples/sec. 40% speedup (8x4090 setup)
1. RAM: 22Gb ⟼ 16 GB reduction at 256 sequence length
1. Absolutely the same model, no approximations

(In my case a big chunk of improvement also came at the cost of reducing softmax precision from fp32 to bf16, but to hell with that)

Flash attention should yield even higher improvements on larger sequence lengths.

🔥49👍9❤3🤯2

www.tgoop.com/dlinnlp/1605

6.76K viewsVlad Lialin, May 14, 2023 at 18:49

tgoop.com/dlinnlp/1605

Create: 2023-05-14
Last Update: 2025-07-28 10:52:51

Flash attention in practice 🔥

PyTorch 2.0 has flash-attention built-in, here's how you can use it:

1. Replace your attention op with torch.nn.functional.scaled_dot_product_attention
1. Use 16-bit float (which you should always be using for training anyway)
1. Make sure that your head dim is a multiple of 8 and no more than 128

Lookup git diff above as an example.

Result:
1. 2010 examples/sec ⟼ 2790 examples/sec. 40% speedup (8x4090 setup)
1. RAM: 22Gb ⟼ 16 GB reduction at 256 sequence length
1. Absolutely the same model, no approximations

(In my case a big chunk of improvement also came at the cost of reducing softmax precision from fp32 to bf16, but to hell with that)

Flash attention should yield even higher improvements on larger sequence lengths.

BY DL in NLP

Share with your friend now:
tgoop.com/dlinnlp/1605

Open in Telegram

Telegram News

Date: 2025-07-28|

fire bomb molotov November 18 Dylan Hollingsworth yau ma tei 4How to customize a Telegram channel? Ng Man-ho, a 27-year-old computer technician, was convicted last month of seven counts of incitement charges after he made use of the 100,000-member Chinese-language channel that he runs and manages to post "seditious messages," which had been shut down since August 2020. How to create a business channel on Telegram? (Tutorial) Telegram desktop app: In the upper left corner, click the Menu icon (the one with three lines). Select “New Channel” from the drop-down menu.
from us

Telegram DL in NLP
FROM American