NoML Digest@noml_digest P.425

NOML_DIGEST Telegram 425

От A/B тестов к многоруким бандитам и RL

Вводные посты, как многорукие бандиты и RL помогут в A/B тестах и контролируемых экспериментах:
📃 Multi-armed Bandits: an alternative to A/B testing, 2021 (5 минут).
📃 Beyond A/B Testing: Multi-armed Bandit Experiments, 2019 (7 минут).
📃 Supercharge your A/B Testing using Reinforcement Learning, 2021 (7 минут).

Введение в RL и QL c примером бизнес-кейса в маркетинге:
📃 Как Reinforcement Learning помогает ритейлерам, 2020 (17 минут).

Про сэмплирование Томпсона и ε-жадную стратегию:
📃 Solving multiarmed bandits: A comparison of epsilon-greedy and Thompson sampling (есть перевод), 2018 (12 минут).

Подробное введение в бандитов - Multi-Armed Bandits, 2020 (70-80 минут):
📃 Part 1 Mathematical Framework and Terminology.
📃 Part 2 The Bandit Framework.
📃 Part 3 Bandit Algorithms.
📃 Part 4 The Upper Confidence Bound (UCB) Bandit Algorithm.
📃 Part 5 Thompson Sampling (есть перевод).
📃 Part 6 A Comparison of Bandit Algorithms.

Подробно про QL - Applied Reinforcement Learning, 2022 (40-50 минут):
📃 Part I: Q-Learning.
📃 Part II: Implementation of Q-Learning.
📃 Part III: Deep Q-Networks (DQN).
📃 Part IV: Implementation of DQN.
📃 Part V: Normalized Advantage Function (NAF) for Continuous Control.

Подробно про RL - Reinforcement Learning Made Simple, 2020-2021 (70-80 минут):
📃 Part 1: Intro to Basic Concepts and Terminology.
📃 Part 2: Solution Approaches.
📃 Part 3: Model-free solutions, step-by-step.
📃 Part 4: Q Learning, step-by-step.
📃 Part 5: Deep Q Networks, step-by-step.
📃 Part 6: Policy Gradients, step-by-step.

www.tgoop.com/noml_digest/425

1.91K viewsPavel Snurnitsyn, Jan 24, 2023 at 18:08

tgoop.com/noml_digest/425

Create: 2023-01-24
Last Update: 2025-12-08 08:38:59

От A/B тестов к многоруким бандитам и RL

Вводные посты, как многорукие бандиты и RL помогут в A/B тестах и контролируемых экспериментах:
📃 Multi-armed Bandits: an alternative to A/B testing, 2021 (5 минут).
📃 Beyond A/B Testing: Multi-armed Bandit Experiments, 2019 (7 минут).
📃 Supercharge your A/B Testing using Reinforcement Learning, 2021 (7 минут).

Введение в RL и QL c примером бизнес-кейса в маркетинге:
📃 Как Reinforcement Learning помогает ритейлерам, 2020 (17 минут).

Про сэмплирование Томпсона и ε-жадную стратегию:
📃 Solving multiarmed bandits: A comparison of epsilon-greedy and Thompson sampling (есть перевод), 2018 (12 минут).

Подробное введение в бандитов - Multi-Armed Bandits, 2020 (70-80 минут):
📃 Part 1 Mathematical Framework and Terminology.
📃 Part 2 The Bandit Framework.
📃 Part 3 Bandit Algorithms.
📃 Part 4 The Upper Confidence Bound (UCB) Bandit Algorithm.
📃 Part 5 Thompson Sampling (есть перевод).
📃 Part 6 A Comparison of Bandit Algorithms.

Подробно про QL - Applied Reinforcement Learning, 2022 (40-50 минут):
📃 Part I: Q-Learning.
📃 Part II: Implementation of Q-Learning.
📃 Part III: Deep Q-Networks (DQN).
📃 Part IV: Implementation of DQN.
📃 Part V: Normalized Advantage Function (NAF) for Continuous Control.

Подробно про RL - Reinforcement Learning Made Simple, 2020-2021 (70-80 минут):
📃 Part 1: Intro to Basic Concepts and Terminology.
📃 Part 2: Solution Approaches.
📃 Part 3: Model-free solutions, step-by-step.
📃 Part 4: Q Learning, step-by-step.
📃 Part 5: Deep Q Networks, step-by-step.
📃 Part 6: Policy Gradients, step-by-step.

BY NoML Digest

Share with your friend now:
tgoop.com/noml_digest/425

Open in Telegram

Telegram News

Date: 2025-12-08|

Just as the Bitcoin turmoil continues, crypto traders have taken to Telegram to voice their feelings. Crypto investors can reduce their anxiety about losses by joining the “Bear Market Screaming Therapy Group” on Telegram. Ng was convicted in April for conspiracy to incite a riot, public nuisance, arson, criminal damage, manufacturing of explosives, administering poison and wounding with intent to do grievous bodily harm between October 2019 and June 2020. Hui said the time period and nature of some offences “overlapped” and thus their prison terms could be served concurrently. The judge ordered Ng to be jailed for a total of six years and six months. A new window will come up. Enter your channel name and bio. (See the character limits above.) Click “Create.” bank east asia october 20 kowloon
from us

Telegram NoML Digest
FROM American