Notice: file_put_contents(): Write of 14803 bytes failed with errno=28 No space left on device in /var/www/tgoop/post.php on line 50

Warning: file_put_contents(): Only 4096 of 18899 bytes written, possibly out of free disk space in /var/www/tgoop/post.php on line 50
Микросервисы / распределенные системы@microservices_arch P.515
MICROSERVICES_ARCH Telegram 515
Микросервисы / распределенные системы
Измерение и обучение - Встраивание механизмов измерения - Регулярный анализ данных - Проведение ретроспектив - Выявление возможностей улучшения - Непрерывное улучшение Основные метрики доступности: mean time between failures (MTBF) and mean time to recover…
Механизмы обеспечения Resilience

Обнаружение проблемы
- Health Checks - рекомендуются синтетические транзакции
- Watchdogs and Alerts(A watchdog is a piece of software whose only responsibility is to watch for a specific condition and then perform an action, usually creating some form of alert, in response.)
- monitoring (metrics, traces, and logs)Metrics are the numeric measurements tracked over time. Traces are sequences of related events that reveal how requests flow through the system. Logs are the timestamped records of events.
- observability(extending monitoring to provide insight into the internal state of the system to allow its failure modes to be better predicted and understood)
(we need to understand how the system works and the various states it can be in, and we must be able to correlate from the data we have collected to what state the system is in (or was in) and how it got there)

Изоляция проблемы
- Synchronous versus Asynchronous Communication: RPCs versus Messages
- Limit the scope of a failure- Bulkheads (по сути - не клади все яйца в одну корзину, тут все от отдельных потоков до зон доступности)
- Defaults and Caches

Защита компонентов системы от перегргузки
– Back Pressure(some sort of signaling back through the system so that the clients can tell that the servers are over- loaded and there is no point in sending more requests yet)
- Load Shedding(reject workload that can’t be processed or that would cause the system to become unstable)
– rate limiting usually defined in terms of the rate of requests arriving from a particular source (e.g., a client ID, a user, or an IP address) in a time period
- Timeouts(defines how long the caller will wait, and we need a mechanism to interrupt or notify the caller that the timeout has occurred so that it can abandon the request and perform whatever logic is needed to clean things up)
- Circuit Breakers(small, state machine–based proxy that sits in front of our service request code;)

Смягчение проблемы
- Data Consistency: Compensation(for any change to a database (a transaction), the caller has the ability to make another change that undoes the change (a compensating transaction))
- Data Availability: Replication(mitigate node failure by ensuring that the data from the failed node is immediately available on other nodes)
- Data Availability: Checks (checking the underlying storage mecha- nism’s integrity and checking the integrity of the system’s data) (tactics to use to mitigate data corruption are regular checking of database integrity, regular backup of the databases, and fix- ing corruption in place)
- Data Availability: Backups

Решение проблемы
👍3🔥31



tgoop.com/microservices_arch/515
Create:
Last Update:

Механизмы обеспечения Resilience

Обнаружение проблемы
- Health Checks - рекомендуются синтетические транзакции
- Watchdogs and Alerts(A watchdog is a piece of software whose only responsibility is to watch for a specific condition and then perform an action, usually creating some form of alert, in response.)
- monitoring (metrics, traces, and logs)Metrics are the numeric measurements tracked over time. Traces are sequences of related events that reveal how requests flow through the system. Logs are the timestamped records of events.
- observability(extending monitoring to provide insight into the internal state of the system to allow its failure modes to be better predicted and understood)
(we need to understand how the system works and the various states it can be in, and we must be able to correlate from the data we have collected to what state the system is in (or was in) and how it got there)

Изоляция проблемы
- Synchronous versus Asynchronous Communication: RPCs versus Messages
- Limit the scope of a failure- Bulkheads (по сути - не клади все яйца в одну корзину, тут все от отдельных потоков до зон доступности)
- Defaults and Caches

Защита компонентов системы от перегргузки
– Back Pressure(some sort of signaling back through the system so that the clients can tell that the servers are over- loaded and there is no point in sending more requests yet)
- Load Shedding(reject workload that can’t be processed or that would cause the system to become unstable)
– rate limiting usually defined in terms of the rate of requests arriving from a particular source (e.g., a client ID, a user, or an IP address) in a time period
- Timeouts(defines how long the caller will wait, and we need a mechanism to interrupt or notify the caller that the timeout has occurred so that it can abandon the request and perform whatever logic is needed to clean things up)
- Circuit Breakers(small, state machine–based proxy that sits in front of our service request code;)

Смягчение проблемы
- Data Consistency: Compensation(for any change to a database (a transaction), the caller has the ability to make another change that undoes the change (a compensating transaction))
- Data Availability: Replication(mitigate node failure by ensuring that the data from the failed node is immediately available on other nodes)
- Data Availability: Checks (checking the underlying storage mecha- nism’s integrity and checking the integrity of the system’s data) (tactics to use to mitigate data corruption are regular checking of database integrity, regular backup of the databases, and fix- ing corruption in place)
- Data Availability: Backups

Решение проблемы

BY Микросервисы / распределенные системы


Share with your friend now:
tgoop.com/microservices_arch/515

View MORE
Open in Telegram


Telegram News

Date: |

5Telegram Channel avatar size/dimensions More>> Ng, who had pleaded not guilty to all charges, had been detained for more than 20 months. His channel was said to have contained around 120 messages and photos that incited others to vandalise pro-government shops and commit criminal damage targeting police stations. The court said the defendant had also incited people to commit public nuisance, with messages calling on them to take part in rallies and demonstrations including at Hong Kong International Airport, to block roads and to paralyse the public transportation system. Various forms of protest promoted on the messaging platform included general strikes, lunchtime protests and silent sit-ins. Healing through screaming therapy
from us


Telegram Микросервисы / распределенные системы
FROM American