Merge branch 'main' of https://github.com/KerradKerridi/prod

feat: add coverage test targets for Telegram bot and AnonBot in Makefile
Merge pull request #5 from KerradKerridi/dev-5
2026-02-01 22:31:08 +03:00 · 2026-02-01 22:31:03 +03:00 · 2026-01-25 22:26:16 +03:00 · 2026-01-25 22:24:12 +03:00 · 2026-01-25 20:51:27 +03:00 · 2026-01-25 20:43:12 +03:00
62 changed files with 25402 additions and 4599 deletions
--- a/.cursor/rules/my-custom-rule.mdc
+++ b/.cursor/rules/my-custom-rule.mdc
@@ -0,0 +1,409 @@
+---
+name: prod-project-rules
+description: Правила работы с проектом prod - инфраструктура, боты, CI/CD
+---
+
+# Правила работы с проектом prod
+
+## 📋 Обзор проекта
+
+**prod** — проект для управления Telegram ботами и мониторинга инфраструктуры в продакшене.
+
+### Основные компоненты:
+- **Инфраструктура мониторинга**: Prometheus, Grafana, Alertmanager, Uptime Kuma
+- **Telegram боты**: telegram-helper-bot, AnonBot (в отдельных поддиректориях)
+- **CI/CD**: GitHub Actions с автоматическим тестированием, созданием PR и деплоем
+- **Контейнеризация**: Docker Compose для оркестрации сервисов
+
+---
+
+## 🌿 Работа с ветками и Git
+
+### Структура веток:
+- **`main`** — продакшен ветка, защищена, только через PR
+- **`develop`** — ветка разработки (опционально)
+- **`dev-*`** — ветки для разработки (например, `dev-4`)
+- **`feature/**`** — ветки для новых фич
+
+### Workflow разработки:
+
+1. **Создание ветки для разработки:**
+   ```bash
+   git checkout -b dev-4  # или feature/my-feature
+   ```
+
+2. **Перед коммитом - проверка качества кода:**
+   ```bash
+   make code-quality  # Проверяет форматирование, импорты, линтинг
+   # Или автоматически исправить:
+   make format        # Исправить форматирование
+   make import-fix    # Исправить сортировку импортов
+   ```
+
+3. **Коммит и пуш:**
+   ```bash
+   git add .
+   git commit -m "feat: описание изменений"
+   git push -u origin dev-4
+   ```
+
+4. **Автоматические действия после push:**
+   - ✅ Запускаются тесты (Black, isort, flake8, pytest)
+   - ✅ При успешных тестах автоматически создается/обновляется PR в `main`
+   - ✅ Отправляется уведомление в Telegram
+
+5. **После мержа PR в `main`:**
+   - ✅ Автоматически запускается деплой в продакшен (`deploy.yml`)
+   - ✅ Проверяются токены ботов
+   - ✅ Выполняется деплой на сервер
+   - ✅ Запускаются health checks и smoke tests
+   - ✅ При падении smoke tests — автоматический rollback
+
+---
+
+## 🎨 Стандарты кода
+
+### Форматирование (Black):
+- **Обязательно**: Все Python файлы должны быть отформатированы через Black
+- **Проверка**: `make format-check` или `black --check .`
+- **Исправление**: `make format` или `black .`
+- **Правила**:
+  - Двойные кавычки `"` вместо одинарных `'`
+  - 2 пустые строки между импортами и определениями функций/классов
+  - Автоматический перенос длинных строк
+
+### Сортировка импортов (isort):
+- **Обязательно**: Импорты должны быть отсортированы
+- **Проверка**: `make import-check` или `isort --check-only .`
+- **Исправление**: `make import-fix` или `isort .`
+- **Порядок**: стандартная библиотека → сторонние → локальные
+
+### Линтинг (flake8):
+- **Критические ошибки** (E9, F63, F7, F82) — блокируют пайплайн
+- **Предупреждения** (F821, F822, F824) — игнорируются в CI
+- **Проверка**: `make lint-check`
+- **Исключения**: `.venv`, `venv`, `__pycache__`, `.git`
+
+### Перед коммитом:
+```bash
+make code-quality  # Проверяет всё сразу
+```
+
+---
+
+## 🧪 Тестирование
+
+### Структура тестов:
+- **`tests/`** — тесты инфраструктуры проекта
+- **`bots/*/tests/`** — тесты ботов (в их репозиториях)
+
+### Запуск тестов:
+```bash
+make test              # Все тесты
+make test-infra        # Только тесты инфраструктуры
+make test-coverage     # С отчетом о покрытии
+make test-clean        # Очистить кэш и отчеты
+```
+
+### Конфигурация pytest:
+- Файл: `pytest.ini` в корне проекта
+- Автоматическое обнаружение тестов в `tests/`
+- Маркеры: `slow`, `integration`, `unit`
+- Asyncio режим: автоматический
+
+### Правила написания тестов:
+- Используй описательные имена: `test_prometheus_config_is_valid`
+- Группируй связанные тесты в классы
+- Используй фикстуры для общих setup/teardown
+- Тесты должны быть независимыми и идемпотентными
+
+---
+
+## 🐳 Docker и контейнеризация
+
+### Структура:
+- **`docker-compose.yml`** — основной файл оркестрации
+- **`Dockerfile`** — базовый образ (если нужен)
+- **`bots/*/Dockerfile`** — Dockerfile для каждого бота
+
+### Сервисы в docker-compose:
+- `prometheus` — сбор метрик (порт 9090)
+- `grafana` — дашборды (порт 3000)
+- `alertmanager` — управление алертами (порт 9093)
+- `uptime-kuma` — мониторинг доступности (порт 3001)
+- `telegram-bot` — Telegram Helper Bot (порт 8080)
+- `anon-bot` — AnonBot (порт 8081)
+
+### Команды:
+```bash
+make build            # Собрать все контейнеры
+make up               # Запустить все сервисы
+make down             # Остановить все сервисы
+make restart          # Перезапустить все сервисы
+make logs             # Логи всех сервисов
+make logs-bot         # Логи Telegram бота
+```
+
+### Важные правила:
+- **Токены ботов**: Используются из GitHub Secrets через переменные окружения
+- **Формат**: `TELEGRAM_BOT_TOKEN=${TELEGRAM_BOT_TOKEN:-${BOT_TOKEN}}` (Secrets имеют приоритет)
+- **Build**: Используй `docker-compose build --pull` (не `--no-cache`) для оптимизации
+- **Graceful shutdown**: `docker-compose down -t 30` для корректного завершения
+
+---
+
+## 🔐 Безопасность и секреты
+
+### GitHub Secrets (обязательные):
+- `TELEGRAM_BOT_TOKEN` — токен Telegram Helper Bot
+- `TELEGRAM_TEST_BOT_TOKEN` — токен тестового бота (опционально)
+- `ANON_BOT_TOKEN` — токен AnonBot
+- `SSH_PRIVATE_KEY` — приватный ключ для SSH доступа к серверу
+- `SERVER_HOST`, `SERVER_USER`, `SSH_PORT` — данные сервера
+- `TELEGRAM_CHAT_ID` — ID чата для уведомлений
+
+### Локальная разработка:
+- Используй `.env` файлы для локальных переменных
+- `.env` файлы в `.gitignore` — никогда не коммить!
+- Токены из Secrets имеют приоритет над `.env` в продакшене
+
+### Правила:
+- ❌ **НЕ коммить** токены, пароли, секреты
+- ❌ **НЕ коммить** `.env` файлы
+- ✅ Используй `env.template` как шаблон
+- ✅ Все секреты храни в GitHub Secrets
+
+---
+
+## 🚀 CI/CD Pipeline
+
+### Два основных workflow:
+
+#### 1. `pipeline.yml` (CI):
+- **Триггер**: Push в `main`, `develop`, `dev-*`, `feature/**`
+- **Jobs**:
+  - `test` — проверка качества кода и тесты
+  - `create-pr` — автоматическое создание/обновление PR (только для `dev-*` и `feature/**`)
+  - `rollback` — ручной откат через `workflow_dispatch`
+
+#### 2. `deploy.yml` (CD):
+- **Триггер**: Мерж PR в `main`
+- **Jobs**:
+  - `deploy` — деплой на сервер
+  - `smoke-tests` — проверка работоспособности ботов
+  - `auto-rollback` — автоматический откат при падении smoke tests
+
+### Правила работы с пайплайном:
+
+1. **Перед push** — всегда запускай `make code-quality` локально
+2. **После успешных тестов** — PR создастся/обновится автоматически
+3. **После мержа PR** — деплой запустится автоматически
+4. **При проблемах** — используй manual rollback через Actions → Run workflow
+
+---
+
+## 📁 Структура проекта
+
+```
+prod/
+├── .github/workflows/     # CI/CD пайплайны
+│   ├── pipeline.yml       # CI: тесты, создание PR
+│   └── deploy.yml         # CD: деплой, smoke tests, rollback
+├── bots/                  # Директория для ботов (submodules)
+│   ├── telegram-helper-bot/  # Telegram Helper Bot
+│   └── AnonBot/              # AnonBot
+├── infra/                 # Инфраструктура
+│   ├── prometheus/        # Конфигурация Prometheus
+│   ├── grafana/          # Дашборды и provisioning Grafana
+│   ├── alertmanager/      # Конфигурация Alertmanager
+│   ├── nginx/            # Nginx конфигурация
+│   └── ansible/          # Ansible playbooks
+├── scripts/              # Скрипты развертывания
+├── tests/                # Тесты инфраструктуры
+│   └── infra/           # Тесты инфраструктуры
+├── docker-compose.yml    # Docker Compose конфигурация
+├── Makefile             # Команды для управления проектом
+├── pytest.ini           # Конфигурация pytest
+└── README.md            # Документация проекта
+```
+
+### Важные файлы:
+- **`docker-compose.yml`** — основная конфигурация сервисов
+- **`Makefile`** — команды для разработки и управления
+- **`pytest.ini`** — конфигурация тестов
+- **`.gitignore`** — исключает `.env`, `.venv`, логи, кэш
+
+---
+
+## 🔧 Разработка
+
+### Локальная настройка:
+
+1. **Клонирование и настройка:**
+   ```bash
+   git clone <repo>
+   cd prod
+   cp env.template .env
+   # Отредактируй .env с локальными значениями
+   ```
+
+2. **Установка зависимостей:**
+   ```bash
+   python3 -m venv .venv
+   source .venv/bin/activate
+   pip install black isort flake8 pytest
+   ```
+
+3. **Проверка перед коммитом:**
+   ```bash
+   make code-quality
+   ```
+
+### Работа с ботами:
+
+- Боты находятся в `bots/` как отдельные репозитории (submodules или клоны)
+- Каждый бот имеет свой Dockerfile
+- Токены ботов передаются через environment variables в docker-compose
+
+---
+
+## 📝 Коммиты и PR
+
+### Формат коммитов:
+Используй понятные сообщения:
+```
+feat: добавлена функция X
+fix: исправлена ошибка Y
+chore: обновлены зависимости
+docs: обновлена документация
+refactor: рефакторинг модуля Z
+```
+
+### Pull Request:
+- **Автоматическое создание**: Для веток `dev-*` и `feature/**` после успешных тестов
+- **Обновление**: PR автоматически обновляется при новых коммитах в той же ветке
+- **Мерж**: После мержа в `main` запускается автоматический деплой
+
+---
+
+## 🚨 Деплой и Rollback
+
+### Автоматический деплой:
+1. Мерж PR в `main` → запуск `deploy.yml`
+2. Валидация токенов ботов
+3. Деплой на сервер (SSH)
+4. Пересборка контейнеров с `--pull`
+5. Health checks с экспоненциальным retry
+6. Smoke tests (отправка сообщений в Telegram)
+7. При успехе — обновление истории деплоев
+
+### Автоматический rollback:
+- Срабатывает при падении smoke tests
+- Откатывается к последнему успешному коммиту из истории
+- Пересобираются контейнеры
+- Проверяются health checks
+
+### Ручной rollback:
+- Actions → CI & CD pipeline → Run workflow
+- Выбери `rollback` и опционально укажи commit hash
+- Если commit не указан — используется последний успешный деплой
+
+---
+
+## 🛠️ Полезные команды Makefile
+
+### Качество кода:
+```bash
+make code-quality    # Все проверки (Black, isort, flake8)
+make format         # Автоисправление форматирования
+make import-fix     # Автоисправление импортов
+make format-diff    # Показать что будет изменено
+```
+
+### Docker:
+```bash
+make build          # Собрать контейнеры
+make up             # Запустить сервисы
+make down           # Остановить сервисы
+make restart        # Перезапустить
+make logs           # Логи всех сервисов
+make logs-bot       # Логи бота
+```
+
+### Тестирование:
+```bash
+make test           # Все тесты
+make test-infra     # Тесты инфраструктуры
+make test-coverage  # С покрытием
+make test-clean     # Очистить кэш
+```
+
+### Мониторинг:
+```bash
+make monitoring     # Открыть Grafana
+make prometheus     # Открыть Prometheus
+make status         # Статус контейнеров
+make health         # Health checks
+```
+
+---
+
+## ⚠️ Важные замечания
+
+### НЕ делай:
+- ❌ Коммить `.env` файлы с секретами
+- ❌ Коммить токены ботов в код
+- ❌ Использовать `docker-compose build --no-cache` без необходимости
+- ❌ Пуш в `main` напрямую (только через PR)
+- ❌ Игнорировать ошибки форматирования перед коммитом
+
+### Всегда делай:
+- ✅ Запускай `make code-quality` перед коммитом
+- ✅ Используй ветки `dev-*` или `feature/**` для разработки
+- ✅ Проверяй, что тесты проходят локально
+- ✅ Используй GitHub Secrets для токенов в продакшене
+- ✅ Проверяй логи после деплоя
+
+---
+
+## 📚 Дополнительные ресурсы
+
+- **README.md** — основная документация проекта
+- **`.cursor/rules/release-notes-template.md`** — шаблон для Release Notes
+- **`pytest.ini`** — конфигурация тестов
+- **`Makefile`** — все доступные команды (`make help`)
+
+---
+
+## 🔄 Workflow схема
+
+```
+1. Создание ветки (dev-* или feature/**)
+   ↓
+2. Разработка + локальные тесты (make code-quality)
+   ↓
+3. Git commit + push
+   ↓
+4. GitHub Actions: автоматические тесты
+   ↓
+5. При успехе: автоматическое создание/обновление PR
+   ↓
+6. Ручной review и мерж PR в main
+   ↓
+7. GitHub Actions: автоматический деплой
+   ↓
+8. Health checks + Smoke tests
+   ↓
+9. При успехе: ✅ Деплой завершен
+   При падении: 🔄 Автоматический rollback
+```
+
+---
+
+## 💡 Советы
+
+1. **Используй Makefile** — все команды там, не запоминай длинные команды
+2. **Проверяй локально** — запускай `make code-quality` перед каждым коммитом
+3. **Следи за уведомлениями** — Telegram уведомления показывают статус деплоя
+4. **Используй правильные ветки** — `dev-*` для автоматического создания PR
+5. **Читай логи** — при проблемах смотри логи в GitHub Actions и на сервере
--- a/.cursor/rules/release-notes-template.md
+++ b/.cursor/rules/release-notes-template.md
@@ -0,0 +1,124 @@
+# Инструкция по оформлению Release Notes
+
+## Назначение
+Этот документ описывает структуру и формат для создания файлов Release Notes (например, `docs/RELEASE_NOTES_DEV-XX.md`).
+
+## Структура документа
+
+### 1. Заголовок
+```markdown
+# Release Notes: [название-ветки]
+```
+
+### 2. Обзор
+Краткий абзац (1-2 предложения), описывающий:
+- Количество коммитов в ветке
+- Основные направления изменений
+
+**Формат:**
+```markdown
+## Обзор
+Ветка [название] содержит [N] коммитов с ключевыми улучшениями: [краткое перечисление основных изменений].
+```
+
+### 3. Ключевые изменения
+Основной раздел с пронумерованными подразделами для каждого значимого изменения.
+
+**Структура каждого подраздела:**
+```markdown
+### [Номер]. [Название изменения]
+
+**Коммит:** `[hash]`
+
+**Что сделано:**
+- [Краткое описание изменения 1]
+- [Краткое описание изменения 2]
+- [Краткое описание изменения 3]
+```
+
+**Правила:**
+- Каждое изменение = отдельный подраздел
+- Название должно быть кратким и понятным
+- В разделе "Что сделано" используй маркированные списки
+- НЕ перечисляй затронутые файлы
+- НЕ указывай статистику строк кода
+- Фокусируйся на сути изменений, а не на технических деталях
+- Разделяй подразделы горизонтальной линией `---`
+
+### 4. Основные достижения
+Раздел с чекбоксами, подводящий итоги релиза.
+
+**Формат:**
+```markdown
+## 🎯 Основные достижения
+
+✅ [Достижение 1]  
+✅ [Достижение 2]  
+✅ [Достижение 3]  
+```
+
+**Правила:**
+- Используй эмодзи ✅ для каждого достижения
+- Каждое достижение на отдельной строке
+- Краткие формулировки (3-5 слов)
+- Фокусируйся на ключевых фичах и улучшениях
+
+### 5. Временная шкала разработки
+Раздел с информацией о сроках разработки.
+
+**Формат:**
+```markdown
+## 📅 Временная шкала разработки
+
+**Последние изменения:** [дата]  
+**Основная разработка:** [период]  
+**Предыдущие улучшения:** [контекст предыдущих веток/изменений]
+
+**Хронология коммитов:**
+- `[hash]` - [дата и время] - [краткое описание]
+- `[hash]` - [дата и время] - [краткое описание]
+```
+
+**Правила:**
+- Используй реальные даты из коммитов
+- Формат даты: "DD месяц YYYY" (например, "25 января 2026")
+- Для времени используй формат "HH:MM"
+- Хронология должна быть в хронологическом порядке (от старых к новым)
+
+## Стиль написания
+
+### Общие правила:
+- **Краткость**: Фокусируйся на сути, избегай избыточных деталей
+- **Ясность**: Используй простые и понятные формулировки
+- **Структурированность**: Информация должна быть легко читаемой и сканируемой
+- **Без технических деталей**: Не перечисляй файлы, классы, методы (только если это ключевая фича)
+- **Без статистики**: Не указывай количество строк кода, файлов и т.д.
+
+### Язык:
+- Используй прошедшее время для описания изменений ("Добавлена", "Реализована", "Обновлена")
+- Избегай технического жаргона, если это не необходимо
+- Используй активный залог
+
+### Эмодзи:
+- 🔥 для раздела "Ключевые изменения"
+- 🎯 для раздела "Основные достижения"
+- 📅 для раздела "Временная шкала разработки"
+- ✅ для чекбоксов достижений
+
+## Пример использования
+
+При создании Release Notes для новой ветки:
+
+1. Получи список коммитов: `git log [base-branch]..[target-branch] --oneline`
+2. Для каждого значимого коммита создай подраздел в "Ключевые изменения"
+3. Собери основные достижения в раздел "Основные достижения"
+4. Добавь временную шкалу с реальными датами коммитов
+5. Проверь, что документ следует структуре и стилю
+
+## Важные замечания
+
+- **НЕ включай** информацию о коммитах, которые уже были в базовой ветке (master/main)
+- **НЕ перечисляй** все файлы, которые были изменены
+- **НЕ указывай** статистику строк кода
+- **Фокусируйся** на функциональных изменениях, а не на технических деталях реализации
+- Используй **реальные даты** из коммитов, а не предполагаемые
--- a/.dockerignore
+++ b/.dockerignore
@@ -0,0 +1,77 @@
+# .dockerignore в папке prod/
+
+# Игнорируем ВСЕХ ботов - они не нужны в этом контейнере
+bots/
+
+# Игнорируем логи ботов
+bots/*/logs/
+bots/*/logs/**
+
+# Игнорируем ВСЕ скрытые файлы и папки (кроме .gitignore)
+.*
+!.gitignore
+
+# Остальные стандартные исключения
+Dockerfile
+docker-compose.yml
+docker-compose.*.yml
+README.md
+LICENSE
+.env
+.dockerignore
+
+__pycache__
+*.pyc
+*.pyo
+*.pyd
+.pytest_cache
+.coverage
+htmlcov/
+
+*.log
+logs/
+
+venv/
+.venv/
+env/
+.env/
+requirements-dev.txt
+
+tests/
+test/
+docs/
+doc/
+
+.vscode/
+.idea/
+
+data/
+*.bin
+*.dat
+*.model
+
+# Игнорируем данные Docker volumes
+/var/lib/docker/volumes/
+uptime_kuma_data/
+prometheus_data/
+grafana_data/
+alertmanager_data/
+
+# Игнорируем временные файлы Docker
+.docker/
+
+# Игнорируем базы данных и файлы данных
+*.db
+*.db-shm
+*.db-wal
+*.sqlite
+*.sqlite3
+
+# Игнорируем backup файлы
+*.backup
+*.bak
+*.old
+
+# Игнорируем файлы миграций и временные скрипты
+migration.log
+fix_dates.log
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -0,0 +1,99 @@
+name: CI pipeline
+
+on:
+  push:
+    branches: [ 'dev-*', 'feature/**' ]
+  workflow_dispatch:
+    inputs:
+      action:
+        description: 'Action to perform'
+        required: true
+        type: choice
+
+jobs:
+  test:
+    runs-on: ubuntu-latest
+    name: Test & Code Quality
+    
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v4
+      
+      - name: Set up Python 3.11
+        uses: actions/setup-python@v5
+        with:
+          python-version: '3.11'
+          cache: 'pip'
+      
+      - name: Install dependencies
+        run: |
+          python -m pip install --upgrade pip
+          pip install -r tests/infra/requirements-test.txt
+          pip install flake8 black isort mypy || true
+      
+      - name: Code formatting check (Black)
+        run: |
+          echo "🔍 Checking code formatting with Black..."
+          black --check . || (echo "❌ Code formatting issues found. Run 'black .' to fix." && exit 1)
+      
+      - name: Import sorting check (isort)
+        run: |
+          echo "🔍 Checking import sorting with isort..."
+          isort --check-only . || (echo "❌ Import sorting issues found. Run 'isort .' to fix." && exit 1)
+      
+      - name: Linting (flake8) - Critical errors
+        run: |
+          echo "🔍 Running flake8 linter (critical errors only)..."
+          flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
+      
+      - name: Linting (flake8) - Warnings
+        run: |
+          echo "🔍 Running flake8 linter (warnings)..."
+          flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics || true
+        continue-on-error: true
+      
+      - name: Run infrastructure tests
+        run: |
+          python -m pytest tests/infra/ -v --tb=short
+      
+      - name: Validate Prometheus config
+        run: |
+          python -m pytest tests/infra/test_prometheus_config.py -v
+      
+      - name: Send test success notification
+        if: success()
+        uses: appleboy/telegram-action@v1.0.0
+        with:
+          to: ${{ secrets.TELEGRAM_CHAT_ID }}
+          token: ${{ secrets.TELEGRAM_BOT_TOKEN }}
+          message: |
+            ✅ CI Tests Passed
+            
+            📦 Repository: prod
+            🌿 Branch: ${{ github.ref_name }}
+            📝 Commit: ${{ github.sha }}
+            👤 Author: ${{ github.actor }}
+            
+            ✅ All tests passed! Code quality checks completed successfully.
+            
+            🔗 View details: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
+        continue-on-error: true
+      
+      - name: Send test failure notification
+        if: failure()
+        uses: appleboy/telegram-action@v1.0.0
+        with:
+          to: ${{ secrets.TELEGRAM_CHAT_ID }}
+          token: ${{ secrets.TELEGRAM_BOT_TOKEN }}
+          message: |
+            ❌ CI Tests Failed
+            
+            📦 Repository: prod
+            🌿 Branch: ${{ github.ref_name }}
+            📝 Commit: ${{ github.sha }}
+            👤 Author: ${{ github.actor }}
+            
+            ❌ Tests failed! Deployment blocked. Please fix the issues and try again.
+            
+            🔗 View details: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
+        continue-on-error: true
--- a/.github/workflows/deploy.yml
+++ b/.github/workflows/deploy.yml
@@ -0,0 +1,276 @@
+name: Deploy to Production
+
+on:
+  push:
+    branches: [ main ]
+  workflow_dispatch:
+    inputs:
+      action:
+        description: 'Action to perform'
+        required: true
+        type: choice
+        options:
+          - deploy
+          - rollback
+      rollback_commit:
+        description: 'Commit hash to rollback to (optional, uses last successful if empty)'
+        required: false
+        type: string
+
+jobs:
+  deploy:
+    runs-on: ubuntu-latest
+    name: Deploy to Production
+    if: |
+      github.event_name == 'push' || 
+      (github.event_name == 'workflow_dispatch' && github.event.inputs.action == 'deploy')
+    concurrency:
+      group: production-deploy
+      cancel-in-progress: false
+    environment:
+      name: production
+    
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v4
+        with:
+          ref: main
+      
+      - name: Deploy to server
+        uses: appleboy/ssh-action@v1.0.0
+        with:
+          host: ${{ vars.SERVER_HOST || secrets.SERVER_HOST }}
+          username: ${{ vars.SERVER_USER || secrets.SERVER_USER }}
+          key: ${{ secrets.SSH_PRIVATE_KEY }}
+          port: ${{ vars.SSH_PORT || secrets.SSH_PORT || 22 }}
+          script: |
+            set -e
+            export TELEGRAM_BOT_TOKEN="${{ secrets.TELEGRAM_BOT_TOKEN }}"
+            export TELEGRAM_TEST_BOT_TOKEN="${{ secrets.TELEGRAM_TEST_BOT_TOKEN }}"
+            export ANON_BOT_TOKEN="${{ secrets.ANON_BOT_TOKEN }}"
+            
+            echo "🚀 Starting deployment to production..."
+            
+            cd /home/prod
+            
+            # Сохраняем информацию о коммите
+            CURRENT_COMMIT=$(git rev-parse HEAD)
+            COMMIT_MESSAGE=$(git log -1 --pretty=format:"%s" || echo "Unknown")
+            COMMIT_AUTHOR=$(git log -1 --pretty=format:"%an" || echo "Unknown")
+            TIMESTAMP=$(date +"%Y-%m-%d %H:%M:%S")
+            
+            echo "📝 Current commit: $CURRENT_COMMIT"
+            echo "📝 Commit message: $COMMIT_MESSAGE"
+            echo "📝 Author: $COMMIT_AUTHOR"
+            
+            # Записываем в историю деплоев
+            HISTORY_FILE="/home/prod/.deploy_history.txt"
+            HISTORY_SIZE="${DEPLOY_HISTORY_SIZE:-10}"
+            echo "${TIMESTAMP}|${CURRENT_COMMIT}|${COMMIT_MESSAGE}|${COMMIT_AUTHOR}|deploying" >> "$HISTORY_FILE"
+            tail -n "$HISTORY_SIZE" "$HISTORY_FILE" > "${HISTORY_FILE}.tmp" && mv "${HISTORY_FILE}.tmp" "$HISTORY_FILE"
+            
+            # Обновляем код
+            echo "📥 Pulling latest changes from main..."
+            sudo chown -R deploy:deploy /home/prod/bots || true
+            git fetch origin main
+            git reset --hard origin/main
+            sudo chown -R deploy:deploy /home/prod/bots || true
+            
+            NEW_COMMIT=$(git rev-parse HEAD)
+            echo "✅ Code updated: $CURRENT_COMMIT → $NEW_COMMIT"
+            
+            # Валидация docker-compose
+            echo "🔍 Validating docker-compose configuration..."
+            docker-compose config > /dev/null || exit 1
+            echo "✅ docker-compose.yml is valid"
+            
+            # Проверка дискового пространства
+            MIN_FREE_GB=5
+            AVAILABLE_SPACE=$(df -BG /home/prod 2>/dev/null | tail -1 | awk '{print $4}' | sed 's/G//' || echo "0")
+            echo "💾 Available disk space: ${AVAILABLE_SPACE}GB"
+            
+            if [ "$AVAILABLE_SPACE" -lt "$MIN_FREE_GB" ]; then
+              echo "⚠️  Insufficient disk space! Cleaning up Docker resources..."
+              docker system prune -f --volumes || true
+            fi
+            
+            # Сборка и запуск контейнеров (кроме ботов для ускорения деплоя)
+            echo "🔨 Rebuilding infrastructure containers (excluding bots)..."
+            docker-compose stop prometheus grafana uptime-kuma alertmanager || true
+            
+            export TELEGRAM_BOT_TOKEN TELEGRAM_TEST_BOT_TOKEN ANON_BOT_TOKEN
+            docker-compose build --pull prometheus grafana uptime-kuma alertmanager
+            docker-compose up -d prometheus grafana uptime-kuma alertmanager
+            
+            echo "✅ Infrastructure containers rebuilt and started (bots remain running)"
+      
+      - name: Update deploy history
+        if: always()
+        uses: appleboy/ssh-action@v1.0.0
+        with:
+          host: ${{ vars.SERVER_HOST || secrets.SERVER_HOST }}
+          username: ${{ vars.SERVER_USER || secrets.SERVER_USER }}
+          key: ${{ secrets.SSH_PRIVATE_KEY }}
+          port: ${{ vars.SSH_PORT || secrets.SSH_PORT || 22 }}
+          script: |
+            HISTORY_FILE="/home/prod/.deploy_history.txt"
+            
+            if [ -f "$HISTORY_FILE" ]; then
+              DEPLOY_STATUS="failed"
+              if [ "${{ job.status }}" = "success" ]; then
+                DEPLOY_STATUS="success"
+              fi
+              
+              sed -i '$s/|deploying$/|'"$DEPLOY_STATUS"'/' "$HISTORY_FILE"
+              echo "✅ Deploy history updated: $DEPLOY_STATUS"
+            fi
+      
+      - name: Send deployment notification
+        if: always()
+        uses: appleboy/telegram-action@v1.0.0
+        with:
+          to: ${{ secrets.TELEGRAM_CHAT_ID }}
+          token: ${{ secrets.TELEGRAM_BOT_TOKEN }}
+          message: |
+            ${{ job.status == 'success' && '✅' || '❌' }} Deployment: ${{ job.status }}
+            
+            📦 Repository: prod
+            🌿 Branch: main
+            📝 Commit: ${{ github.event.pull_request.merge_commit_sha || github.sha }}
+            👤 Author: ${{ github.event.pull_request.user.login || github.actor }}
+            ${{ github.event.pull_request.number && format('🔀 PR: #{0}', github.event.pull_request.number) || '' }}
+            
+            ${{ job.status == 'success' && '✅ Deployment successful! Containers started.' || '❌ Deployment failed! Check logs for details.' }}
+            
+            🔗 View details: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
+        continue-on-error: true
+
+  rollback:
+    runs-on: ubuntu-latest
+    name: Rollback to Previous Version
+    if: |
+      github.event_name == 'workflow_dispatch' && 
+      github.event.inputs.action == 'rollback'
+    environment:
+      name: production
+    
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v4
+        with:
+          ref: main
+      
+      - name: Rollback on server
+        uses: appleboy/ssh-action@v1.0.0
+        with:
+          host: ${{ vars.SERVER_HOST || secrets.SERVER_HOST }}
+          username: ${{ vars.SERVER_USER || secrets.SERVER_USER }}
+          key: ${{ secrets.SSH_PRIVATE_KEY }}
+          port: ${{ vars.SSH_PORT || secrets.SSH_PORT || 22 }}
+          script: |
+            set -e
+            export TELEGRAM_BOT_TOKEN="${{ secrets.TELEGRAM_BOT_TOKEN }}"
+            export TELEGRAM_TEST_BOT_TOKEN="${{ secrets.TELEGRAM_TEST_BOT_TOKEN }}"
+            export ANON_BOT_TOKEN="${{ secrets.ANON_BOT_TOKEN }}"
+            
+            echo "🔄 Starting rollback..."
+            
+            cd /home/prod
+            
+            # Определяем коммит для отката
+            ROLLBACK_COMMIT="${{ github.event.inputs.rollback_commit }}"
+            HISTORY_FILE="/home/prod/.deploy_history.txt"
+            
+            if [ -z "$ROLLBACK_COMMIT" ]; then
+              echo "📝 No commit specified, finding last successful deploy..."
+              if [ -f "$HISTORY_FILE" ]; then
+                ROLLBACK_COMMIT=$(grep "|success$" "$HISTORY_FILE" | tail -1 | cut -d'|' -f2 || echo "")
+              fi
+              
+              if [ -z "$ROLLBACK_COMMIT" ]; then
+                echo "❌ No successful deploy found in history!"
+                echo "💡 Please specify commit hash manually or check deploy history"
+                exit 1
+              fi
+            fi
+            
+            echo "📝 Rolling back to commit: $ROLLBACK_COMMIT"
+            
+            # Проверяем, что коммит существует
+            if ! git cat-file -e "$ROLLBACK_COMMIT" 2>/dev/null; then
+              echo "❌ Commit $ROLLBACK_COMMIT not found!"
+              exit 1
+            fi
+            
+            # Сохраняем текущий коммит
+            CURRENT_COMMIT=$(git rev-parse HEAD)
+            COMMIT_MESSAGE=$(git log -1 --pretty=format:"%s" "$ROLLBACK_COMMIT" || echo "Rollback")
+            TIMESTAMP=$(date +"%Y-%m-%d %H:%M:%S")
+            
+            echo "📝 Current commit: $CURRENT_COMMIT"
+            echo "📝 Target commit: $ROLLBACK_COMMIT"
+            echo "📝 Commit message: $COMMIT_MESSAGE"
+            
+            # Исправляем права перед откатом
+            sudo chown -R deploy:deploy /home/prod/bots || true
+            
+            # Откатываем код
+            echo "🔄 Rolling back code..."
+            git fetch origin main
+            git reset --hard "$ROLLBACK_COMMIT"
+            
+            # Исправляем права после отката
+            sudo chown -R deploy:deploy /home/prod/bots || true
+            
+            echo "✅ Code rolled back: $CURRENT_COMMIT → $ROLLBACK_COMMIT"
+            
+            # Валидация docker-compose
+            echo "🔍 Validating docker-compose configuration..."
+            docker-compose config > /dev/null || exit 1
+            echo "✅ docker-compose.yml is valid"
+            
+            # Проверка дискового пространства
+            MIN_FREE_GB=5
+            AVAILABLE_SPACE=$(df -BG /home/prod 2>/dev/null | tail -1 | awk '{print $4}' | sed 's/G//' || echo "0")
+            echo "💾 Available disk space: ${AVAILABLE_SPACE}GB"
+            
+            if [ "$AVAILABLE_SPACE" -lt "$MIN_FREE_GB" ]; then
+              echo "⚠️  Insufficient disk space! Cleaning up Docker resources..."
+              docker system prune -f --volumes || true
+            fi
+            
+            # Пересобираем и запускаем контейнеры (кроме ботов для ускорения отката)
+            echo "🔨 Rebuilding infrastructure containers (excluding bots)..."
+            docker-compose stop prometheus grafana uptime-kuma alertmanager || true
+            
+            export TELEGRAM_BOT_TOKEN TELEGRAM_TEST_BOT_TOKEN ANON_BOT_TOKEN
+            docker-compose build --pull prometheus grafana uptime-kuma alertmanager
+            docker-compose up -d prometheus grafana uptime-kuma alertmanager
+            
+            echo "✅ Infrastructure containers rebuilt and started (bots remain running)"
+            
+            # Записываем в историю
+            echo "${TIMESTAMP}|${ROLLBACK_COMMIT}|Rollback to: ${COMMIT_MESSAGE}|github-actions|rolled_back" >> "$HISTORY_FILE"
+            HISTORY_SIZE="${DEPLOY_HISTORY_SIZE:-10}"
+            tail -n "$HISTORY_SIZE" "$HISTORY_FILE" > "${HISTORY_FILE}.tmp" && mv "${HISTORY_FILE}.tmp" "$HISTORY_FILE"
+            
+            echo "✅ Rollback completed successfully"
+      
+      - name: Send rollback notification
+        if: always()
+        uses: appleboy/telegram-action@v1.0.0
+        with:
+          to: ${{ secrets.TELEGRAM_CHAT_ID }}
+          token: ${{ secrets.TELEGRAM_BOT_TOKEN }}
+          message: |
+            ${{ job.status == 'success' && '🔄' || '❌' }} Rollback: ${{ job.status }}
+            
+            📦 Repository: prod
+            🌿 Branch: main
+            📝 Rolled back to: ${{ github.event.inputs.rollback_commit || 'Last successful commit' }}
+            👤 Triggered by: ${{ github.actor }}
+            
+            ${{ job.status == 'success' && '✅ Rollback completed successfully! Services restored to previous version.' || '❌ Rollback failed! Check logs for details.' }}
+            
+            🔗 View details: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
+        continue-on-error: true
--- a/.gitignore
+++ b/.gitignore
@@ -64,4 +64,12 @@ build/

 # Bots
 /bots/*
-!/bots/.gitkeep
+!/bots/.gitkeep
+
+# Ansible inventory files (contain sensitive server info)
+infra/ansible/inventory.ini
+infra/ansible/inventory_*.ini
+
+# Ansible vars files (contain passwords)
+infra/ansible/vars.yml
+infra/ansible/vars_*.yml
--- a/53
+++ b/53
@@ -1,27 +1,46 @@
-FROM python:3.9-slim
+###########################################
+# Этап 1: Сборщик (Builder)
+###########################################
+FROM python:3.11.9-slim as builder

-# Установка системных зависимостей
-RUN apt-get update && apt-get install -y \
-    procps \
+# Устанавливаем ТОЧНО ТОЛЬКО то, что нужно для компиляции
+RUN apt-get update && apt-get install --no-install-recommends -y \
+    gcc \
+    python3-dev \
    && rm -rf /var/lib/apt/lists/*

-# Установка рабочей директории
 WORKDIR /app
-
-# Копирование файлов зависимостей
 COPY requirements.txt .

-# Установка Python зависимостей
-RUN pip install --no-cache-dir -r requirements.txt
+# Критически важный момент: устанавливаем в отдельную папку
+RUN pip install --no-cache-dir --target /install -r requirements.txt

-# Копирование исходного кода
-COPY . .

-# Создание пользователя для безопасности
-RUN groupadd -g 1000 monitor && \
-    useradd -m -u 1000 -g monitor monitor && \
-    chown -R 1000:1000 /app
+###########################################
+# Этап 2: Финальный образ (Runtime)
+###########################################
+# Используем ОЧЕНЬ легковесный базовый образ
+FROM python:3.11.9-alpine as runtime
+
+# В Alpine Linux свои пакеты. apk вместо apt.
+# Устанавливаем минимальные рантайм-зависимости
+RUN apk add --no-cache libstdc++
+
+# Создаем пользователя (в Alpine другие команды)
+RUN addgroup -g 1000 app && \
+    adduser -D -u 1000 -G app app
+
+WORKDIR /app
+
+# Копируем зависимости из сборщика (если есть)
+COPY --from=builder --chown=1000:1000 /install /usr/local/lib/python3.11/site-packages
+# Копируем исходный код
+COPY --chown=1000:1000 . .
+
 USER 1000

-# Команда по умолчанию для запуска мониторинга
-CMD ["python", "infra/monitoring/main.py"]
+# Важно: явно указываем Python искать зависимости в скопированной директории
+ENV PYTHONPATH="/usr/local/lib/python3.11/site-packages:${PYTHONPATH}"
+
+# Оставляем базовую команду для совместимости
+CMD ["python", "-c", "print('Dockerfile готов для использования')"]
--- a/214
+++ b/214
@@ -1,4 +1,4 @@
-.PHONY: help build up down logs clean restart status deploy backup restore update clean-monitoring monitoring check-deps check-bot-deps
+.PHONY: help build up down logs clean restart status deploy backup restore update clean-monitoring monitoring check-deps check-bot-deps check-anonBot-deps auth-setup auth-add-user auth-reset format-check format format-diff import-check import-fix lint-check code-quality

 help: ## Показать справку
 	@echo "🏗️  Production Infrastructure - Доступные команды:"
@@ -9,8 +9,11 @@ help: ## Показать справку
 	@echo "📊 Мониторинг:"
 	@echo "  Prometheus: http://localhost:9090"
 	@echo "  Grafana: http://localhost:3000 (admin/admin)"
+	@echo "  Uptime Kuma: http://localhost:3001"
+	@echo "  Alertmanager: http://localhost:9093"
 	@echo "  Server Monitor: http://localhost:9091/health"
 	@echo "  Bot Health: http://localhost:8080/health"
+	@echo "  AnonBot Health: http://localhost:8081/health"

 build: ## Собрать все контейнеры
 	docker-compose build
@@ -24,9 +27,6 @@ down: ## Остановить все сервисы
 logs: ## Показать логи всех сервисов
 	docker-compose logs -f

-logs-monitor: ## Показать логи мониторинга
-	docker-compose logs -f server_monitor
-
 logs-prometheus: ## Показать логи Prometheus
 	docker-compose logs -f prometheus

@@ -36,14 +36,20 @@ logs-grafana: ## Показать логи Grafana
 logs-bot: ## Показать логи Telegram бота
 	docker-compose logs -f telegram-bot

+logs-anonBot: ## Показать логи AnonBot
+	docker-compose logs -f anon-bot
+
+logs-uptime-kuma: ## Показать логи Uptime Kuma
+	docker-compose logs -f uptime-kuma
+
+logs-alertmanager: ## Показать логи Alertmanager
+	docker-compose logs -f alertmanager
+
 restart: ## Перезапустить все сервисы
 	docker-compose down
 	docker-compose build --no-cache
 	docker-compose up -d

-restart-monitor: ## Перезапустить только мониторинг
-	docker-compose restart server_monitor
-
 restart-prometheus: ## Перезапустить только Prometheus
 	docker-compose restart prometheus

@@ -53,14 +59,26 @@ restart-grafana: ## Перезапустить только Grafana
 restart-bot: ## Перезапустить только Telegram бота
 	docker-compose restart telegram-bot

+restart-anonBot: ## Перезапустить только AnonBot
+	docker-compose restart anon-bot
+
+restart-uptime-kuma: ## Перезапустить только Uptime Kuma
+	docker-compose restart uptime-kuma
+
+restart-alertmanager: ## Перезапустить только Alertmanager
+	docker-compose restart alertmanager
+
 status: ## Показать статус контейнеров
 	docker-compose ps

 health: ## Проверить здоровье сервисов
 	@echo "🏥 Checking service health..."
 	@curl -f http://localhost:8080/health || echo "❌ Bot health check failed"
+	@curl -f http://localhost:8081/health || echo "❌ AnonBot health check failed"
 	@curl -f http://localhost:9090/-/healthy || echo "❌ Prometheus health check failed"
 	@curl -f http://localhost:3000/api/health || echo "❌ Grafana health check failed"
+	@curl -f http://localhost:3001 || echo "❌ Uptime Kuma health check failed"
+	@curl -f http://localhost:9093/-/healthy || echo "❌ Alertmanager health check failed"
 	@curl -f http://localhost:9091/health || echo "❌ Server monitor health check failed"

 deploy: ## Полный деплой на продакшен
@@ -74,7 +92,6 @@ backup: ## Создать backup данных
 	@tar -czf "backups/backup-$(date +%Y%m%d-%H%M%S).tar.gz" \
 		infra/grafana/provisioning/ \
 		infra/prometheus/ \
-		infra/monitoring/ \
 		.env \
 		docker-compose.yml
 	@echo "✅ Backup created in backups/"
@@ -97,7 +114,7 @@ clean: ## Очистить все контейнеры и образы

 clean-monitoring: ## Очистить только данные мониторинга
 	docker-compose down -v
-	docker volume rm prod_prometheus_data prod_grafana_data 2>/dev/null || true
+	docker volume rm prod_prometheus_data prod_grafana_data prod_uptime_kuma_data prod_alertmanager_data 2>/dev/null || true

 security-scan: ## Сканировать образы на уязвимости
 	@echo "🔍 Scanning Docker images for vulnerabilities..."
@@ -119,23 +136,29 @@ start: build up ## Собрать и запустить все сервисы
 	@echo "🏗️  Production Infrastructure запущена!"
 	@echo "📊 Prometheus: http://localhost:9090"
 	@echo "📈 Grafana: http://localhost:3000 (admin/admin)"
+	@echo "📊 Uptime Kuma: http://localhost:3001"
+	@echo "🚨 Alertmanager: http://localhost:9093"
 	@echo "🤖 Bot Health: http://localhost:8080/health"
+	@echo "🔒 AnonBot Health: http://localhost:8081/health"
 	@echo "📡 Server Monitor: http://localhost:9091/health"
 	@echo "📝 Логи: make logs"

 stop: down ## Остановить все сервисы
 	@echo "🛑 Все сервисы остановлены"

-test: check-deps check-bot-deps ## Запустить все тесты в проекте
+test: check-deps check-bot-deps check-anonBot-deps ## Запустить все тесты в проекте
 	@echo "🧪 Запускаю все тесты в проекте..."
 	@echo "📊 Тесты инфраструктуры..."
 	@python3 -m pytest tests/infra/ -q --tb=no
 	@echo "🤖 Тесты Telegram бота..."
 	@cd bots/telegram-helper-bot && source .venv/bin/activate && python3 -m pytest tests/ -q --tb=no
+	@echo "🔒 Тесты AnonBot..."
+	@cd bots/AnonBot && python3 -m pytest tests/ -q --tb=no
 	@echo "✅ Все тесты завершены!"
 	@echo "📈 Общая статистика:"
 	@echo "   - Инфраструктура: $(shell python3 count_tests.py | head -1) тестов"
 	@echo "   - Telegram бот: $(shell python3 count_tests.py | head -2 | tail -1) тестов"
+	@echo "   - AnonBot: $(shell python3 count_tests.py | head -3 | tail -1) тестов"
 	@echo "   - Всего: $(shell python3 count_tests.py | tail -1) тестов"

 test-all: ## Запустить все тесты в одном процессе (только для разработчиков)
@@ -146,22 +169,42 @@ test-all: ## Запустить все тесты в одном процессе

 test-infra: check-deps ## Запустить тесты инфраструктуры
 	@echo "🏗️  Запускаю тесты инфраструктуры..."
-	@python3 -m pytest tests/infra/ -v
+	@source .venv/bin/activate && python3 -m pytest tests/infra/ -v

 test-bot: check-bot-deps ## Запустить тесты Telegram бота
 	@echo "🤖 Запускаю тесты Telegram бота..."
 	@cd bots/telegram-helper-bot && source .venv/bin/activate && python3 -m pytest tests/ -v

-test-coverage: check-deps check-bot-deps ## Запустить все тесты с отчетом о покрытии
+test-bot-coverage: check-bot-deps ## Запустить тесты Telegram бота с отчетом о покрытии
+	@echo "🤖 Запускаю тесты Telegram бота..."
+	@cd bots/telegram-helper-bot && source .venv/bin/activate && python3 -m pytest tests/ --cov=helper_bot --cov-report=term-missing --cov-report=html:htmlcov/bot
+	@echo "📊 Отчеты о покрытии сохранены в htmlcov/"
+	@echo "   - Telegram бот: $(shell python3 count_tests.py | head -2 | tail -1) тестов"
+
+test-anonBot: check-anonBot-deps ## Запустить тесты AnonBot
+	@echo "🔒 Запускаю тесты AnonBot..."
+	@cd bots/AnonBot && python3 -m pytest tests/ -v
+
+
+test-anonBot-coverage: check-anonBot-deps ## Запустить тесты AnonBot с отчетом о покрытии
+	@echo "🔒 Запускаю тесты AnonBot..."
+	@cd bots/AnonBot && python3 -m pytest tests/ --cov=. --cov-report=term-missing --cov-report=html:htmlcov/anonbot
+	@echo "📊 Отчеты о покрытии сохранены в htmlcov/"
+	@echo "   - AnonBot: $(shell python3 count_tests.py | head -3 | tail -1) тестов"
+
+test-coverage: check-deps check-bot-deps check-anonBot-deps ## Запустить все тесты с отчетом о покрытии
 	@echo "📊 Запускаю все тесты с отчетом о покрытии..."
 	@echo "📈 Покрытие для инфраструктуры..."
 	@python3 -m pytest tests/infra/ --cov=infra --cov-report=term-missing --cov-report=html:htmlcov/infra
 	@echo "🤖 Покрытие для Telegram бота..."
 	@cd bots/telegram-helper-bot && source .venv/bin/activate && python3 -m pytest tests/ --cov=helper_bot --cov-report=term-missing --cov-report=html:htmlcov/bot
+	@echo "🔒 Покрытие для AnonBot..."
+	@cd bots/AnonBot && python3 -m pytest tests/ --cov=. --cov-report=term-missing --cov-report=html:htmlcov/anonbot
 	@echo "📊 Отчеты о покрытии сохранены в htmlcov/"
 	@echo "📈 Общая статистика:"
 	@echo "   - Инфраструктура: $(shell python3 count_tests.py | head -1) тестов"
 	@echo "   - Telegram бот: $(shell python3 count_tests.py | head -2 | tail -1) тестов"
+	@echo "   - AnonBot: $(shell python3 count_tests.py | head -3 | tail -1) тестов"
 	@echo "   - Всего: $(shell python3 count_tests.py | tail -1) тестов"

 test-clean: ## Очистить все файлы тестирования и отчеты
@@ -173,9 +216,13 @@ test-clean: ## Очистить все файлы тестирования и о
 	@rm -rf bots/telegram-helper-bot/.pytest_cache/
 	@rm -rf bots/telegram-helper-bot/htmlcov/
 	@rm -rf bots/telegram-helper-bot/.coverage
+	@rm -rf bots/AnonBot/.pytest_cache/
+	@rm -rf bots/AnonBot/htmlcov/
+	@rm -rf bots/AnonBot/.coverage
 	@find . -name "*.pyc" -delete 2>/dev/null || true
 	@find . -name "__pycache__" -type d -exec rm -rf {} + 2>/dev/null || true
 	@echo "✅ Файлы тестирования очищены"
+	

 check-ports: ## Проверить занятые порты
 	@echo "🔍 Checking occupied ports..."
@@ -187,14 +234,13 @@ check-ports: ## Проверить занятые порты
 	@lsof -i :9091 2>/dev/null || echo "  Free"
 	@echo "Port 8080 (Telegram Bot):"
 	@lsof -i :8080 2>/dev/null || echo "  Free"
+	@echo "Port 8081 (AnonBot):"
+	@lsof -i :8081 2>/dev/null || echo "  Free"

-check-grafana: ## Проверить состояние Grafana
-	@echo "📊 Checking Grafana status..."
-	@cd infra/monitoring && python3 check_grafana.py

 check-deps: ## Проверить зависимости инфраструктуры
 	@echo "🔍 Проверяю зависимости инфраструктуры..."
-	@python3 -c "import pytest, prometheus_client, psutil, aiohttp" 2>/dev/null || (echo "❌ Отсутствуют зависимости инфраструктуры. Установите: pip install pytest prometheus-client psutil aiohttp" && exit 1)
+	@source .venv/bin/activate && python3 -c "import pytest" 2>/dev/null || (echo "❌ Отсутствуют зависимости инфраструктуры. Установите: source .venv/bin/activate && pip install pytest" && exit 1)
 	@echo "✅ Зависимости инфраструктуры установлены"

 check-bot-deps: ## Проверить зависимости Telegram бота
@@ -202,6 +248,11 @@ check-bot-deps: ## Проверить зависимости Telegram бота
 	@cd bots/telegram-helper-bot && source .venv/bin/activate && python3 -c "import aiogram, aiosqlite, pytest" 2>/dev/null || (echo "❌ Отсутствуют зависимости бота. Установите: cd bots/telegram-helper-bot && source .venv/bin/activate && pip install -r requirements.txt" && exit 1)
 	@echo "✅ Зависимости Telegram бота установлены"

+check-anonBot-deps: ## Проверить зависимости AnonBot
+	@echo "🔍 Проверяю зависимости AnonBot..."
+	@cd bots/AnonBot && python3 -c "import aiogram, aiosqlite, pytest, loguru, pydantic" 2>/dev/null || (echo "❌ Отсутствуют зависимости AnonBot. Установите: cd bots/AnonBot && pip install -r requirements.txt" && exit 1)
+	@echo "✅ Зависимости AnonBot установлены"
+
 logs-tail: ## Показать последние логи всех сервисов
 	@echo "📝 Recent logs from all services:"
 	@docker-compose logs --tail=50
@@ -223,3 +274,134 @@ reload-prometheus: ## Перезагрузить конфигурацию Promet
 reload-grafana: ## Перезагрузить конфигурацию Grafana
 	@echo "🔄 Reloading Grafana configuration..."
 	@docker-compose restart grafana
+
+ssl-setup: ## Настроить SSL сертификаты (самоподписанный)
+	@echo "🔒 Setting up self-signed SSL certificates..."
+	@if [ -z "$(SERVER_IP)" ]; then echo "❌ Please set SERVER_IP variable in .env file"; exit 1; fi
+	@mkdir -p /etc/letsencrypt/live/$(SERVER_IP)
+	@openssl req -x509 -nodes -days 365 -newkey rsa:2048 \
+		-keyout /etc/letsencrypt/live/$(SERVER_IP)/privkey.pem \
+		-out /etc/letsencrypt/live/$(SERVER_IP)/fullchain.pem \
+		-subj "/CN=$(SERVER_IP)"
+	@echo "✅ Self-signed certificate created for $(SERVER_IP)"
+
+ssl-renew: ## Обновить SSL сертификаты
+	@echo "🔄 Renewing SSL certificates..."
+	@sudo /usr/local/bin/ssl-renewal.sh
+
+ssl-status: ## Проверить статус SSL сертификатов
+	@echo "🔍 Checking SSL certificate status..."
+	@sudo certbot certificates
+
+uptime-kuma: ## Открыть Uptime Kuma в браузере
+	@echo "📊 Opening Uptime Kuma..."
+	@open http://localhost:3001 || xdg-open http://localhost:3001 || echo "Please open manually: http://localhost:3001"
+
+alertmanager: ## Открыть Alertmanager в браузере
+	@echo "🚨 Opening Alertmanager..."
+	@open http://localhost:9093 || xdg-open http://localhost:9093 || echo "Please open manually: http://localhost:9093"
+
+monitoring-all: ## Открыть все мониторинг сервисы
+	@echo "📊 Opening all monitoring services..."
+	@echo "  - Grafana: http://localhost:3000"
+	@echo "  - Prometheus: http://localhost:9090"
+	@echo "  - Uptime Kuma: http://localhost:3001"
+	@echo "  - Alertmanager: http://localhost:9093"
+	@open http://localhost:3000 || xdg-open http://localhost:3000 || echo "Please open manually"
+
+# ========================================
+# 🔐 АВТОРИЗАЦИЯ МОНИТОРИНГА
+# ========================================
+
+auth-setup: ## Настроить авторизацию для мониторинга
+	@echo "🔐 Setting up monitoring authentication..."
+	@sudo mkdir -p /etc/nginx/passwords
+	@sudo cp scripts/generate_auth_passwords.sh /usr/local/bin/generate_auth_passwords.sh
+	@sudo chmod +x /usr/local/bin/generate_auth_passwords.sh
+	@echo "✅ Authentication setup complete!"
+	@echo "💡 Use 'make auth-add-user' to add users"
+
+auth-add-user: ## Добавить пользователя для мониторинга (make auth-add-user USER=username)
+	@if [ -z "$(USER)" ]; then \
+		echo "❌ Please specify USER: make auth-add-user USER=username"; \
+		exit 1; \
+	fi
+	@echo "🔐 Adding user $(USER) for monitoring..."
+	@sudo /usr/local/bin/generate_auth_passwords.sh $(USER)
+	@echo "✅ User $(USER) added successfully!"
+
+auth-reset: ## Сбросить пароль для пользователя (make auth-reset USER=username)
+	@if [ -z "$(USER)" ]; then \
+		echo "❌ Please specify USER: make auth-reset USER=username"; \
+		exit 1; \
+	fi
+	@echo "🔐 Resetting password for user $(USER)..."
+	@sudo htpasswd /etc/nginx/passwords/monitoring.htpasswd $(USER)
+	@echo "✅ Password reset for user $(USER)!"
+
+auth-list: ## Показать список пользователей мониторинга
+	@echo "👥 Monitoring users:"
+	@sudo cat /etc/nginx/passwords/monitoring.htpasswd 2>/dev/null | cut -d: -f1 || echo "❌ No users found"
+
+# ========================================
+# Code Quality & Formatting
+# ========================================
+
+format-check: ## Проверить форматирование кода (Black)
+	@echo "🔍 Checking code formatting with Black..."
+	@if [ -f .venv/bin/python ]; then \
+		.venv/bin/python -m black --check . || (echo "❌ Code formatting issues found. Run 'make format' to fix." && exit 1); \
+	else \
+		python3 -m black --check . || (echo "❌ Code formatting issues found. Run 'make format' to fix." && exit 1); \
+	fi
+	@echo "✅ Code formatting is correct!"
+
+format: ## Автоматически исправить форматирование кода (Black)
+	@echo "🎨 Formatting code with Black..."
+	@if [ -f .venv/bin/python ]; then \
+		.venv/bin/python -m black .; \
+	else \
+		python3 -m black .; \
+	fi
+	@echo "✅ Code formatted!"
+
+format-diff: ## Показать что будет изменено Black (без применения)
+	@echo "📋 Showing Black diff (no changes applied)..."
+	@if [ -f .venv/bin/python ]; then \
+		.venv/bin/python -m black --diff .; \
+	else \
+		python3 -m black --diff .; \
+	fi
+
+import-check: ## Проверить сортировку импортов (isort)
+	@echo "🔍 Checking import sorting with isort..."
+	@if [ -f .venv/bin/python ]; then \
+		.venv/bin/python -m isort --check-only . || (echo "❌ Import sorting issues found. Run 'make import-fix' to fix." && exit 1); \
+	else \
+		python3 -m isort --check-only . || (echo "❌ Import sorting issues found. Run 'make import-fix' to fix." && exit 1); \
+	fi
+	@echo "✅ Import sorting is correct!"
+
+import-fix: ## Автоматически исправить сортировку импортов (isort)
+	@echo "📦 Fixing import sorting with isort..."
+	@if [ -f .venv/bin/python ]; then \
+		.venv/bin/python -m isort .; \
+	else \
+		python3 -m isort .; \
+	fi
+	@echo "✅ Imports sorted!"
+
+lint-check: ## Проверить код линтером (flake8) - только критические ошибки
+	@echo "🔍 Running flake8 linter (critical errors only)..."
+	@if [ -f .venv/bin/python ]; then \
+		.venv/bin/python -m flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics --exclude=".venv,venv,__pycache__,.git,*.pyc" || true; \
+	else \
+		python3 -m flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics --exclude=".venv,venv,__pycache__,.git,*.pyc" || true; \
+	fi
+	@echo "✅ Linting check completed (non-critical warnings in dependencies ignored)!"
+
+code-quality: format-check import-check lint-check ## Проверить качество кода (все проверки)
+	@echo ""
+	@echo "✅ All code quality checks passed!"
+	@echo ""
+	@echo "ℹ️  Note: F821/F822/F824 warnings in bots/ are non-critical and ignored in CI"
--- a/README.md
+++ b/README.md
@@ -19,10 +19,6 @@ prod/

 ## 🚀 Быстрый запуск

-### ⚠️ Важное замечание
-**Убедитесь, что вы удалили файл `docker-compose.yml` из папки `bots/telegram-helper-bot/`** 
-для избежания конфликтов портов. Используйте только корневой `docker-compose.yml`.
-
 ### 1. Настройка переменных окружения

 Скопируйте шаблон и настройте переменные:
@@ -57,12 +53,25 @@ GRAFANA_ADMIN_PASSWORD=admin
 docker-compose up -d
 ```

+### 2.1 Запуск только основного бота (с зависимостями). Можно заменить на AnonBot
+
+```bash
+docker-compose up -d prometheus telegram-bot
+```
+
 ### 3. Проверка статуса

 ```bash
 docker-compose ps
 ```

+
+### 4. Перезапуск контейнера
+
+```bash
+docker-compose down telegram-bot && docker-compose build --no-cache telegram-bot && docker-compose up -d telegram-bot
+```
+
 ## 📊 Сервисы

 - **Prometheus** (порт 9090) - сбор метрик
@@ -100,19 +109,11 @@ docker-compose ps
 - **Назначение**: Сбор и хранение метрик, API для запросов
 - **Доступ**: Публичный (проброс из контейнера)
 - **Функции**: 
-  - Сбор метрик с server_monitor (порт 9091)
  - Сбор метрик с telegram-bot (порт 8080)
+  - Сбор метрик с anon-bot (порт 8081)
+  - Сбор метрик с node_exporter (порт 9100)
  - Хранение исторических данных

-#### **Порт 9091 - Server Monitor**
- **Контейнер**: `bots_server_monitor`
- **Назначение**: Мониторинг системных ресурсов сервера
- **Доступ**: Внутренний (только внутри Docker сети)
- **Функции**:
-  - Сбор CPU, RAM, Disk метрик
-  - Отправка алертов в Telegram
-  - Предоставление метрик для Prometheus
-
 #### **Порт 8080 - Telegram Bot**
 - **Контейнер**: `bots_telegram_bot`
 - **Назначение**: Основной функционал Telegram бота
@@ -152,7 +153,6 @@ docker-compose ps
 docker-compose logs

 # Только мониторинг
-docker-compose logs -f server_monitor

 # Prometheus
 docker logs bots_prometheus
@@ -165,7 +165,6 @@ docker logs bots_grafana

 ### Автоматическая проверка
 ```bash
-cd infra/monitoring
 python3 check_grafana.py
 ```

@@ -204,7 +203,6 @@ make health        # Проверить здоровье всех сервисо
 ### 📊 Мониторинг и логи
 ```bash
 make logs          # Логи всех сервисов
-make logs-monitor  # Логи только мониторинга
 make logs-bot      # Логи Telegram бота
 make logs-errors   # Только ошибки из логов
 make monitoring    # Открыть Grafana в браузере
@@ -213,7 +211,6 @@ make prometheus    # Открыть Prometheus в браузере

 ### 🔧 Управление отдельными сервисами
 ```bash
-make restart-monitor    # Перезапустить только мониторинг
 make restart-grafana    # Перезапустить только Grafana
 make restart-prometheus # Перезапустить только Prometheus
 make restart-bot        # Перезапустить только Telegram бота
--- a/count_tests.py
+++ b/count_tests.py
@@ -1,77 +0,0 @@
-#!/usr/bin/env python3
-"""
-Скрипт для подсчета количества тестов в проекте
-"""
-
-import subprocess
-import sys
-import os
-
-def count_tests_in_directory(directory):
-    """Подсчитывает количество тестов в указанной директории"""
-    try:
-        # Запускаем pytest --collect-only для подсчета тестов
-        result = subprocess.run(
-            [sys.executable, '-m', 'pytest', directory, '--collect-only', '-q'],
-            capture_output=True,
-            text=True,
-            cwd=os.getcwd()
-        )
-        
-        if result.returncode == 0:
-            # Ищем строку с количеством собранных тестов
-            for line in result.stdout.split('\n'):
-                if 'collected' in line:
-                    # Извлекаем число из строки вида "78 collected"
-                    parts = line.strip().split()
-                    for part in parts:
-                        if part.isdigit():
-                            return int(part)
-        return 0
-    except Exception as e:
-        print(f"Ошибка при подсчете тестов в {directory}: {e}", file=sys.stderr)
-        return 0
-
-def count_bot_tests():
-    """Подсчитывает количество тестов бота"""
-    try:
-        # Переходим в директорию бота и запускаем pytest
-        bot_dir = os.path.join(os.getcwd(), 'bots', 'telegram-helper-bot')
-        result = subprocess.run(
-            [sys.executable, '-m', 'pytest', 'tests/', '--collect-only', '-q'],
-            capture_output=True,
-            text=True,
-            cwd=bot_dir
-        )
-        
-        if result.returncode == 0:
-            # Ищем строку с количеством собранных тестов
-            for line in result.stdout.split('\n'):
-                if 'collected' in line:
-                    # Извлекаем число из строки вида "201 collected"
-                    parts = line.strip().split()
-                    for part in parts:
-                        if part.isdigit():
-                            return int(part)
-        return 0
-    except Exception as e:
-        print(f"Ошибка при подсчете тестов бота: {e}", file=sys.stderr)
-        return 0
-
-def main():
-    """Основная функция"""
-    # Подсчитываем тесты инфраструктуры
-    infra_tests = count_tests_in_directory('tests/infra/')
-    
-    # Подсчитываем тесты бота
-    bot_tests = count_bot_tests()
-    
-    total_tests = infra_tests + bot_tests
-    
-    # Выводим результат в формате для Makefile
-    print(f"{infra_tests}")
-    print(f"{bot_tests}")
-    print(f"{total_tests}")
-
-if __name__ == '__main__':
-    main()
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -12,18 +12,31 @@ services:
      - '--web.console.templates=/etc/prometheus/consoles'
      - '--storage.tsdb.retention.time=${PROMETHEUS_RETENTION_DAYS:-30}d'
      - '--web.enable-lifecycle'
+      - '--web.external-url=https://${SERVER_IP}/prometheus/'
+      # Оптимизация памяти
+      - '--storage.tsdb.max-block-duration=2h'
+      - '--storage.tsdb.min-block-duration=2h'
    ports:
      - "9090:9090"
    volumes:
      - ./infra/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml:ro
+      - ./infra/prometheus/alert_rules.yml:/etc/prometheus/alert_rules.yml:ro
      - prometheus_data:/prometheus
    networks:
      - bots_network
    healthcheck:
-      test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:9090/-/healthy"]
+      test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:9090/prometheus/-/healthy"]
      interval: 30s
      timeout: 10s
      retries: 3
+    deploy:
+      resources:
+        limits:
+          memory: 128M
+          cpus: '0.5'
+        reservations:
+          memory: 64M
+          cpus: '0.25'

  # Grafana Dashboard
  grafana:
@@ -35,6 +48,12 @@ services:
      - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_ADMIN_PASSWORD:-admin}
      - GF_USERS_ALLOW_SIGN_UP=false
      - GF_INSTALL_PLUGINS=grafana-clock-panel,grafana-simple-json-datasource
+      - GF_SERVER_ROOT_URL=https://${SERVER_IP}/grafana/
+      - GF_SERVER_SERVE_FROM_SUB_PATH=true
+      # Оптимизация памяти
+      - GF_DATABASE_MAX_IDLE_CONN=2
+      - GF_DATABASE_MAX_OPEN_CONN=5
+      - GF_DASHBOARDS_DEFAULT_HOME_DASHBOARD_PATH=/etc/grafana/provisioning/dashboards/node-exporter-full-dashboard.json
    ports:
      - "3000:3000"
    volumes:
@@ -45,35 +64,69 @@ services:
    depends_on:
      - prometheus
    healthcheck:
-      test: ["CMD-SHELL", "curl -f http://localhost:3000/api/health || exit 1"]
+      test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:3000/api/health"]
      interval: 30s
      timeout: 10s
      retries: 3
+    deploy:
+      resources:
+        limits:
+          memory: 200M
+          cpus: '0.5'
+        reservations:
+          memory: 100M
+          cpus: '0.25'

-  # Server Monitoring Service
-  server_monitor:
-    build: .
-    container_name: bots_server_monitor
+  # Uptime Kuma Status Page
+  uptime-kuma:
+    image: louislam/uptime-kuma:latest
+    container_name: bots_uptime_kuma
    restart: unless-stopped
-    ports:
-      - "9091:9091"
-    environment:
-      - TELEGRAM_BOT_TOKEN=${TELEGRAM_MONITORING_BOT_TOKEN}
-      - GROUP_FOR_LOGS=${GROUP_MONITORING_FOR_LOGS}
-      - IMPORTANT_LOGS=${IMPORTANT_MONITORING_LOGS}
-      - THRESHOLD=${THRESHOLD:-80.0}
-      - RECOVERY_THRESHOLD=${RECOVERY_THRESHOLD:-75.0}
    volumes:
-      - /proc:/host/proc:ro
-      - /sys:/host/sys:ro
-      - /var/run:/host/var/run:ro
+      - uptime_kuma_data:/app/data
+    ports:
+      - "3001:3001"
+    environment:
+      - UPTIME_KUMA_PORT=3001
+    networks:
+      - bots_network
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:3001"]
+      interval: 30s
+      timeout: 10s
+      retries: 3
+      start_period: 40s
+    deploy:
+      resources:
+        limits:
+          memory: 150M
+          cpus: '0.5'
+        reservations:
+          memory: 80M
+          cpus: '0.25'
+
+  # Alertmanager
+  alertmanager:
+    image: prom/alertmanager:latest
+    container_name: bots_alertmanager
+    restart: unless-stopped
+    command:
+      - '--config.file=/etc/alertmanager/alertmanager.yml'
+      - '--storage.path=/alertmanager'
+      - '--web.external-url=https://${SERVER_IP}/alertmanager/'
+      - '--web.route-prefix=/'
+    ports:
+      - "9093:9093"
+    volumes:
+      - alertmanager_data:/alertmanager
+      - ./infra/alertmanager/alertmanager.yml:/etc/alertmanager/alertmanager.yml:ro
    networks:
      - bots_network
    depends_on:
      - prometheus
    healthcheck:
-      test: ["CMD-SHELL", "ps aux | grep python | grep server_monitor || exit 1"]
-      interval: 60s
+      test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:9093/-/healthy"]
+      interval: 30s
      timeout: 10s
      retries: 3

@@ -81,7 +134,7 @@ services:
  telegram-bot:
    build:
      context: ./bots/telegram-helper-bot
-      dockerfile: Dockerfile.bot
+      dockerfile: Dockerfile
    container_name: bots_telegram_bot
    restart: unless-stopped
    env_file:
@@ -95,10 +148,10 @@ services:
      - LOG_RETENTION_DAYS=${LOG_RETENTION_DAYS:-30}
      - METRICS_HOST=${METRICS_HOST:-0.0.0.0}
      - METRICS_PORT=${METRICS_PORT:-8080}
-      # Telegram settings
-      - TELEGRAM_BOT_TOKEN=${BOT_TOKEN}
-      - TELEGRAM_LISTEN_BOT_TOKEN=${LISTEN_BOT_TOKEN}
-      - TELEGRAM_TEST_BOT_TOKEN=${TEST_BOT_TOKEN}
+      # Telegram settings (токены из GitHub Secrets имеют приоритет над .env)
+      - TELEGRAM_BOT_TOKEN=${TELEGRAM_BOT_TOKEN:-${BOT_TOKEN}}
+      - TELEGRAM_LISTEN_BOT_TOKEN=${TELEGRAM_LISTEN_BOT_TOKEN:-${LISTEN_BOT_TOKEN}}
+      - TELEGRAM_TEST_BOT_TOKEN=${TELEGRAM_TEST_BOT_TOKEN:-${TEST_BOT_TOKEN}}
      - TELEGRAM_PREVIEW_LINK=${PREVIEW_LINK:-false}
      - TELEGRAM_MAIN_PUBLIC=${MAIN_PUBLIC}
      - TELEGRAM_GROUP_FOR_POSTS=${GROUP_FOR_POSTS}
@@ -122,7 +175,7 @@ services:
    depends_on:
      - prometheus
    healthcheck:
-      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
+      test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:8080/health"]
      interval: 30s
      timeout: 10s
      retries: 3
@@ -130,21 +183,79 @@ services:
    deploy:
      resources:
        limits:
-          memory: 512M
+          memory: 256M
          cpus: '0.5'
        reservations:
+          memory: 128M
+          cpus: '0.25'
+
+  # AnonBot - Anonymous Q&A Bot
+  anon-bot:
+    build:
+      context: ./bots/AnonBot
+      dockerfile: Dockerfile
+    container_name: bots_anon_bot
+    restart: unless-stopped
+    env_file:
+      - ./bots/AnonBot/.env
+    ports:
+      - "8081:8081"
+    environment:
+      - PYTHONPATH=/app
+      - PYTHONUNBUFFERED=1
+      - DOCKER_CONTAINER=true
+      - LOG_LEVEL=${LOG_LEVEL:-INFO}
+      # AnonBot settings (токен из GitHub Secrets имеет приоритет над .env)
+      - ANON_BOT_TOKEN=${ANON_BOT_TOKEN:-${BOT_TOKEN}}
+      - ANON_BOT_ADMINS=${ADMINS}
+      - ANON_BOT_DATABASE_PATH=/app/database/anon_qna.db
+      - ANON_BOT_DEBUG=${DEBUG:-false}
+      - ANON_BOT_MAX_QUESTION_LENGTH=${MAX_QUESTION_LENGTH:-1000}
+      - ANON_BOT_MAX_ANSWER_LENGTH=${MAX_ANSWER_LENGTH:-2000}
+      # Rate limiting settings
+      - RATE_LIMIT_ENV=${RATE_LIMIT_ENV:-production}
+      - RATE_LIMIT_MESSAGES_PER_SECOND=${RATE_LIMIT_MESSAGES_PER_SECOND:-0.5}
+      - RATE_LIMIT_BURST_LIMIT=${RATE_LIMIT_BURST_LIMIT:-2}
+      - RATE_LIMIT_RETRY_MULTIPLIER=${RATE_LIMIT_RETRY_MULTIPLIER:-1.5}
+      - RATE_LIMIT_MAX_RETRY_DELAY=${RATE_LIMIT_MAX_RETRY_DELAY:-30.0}
+      - RATE_LIMIT_MAX_RETRIES=${RATE_LIMIT_MAX_RETRIES:-3}
+    volumes:
+      - ./bots/AnonBot/database:/app/database:rw
+      - ./bots/AnonBot/logs:/app/logs:rw
+      - ./bots/AnonBot/.env:/app/.env:ro
+    networks:
+      - bots_network
+    depends_on:
+      - prometheus
+    healthcheck:
+      test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:8081/health"]
+      interval: 30s
+      timeout: 10s
+      retries: 3
+      start_period: 40s
+    deploy:
+      resources:
+        limits:
          memory: 256M
          cpus: '0.25'
+        reservations:
+          memory: 128M
+          cpus: '0.1'

 volumes:
  prometheus_data:
    driver: local
  grafana_data:
    driver: local
+  uptime_kuma_data:
+    driver: local
+  alertmanager_data:
+    driver: local

 networks:
  bots_network:
    driver: bridge
    ipam:
      config:
-        - subnet: 192.168.100.0/24
+        - subnet: 172.20.0.0/16
+          gateway: 172.20.0.1
--- a/env.template
+++ b/env.template
@@ -21,3 +21,9 @@ PROMETHEUS_RETENTION_DAYS=30
 # Grafana Configuration
 GRAFANA_ADMIN_USER=admin
 GRAFANA_ADMIN_PASSWORD=admin
+
+# Server Configuration
+SERVER_IP=your_server_ip_here
+
+# Status Page Configuration
+STATUS_PAGE_PASSWORD=admin123
--- a/infra/alertmanager/alertmanager-simple.yml
+++ b/infra/alertmanager/alertmanager-simple.yml
@@ -0,0 +1,17 @@
+# Simplified Alertmanager Configuration
+global:
+  smtp_smarthost: 'localhost:587'
+  smtp_from: 'alerts@localhost'
+
+route:
+  group_by: ['alertname']
+  group_wait: 10s
+  group_interval: 10s
+  repeat_interval: 1h
+  receiver: 'web.hook'
+
+receivers:
+  - name: 'web.hook'
+    webhook_configs:
+      - url: 'http://localhost:5001/'
+        send_resolved: true
--- a/infra/alertmanager/alertmanager.yml
+++ b/infra/alertmanager/alertmanager.yml
@@ -0,0 +1,119 @@
+# Alertmanager Configuration
+# This file configures how alerts are handled and routed
+
+global:
+  # SMTP configuration for email notifications
+  smtp_smarthost: 'localhost:587'
+  smtp_from: 'alerts@{{DOMAIN}}'
+  smtp_auth_username: 'alerts@{{DOMAIN}}'
+  smtp_auth_password: '{{SMTP_PASSWORD}}'
+  smtp_require_tls: true
+
+  # Resolve timeout
+  resolve_timeout: 5m
+
+# Templates for alert formatting
+templates:
+  - '/etc/alertmanager/templates/*.tmpl'
+
+# Route configuration - defines how alerts are routed
+route:
+  group_by: ['alertname', 'cluster', 'service']
+  group_wait: 10s
+  group_interval: 10s
+  repeat_interval: 1h
+  receiver: 'web.hook'
+  routes:
+    # Critical alerts - immediate notification
+    - match:
+        severity: critical
+      receiver: 'critical-alerts'
+      group_wait: 5s
+      repeat_interval: 5m
+      
+    # Warning alerts - grouped notification
+    - match:
+        severity: warning
+      receiver: 'warning-alerts'
+      group_wait: 30s
+      repeat_interval: 30m
+      
+    # Bot-specific alerts
+    - match:
+        service: telegram-bot
+      receiver: 'bot-alerts'
+      group_wait: 10s
+      repeat_interval: 15m
+      
+    - match:
+        service: anon-bot
+      receiver: 'bot-alerts'
+      group_wait: 10s
+      repeat_interval: 15m
+      
+    # Infrastructure alerts
+    - match:
+        service: prometheus
+      receiver: 'infrastructure-alerts'
+      group_wait: 30s
+      repeat_interval: 1h
+      
+    - match:
+        service: grafana
+      receiver: 'infrastructure-alerts'
+      group_wait: 30s
+      repeat_interval: 1h
+      
+    - match:
+        service: nginx
+      receiver: 'infrastructure-alerts'
+      group_wait: 30s
+      repeat_interval: 1h
+
+# Inhibition rules - suppress certain alerts when others are firing
+inhibit_rules:
+  # Suppress warning alerts when critical alerts are firing
+  - source_match:
+      severity: 'critical'
+    target_match:
+      severity: 'warning'
+    equal: ['alertname', 'cluster', 'service']
+    
+  # Suppress individual instance alerts when the entire service is down
+  - source_match:
+      alertname: 'ServiceDown'
+    target_match:
+      alertname: 'InstanceDown'
+    equal: ['service']
+
+# Receiver configurations
+receivers:
+  # Default webhook receiver (for testing)
+  - name: 'web.hook'
+    webhook_configs:
+      - url: 'http://localhost:5001/'
+        send_resolved: true
+
+  # Critical alerts - immediate notification via webhook
+  - name: 'critical-alerts'
+    webhook_configs:
+      - url: 'http://localhost:5001/critical'
+        send_resolved: true
+
+  # Warning alerts - less urgent notification
+  - name: 'warning-alerts'
+    webhook_configs:
+      - url: 'http://localhost:5001/warning'
+        send_resolved: true
+
+  # Bot-specific alerts
+  - name: 'bot-alerts'
+    webhook_configs:
+      - url: 'http://localhost:5001/bot'
+        send_resolved: true
+
+  # Infrastructure alerts
+  - name: 'infrastructure-alerts'
+    webhook_configs:
+      - url: 'http://localhost:5001/infrastructure'
+        send_resolved: true
--- a/infra/ansible/inventory.ini
+++ b/infra/ansible/inventory.ini
@@ -0,0 +1,5 @@
+[new_server]
+127.0.0.1 ansible_user=root ansible_ssh_common_args='-o StrictHostKeyChecking=no'
+
+[all:vars]
+ansible_python_interpreter=/usr/bin/python3
--- a/infra/ansible/playbook.yml
+++ b/infra/ansible/playbook.yml
--- a/infra/grafana/dashboards/bot-monitoring.json
+++ b/infra/grafana/dashboards/bot-monitoring.json
@@ -0,0 +1,529 @@
+{
+  "annotations": {
+    "list": [
+      {
+        "builtIn": 1,
+        "datasource": "-- Grafana --",
+        "enable": true,
+        "hide": true,
+        "iconColor": "rgba(0, 211, 255, 1)",
+        "name": "Annotations & Alerts",
+        "type": "dashboard"
+      }
+    ]
+  },
+  "editable": true,
+  "gnetId": null,
+  "graphTooltip": 0,
+  "id": null,
+  "links": [],
+  "panels": [
+    {
+      "datasource": "Prometheus",
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "vis": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "reqps"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 0,
+        "y": 0
+      },
+      "id": 1,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single"
+        }
+      },
+      "targets": [
+        {
+          "expr": "rate(http_requests_total{job=~\"telegram-bot|anon-bot\"}[5m])",
+          "interval": "",
+          "legendFormat": "{{job}} - {{method}} {{status}}",
+          "refId": "A"
+        }
+      ],
+      "title": "Bot Request Rate",
+      "type": "timeseries"
+    },
+    {
+      "datasource": "Prometheus",
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "vis": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "s"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 12,
+        "y": 0
+      },
+      "id": 2,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single"
+        }
+      },
+      "targets": [
+        {
+          "expr": "histogram_quantile(0.95, rate(http_request_duration_seconds_bucket{job=~\"telegram-bot|anon-bot\"}[5m]))",
+          "interval": "",
+          "legendFormat": "{{job}} - 95th percentile",
+          "refId": "A"
+        },
+        {
+          "expr": "histogram_quantile(0.50, rate(http_request_duration_seconds_bucket{job=~\"telegram-bot|anon-bot\"}[5m]))",
+          "interval": "",
+          "legendFormat": "{{job}} - 50th percentile",
+          "refId": "B"
+        }
+      ],
+      "title": "Bot Response Time",
+      "type": "timeseries"
+    },
+    {
+      "datasource": "Prometheus",
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "vis": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "percent"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 0,
+        "y": 8
+      },
+      "id": 3,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single"
+        }
+      },
+      "targets": [
+        {
+          "expr": "rate(http_requests_total{job=~\"telegram-bot|anon-bot\",status=~\"5..\"}[5m]) / rate(http_requests_total{job=~\"telegram-bot|anon-bot\"}[5m]) * 100",
+          "interval": "",
+          "legendFormat": "{{job}} - Error Rate",
+          "refId": "A"
+        }
+      ],
+      "title": "Bot Error Rate",
+      "type": "timeseries"
+    },
+    {
+      "datasource": "Prometheus",
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "vis": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "bytes"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 12,
+        "y": 8
+      },
+      "id": 4,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single"
+        }
+      },
+      "targets": [
+        {
+          "expr": "process_resident_memory_bytes{job=~\"telegram-bot|anon-bot\"}",
+          "interval": "",
+          "legendFormat": "{{job}} - Memory Usage",
+          "refId": "A"
+        }
+      ],
+      "title": "Bot Memory Usage",
+      "type": "timeseries"
+    },
+    {
+      "datasource": "Prometheus",
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "vis": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "short"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 0,
+        "y": 16
+      },
+      "id": 5,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single"
+        }
+      },
+      "targets": [
+        {
+          "expr": "up{job=~\"telegram-bot|anon-bot\"}",
+          "interval": "",
+          "legendFormat": "{{job}} - Status",
+          "refId": "A"
+        }
+      ],
+      "title": "Bot Health Status",
+      "type": "timeseries"
+    },
+    {
+      "datasource": "Prometheus",
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "vis": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "short"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 12,
+        "y": 16
+      },
+      "id": 6,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single"
+        }
+      },
+      "targets": [
+        {
+          "expr": "rate(process_cpu_seconds_total{job=~\"telegram-bot|anon-bot\"}[5m]) * 100",
+          "interval": "",
+          "legendFormat": "{{job}} - CPU Usage",
+          "refId": "A"
+        }
+      ],
+      "title": "Bot CPU Usage",
+      "type": "timeseries"
+    }
+  ],
+  "schemaVersion": 27,
+  "style": "dark",
+  "tags": ["bots", "monitoring"],
+  "templating": {
+    "list": []
+  },
+  "time": {
+    "from": "now-1h",
+    "to": "now"
+  },
+  "timepicker": {},
+  "timezone": "",
+  "title": "Bot Monitoring Dashboard",
+  "uid": "bot-monitoring",
+  "version": 1
+}
--- a/infra/grafana/dashboards/infrastructure-monitoring.json
+++ b/infra/grafana/dashboards/infrastructure-monitoring.json
@@ -0,0 +1,523 @@
+{
+  "annotations": {
+    "list": [
+      {
+        "builtIn": 1,
+        "datasource": "-- Grafana --",
+        "enable": true,
+        "hide": true,
+        "iconColor": "rgba(0, 211, 255, 1)",
+        "name": "Annotations & Alerts",
+        "type": "dashboard"
+      }
+    ]
+  },
+  "editable": true,
+  "gnetId": null,
+  "graphTooltip": 0,
+  "id": null,
+  "links": [],
+  "panels": [
+    {
+      "datasource": "Prometheus",
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "vis": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "percent"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 0,
+        "y": 0
+      },
+      "id": 1,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single"
+        }
+      },
+      "targets": [
+        {
+          "expr": "100 - (avg by(instance) (rate(node_cpu_seconds_total{mode=\"idle\"}[5m])) * 100)",
+          "interval": "",
+          "legendFormat": "CPU Usage - {{instance}}",
+          "refId": "A"
+        }
+      ],
+      "title": "System CPU Usage",
+      "type": "timeseries"
+    },
+    {
+      "datasource": "Prometheus",
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "vis": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "percent"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 12,
+        "y": 0
+      },
+      "id": 2,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single"
+        }
+      },
+      "targets": [
+        {
+          "expr": "(1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100",
+          "interval": "",
+          "legendFormat": "Memory Usage - {{instance}}",
+          "refId": "A"
+        }
+      ],
+      "title": "System Memory Usage",
+      "type": "timeseries"
+    },
+    {
+      "datasource": "Prometheus",
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "vis": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "percent"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 0,
+        "y": 8
+      },
+      "id": 3,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single"
+        }
+      },
+      "targets": [
+        {
+          "expr": "(1 - (node_filesystem_avail_bytes / node_filesystem_size_bytes)) * 100",
+          "interval": "",
+          "legendFormat": "Disk Usage - {{instance}} {{mountpoint}}",
+          "refId": "A"
+        }
+      ],
+      "title": "Disk Usage",
+      "type": "timeseries"
+    },
+    {
+      "datasource": "Prometheus",
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "vis": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "short"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 12,
+        "y": 8
+      },
+      "id": 4,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single"
+        }
+      },
+      "targets": [
+        {
+          "expr": "up{job=~\"prometheus|grafana|nginx|alertmanager|uptime-kuma\"}",
+          "interval": "",
+          "legendFormat": "{{job}} - Status",
+          "refId": "A"
+        }
+      ],
+      "title": "Service Health Status",
+      "type": "timeseries"
+    },
+    {
+      "datasource": "Prometheus",
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "vis": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "reqps"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 0,
+        "y": 16
+      },
+      "id": 5,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single"
+        }
+      },
+      "targets": [
+        {
+          "expr": "rate(nginx_http_requests_total[5m])",
+          "interval": "",
+          "legendFormat": "Nginx - {{status}}",
+          "refId": "A"
+        }
+      ],
+      "title": "Nginx Request Rate",
+      "type": "timeseries"
+    },
+    {
+      "datasource": "Prometheus",
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "vis": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "bytes"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 12,
+        "y": 16
+      },
+      "id": 6,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single"
+        }
+      },
+      "targets": [
+        {
+          "expr": "container_memory_usage_bytes{name=~\"bots_.*\"}",
+          "interval": "",
+          "legendFormat": "{{name}} - Memory",
+          "refId": "A"
+        }
+      ],
+      "title": "Container Memory Usage",
+      "type": "timeseries"
+    }
+  ],
+  "schemaVersion": 27,
+  "style": "dark",
+  "tags": ["infrastructure", "monitoring"],
+  "templating": {
+    "list": []
+  },
+  "time": {
+    "from": "now-1h",
+    "to": "now"
+  },
+  "timepicker": {},
+  "timezone": "",
+  "title": "Infrastructure Monitoring Dashboard",
+  "uid": "infrastructure-monitoring",
+  "version": 1
+}
--- a/infra/grafana/provisioning/dashboards/anonbot-overview-dashboard.json
+++ b/infra/grafana/provisioning/dashboards/anonbot-overview-dashboard.json
@@ -0,0 +1,874 @@
+{
+  "annotations": {
+    "list": [
+      {
+        "builtIn": 1,
+        "datasource": {
+          "type": "grafana",
+          "uid": "-- Grafana --"
+        },
+        "enable": true,
+        "hide": true,
+        "iconColor": "rgba(0, 211, 255, 1)",
+        "name": "Annotations & Alerts",
+        "type": "dashboard"
+      }
+    ]
+  },
+  "editable": true,
+  "fiscalYearStartMonth": 0,
+  "graphTooltip": 0,
+  "id": null,
+  "links": [],
+  "liveNow": false,
+  "panels": [
+    {
+      "datasource": {
+        "type": "prometheus",
+        "uid": "PBFA97CFB590B2093"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "thresholds"
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "short"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 6,
+        "x": 0,
+        "y": 0
+      },
+      "id": 1,
+      "options": {
+        "colorMode": "value",
+        "graphMode": "area",
+        "justifyMode": "auto",
+        "orientation": "auto",
+        "reduceOptions": {
+          "calcs": [
+            "lastNotNull"
+          ],
+          "fields": "",
+          "values": false
+        },
+        "textMode": "auto"
+      },
+      "pluginVersion": "8.5.0",
+      "targets": [
+        {
+          "expr": "anon_bot_active_users",
+          "interval": "",
+          "legendFormat": "Active Users",
+          "refId": "A"
+        }
+      ],
+      "title": "Active Users",
+      "type": "stat"
+    },
+    {
+      "datasource": {
+        "type": "prometheus",
+        "uid": "PBFA97CFB590B2093"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "thresholds"
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "short"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 6,
+        "x": 6,
+        "y": 0
+      },
+      "id": 2,
+      "options": {
+        "colorMode": "value",
+        "graphMode": "area",
+        "justifyMode": "auto",
+        "orientation": "auto",
+        "reduceOptions": {
+          "calcs": [
+            "lastNotNull"
+          ],
+          "fields": "",
+          "values": false
+        },
+        "textMode": "auto"
+      },
+      "pluginVersion": "8.5.0",
+      "targets": [
+        {
+          "expr": "anon_bot_active_questions",
+          "interval": "",
+          "legendFormat": "Active Questions",
+          "refId": "A"
+        }
+      ],
+      "title": "Active Questions",
+      "type": "stat"
+    },
+    {
+      "datasource": {
+        "type": "prometheus",
+        "uid": "PBFA97CFB590B2093"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "vis": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "short"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 12,
+        "y": 0
+      },
+      "id": 3,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single",
+          "sort": "none"
+        }
+      },
+      "targets": [
+        {
+          "expr": "rate(anon_bot_questions_total{status=\"created\"}[5m]) * 60",
+          "interval": "",
+          "legendFormat": "Questions Created/min",
+          "refId": "A"
+        },
+        {
+          "expr": "rate(anon_bot_questions_total{status=\"processed\"}[5m]) * 60",
+          "interval": "",
+          "legendFormat": "Questions Processed/min",
+          "refId": "B"
+        },
+        {
+          "expr": "rate(anon_bot_questions_total{status=\"rejected\"}[5m]) * 60",
+          "interval": "",
+          "legendFormat": "Questions Rejected/min",
+          "refId": "C"
+        },
+        {
+          "expr": "rate(anon_bot_questions_total{status=\"deleted\"}[5m]) * 60",
+          "interval": "",
+          "legendFormat": "Questions Deleted/min",
+          "refId": "D"
+        }
+      ],
+      "title": "Questions Flow (per minute)",
+      "type": "timeseries"
+    },
+    {
+      "datasource": {
+        "type": "prometheus",
+        "uid": "PBFA97CFB590B2093"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "vis": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "short"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 0,
+        "y": 8
+      },
+      "id": 4,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single",
+          "sort": "none"
+        }
+      },
+      "targets": [
+        {
+          "expr": "rate(anon_bot_answers_total{status=\"sent\"}[5m]) * 60",
+          "interval": "",
+          "legendFormat": "Answers Sent/min",
+          "refId": "A"
+        },
+        {
+          "expr": "rate(anon_bot_answers_total{status=\"delivered\"}[5m]) * 60",
+          "interval": "",
+          "legendFormat": "Answers Delivered/min",
+          "refId": "B"
+        },
+        {
+          "expr": "rate(anon_bot_answers_total{status=\"delivery_failed\"}[5m]) * 60",
+          "interval": "",
+          "legendFormat": "Delivery Failed/min",
+          "refId": "C"
+        }
+      ],
+      "title": "Answers Flow (per minute)",
+      "type": "timeseries"
+    },
+    {
+      "datasource": {
+        "type": "prometheus",
+        "uid": "PBFA97CFB590B2093"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "vis": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "short"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 12,
+        "y": 8
+      },
+      "id": 5,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single",
+          "sort": "none"
+        }
+      },
+      "targets": [
+        {
+          "expr": "rate(anon_bot_users_total{action=\"created\"}[5m]) * 60",
+          "interval": "",
+          "legendFormat": "New Users/min",
+          "refId": "A"
+        },
+        {
+          "expr": "rate(anon_bot_users_total{action=\"updated\"}[5m]) * 60",
+          "interval": "",
+          "legendFormat": "Updated Users/min",
+          "refId": "B"
+        }
+      ],
+      "title": "User Activity (per minute)",
+      "type": "timeseries"
+    },
+    {
+      "datasource": {
+        "type": "prometheus",
+        "uid": "PBFA97CFB590B2093"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "vis": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "short"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 0,
+        "y": 16
+      },
+      "id": 6,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single",
+          "sort": "none"
+        }
+      },
+      "targets": [
+        {
+          "expr": "sum(rate(anon_bot_messages_total[1d])) by (message_type)",
+          "interval": "",
+          "legendFormat": "{{message_type}} (daily)",
+          "refId": "A"
+        }
+      ],
+      "title": "Daily Trends - Messages",
+      "type": "timeseries"
+    },
+    {
+      "datasource": {
+        "type": "prometheus",
+        "uid": "PBFA97CFB590B2093"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "vis": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "short"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 12,
+        "y": 16
+      },
+      "id": 7,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single",
+          "sort": "none"
+        }
+      },
+      "targets": [
+        {
+          "expr": "sum(rate(anon_bot_questions_total[1d])) by (status)",
+          "interval": "",
+          "legendFormat": "{{status}} (daily)",
+          "refId": "A"
+        }
+      ],
+      "title": "Daily Trends - Questions",
+      "type": "timeseries"
+    },
+    {
+      "datasource": {
+        "type": "prometheus",
+        "uid": "PBFA97CFB590B2093"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "thresholds"
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "short"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 6,
+        "x": 0,
+        "y": 24
+      },
+      "id": 8,
+      "options": {
+        "colorMode": "value",
+        "graphMode": "area",
+        "justifyMode": "auto",
+        "orientation": "auto",
+        "reduceOptions": {
+          "calcs": [
+            "lastNotNull"
+          ],
+          "fields": "",
+          "values": false
+        },
+        "textMode": "auto"
+      },
+      "pluginVersion": "8.5.0",
+      "targets": [
+        {
+          "expr": "anon_bot_active_users",
+          "interval": "",
+          "legendFormat": "Live Active Users",
+          "refId": "A"
+        }
+      ],
+      "title": "Live Activity - Active Users",
+      "type": "stat"
+    },
+    {
+      "datasource": {
+        "type": "prometheus",
+        "uid": "PBFA97CFB590B2093"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "thresholds"
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "short"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 6,
+        "x": 6,
+        "y": 24
+      },
+      "id": 9,
+      "options": {
+        "colorMode": "value",
+        "graphMode": "area",
+        "justifyMode": "auto",
+        "orientation": "auto",
+        "reduceOptions": {
+          "calcs": [
+            "lastNotNull"
+          ],
+          "fields": "",
+          "values": false
+        },
+        "textMode": "auto"
+      },
+      "pluginVersion": "8.5.0",
+      "targets": [
+        {
+          "expr": "rate(anon_bot_messages_total[1m]) * 60",
+          "interval": "",
+          "legendFormat": "Messages/min",
+          "refId": "A"
+        }
+      ],
+      "title": "Messages per Minute",
+      "type": "stat"
+    },
+    {
+      "datasource": {
+        "type": "prometheus",
+        "uid": "PBFA97CFB590B2093"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "thresholds"
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "short"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 6,
+        "x": 12,
+        "y": 24
+      },
+      "id": 10,
+      "options": {
+        "colorMode": "value",
+        "graphMode": "area",
+        "justifyMode": "auto",
+        "orientation": "auto",
+        "reduceOptions": {
+          "calcs": [
+            "lastNotNull"
+          ],
+          "fields": "",
+          "values": false
+        },
+        "textMode": "auto"
+      },
+      "pluginVersion": "8.5.0",
+      "targets": [
+        {
+          "expr": "rate(anon_bot_questions_total[1h]) * 3600",
+          "interval": "",
+          "legendFormat": "Questions/hour",
+          "refId": "A"
+        }
+      ],
+      "title": "Questions per Hour",
+      "type": "stat"
+    },
+    {
+      "datasource": {
+        "type": "prometheus",
+        "uid": "PBFA97CFB590B2093"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "thresholds"
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "short"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 6,
+        "x": 18,
+        "y": 24
+      },
+      "id": 11,
+      "options": {
+        "colorMode": "value",
+        "graphMode": "area",
+        "justifyMode": "auto",
+        "orientation": "auto",
+        "reduceOptions": {
+          "calcs": [
+            "lastNotNull"
+          ],
+          "fields": "",
+          "values": false
+        },
+        "textMode": "auto"
+      },
+      "pluginVersion": "8.5.0",
+      "targets": [
+        {
+          "expr": "rate(anon_bot_answers_total[1m]) * 60",
+          "interval": "",
+          "legendFormat": "Answers/min",
+          "refId": "A"
+        }
+      ],
+      "title": "Answers per Minute",
+      "type": "stat"
+    }
+  ],
+  "refresh": "5s",
+  "schemaVersion": 30,
+  "style": "dark",
+  "tags": [
+    "anonbot",
+    "overview",
+    "monitoring"
+  ],
+  "templating": {
+    "list": []
+  },
+  "time": {
+    "from": "now-1h",
+    "to": "now"
+  },
+  "timepicker": {},
+  "timezone": "",
+  "title": "AnonBot Overview",
+  "uid": "anonbot-overview",
+  "version": 1,
+  "weekStart": ""
+}
--- a/infra/grafana/provisioning/dashboards/anonbot-performance-dashboard.json
+++ b/infra/grafana/provisioning/dashboards/anonbot-performance-dashboard.json
@@ -0,0 +1,877 @@
+{
+  "annotations": {
+    "list": [
+      {
+        "builtIn": 1,
+        "datasource": {
+          "type": "grafana",
+          "uid": "-- Grafana --"
+        },
+        "enable": true,
+        "hide": true,
+        "iconColor": "rgba(0, 211, 255, 1)",
+        "name": "Annotations & Alerts",
+        "type": "dashboard"
+      }
+    ]
+  },
+  "editable": true,
+  "fiscalYearStartMonth": 0,
+  "graphTooltip": 0,
+  "id": null,
+  "links": [],
+  "liveNow": false,
+  "panels": [
+    {
+      "datasource": {
+        "type": "prometheus",
+        "uid": "PBFA97CFB590B2093"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "vis": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "s"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 0,
+        "y": 0
+      },
+      "id": 1,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single",
+          "sort": "none"
+        }
+      },
+      "targets": [
+        {
+          "expr": "histogram_quantile(0.95, rate(anon_bot_message_processing_seconds_bucket[5m]))",
+          "interval": "",
+          "legendFormat": "Message Processing 95th percentile",
+          "refId": "A"
+        },
+        {
+          "expr": "histogram_quantile(0.95, rate(anon_bot_question_processing_seconds_bucket[5m]))",
+          "interval": "",
+          "legendFormat": "Question Processing 95th percentile",
+          "refId": "B"
+        },
+        {
+          "expr": "histogram_quantile(0.95, rate(anon_bot_answer_processing_seconds_bucket[5m]))",
+          "interval": "",
+          "legendFormat": "Answer Processing 95th percentile",
+          "refId": "C"
+        }
+      ],
+      "title": "Response Time - 95th Percentile",
+      "type": "timeseries"
+    },
+    {
+      "datasource": {
+        "type": "prometheus",
+        "uid": "PBFA97CFB590B2093"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "vis": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "s"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 12,
+        "y": 0
+      },
+      "id": 2,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single",
+          "sort": "none"
+        }
+      },
+      "targets": [
+        {
+          "expr": "rate(anon_bot_message_processing_seconds_bucket[5m])",
+          "interval": "",
+          "legendFormat": "{{le}}",
+          "refId": "A"
+        }
+      ],
+      "title": "Latency Heatmap - Message Processing",
+      "type": "timeseries"
+    },
+    {
+      "datasource": {
+        "type": "prometheus",
+        "uid": "PBFA97CFB590B2093"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "vis": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "s"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 0,
+        "y": 8
+      },
+      "id": 3,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single",
+          "sort": "none"
+        }
+      },
+      "targets": [
+        {
+          "expr": "histogram_quantile(0.95, rate(anon_bot_db_query_duration_seconds_bucket[5m]))",
+          "interval": "",
+          "legendFormat": "DB Query 95th percentile - {{operation}}/{{table}}",
+          "refId": "A"
+        }
+      ],
+      "title": "Database Performance - Query Duration",
+      "type": "timeseries"
+    },
+    {
+      "datasource": {
+        "type": "prometheus",
+        "uid": "PBFA97CFB590B2093"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "thresholds"
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "short"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 6,
+        "x": 0,
+        "y": 16
+      },
+      "id": 4,
+      "options": {
+        "colorMode": "value",
+        "graphMode": "area",
+        "justifyMode": "auto",
+        "orientation": "auto",
+        "reduceOptions": {
+          "calcs": [
+            "lastNotNull"
+          ],
+          "fields": "",
+          "values": false
+        },
+        "textMode": "auto"
+      },
+      "pluginVersion": "8.5.0",
+      "targets": [
+        {
+          "expr": "anon_bot_db_connections_active",
+          "interval": "",
+          "legendFormat": "Active DB Connections",
+          "refId": "A"
+        }
+      ],
+      "title": "Database Connections - Active",
+      "type": "stat"
+    },
+    {
+      "datasource": {
+        "type": "prometheus",
+        "uid": "PBFA97CFB590B2093"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "thresholds"
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "yellow",
+                "value": 80
+              },
+              {
+                "color": "red",
+                "value": 100
+              }
+            ]
+          },
+          "unit": "percent"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 6,
+        "x": 6,
+        "y": 16
+      },
+      "id": 5,
+      "options": {
+        "colorMode": "value",
+        "graphMode": "area",
+        "justifyMode": "auto",
+        "orientation": "auto",
+        "reduceOptions": {
+          "calcs": [
+            "lastNotNull"
+          ],
+          "fields": "",
+          "values": false
+        },
+        "textMode": "auto"
+      },
+      "pluginVersion": "8.5.0",
+      "targets": [
+        {
+          "expr": "anon_bot_db_pool_utilization_percent",
+          "interval": "",
+          "legendFormat": "Pool Utilization %",
+          "refId": "A"
+        }
+      ],
+      "title": "DB Pool Utilization",
+      "type": "stat"
+    },
+    {
+      "datasource": {
+        "type": "prometheus",
+        "uid": "PBFA97CFB590B2093"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "vis": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "percent"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 12,
+        "y": 8
+      },
+      "id": 6,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single",
+          "sort": "none"
+        }
+      },
+      "targets": [
+        {
+          "expr": "rate(anon_bot_messages_total{status=\"success\"}[5m]) / rate(anon_bot_messages_total[5m]) * 100",
+          "interval": "",
+          "legendFormat": "Messages Success Rate",
+          "refId": "A"
+        },
+        {
+          "expr": "rate(anon_bot_questions_total{status=\"processed\"}[5m]) / rate(anon_bot_questions_total[5m]) * 100",
+          "interval": "",
+          "legendFormat": "Questions Success Rate",
+          "refId": "B"
+        },
+        {
+          "expr": "rate(anon_bot_answers_total{status=\"sent\"}[5m]) / rate(anon_bot_answers_total[5m]) * 100",
+          "interval": "",
+          "legendFormat": "Answers Success Rate",
+          "refId": "C"
+        }
+      ],
+      "title": "Success/Error Rates",
+      "type": "timeseries"
+    },
+    {
+      "datasource": {
+        "type": "prometheus",
+        "uid": "PBFA97CFB590B2093"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "vis": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "short"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 12,
+        "y": 16
+      },
+      "id": 7,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single",
+          "sort": "none"
+        }
+      },
+      "targets": [
+        {
+          "expr": "rate(anon_bot_errors_total[5m])",
+          "interval": "",
+          "legendFormat": "{{component}} - {{error_type}}",
+          "refId": "A"
+        }
+      ],
+      "title": "Error Rate by Component",
+      "type": "timeseries"
+    },
+    {
+      "datasource": {
+        "type": "prometheus",
+        "uid": "PBFA97CFB590B2093"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "vis": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "short"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 0,
+        "y": 24
+      },
+      "id": 8,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single",
+          "sort": "none"
+        }
+      },
+      "targets": [
+        {
+          "expr": "sum(rate(anon_bot_errors_total[5m])) by (error_type)",
+          "interval": "",
+          "legendFormat": "{{error_type}}",
+          "refId": "A"
+        }
+      ],
+      "title": "Error Types Distribution",
+      "type": "timeseries"
+    },
+    {
+      "datasource": {
+        "type": "prometheus",
+        "uid": "PBFA97CFB590B2093"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "vis": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "short"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 12,
+        "y": 24
+      },
+      "id": 9,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single",
+          "sort": "none"
+        }
+      },
+      "targets": [
+        {
+          "expr": "rate(anon_bot_db_queries_total{status=\"error\"}[5m])",
+          "interval": "",
+          "legendFormat": "{{operation}}/{{table}}",
+          "refId": "A"
+        }
+      ],
+      "title": "Database Errors",
+      "type": "timeseries"
+    },
+    {
+      "datasource": {
+        "type": "prometheus",
+        "uid": "PBFA97CFB590B2093"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "vis": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "short"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 0,
+        "y": 32
+      },
+      "id": 10,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single",
+          "sort": "none"
+        }
+      },
+      "targets": [
+        {
+          "expr": "rate(anon_bot_pagination_errors_total[5m])",
+          "interval": "",
+          "legendFormat": "{{entity_type}} - {{error_type}}",
+          "refId": "A"
+        }
+      ],
+      "title": "Pagination Errors",
+      "type": "timeseries"
+    }
+  ],
+  "refresh": "5s",
+  "schemaVersion": 30,
+  "style": "dark",
+  "tags": [
+    "anonbot",
+    "performance",
+    "monitoring"
+  ],
+  "templating": {
+    "list": []
+  },
+  "time": {
+    "from": "now-1h",
+    "to": "now"
+  },
+  "timepicker": {},
+  "timezone": "",
+  "title": "Performance AnonBot",
+  "uid": "anonbot-performance",
+  "version": 1,
+  "weekStart": ""
+}
--- a/infra/grafana/provisioning/dashboards/dashboards.yml
+++ b/infra/grafana/provisioning/dashboards/dashboards.yml
@@ -0,0 +1,16 @@
+# Grafana Dashboard Provisioning Configuration
+# This file configures automatic dashboard import
+
+apiVersion: 1
+
+providers:
+  - name: 'default'
+    orgId: 1
+    folder: ''
+    type: file
+    disableDeletion: false
+    updateIntervalSeconds: 10
+    allowUiUpdates: true
+    options:
+      path: /etc/grafana/provisioning/dashboards
+      foldersFromFilesStructure: true
--- a/infra/grafana/provisioning/dashboards/grafana-rate-limiting-dashboard.json
+++ b/infra/grafana/provisioning/dashboards/grafana-rate-limiting-dashboard.json
--- a/infra/grafana/provisioning/dashboards/node-exporter-full-dashboard.json
+++ b/infra/grafana/provisioning/dashboards/node-exporter-full-dashboard.json
--- a/infra/grafana/provisioning/dashboards/server-dashboard.json
+++ b/infra/grafana/provisioning/dashboards/server-dashboard.json
@@ -1,224 +0,0 @@
-{
-  "id": null,
-  "title": "Server Monitoring",
-  "tags": ["monitoring", "server"],
-  "style": "dark",
-  "timezone": "browser",
-  "panels": [
-    {
-      "id": 1,
-      "title": "CPU Usage",
-      "type": "stat",
-      "targets": [
-        {
-          "expr": "cpu_usage_percent",
-          "legendFormat": "CPU %"
-        }
-      ],
-      "fieldConfig": {
-        "defaults": {
-          "color": {
-            "mode": "thresholds"
-          },
-          "thresholds": {
-            "steps": [
-              {"color": "green", "value": null},
-              {"color": "yellow", "value": 70},
-              {"color": "red", "value": 90}
-            ]
-          },
-          "unit": "percent"
-        }
-      },
-      "gridPos": {"h": 8, "w": 6, "x": 0, "y": 0}
-    },
-    {
-      "id": 2,
-      "title": "RAM Usage",
-      "type": "stat",
-      "targets": [
-        {
-          "expr": "ram_usage_percent",
-          "legendFormat": "RAM %"
-        }
-      ],
-      "fieldConfig": {
-        "defaults": {
-          "color": {
-            "mode": "thresholds"
-          },
-          "thresholds": {
-            "steps": [
-              {"color": "green", "value": null},
-              {"color": "yellow", "value": 70},
-              {"color": "red", "value": 90}
-            ]
-          },
-          "unit": "percent"
-        }
-      },
-      "gridPos": {"h": 8, "w": 6, "x": 6, "y": 0}
-    },
-    {
-      "id": 3,
-      "title": "Disk Usage",
-      "type": "stat",
-      "targets": [
-        {
-          "expr": "disk_usage_percent",
-          "legendFormat": "Disk %"
-        }
-      ],
-      "fieldConfig": {
-        "defaults": {
-          "color": {
-            "mode": "thresholds"
-          },
-          "thresholds": {
-            "steps": [
-              {"color": "green", "value": null},
-              {"color": "yellow", "value": 80},
-              {"color": "red", "value": 95}
-            ]
-          },
-          "unit": "percent"
-        }
-      },
-      "gridPos": {"h": 8, "w": 6, "x": 12, "y": 0}
-    },
-    {
-      "id": 4,
-      "title": "Load Average",
-      "type": "timeseries",
-      "targets": [
-        {
-          "expr": "load_average_1m",
-          "legendFormat": "1m"
-        },
-        {
-          "expr": "load_average_5m",
-          "legendFormat": "5m"
-        },
-        {
-          "expr": "load_average_15m",
-          "legendFormat": "15m"
-        }
-      ],
-      "fieldConfig": {
-        "defaults": {
-          "color": {
-            "mode": "palette-classic"
-          },
-          "custom": {
-            "axisLabel": "",
-            "axisPlacement": "auto",
-            "barAlignment": 0,
-            "drawStyle": "line",
-            "fillOpacity": 10,
-            "gradientMode": "none",
-            "hideFrom": {
-              "legend": false,
-              "tooltip": false,
-              "vis": false
-            },
-            "lineInterpolation": "linear",
-            "lineWidth": 1,
-            "pointSize": 5,
-            "scaleDistribution": {
-              "type": "linear"
-            },
-            "showPoints": "never",
-            "spanNulls": false,
-            "stacking": {
-              "group": "A",
-              "mode": "none"
-            },
-            "thresholdsStyle": {
-              "mode": "off"
-            }
-          }
-        }
-      },
-      "gridPos": {"h": 8, "w": 12, "x": 0, "y": 8}
-    },
-    {
-      "id": 5,
-      "title": "System Uptime",
-      "type": "stat",
-      "targets": [
-        {
-          "expr": "system_uptime_seconds",
-          "legendFormat": "Uptime"
-        }
-      ],
-      "fieldConfig": {
-        "defaults": {
-          "color": {
-            "mode": "thresholds"
-          },
-          "unit": "s"
-        }
-      },
-      "gridPos": {"h": 8, "w": 6, "x": 12, "y": 8}
-    },
-    {
-      "id": 6,
-      "title": "Disk I/O Usage",
-      "type": "stat",
-      "targets": [
-        {
-          "expr": "disk_io_percent",
-          "legendFormat": "Disk I/O %"
-        }
-      ],
-      "fieldConfig": {
-        "defaults": {
-          "color": {
-            "mode": "thresholds"
-          },
-          "thresholds": {
-            "steps": [
-              {"color": "green", "value": null},
-              {"color": "yellow", "value": 50},
-              {"color": "red", "value": 80}
-            ]
-          },
-          "unit": "percent"
-        }
-      },
-      "gridPos": {"h": 8, "w": 6, "x": 0, "y": 16}
-    },
-    {
-      "id": 7,
-      "title": "Swap Usage",
-      "type": "stat",
-      "targets": [
-        {
-          "expr": "swap_usage_percent",
-          "legendFormat": "Swap %"
-        }
-      ],
-      "fieldConfig": {
-        "defaults": {
-          "color": {
-            "mode": "thresholds"
-          },
-          "thresholds": {
-            "steps": [
-              {"color": "green", "value": null},
-              {"color": "yellow", "value": 50},
-              {"color": "red", "value": 80}
-            ]
-          },
-          "unit": "percent"
-        }
-      },
-      "gridPos": {"h": 8, "w": 6, "x": 6, "y": 16}
-    }
-  ],
-  "time": {
-    "from": "now-1h",
-    "to": "now"
-  },
-  "refresh": "30s"
-}
--- a/infra/grafana/provisioning/dashboards/telegram-bot-dashboards.json
+++ b/infra/grafana/provisioning/dashboards/telegram-bot-dashboards.json
@@ -899,6 +899,190 @@
      ],
      "title": "Database Query Time (P95)",
      "type": "timeseries"
+    },
+    {
+      "datasource": {
+        "type": "prometheus",
+        "uid": "PBFA97CFB590B2093"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "vis": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "short"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 0,
+        "y": 40
+      },
+      "id": 11,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single",
+          "sort": "none"
+        }
+      },
+      "targets": [
+        {
+          "datasource": {
+            "type": "prometheus",
+            "uid": "PBFA97CFB590B2093"
+          },
+          "expr": "topk(5, sum by(content_type) (rate(media_processing_total[5m])))",
+          "refId": "A"
+        }
+      ],
+      "title": "Top Media Types",
+      "type": "timeseries"
+    },
+    {
+      "datasource": {
+        "type": "prometheus",
+        "uid": "PBFA97CFB590B2093"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "vis": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "bytes"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 12,
+        "y": 40
+      },
+      "id": 12,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "single",
+          "sort": "none"
+        }
+      },
+      "targets": [
+        {
+          "datasource": {
+            "type": "prometheus",
+            "uid": "PBFA97CFB590B2093"
+          },
+          "expr": "histogram_quantile(0.5, rate(file_download_size_bytes_bucket[5m])) by (content_type)",
+          "refId": "A",
+          "legendFormat": "{{content_type}} P50"
+        },
+        {
+          "datasource": {
+            "type": "prometheus",
+            "uid": "PBFA97CFB590B2093"
+          },
+          "expr": "histogram_quantile(0.95, rate(file_download_size_bytes_bucket[5m])) by (content_type)",
+          "refId": "B",
+          "legendFormat": "{{content_type}} P95"
+        }
+      ],
+      "title": "File Download Size Distribution",
+      "type": "timeseries"
    }
  ],
  "refresh": "5s",
--- a/infra/grafana/provisioning/datasources/prometheus.yml
+++ b/infra/grafana/provisioning/datasources/prometheus.yml
@@ -4,5 +4,13 @@ datasources:
  - name: Prometheus
    type: prometheus
    access: proxy
-    url: http://prometheus:9090
+    url: http://prometheus:9090/prometheus
    isDefault: true
+    jsonData:
+      httpMethod: POST
+      manageAlerts: true
+      prometheusType: Prometheus
+      prometheusVersion: 2.40.0
+      cacheLevel: 'High'
+      disableRecordingRules: false
+      incrementalQueryOverlapWindow: 10m
--- a/infra/logrotate/README.md
+++ b/infra/logrotate/README.md
@@ -0,0 +1,77 @@
+# Logrotate Configuration
+
+Эта директория содержит конфигурационные файлы для автоматической ротации логов.
+
+## Файлы
+
+### `logrotate_bots.conf.j2`
+Шаблон конфигурации для логов ботов и Docker контейнеров:
+- Логи ботов в `{{ project_root }}/bots/*/logs/*.log`
+- Stderr логи ботов в `{{ project_root }}/bots/*/bot_stderr.log`
+- Docker контейнер логи в `/var/lib/docker/containers/*/*.log`
+
+### `logrotate_system.conf.j2`
+Шаблон конфигурации для системных сервисов:
+- Nginx логи в `/var/log/nginx/*.log`
+- Системные логи (syslog, mail, auth, cron и др.)
+- Fail2ban логи
+- Docker daemon логи
+- Prometheus node exporter логи
+
+## Переменные окружения
+
+Конфигурации используют следующие переменные из `.env` файла:
+
+```bash
+# Logrotate настройки
+LOGROTATE_RETENTION_DAYS=30        # Количество дней хранения логов
+LOGROTATE_COMPRESS=true            # Сжатие старых логов
+LOGROTATE_DELAYCOMPRESS=true       # Отложенное сжатие
+```
+
+## Использование
+
+Эти шаблоны автоматически применяются при запуске Ansible playbook. Они создают конфигурационные файлы в `/etc/logrotate.d/` на сервере.
+
+### Ручное применение
+
+Если нужно применить конфигурации вручную:
+
+```bash
+# Скопировать конфигурации
+sudo cp logrotate_bots.conf.j2 /etc/logrotate.d/bots
+sudo cp logrotate_system.conf.j2 /etc/logrotate.d/system
+
+# Проверить конфигурацию
+sudo logrotate -d /etc/logrotate.conf
+
+# Принудительная ротация
+sudo logrotate -f /etc/logrotate.conf
+```
+
+## Настройки по умолчанию
+
+- **Ежедневная ротация**: все логи ротируются каждый день
+- **Сжатие**: старые логи сжимаются gzip
+- **Хранение**: 30 дней (настраивается через переменную)
+- **Автоматический перезапуск сервисов**: после ротации логов
+
+## Структура логов
+
+После настройки логи будут организованы следующим образом:
+
+```
+/var/log/
+├── nginx/
+│   ├── access.log
+│   ├── access.log.1.gz
+│   ├── error.log
+│   └── error.log.1.gz
+└── ...
+
+{{ project_root }}/bots/*/logs/
+├── bot.log
+├── bot.log.1.gz
+├── bot.log.2.gz
+└── ...
+```
--- a/infra/logrotate/logrotate_bots.conf.j2
+++ b/infra/logrotate/logrotate_bots.conf.j2
@@ -0,0 +1,49 @@
+# Logrotate configuration for bot applications
+# This file manages log rotation for all bot services
+
+{{ project_root }}/bots/*/logs/*.log {
+    daily
+    missingok
+    rotate 30
+    compress
+    delaycompress
+    notifempty
+    create 0644 {{ deploy_user }} {{ deploy_user }}
+    postrotate
+        # Restart bot services if they are running
+        if [ -f /home/{{ deploy_user }}/.docker-compose-pid ]; then
+            cd {{ project_root }} && docker-compose restart
+        fi
+    endscript
+}
+
+{{ project_root }}/bots/*/bot_stderr.log {
+    daily
+    missingok
+    rotate 30
+    compress
+    delaycompress
+    notifempty
+    create 0644 {{ deploy_user }} {{ deploy_user }}
+    postrotate
+        # Restart bot services if they are running
+        if [ -f /home/{{ deploy_user }}/.docker-compose-pid ]; then
+            cd {{ project_root }} && docker-compose restart
+        fi
+    endscript
+}
+
+# Docker container logs
+/var/lib/docker/containers/*/*.log {
+    daily
+    missingok
+    rotate 7
+    compress
+    delaycompress
+    notifempty
+    create 0644 root root
+    postrotate
+        # Reload Docker daemon
+        systemctl reload docker
+    endscript
+}
--- a/infra/logrotate/logrotate_system.conf.j2
+++ b/infra/logrotate/logrotate_system.conf.j2
@@ -0,0 +1,100 @@
+# Logrotate configuration for system services
+# This file manages log rotation for system services
+
+# Nginx logs
+/var/log/nginx/*.log {
+    daily
+    missingok
+    rotate 30
+    compress
+    delaycompress
+    notifempty
+    create 0644 www-data adm
+    sharedscripts
+    postrotate
+        if [ -f /var/run/nginx.pid ]; then
+            kill -USR1 `cat /var/run/nginx.pid`
+        fi
+    endscript
+}
+
+# System logs
+/var/log/syslog {
+    daily
+    missingok
+    rotate 7
+    compress
+    delaycompress
+    notifempty
+    create 0644 syslog adm
+    postrotate
+        /usr/lib/rsyslog/rsyslog-rotate
+    endscript
+}
+
+/var/log/mail.info
+/var/log/mail.warn
+/var/log/mail.err
+/var/log/mail.log
+/var/log/daemon.log
+/var/log/kern.log
+/var/log/auth.log
+/var/log/user.log
+/var/log/lpr.log
+/var/log/cron.log
+/var/log/debug
+/var/log/messages {
+    daily
+    missingok
+    rotate 7
+    compress
+    delaycompress
+    notifempty
+    create 0644 syslog adm
+    sharedscripts
+    postrotate
+        /usr/lib/rsyslog/rsyslog-rotate
+    endscript
+}
+
+# Fail2ban logs
+/var/log/fail2ban.log {
+    daily
+    missingok
+    rotate 7
+    compress
+    delaycompress
+    notifempty
+    create 0644 root root
+    postrotate
+        systemctl reload fail2ban
+    endscript
+}
+
+# Docker daemon logs
+/var/log/docker.log {
+    daily
+    missingok
+    rotate 7
+    compress
+    delaycompress
+    notifempty
+    create 0644 root root
+    postrotate
+        systemctl reload docker
+    endscript
+}
+
+# Prometheus node exporter logs
+/var/log/prometheus-node-exporter.log {
+    daily
+    missingok
+    rotate 7
+    compress
+    delaycompress
+    notifempty
+    create 0644 prometheus prometheus
+    postrotate
+        systemctl reload prometheus-node-exporter
+    endscript
+}
--- a/infra/monitoring/README_PID_MANAGER.md
+++ b/infra/monitoring/README_PID_MANAGER.md
@@ -1,188 +0,0 @@
-# PID Manager - Управление процессами ботов
-
-## Описание
-
-`pid_manager.py` - это общий модуль для управления PID файлами всех ботов в проекте. Он обеспечивает создание, отслеживание и очистку PID файлов для мониторинга состояния процессов.
-
-## Использование
-
-### Для новых ботов
-
-Чтобы добавить PID мониторинг в новый бот, выполните следующие шаги:
-
-1. **Импортируйте PID менеджер в ваш скрипт запуска:**
-
-```python
-import sys
-import os
-
-# Добавляем путь к инфраструктуре в sys.path
-infra_path = os.path.join(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))), 'infra', 'monitoring')
-if infra_path not in sys.path:
-    sys.path.insert(0, infra_path)
-
-from pid_manager import get_bot_pid_manager
-```
-
-2. **Создайте PID менеджер в начале main функции:**
-
-```python
-async def main():
-    # Создаем PID менеджер для отслеживания процесса (если доступен)
-    pid_manager = None
-    if get_bot_pid_manager:
-        pid_manager = get_bot_pid_manager("your_bot_name")  # Замените на имя вашего бота
-        if not pid_manager.create_pid_file():
-            logger.error("Не удалось создать PID файл, завершаем работу")
-            return
-    else:
-        logger.info("PID менеджер недоступен, запуск без PID файла")
-    
-    # Ваш код запуска бота...
-```
-
-3. **Очистите PID файл при завершении:**
-
-```python
-try:
-    # Ваш код работы бота...
-finally:
-    # Очищаем PID файл (если PID менеджер доступен)
-    if pid_manager:
-        pid_manager.cleanup_pid_file()
-```
-
-### Для мониторинга
-
-Чтобы добавить новый бот в систему мониторинга:
-
-```python
-from infra.monitoring.metrics_collector import MetricsCollector
-
-# Создаем экземпляр коллектора метрик
-collector = MetricsCollector()
-
-# Добавляем новый бот в мониторинг
-collector.add_bot_to_monitoring("your_bot_name")
-
-# Теперь можно проверять статус
-status, uptime = collector.check_process_status("your_bot_name")
-```
-
-## Структура файлов
-
-```
-prod/
-├── infra/
-│   └── monitoring/
-│       ├── pid_manager.py          # Основной модуль
-│       ├── metrics_collector.py    # Мониторинг процессов
-│       └── README_PID_MANAGER.md   # Эта документация
-├── bots/
-│   ├── telegram-helper-bot/
-│   │   └── run_helper.py           # Использует PID менеджер
-│   └── your-new-bot/
-│       └── run_your_bot.py         # Будет использовать PID менеджер
-├── helper_bot.pid                  # PID файл helper_bot
-├── your_bot.pid                    # PID файл вашего бота
-└── .gitignore                      # Содержит *.pid
-```
-
-## API
-
-### PIDManager
-
- `create_pid_file()` - Создает PID файл
- `cleanup_pid_file()` - Удаляет PID файл
- `is_running()` - Проверяет, запущен ли процесс
- `get_pid()` - Получает PID из файла
-
-### Функции
-
- `get_bot_pid_manager(bot_name)` - Создает PID менеджер для бота
- `create_pid_manager(process_name, project_root)` - Создает PID менеджер с настройками
-
-## Примеры
-
-### Простой бот
-
-```python
-import asyncio
-from pid_manager import get_bot_pid_manager
-
-async def main():
-    # Создаем PID менеджер
-    pid_manager = get_bot_pid_manager("simple_bot")
-    if not pid_manager.create_pid_file():
-        print("Не удалось создать PID файл")
-        return
-    
-    try:
-        # Ваш код бота
-        print("Бот запущен...")
-        await asyncio.sleep(3600)  # Работаем час
-    finally:
-        # Очищаем PID файл
-        pid_manager.cleanup_pid_file()
-
-if __name__ == '__main__':
-    asyncio.run(main())
-```
-
-### Бот с обработкой сигналов
-
-```python
-import asyncio
-import signal
-from pid_manager import get_bot_pid_manager
-
-async def main():
-    pid_manager = get_bot_pid_manager("advanced_bot")
-    if not pid_manager.create_pid_file():
-        return
-    
-    # Флаг для корректного завершения
-    shutdown_event = asyncio.Event()
-    
-    def signal_handler(signum, frame):
-        print(f"Получен сигнал {signum}, завершаем работу...")
-        shutdown_event.set()
-    
-    # Регистрируем обработчики сигналов
-    signal.signal(signal.SIGINT, signal_handler)
-    signal.signal(signal.SIGTERM, signal_handler)
-    
-    try:
-        # Ваш код бота
-        await shutdown_event.wait()
-    finally:
-        pid_manager.cleanup_pid_file()
-
-if __name__ == '__main__':
-    asyncio.run(main())
-```
-
-## Примечания
-
- PID файлы создаются в корне проекта
- Все PID файлы автоматически игнорируются Git (см. `.gitignore`)
- PID менеджер автоматически обрабатывает сигналы SIGTERM и SIGINT
- При завершении процесса PID файл автоматически удаляется
- Система мониторинга автоматически находит PID файлы в корне проекта
-
-## Изолированный запуск
-
-При запуске бота изолированно (без доступа к основному проекту):
-
- PID менеджер автоматически не создается
- Бот запускается без PID файла
- В логах появляется сообщение "PID менеджер недоступен (изолированный запуск), PID файл не создается"
- Это позволяет запускать бота в любой среде без ошибок
-
-## Автоматическое определение
-
-Система автоматически определяет доступность PID менеджера:
-
-1. **В основном проекте**: PID менеджер доступен, создается PID файл для мониторинга
-2. **Изолированно**: PID менеджер недоступен, бот работает без PID файла
-3. **Fallback**: Если PID менеджер недоступен, бот продолжает работать нормально
--- a/infra/monitoring/init.py
+++ b/infra/monitoring/init.py
@@ -1,7 +0,0 @@
-# Infrastructure Monitoring Module
-
-from .metrics_collector import MetricsCollector
-from .message_sender import MessageSender
-from .server_monitor import ServerMonitor
-
-__all__ = ['MetricsCollector', 'MessageSender', 'ServerMonitor']
--- a/infra/monitoring/check_grafana.py
+++ b/infra/monitoring/check_grafana.py
@@ -1,127 +0,0 @@
-#!/usr/bin/env python3
-"""
-Скрипт для проверки статуса Grafana и дашбордов
-"""
-
-import requests
-import json
-import sys
-from datetime import datetime
-
-def check_grafana_status():
-    """Проверка статуса Grafana"""
-    try:
-        response = requests.get("http://localhost:3000/api/health", timeout=5)
-        if response.status_code == 200:
-            data = response.json()
-            print(f"✅ Grafana работает (версия: {data.get('version', 'unknown')})")
-            return True
-        else:
-            print(f"❌ Grafana: HTTP {response.status_code}")
-            return False
-    except Exception as e:
-        print(f"❌ Grafana: ошибка подключения - {e}")
-        return False
-
-def check_prometheus_connection():
-    """Проверка подключения Grafana к Prometheus"""
-    try:
-        # Проверяем, что Prometheus доступен
-        response = requests.get("http://localhost:9090/api/v1/targets", timeout=5)
-        if response.status_code == 200:
-            print("✅ Prometheus доступен для Grafana")
-            return True
-        else:
-            print(f"❌ Prometheus: HTTP {response.status_code}")
-            return False
-    except Exception as e:
-        print(f"❌ Prometheus: ошибка подключения - {e}")
-        return False
-
-def check_metrics_availability():
-    """Проверка доступности метрик"""
-    try:
-        response = requests.get("http://localhost:9091/metrics", timeout=5)
-        if response.status_code == 200:
-            content = response.text
-            if "cpu_usage_percent" in content and "ram_usage_percent" in content:
-                print("✅ Метрики доступны и содержат данные")
-                return True
-            else:
-                print("⚠️  Метрики доступны, но данные неполные")
-                return False
-        else:
-            print(f"❌ Метрики: HTTP {response.status_code}")
-            return False
-    except Exception as e:
-        print(f"❌ Метрики: ошибка подключения - {e}")
-        return False
-
-def check_prometheus_targets():
-    """Проверка статуса targets в Prometheus"""
-    try:
-        response = requests.get("http://localhost:9090/api/v1/targets", timeout=5)
-        if response.status_code == 200:
-            data = response.json()
-            targets = data.get('data', {}).get('activeTargets', [])
-            
-            print("\n📊 Статус targets в Prometheus:")
-            for target in targets:
-                job = target.get('labels', {}).get('job', 'unknown')
-                instance = target.get('labels', {}).get('instance', 'unknown')
-                health = target.get('health', 'unknown')
-                last_error = target.get('lastError', '')
-                
-                status_emoji = "✅" if health == "up" else "❌"
-                print(f"  {status_emoji} {job} ({instance}): {health}")
-                
-                if last_error:
-                    print(f"    Ошибка: {last_error}")
-            
-            return True
-        else:
-            print(f"❌ Prometheus API: HTTP {response.status_code}")
-            return False
-    except Exception as e:
-        print(f"❌ Prometheus API: ошибка подключения - {e}")
-        return False
-
-def main():
-    """Основная функция проверки"""
-    print(f"🔍 Проверка Grafana и системы мониторинга - {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
-    print("=" * 70)
-    
-    # Проверяем все компоненты
-    all_ok = True
-    
-    if not check_grafana_status():
-        all_ok = False
-    
-    if not check_prometheus_connection():
-        all_ok = False
-    
-    if not check_metrics_availability():
-        all_ok = False
-    
-    if not check_prometheus_targets():
-        all_ok = False
-    
-    print("\n" + "=" * 70)
-    if all_ok:
-        print("🎉 Все компоненты работают корректно!")
-        print("\n📋 Доступные адреса:")
-        print("  • Grafana: http://localhost:3000 (admin/admin)")
-        print("  • Prometheus: http://localhost:9090")
-        print("  • Метрики: http://localhost:9091/metrics")
-        print("\n📊 Дашборды должны быть доступны в Grafana:")
-        print("  • Server Monitoring")
-        print("  • Server Monitoring Dashboard")
-        print("\n💡 Если дашборды не видны, используйте ручную настройку:")
-        print("  • См. файл: GRAFANA_MANUAL_SETUP.md")
-    else:
-        print("⚠️  Обнаружены проблемы в системе мониторинга")
-        print("   Проверьте логи и настройки")
-        sys.exit(1)
-
-if __name__ == "__main__":
-    main()
--- a/infra/monitoring/main.py
+++ b/infra/monitoring/main.py
@@ -1,50 +0,0 @@
-#!/usr/bin/env python3
-"""
-Основной скрипт для запуска модуля мониторинга сервера
-"""
-
-import asyncio
-import logging
-import os
-import sys
-
-# Добавляем корневую папку проекта в путь
-sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..', '..'))
-
-from dotenv import load_dotenv
-from infra.monitoring.server_monitor import ServerMonitor
-
-# Загружаем переменные окружения из .env файла
-load_dotenv()
-
-# Настройка логирования
-logging.basicConfig(
-    level=logging.INFO,
-    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
-)
-
-logger = logging.getLogger(__name__)
-
-
-async def main():
-    """Основная функция запуска мониторинга"""
-    try:
-        # Создаем экземпляр мониторинга
-        monitor = ServerMonitor()
-        
-        # Отправляем статус при запуске
-        await monitor.send_startup_status()
-        
-        # Запускаем основной цикл мониторинга
-        await monitor.monitor_loop()
-        
-    except KeyboardInterrupt:
-        logger.info("Мониторинг остановлен пользователем")
-    except Exception as e:
-        logger.error(f"Критическая ошибка в мониторинге: {e}")
-        raise
-
-
-if __name__ == "__main__":
-    # Запускаем асинхронную функцию
-    asyncio.run(main())
--- a/infra/monitoring/message_sender.py
+++ b/infra/monitoring/message_sender.py
@@ -1,331 +0,0 @@
-import os
-import aiohttp
-import logging
-from datetime import datetime
-from typing import Dict, List, Tuple
-try:
-    from .metrics_collector import MetricsCollector
-except ImportError:
-    from metrics_collector import MetricsCollector
-
-logger = logging.getLogger(__name__)
-
-
-class MessageSender:
-    def __init__(self):
-        # Получаем переменные окружения
-        self.telegram_bot_token = os.getenv('TELEGRAM_MONITORING_BOT_TOKEN')
-        self.group_for_logs = os.getenv('GROUP_MONITORING_FOR_LOGS')
-        self.important_logs = os.getenv('IMPORTANT_MONITORING_LOGS')
-        
-        # Интервал отправки статуса в минутах (по умолчанию 2 минуты)
-        self.status_update_interval_minutes = int(os.getenv('STATUS_UPDATE_INTERVAL_MINUTES', 2))
-        
-        # Создаем экземпляр сборщика метрик
-        self.metrics_collector = MetricsCollector()
-        
-        # Время последней отправки статуса
-        self.last_status_time = None
-        
-        if not self.telegram_bot_token:
-            logger.warning("TELEGRAM_MONITORING_BOT_TOKEN не установлен в переменных окружения")
-        if not self.group_for_logs:
-            logger.warning("GROUP_MONITORING_FOR_LOGS не установлен в переменных окружения")
-        if not self.important_logs:
-            logger.warning("IMPORTANT_MONITORING_LOGS не установлен в переменных окружения")
-        
-        logger.info(f"Интервал отправки статуса установлен: {self.status_update_interval_minutes} минут")
-    
-    async def send_telegram_message(self, chat_id: str, message: str) -> bool:
-        """Отправка сообщения в Telegram через прямое обращение к API"""
-        if not self.telegram_bot_token:
-            logger.error("TELEGRAM_MONITORING_BOT_TOKEN не установлен")
-            return False
-        
-        try:
-            async with aiohttp.ClientSession() as session:
-                url = f"https://api.telegram.org/bot{self.telegram_bot_token}/sendMessage"
-                payload = {
-                    "chat_id": chat_id,
-                    "text": message,
-                    "parse_mode": "HTML"
-                }
-                
-                async with session.post(url, json=payload) as response:
-                    if response.status == 200:
-                        logger.info(f"Сообщение успешно отправлено в чат {chat_id}")
-                        return True
-                    else:
-                        response_text = await response.text()
-                        logger.error(f"Ошибка отправки в Telegram: {response.status} - {response_text}")
-                        return False
-                        
-        except Exception as e:
-            logger.error(f"Ошибка при отправке сообщения в Telegram: {e}")
-            return False
-    
-    def should_send_status(self) -> bool:
-        """Проверка, нужно ли отправить статус (каждые N минут)"""
-        now = datetime.now()
-        
-        # Логируем для диагностики
-        import logging
-        logger = logging.getLogger(__name__)
-        
-        if self.last_status_time is None:
-            logger.info(f"should_send_status: last_status_time is None, отправляем статус")
-            self.last_status_time = now
-            return True
-        
-        # Вычисляем разницу в минутах
-        time_diff_minutes = (now - self.last_status_time).total_seconds() / 60
-        logger.info(f"should_send_status: прошло {time_diff_minutes:.1f} минут с последней отправки, нужно {self.status_update_interval_minutes} минут")
-        
-        # Проверяем, что прошло N минут с последней отправки
-        if time_diff_minutes >= self.status_update_interval_minutes:
-            logger.info(f"should_send_status: отправляем статус (прошло {time_diff_minutes:.1f} минут)")
-            self.last_status_time = now
-            return True
-        
-        logger.info(f"should_send_status: статус не отправляем (прошло {time_diff_minutes:.1f} минут)")
-        return False
-    
-    def should_send_startup_status(self) -> bool:
-        """Проверка, нужно ли отправить статус при запуске"""
-        return self.last_status_time is None
-    
-    def _get_disk_space_emoji(self, disk_percent: float) -> str:
-        """Получение эмодзи для дискового пространства"""
-        if disk_percent < 60:
-            return "🟢"
-        elif disk_percent < 90:
-            return "⚠️"
-        else:
-            return "🚨"
-    
-    def _get_cpu_emoji(self, cpu_percent: float) -> str:
-        """Получение эмодзи для CPU"""
-        if cpu_percent < 50:
-            return "🟢"
-        elif cpu_percent < 80:
-            return "⚠️"
-        else:
-            return "🚨"
-    
-    def _get_memory_emoji(self, memory_percent: float) -> str:
-        """Получение эмодзи для памяти (RAM/Swap)"""
-        if memory_percent < 60:
-            return "🟢"
-        elif memory_percent < 85:
-            return "⚠️"
-        else:
-            return "🚨"
-    
-    def _get_load_average_emoji(self, load_avg: float, cpu_count: int) -> str:
-        """Получение эмодзи для Load Average"""
-        # Load Average считается нормальным если < 1.0 на ядро
-        # Критичным если > 2.0 на ядро
-        load_per_core = load_avg / cpu_count
-        if load_per_core < 1.0:
-            return "🟢"
-        elif load_per_core < 2.0:
-            return "⚠️"
-        else:
-            return "🚨"
-    
-    def _get_io_wait_emoji(self, io_wait_percent: float) -> str:
-        """Получение эмодзи для IO Wait"""
-        # IO Wait считается нормальным если < 5%
-        # Критичным если > 20%
-        if io_wait_percent < 5:
-            return "🟢"
-        elif io_wait_percent < 20:
-            return "⚠️"
-        else:
-            return "🚨"
-    
-    def get_status_message(self, system_info: Dict) -> str:
-        """Формирование сообщения со статусом сервера"""
-        try:
-            helper_bot_status, helper_bot_uptime = self.metrics_collector.check_process_status('helper_bot')
-            
-            # Получаем эмодзи для всех метрик
-            cpu_emoji = self._get_cpu_emoji(system_info['cpu_percent'])
-            ram_emoji = self._get_memory_emoji(system_info['ram_percent'])
-            swap_emoji = self._get_memory_emoji(system_info['swap_percent'])
-            la_emoji = self._get_load_average_emoji(system_info['load_avg_1m'], system_info['cpu_count'])
-            io_wait_emoji = self._get_io_wait_emoji(system_info['io_wait_percent'])
-            disk_emoji = self._get_disk_space_emoji(system_info['disk_percent'])
-            
-            # Определяем уровень мониторинга
-            monitoring_level = system_info.get('monitoring_level', 'unknown')
-            level_emoji = "🖥️" if monitoring_level == 'host' else "📦"
-            level_text = "Хост" if monitoring_level == 'host' else "Контейнер"
-            
-            message = f"""{level_emoji} **Статус {level_text}** | <code>{system_info['current_time']}</code>
---------------------------------
-**📊 Общая нагрузка:**
-CPU: <b>{system_info['cpu_percent']}%</b> {cpu_emoji} | LA: <b>{system_info['load_avg_1m']} / {system_info['cpu_count']}</b> {la_emoji} | IO Wait: <b>{system_info['io_wait_percent']}%</b> {io_wait_emoji}
-
-**💾 Память:**
-RAM: <b>{system_info['ram_used']}/{system_info['ram_total']} GB</b> ({system_info['ram_percent']}%) {ram_emoji}
-Swap: <b>{system_info['swap_used']}/{system_info['swap_total']} GB</b> ({system_info['swap_percent']}%) {swap_emoji}
-
-**🗂️ Дисковое пространство:**
-Диск (/): <b>{system_info['disk_used']}/{system_info['disk_total']} GB</b> ({system_info['disk_percent']}%) {disk_emoji}
-
-**💿 Диск I/O:**
-Read: <b>{system_info['disk_read_speed']}</b> | Write: <b>{system_info['disk_write_speed']}</b>
-Диск загружен: <b>{system_info['disk_io_percent']}%</b>
-
-**🤖 Процессы:**
-{helper_bot_status} helper-bot - {helper_bot_uptime}
---------------------------------
-⏰ Uptime сервера: {system_info['system_uptime']}
-🔍 Уровень мониторинга: {level_text} ({monitoring_level})"""
-            
-            return message
-            
-        except Exception as e:
-            logger.error(f"Ошибка при формировании статуса сервера: {e}")
-            return f"Ошибка при получении статуса сервера: {e}"
-    
-    def get_alert_message(self, metric_name: str, current_value: float, details: str) -> str:
-        """Формирование сообщения об алерте"""
-        try:
-            # Получаем информацию о задержке для данного метрика
-            delay_info = ""
-            if hasattr(self.metrics_collector, 'alert_delays'):
-                metric_type = metric_name.lower().replace('использование ', '').replace('заполнение диска (/)', 'disk')
-                if 'cpu' in metric_type:
-                    delay_info = f"⏱️ Задержка срабатывания: {self.metrics_collector.alert_delays['cpu']} сек"
-                elif 'память' in metric_type or 'ram' in metric_type:
-                    delay_info = f"⏱️ Задержка срабатывания: {self.metrics_collector.alert_delays['ram']} сек"
-                elif 'диск' in metric_type or 'disk' in metric_type:
-                    delay_info = f"⏱️ Задержка срабатывания: {self.metrics_collector.alert_delays['disk']} сек"
-            
-            message = f"""🚨  **ALERT: Высокая нагрузка на сервере!**
---------------------------------
-**Показатель:** {metric_name}
-**Текущее значение:** <b>{current_value}%</b> ⚠️
-**Пороговое значение:** 80%
-
-**Детали:**
-{details}
-
-{delay_info}
-
-**Сервер:** `{self.metrics_collector.os_type.upper()}`
-**Время:** `{datetime.now().strftime('%Y-%m-%d %H:%M:%S')}`
---------------------------------"""
-            
-            return message
-            
-        except Exception as e:
-            logger.error(f"Ошибка при формировании алерта: {e}")
-            return f"Ошибка при формировании алерта: {e}"
-    
-    def get_recovery_message(self, metric_name: str, current_value: float, peak_value: float) -> str:
-        """Формирование сообщения о восстановлении"""
-        try:
-            message = f"""✅  **RECOVERY: Нагрузка нормализовалась**
---------------------------------
-**Показатель:** {metric_name}
-**Текущее значение:** <b>{current_value}%</b> ✔️
-**Было превышение:** До {peak_value}%
-
-**Сервер:** `{self.metrics_collector.os_type.upper()}`
-**Время:** `{datetime.now().strftime('%Y-%m-%d %H:%M:%S')}`
---------------------------------"""
-            
-            return message
-            
-        except Exception as e:
-            logger.error(f"Ошибка при формировании сообщения о восстановлении: {e}")
-            return f"Ошибка при формировании сообщения о восстановлении: {e}"
-    
-    async def send_status_message(self) -> bool:
-        """Отправка статуса сервера в группу логов"""
-        if not self.group_for_logs:
-            logger.warning("GROUP_MONITORING_FOR_LOGS не установлен, пропускаем отправку статуса")
-            return False
-        
-        try:
-            system_info = self.metrics_collector.get_system_info()
-            if not system_info:
-                logger.error("Не удалось получить информацию о системе")
-                return False
-            
-            status_message = self.get_status_message(system_info)
-            return await self.send_telegram_message(self.group_for_logs, status_message)
-            
-        except Exception as e:
-            logger.error(f"Ошибка при отправке статуса: {e}")
-            return False
-    
-    async def send_alert_message(self, metric_type: str, current_value: float, details: str) -> bool:
-        """Отправка сообщения об алерте в важные логи"""
-        if not self.important_logs:
-            logger.warning("IMPORTANT_MONITORING_LOGS не установлен, пропускаем отправку алерта")
-            return False
-        
-        try:
-            metric_names = {
-                'cpu': 'Использование CPU',
-                'ram': 'Использование оперативной памяти',
-                'disk': 'Заполнение диска (/)'
-            }
-            
-            metric_name = metric_names.get(metric_type, metric_type)
-            alert_message = self.get_alert_message(metric_name, current_value, details)
-            return await self.send_telegram_message(self.important_logs, alert_message)
-            
-        except Exception as e:
-            logger.error(f"Ошибка при отправке алерта: {e}")
-            return False
-    
-    async def send_recovery_message(self, metric_type: str, current_value: float, peak_value: float) -> bool:
-        """Отправка сообщения о восстановлении в важные логи"""
-        if not self.important_logs:
-            logger.warning("IMPORTANT_MONITORING_LOGS не установлен, пропускаем отправку сообщения о восстановлении")
-            return False
-        
-        try:
-            metric_names = {
-                'cpu': 'Использование CPU',
-                'ram': 'Использование оперативной памяти',
-                'disk': 'Заполнение диска (/)'
-            }
-            
-            metric_name = metric_names.get(metric_type, metric_type)
-            recovery_message = self.get_recovery_message(metric_name, current_value, peak_value)
-            return await self.send_telegram_message(self.important_logs, recovery_message)
-            
-        except Exception as e:
-            logger.error(f"Ошибка при отправке сообщения о восстановлении: {e}")
-            return False
-    
-    async def process_alerts_and_recoveries(self) -> None:
-        """Обработка алертов и восстановлений"""
-        try:
-            system_info = self.metrics_collector.get_system_info()
-            if not system_info:
-                return
-            
-            # Проверка алертов
-            alerts, recoveries = self.metrics_collector.check_alerts(system_info)
-            
-            # Отправка алертов
-            for metric_type, value, details in alerts:
-                await self.send_alert_message(metric_type, value, details)
-                logger.warning(f"ALERT отправлен: {metric_type} - {value}% - {details}")
-            
-            # Отправка сообщений о восстановлении
-            for metric_type, value in recoveries:
-                # Находим пиковое значение для сообщения о восстановлении
-                peak_value = self.metrics_collector.threshold
-                await self.send_recovery_message(metric_type, value, peak_value)
-                logger.info(f"RECOVERY отправлен: {metric_type} - {value}%")
-                
-        except Exception as e:
-            logger.error(f"Ошибка при обработке алертов и восстановлений: {e}")
--- a/infra/monitoring/metrics_collector.py
+++ b/infra/monitoring/metrics_collector.py
@@ -1,849 +0,0 @@
-import os
-import psutil
-import time
-import platform
-from datetime import datetime
-from typing import Dict, Optional, Tuple
-import logging
-from pid_manager import create_pid_manager
-
-logger = logging.getLogger(__name__)
-
-
-class MetricsCollector:
-    def __init__(self):
-        # Определяем ОС
-        self.os_type = self._detect_os()
-        logger.info(f"Обнаружена ОС: {self.os_type}")
-        
-        # Проверяем, запущены ли мы в Docker с доступом к хосту
-        self.is_docker_host_monitoring = self._check_docker_host_access()
-        if self.is_docker_host_monitoring:
-            logger.info("Обнаружен доступ к хосту через Docker volumes - мониторинг будет вестись на уровне хоста")
-        else:
-            logger.warning("Мониторинг будет вестись на уровне контейнера (не рекомендуется для продакшена)")
-        
-        # Пороговые значения для алертов
-        self.threshold = float(os.getenv('THRESHOLD', '80.0'))
-        self.recovery_threshold = float(os.getenv('RECOVERY_THRESHOLD', '75.0'))
-        
-        # Задержки для алертов (в секундах) - предотвращают ложные срабатывания
-        self.alert_delays = {
-            'cpu': int(os.getenv('CPU_ALERT_DELAY', '30')),      # 30 сек для CPU
-            'ram': int(os.getenv('RAM_ALERT_DELAY', '45')),      # 45 сек для RAM  
-            'disk': int(os.getenv('DISK_ALERT_DELAY', '60'))     # 60 сек для диска
-        }
-        
-        # Состояние алертов для предотвращения спама
-        self.alert_states = {
-            'cpu': False,
-            'ram': False,
-            'disk': False
-        }
-        
-        # Время первого превышения порога для каждого метрика
-        self.alert_start_times = {
-            'cpu': None,
-            'ram': None,
-            'disk': None
-        }
-        
-        # PID файлы для отслеживания процессов
-        # Определяем корень проекта для поиска PID файлов
-        current_file = os.path.abspath(__file__)
-        self.project_root = os.path.dirname(os.path.dirname(current_file))
-        
-        self.pid_files = {
-            'helper_bot': os.path.join(self.project_root, 'helper_bot.pid')
-        }
-        
-        # Для расчета скорости диска
-        self.last_disk_io = None
-        self.last_disk_io_time = None
-        
-        # Для расчета процента загрузки диска (отдельные переменные)
-        self.last_disk_io_for_percent = None
-        self.last_disk_io_time_for_percent = None
-        
-        # Инициализируем базовые значения для скорости диска при первом вызове
-        self._initialize_disk_io()
-        
-
-        
-        # Время запуска мониторинга для расчета uptime
-        self.monitor_start_time = time.time()
-        
-        logger.info(f"Инициализированы задержки алертов: CPU={self.alert_delays['cpu']}s, RAM={self.alert_delays['ram']}s, Disk={self.alert_delays['disk']}s")
-    
-    def add_bot_to_monitoring(self, bot_name: str):
-        """
-        Добавление нового бота в мониторинг
-        
-        Args:
-            bot_name: Имя бота (например, 'helper_bot', 'admin_bot', etc.)
-        """
-        pid_file_path = os.path.join(self.project_root, f"{bot_name}.pid")
-        self.pid_files[bot_name] = pid_file_path
-        logger.info(f"Добавлен бот {bot_name} в мониторинг: {pid_file_path}")
-        
-    def _detect_os(self) -> str:
-        """Определение типа операционной системы"""
-        system = platform.system().lower()
-        if system == "darwin":
-            return "macos"
-        elif system == "linux":
-            return "ubuntu"
-        else:
-            return "unknown"
-    
-    def _check_docker_host_access(self) -> bool:
-        """Проверка доступности хоста через Docker volumes"""
-        try:
-            # Проверяем, доступны ли файлы хоста через /host/proc
-            # Это означает, что контейнер запущен с --privileged и volume mounts
-            if os.path.exists('/host/proc/stat') and os.path.exists('/host/proc/meminfo'):
-                return True
-            
-            # Альтернативная проверка - проверяем, запущены ли мы в Docker
-            # и есть ли доступ к системным файлам хоста
-            if os.path.exists('/.dockerenv'):
-                # Проверяем, можем ли мы читать системные файлы хоста
-                try:
-                    with open('/proc/stat', 'r') as f:
-                        f.read(100)  # Читаем немного для проверки доступа
-                    return True
-                except (OSError, PermissionError):
-                    pass
-            
-            return False
-        except Exception as e:
-            logger.debug(f"Ошибка при проверке доступа к хосту: {e}")
-            return False
-    
-    def _initialize_disk_io(self):
-        """Инициализация базовых значений для расчета скорости диска"""
-        try:
-            disk_io = self._get_disk_io_counters()
-            if disk_io:
-                self.last_disk_io = disk_io
-                self.last_disk_io_time = time.time()
-                logger.debug("Инициализированы базовые значения для расчета скорости диска")
-        except Exception as e:
-            logger.error(f"Ошибка при инициализации диска I/O: {e}")
-    
-    def _get_disk_path(self) -> str:
-        """Получение пути к диску в зависимости от ОС"""
-        if self.os_type == "macos":
-            return "/"
-        elif self.os_type == "ubuntu":
-            return "/"
-        else:
-            return "/"
-    
-    def _get_disk_usage(self) -> Optional[object]:
-        """Получение информации о диске с учетом ОС"""
-        try:
-            if self.os_type == "macos":
-                # На macOS используем diskutil для получения реального использования диска
-                return self._get_macos_disk_usage()
-            else:
-                disk_path = self._get_disk_path()
-                return psutil.disk_usage(disk_path)
-        except Exception as e:
-            logger.error(f"Ошибка при получении информации о диске: {e}")
-            return None
-    
-    def _get_macos_disk_usage(self) -> Optional[object]:
-        """Получение информации о диске на macOS через diskutil"""
-        try:
-            import subprocess
-            import re
-            
-            # Получаем информацию о диске через diskutil
-            result = subprocess.run(['diskutil', 'info', '/'], capture_output=True, text=True)
-            if result.returncode != 0:
-                # Fallback к psutil
-                return psutil.disk_usage('/')
-            
-            output = result.stdout
-            
-            # Извлекаем размеры из вывода diskutil
-            total_match = re.search(r'Container Total Space:\s+(\d+\.\d+)\s+GB', output)
-            free_match = re.search(r'Container Free Space:\s+(\d+\.\d+)\s+GB', output)
-            
-            if total_match and free_match:
-                total_gb = float(total_match.group(1))
-                free_gb = float(free_match.group(1))
-                used_gb = total_gb - free_gb
-                
-                # Создаем объект, похожий на результат psutil.disk_usage
-                class DiskUsage:
-                    def __init__(self, total, used, free):
-                        self.total = total * (1024**3)  # Конвертируем в байты
-                        self.used = used * (1024**3)
-                        self.free = free * (1024**3)
-                
-                return DiskUsage(total_gb, used_gb, free_gb)
-            else:
-                # Fallback к psutil
-                return psutil.disk_usage('/')
-                
-        except Exception as e:
-            logger.error(f"Ошибка при получении информации о диске macOS: {e}")
-            # Fallback к psutil
-            return psutil.disk_usage('/')
-    
-    def _get_disk_io_counters(self):
-        """Получение статистики диска с учетом ОС"""
-        try:
-            if self.os_type == "macos":
-                # На macOS может быть несколько дисков, берем основной
-                return psutil.disk_io_counters(perdisk=False)
-            elif self.os_type == "ubuntu":
-                # На Ubuntu обычно один диск
-                return psutil.disk_io_counters(perdisk=False)
-            else:
-                return psutil.disk_io_counters()
-        except Exception as e:
-            logger.error(f"Ошибка при получении статистики диска: {e}")
-            return None
-    
-    def _get_system_uptime(self) -> float:
-        """Получение uptime системы с учетом ОС"""
-        try:
-            if self.os_type == "macos":
-                # На macOS используем boot_time
-                boot_time = psutil.boot_time()
-                return time.time() - boot_time
-            elif self.os_type == "ubuntu":
-                # На Ubuntu также используем boot_time
-                boot_time = psutil.boot_time()
-                return time.time() - boot_time
-            else:
-                boot_time = psutil.boot_time()
-                return time.time() - boot_time
-        except Exception as e:
-            logger.error(f"Ошибка при получении uptime системы: {e}")
-            return 0.0
-    
-    def get_monitor_uptime(self) -> str:
-        """Получение uptime мониторинга"""
-        uptime_seconds = time.time() - self.monitor_start_time
-        return self._format_uptime(uptime_seconds)
-    
-    def get_system_info(self) -> Dict:
-        """Получение информации о системе"""
-        try:
-            # Определяем, какой psutil использовать
-            current_psutil = psutil
-            if self.is_docker_host_monitoring:
-                # Для хоста используем специальные методы
-                host_cpu = self._get_host_cpu_info()
-                host_memory = self._get_host_memory_info()
-                host_disk = self._get_host_disk_info()
-                
-                if host_cpu and host_memory and host_disk:
-                    # Используем данные хоста
-                    cpu_count = host_cpu['cpu_count']
-                    load_avg = host_cpu['load_avg']
-                    
-                    # Для CPU процента используем упрощенный расчет на основе load average
-                    # Load average > 1.0 на ядро считается высокой нагрузкой
-                    load_per_core = load_avg[0] / cpu_count if cpu_count > 0 else 0
-                    cpu_percent = min(100, load_per_core * 100)  # Упрощенный расчет
-                    
-                    # Память хоста
-                    ram_total = host_memory['ram_total']
-                    ram_used = host_memory['ram_used']
-                    ram_percent = host_memory['ram_percent']
-                    swap_total = host_memory['swap_total']
-                    swap_used = host_memory['swap_used']
-                    swap_percent = host_memory['swap_percent']
-                    
-                    # Диск хоста
-                    disk_total = host_disk['total']
-                    disk_used = host_disk['used']
-                    disk_free = host_disk['free']
-                    disk_percent = host_disk['percent']
-                    
-                    # IO Wait и другие метрики недоступны через /proc, используем 0
-                    io_wait_percent = 0.0
-                    
-                    logger.debug("Используются метрики хоста через Docker volumes")
-                else:
-                    # Fallback к стандартному psutil
-                    logger.warning("Не удалось получить метрики хоста, используем контейнер")
-                    current_psutil = psutil
-                    host_cpu = host_memory = host_disk = None
-            else:
-                # Стандартный psutil для контейнера
-                host_cpu = host_memory = host_disk = None
-            
-            # Если не используем хост, получаем стандартные метрики
-            if not host_cpu:
-                cpu_percent = current_psutil.cpu_percent(interval=1)
-                load_avg = current_psutil.getloadavg()
-                cpu_count = current_psutil.cpu_count()
-                
-                # CPU times для получения IO Wait
-                cpu_times = current_psutil.cpu_times_percent(interval=1)
-                io_wait_percent = getattr(cpu_times, 'iowait', 0.0)
-                
-                # Память
-                memory = current_psutil.virtual_memory()
-                swap = current_psutil.swap_memory()
-                
-                # Используем единый расчет для всех ОС: used / total для получения процента занятой памяти
-                ram_percent = (memory.used / memory.total) * 100
-                ram_total = memory.total
-                ram_used = memory.used
-                swap_total = swap.total
-                swap_used = swap.used
-                swap_percent = swap.percent
-                
-                # Диск
-                disk = self._get_disk_usage()
-                disk_total = disk.total if disk else 0
-                disk_used = disk.used if disk else 0
-                disk_free = disk.free if disk else 0
-                disk_percent = (disk_used / disk_total * 100) if disk_total > 0 else 0
-            
-            # Диск I/O (может быть недоступен для хоста)
-            disk_io = self._get_disk_io_counters()
-            if disk_io:
-                disk_io_percent = self._calculate_disk_io_percent()
-                disk_read_speed, disk_write_speed = self._calculate_disk_speed(disk_io)
-            else:
-                disk_io_percent = 0
-                disk_read_speed = "0 B/s"
-                disk_write_speed = "0 B/s"
-            
-            # Система
-            system_uptime = self._get_system_uptime()
-            
-            # Получаем имя хоста
-            if self.is_docker_host_monitoring:
-                try:
-                    with open('/host/proc/sys/kernel/hostname', 'r') as f:
-                        hostname = f.read().strip()
-                except:
-                    hostname = "host"
-            else:
-                hostname = os.uname().nodename
-            
-            return {
-                'cpu_percent': round(cpu_percent, 1),
-                'load_avg_1m': round(load_avg[0], 2),
-                'load_avg_5m': round(load_avg[1], 2),
-                'load_avg_15m': round(load_avg[2], 2),
-                'cpu_count': cpu_count,
-                'io_wait_percent': round(io_wait_percent, 1),
-                'ram_used': round(ram_used / (1024**3), 2),
-                'ram_total': round(ram_total / (1024**3), 2),
-                'ram_percent': round(ram_percent, 1),
-                'swap_used': round(swap_used / (1024**3), 2),
-                'swap_total': round(swap_total / (1024**3), 2),
-                'swap_percent': round(swap_percent, 1),
-                'disk_used': round(disk_used / (1024**3), 2),
-                'disk_total': round(disk_total / (1024**3), 2),
-                'disk_percent': round(disk_percent, 1),
-                'disk_free': round(disk_free / (1024**3), 2),
-                'disk_read_speed': disk_read_speed,
-                'disk_write_speed': disk_write_speed,
-                'disk_io_percent': disk_io_percent,
-                'system_uptime': self._format_uptime(system_uptime),
-                'monitor_uptime': self.get_monitor_uptime(),
-                'server_hostname': hostname,
-                'current_time': datetime.now().strftime('%Y-%m-%d %H:%M:%S'),
-                'monitoring_level': 'host' if self.is_docker_host_monitoring else 'container'
-            }
-        except Exception as e:
-            logger.error(f"Ошибка при получении информации о системе: {e}")
-            return {}
-    
-    def _format_bytes(self, bytes_value: int) -> str:
-        """Форматирование байтов в человекочитаемый вид"""
-        if bytes_value == 0:
-            return "0 B"
-        
-        size_names = ["B", "KB", "MB", "GB", "TB"]
-        i = 0
-        while bytes_value >= 1024 and i < len(size_names) - 1:
-            bytes_value /= 1024.0
-            i += 1
-        
-        return f"{bytes_value:.1f} {size_names[i]}"
-    
-    def _format_uptime(self, seconds: float) -> str:
-        """Форматирование времени работы системы"""
-        days = int(seconds // 86400)
-        hours = int((seconds % 86400) // 3600)
-        minutes = int((seconds % 3600) // 60)
-        
-        if days > 0:
-            return f"{days}д {hours}ч {minutes}м"
-        elif hours > 0:
-            return f"{hours}ч {minutes}м"
-        else:
-            return f"{minutes}м"
-    
-    def check_process_status(self, process_name: str) -> Tuple[str, str]:
-        """Проверка статуса процесса и возврат статуса с uptime"""
-        try:
-            # Для helper_bot используем HTTP endpoint
-            if process_name == 'helper_bot':
-                return self._check_helper_bot_status()
-            
-            # Для других процессов используем стандартную проверку
-            return self._check_local_process_status(process_name)
-            
-        except Exception as e:
-            logger.error(f"Ошибка при проверке процесса {process_name}: {e}")
-            return "❌", "Выключен"
-    
-    def _check_local_process_status(self, process_name: str) -> Tuple[str, str]:
-        """Проверка локального процесса по PID файлу или имени"""
-        try:
-            # Проверяем по PID файлу
-            pid_file = self.pid_files.get(process_name)
-            if pid_file and os.path.exists(pid_file):
-                try:
-                    with open(pid_file, 'r') as f:
-                        content = f.read().strip()
-                        if content and content != '# Этот файл будет автоматически обновляться при запуске бота':
-                            pid = int(content)
-                            if psutil.pid_exists(pid):
-                                proc = psutil.Process(pid)
-                                proc_uptime = time.time() - proc.create_time()
-                                uptime_str = self._format_uptime(proc_uptime)
-                                return "✅", f"Uptime {uptime_str}"
-                except (ValueError, FileNotFoundError):
-                    pass
-            
-            # Проверяем по имени процесса
-            for proc in psutil.process_iter(['pid', 'name', 'cmdline']):
-                try:
-                    proc_name = proc.info['name'].lower()
-                    cmdline = ' '.join(proc.info['cmdline']).lower() if proc.info['cmdline'] else ''
-                    
-                    if (process_name in proc_name or 
-                        process_name in cmdline or
-                        'python' in proc_name and process_name in cmdline):
-                        
-                        proc_uptime = time.time() - proc.create_time()
-                        uptime_str = self._format_uptime(proc_uptime)
-                        return "✅", f"Uptime {uptime_str}"
-                        
-                except (psutil.NoSuchProcess, psutil.AccessDenied):
-                    continue
-            
-            return "❌", "Выключен"
-            
-        except Exception as e:
-            logger.error(f"Ошибка при проверке локального процесса {process_name}: {e}")
-            return "❌", "Выключен"
-    
-    def _calculate_disk_speed(self, current_disk_io) -> Tuple[str, str]:
-        """Расчет скорости чтения/записи диска"""
-        current_time = time.time()
-        
-        if self.last_disk_io is None or self.last_disk_io_time is None:
-            self.last_disk_io = current_disk_io
-            self.last_disk_io_time = current_time
-            return "0 B/s", "0 B/s"
-        
-        time_diff = current_time - self.last_disk_io_time
-        if time_diff < 1:  # Минимальный интервал 1 секунда
-            return "0 B/s", "0 B/s"
-        
-        read_diff = current_disk_io.read_bytes - self.last_disk_io.read_bytes
-        write_diff = current_disk_io.write_bytes - self.last_disk_io.write_bytes
-        
-        read_speed = read_diff / time_diff
-        write_speed = write_diff / time_diff
-        
-        # Обновляем предыдущие значения
-        self.last_disk_io = current_disk_io
-        self.last_disk_io_time = current_time
-        
-        return self._format_bytes(read_speed) + "/s", self._format_bytes(write_speed) + "/s"
-    
-    def _calculate_disk_io_percent(self) -> int:
-        """Расчет процента загрузки диска на основе реальной скорости I/O"""
-        try:
-            # Получаем текущую статистику диска
-            current_disk_io = self._get_disk_io_counters()
-            if current_disk_io is None:
-                return 0
-            
-            current_time = time.time()
-            
-            # Если это первое измерение, инициализируем
-            if self.last_disk_io_for_percent is None or self.last_disk_io_time_for_percent is None:
-                logger.debug("Первое измерение диска для процента, инициализируем базовые значения")
-                self.last_disk_io_for_percent = current_disk_io
-                self.last_disk_io_time_for_percent = current_time
-                return 0
-            
-            # Рассчитываем время между измерениями
-            time_diff = current_time - self.last_disk_io_time_for_percent
-            if time_diff < 0.1:  # Минимальный интервал 0.1 секунды для более точных измерений
-                logger.debug(f"Интервал между измерениями слишком мал: {time_diff:.3f}s, возвращаем 0%")
-                return 0
-            
-            # Рассчитываем скорость операций в секунду
-            read_ops_diff = current_disk_io.read_count - self.last_disk_io_for_percent.read_count
-            write_ops_diff = current_disk_io.write_count - self.last_disk_io_for_percent.write_count
-            
-            read_ops_per_sec = read_ops_diff / time_diff
-            write_ops_per_sec = write_ops_diff / time_diff
-            total_ops_per_sec = read_ops_per_sec + write_ops_per_sec
-            
-            # Рассчитываем скорость передачи данных в байтах в секунду
-            read_bytes_diff = current_disk_io.read_bytes - self.last_disk_io_for_percent.read_bytes
-            write_bytes_diff = current_disk_io.write_bytes - self.last_disk_io_for_percent.write_bytes
-            
-            read_bytes_per_sec = read_bytes_diff / time_diff
-            write_bytes_per_sec = write_bytes_diff / time_diff
-            total_bytes_per_sec = read_bytes_per_sec + write_bytes_per_sec
-            
-            # Обновляем предыдущие значения для процента
-            self.last_disk_io_for_percent = current_disk_io
-            self.last_disk_io_time_for_percent = current_time
-            
-            # Определяем максимальную производительность диска в зависимости от ОС
-            if self.os_type == "macos":
-                # macOS обычно имеет SSD с высокой производительностью
-                max_ops_per_sec = 50000  # Операций в секунду
-                max_bytes_per_sec = 3 * (1024**3)  # 3 GB/s
-            elif self.os_type == "ubuntu":
-                # Ubuntu может быть на разных типах дисков
-                max_ops_per_sec = 30000  # Операций в секунду
-                max_bytes_per_sec = 2 * (1024**3)  # 2 GB/s
-            else:
-                max_ops_per_sec = 40000
-                max_bytes_per_sec = 2.5 * (1024**3)
-            
-            # Рассчитываем процент загрузки на основе операций и байтов
-            # Защита от деления на ноль
-            if max_ops_per_sec > 0:
-                ops_percent = min(100, (total_ops_per_sec / max_ops_per_sec) * 100)
-            else:
-                ops_percent = 0
-                
-            if max_bytes_per_sec > 0:
-                bytes_percent = min(100, (total_bytes_per_sec / max_bytes_per_sec) * 100)
-            else:
-                bytes_percent = 0
-            
-            # Взвешенный средний процент (операции важнее для большинства случаев)
-            final_percent = (ops_percent * 0.7) + (bytes_percent * 0.3)
-            
-            # Логируем для отладки (только при высоких значениях)
-            if final_percent > 10:
-                logger.debug(f"Диск I/O: {total_ops_per_sec:.1f} ops/s, {total_bytes_per_sec/(1024**2):.1f} MB/s, "
-                           f"Загрузка: {final_percent:.1f}% (ops: {ops_percent:.1f}%, bytes: {bytes_percent:.1f}%)")
-            
-            # Округляем до целого числа
-            return round(final_percent)
-            
-        except Exception as e:
-            logger.error(f"Ошибка при расчете процента загрузки диска: {e}")
-            return 0
-    
-    def get_metrics_data(self) -> Dict:
-        """Получение данных для метрик Prometheus"""
-        system_info = self.get_system_info()
-        if not system_info:
-            return {}
-        
-        return {
-            'cpu_usage_percent': system_info.get('cpu_percent', 0),
-            'ram_usage_percent': system_info.get('ram_percent', 0),
-            'disk_usage_percent': system_info.get('disk_percent', 0),
-            'load_average_1m': system_info.get('load_avg_1m', 0),
-            'load_average_5m': system_info.get('load_avg_5m', 0),
-            'load_average_15m': system_info.get('load_avg_15m', 0),
-            'swap_usage_percent': system_info.get('swap_percent', 0),
-            'disk_io_percent': system_info.get('disk_io_percent', 0),
-            'system_uptime_seconds': self._get_system_uptime(),
-            'monitor_uptime_seconds': time.time() - self.monitor_start_time
-        }
-    
-    def check_alerts(self, system_info: Dict) -> Tuple[bool, Optional[str]]:
-        """Проверка необходимости отправки алертов с учетом задержек"""
-        current_time = time.time()
-        alerts = []
-        recoveries = []
-        
-        # Проверка CPU с задержкой
-        if system_info['cpu_percent'] > self.threshold:
-            if not self.alert_states['cpu']:
-                # Первое превышение порога
-                if self.alert_start_times['cpu'] is None:
-                    self.alert_start_times['cpu'] = current_time
-                    logger.debug(f"CPU превысил порог {self.threshold}%: {system_info['cpu_percent']:.1f}% - начинаем отсчет задержки {self.alert_delays['cpu']}s")
-                
-                # Проверяем, прошла ли задержка
-                if self.alert_delays['cpu'] == 0 or current_time - self.alert_start_times['cpu'] >= self.alert_delays['cpu']:
-                    self.alert_states['cpu'] = True
-                    alerts.append(('cpu', system_info['cpu_percent'], f"Нагрузка за 1 мин: {system_info['load_avg_1m']}"))
-                    logger.warning(f"CPU ALERT: {system_info['cpu_percent']:.1f}% > {self.threshold}% (задержка {self.alert_delays['cpu']}s)")
-        else:
-            # CPU ниже порога - сбрасываем состояние
-            if self.alert_states['cpu']:
-                self.alert_states['cpu'] = False
-                recoveries.append(('cpu', system_info['cpu_percent']))
-                logger.info(f"CPU RECOVERY: {system_info['cpu_percent']:.1f}% < {self.recovery_threshold}%")
-            
-            # Сбрасываем время начала превышения
-            self.alert_start_times['cpu'] = None
-        
-        # Проверка RAM с задержкой
-        if system_info['ram_percent'] > self.threshold:
-            if not self.alert_states['ram']:
-                # Первое превышение порога
-                if self.alert_start_times['ram'] is None:
-                    self.alert_start_times['ram'] = current_time
-                    logger.debug(f"RAM превысил порог {self.threshold}%: {system_info['ram_percent']:.1f}% - начинаем отсчет задержки {self.alert_delays['ram']}s")
-                
-                # Проверяем, прошла ли задержка
-                if self.alert_delays['ram'] == 0 or current_time - self.alert_start_times['ram'] >= self.alert_delays['ram']:
-                    self.alert_states['ram'] = True
-                    alerts.append(('ram', system_info['ram_percent'], f"Используется: {system_info['ram_used']} GB из {system_info['ram_total']} GB"))
-                    logger.warning(f"RAM ALERT: {system_info['ram_percent']:.1f}% > {self.threshold}% (задержка {self.alert_delays['ram']}s)")
-        else:
-            # RAM ниже порога - сбрасываем состояние
-            if self.alert_states['ram']:
-                self.alert_states['ram'] = False
-                recoveries.append(('ram', system_info['ram_percent']))
-                logger.info(f"RAM RECOVERY: {system_info['ram_percent']:.1f}% < {self.recovery_threshold}%")
-            
-            # Сбрасываем время начала превышения
-            self.alert_start_times['ram'] = None
-        
-        # Проверка диска с задержкой
-        if system_info['disk_percent'] > self.threshold:
-            if not self.alert_states['disk']:
-                # Первое превышение порога
-                if self.alert_start_times['disk'] is None:
-                    self.alert_start_times['disk'] = current_time
-                    logger.debug(f"Disk превысил порог {self.threshold}%: {system_info['disk_percent']:.1f}% - начинаем отсчет задержки {self.alert_delays['disk']}s")
-                
-                # Проверяем, прошла ли задержка
-                if self.alert_delays['disk'] == 0 or current_time - self.alert_start_times['disk'] >= self.alert_delays['disk']:
-                    self.alert_states['disk'] = True
-                    alerts.append(('disk', system_info['disk_percent'], f"Свободно: {system_info['disk_free']} GB на /"))
-                    logger.warning(f"DISK ALERT: {system_info['disk_percent']:.1f}% > {self.threshold}% (задержка {self.alert_delays['disk']}s)")
-        else:
-            # Диск ниже порога - сбрасываем состояние
-            if self.alert_states['disk']:
-                self.alert_states['disk'] = False
-                recoveries.append(('disk', system_info['disk_percent']))
-                logger.info(f"DISK RECOVERY: {system_info['disk_percent']:.1f}% < {self.recovery_threshold}%")
-            
-            # Сбрасываем время начала превышения
-            self.alert_start_times['disk'] = None
-        
-        return alerts, recoveries
-
-    def _get_host_psutil(self):
-        """Получение psutil с доступом к хосту"""
-        if self.is_docker_host_monitoring:
-            # Переключаемся на директории хоста
-            os.environ['PROC_ROOT'] = '/host/proc'
-            os.environ['SYS_ROOT'] = '/host/sys'
-            # Перезагружаем psutil для использования новых путей
-            import importlib
-            import psutil
-            importlib.reload(psutil)
-            return psutil
-        return psutil
-    
-    def _get_host_cpu_info(self):
-        """Получение информации о CPU хоста"""
-        try:
-            if self.is_docker_host_monitoring:
-                # Читаем информацию о CPU напрямую из /proc
-                with open('/host/proc/cpuinfo', 'r') as f:
-                    cpu_info = f.read()
-                
-                # Подсчитываем количество ядер
-                cpu_count = cpu_info.count('processor')
-                
-                # Читаем load average
-                with open('/host/proc/loadavg', 'r') as f:
-                    load_avg = f.read().strip().split()[:3]
-                    load_avg = [float(x) for x in load_avg]
-                
-                # Читаем статистику CPU
-                with open('/host/proc/stat', 'r') as f:
-                    cpu_stat = f.readline().strip().split()[1:]
-                    cpu_stat = [int(x) for x in cpu_stat]
-                
-                # Рассчитываем процент CPU (упрощенный метод)
-                # В реальности нужно сравнивать с предыдущими значениями
-                cpu_percent = 0.0  # Будет рассчитано в get_system_info
-                
-                return {
-                    'cpu_count': cpu_count,
-                    'load_avg': load_avg,
-                    'cpu_stat': cpu_stat
-                }
-            else:
-                # Используем стандартный psutil
-                return {
-                    'cpu_count': psutil.cpu_count(),
-                    'load_avg': psutil.getloadavg(),
-                    'cpu_stat': None
-                }
-        except Exception as e:
-            logger.error(f"Ошибка при получении информации о CPU хоста: {e}")
-            return None
-    
-    def _get_host_memory_info(self):
-        """Получение информации о памяти хоста"""
-        try:
-            if self.is_docker_host_monitoring:
-                # Читаем информацию о памяти из /proc/meminfo
-                with open('/host/proc/meminfo', 'r') as f:
-                    mem_info = f.read()
-                
-                # Парсим значения
-                mem_lines = mem_info.split('\n')
-                mem_data = {}
-                for line in mem_lines:
-                    if ':' in line:
-                        key, value = line.split(':', 1)
-                        mem_data[key.strip()] = int(value.strip().split()[0]) * 1024  # Конвертируем в байты
-                
-                # Рассчитываем проценты
-                total = mem_data.get('MemTotal', 0)
-                available = mem_data.get('MemAvailable', 0)
-                used = total - available
-                ram_percent = (used / total * 100) if total > 0 else 0
-                
-                # Swap
-                swap_total = mem_data.get('SwapTotal', 0)
-                swap_free = mem_data.get('SwapFree', 0)
-                swap_used = swap_total - swap_free
-                swap_percent = (swap_used / swap_total * 100) if swap_total > 0 else 0
-                
-                return {
-                    'ram_total': total,
-                    'ram_used': used,
-                    'ram_percent': ram_percent,
-                    'swap_total': swap_total,
-                    'swap_used': swap_used,
-                    'swap_percent': swap_percent
-                }
-            else:
-                # Используем стандартный psutil
-                memory = psutil.virtual_memory()
-                swap = psutil.swap_memory()
-                return {
-                    'ram_total': memory.total,
-                    'ram_used': memory.used,
-                    'ram_percent': memory.percent,
-                    'swap_total': swap.total,
-                    'swap_used': swap.used,
-                    'swap_percent': swap.percent
-                }
-        except Exception as e:
-            logger.error(f"Ошибка при получении информации о памяти хоста: {e}")
-            return None
-    
-    def _get_host_disk_info(self):
-        """Получение информации о диске хоста"""
-        try:
-            if self.is_docker_host_monitoring:
-                # Используем df для получения информации о диске
-                import subprocess
-                result = subprocess.run(['df', '/'], capture_output=True, text=True)
-                if result.returncode == 0:
-                    lines = result.stdout.strip().split('\n')
-                    if len(lines) >= 2:
-                        parts = lines[1].split()
-                        if len(parts) >= 4:
-                            total_kb = int(parts[1])
-                            used_kb = int(parts[2])
-                            available_kb = int(parts[3])
-                            
-                            total = total_kb * 1024
-                            used = used_kb * 1024
-                            available = available_kb * 1024
-                            percent = (used / total * 100) if total > 0 else 0
-                            
-                            return {
-                                'total': total,
-                                'used': used,
-                                'free': available,
-                                'percent': percent
-                            }
-                
-                # Fallback к стандартному psutil
-                return None
-            else:
-                # Используем стандартный psutil
-                return None
-        except Exception as e:
-            logger.error(f"Ошибка при получении информации о диске хоста: {e}")
-            return None
-    
-    def _check_helper_bot_status(self) -> Tuple[str, str]:
-        """Проверка статуса helper_bot через HTTP endpoint"""
-        try:
-            import requests
-            
-            logger.info("Проверяем статус helper_bot через HTTP endpoint /status")
-            
-            # Обращаемся к endpoint /status в helper_bot
-            url = 'http://bots_telegram_bot:8080/status'
-            logger.info(f"Отправляем HTTP запрос к: {url}")
-            
-            response = requests.get(url, timeout=5)
-            logger.info(f"Получен HTTP ответ: статус {response.status_code}")
-            
-            if response.status_code == 200:
-                try:
-                    data = response.json()
-                    logger.info(f"Получены данные: {data}")
-                    
-                    status = data.get('status', 'unknown')
-                    uptime = data.get('uptime', 'unknown')
-                    
-                    if status == 'running':
-                        result = "✅", f"Uptime {uptime}"
-                        logger.info(f"Helper_bot работает: {result}")
-                        return result
-                    elif status == 'starting':
-                        result = "🔄", f"Запуск: {uptime}"
-                        logger.info(f"Helper_bot запускается: {result}")
-                        return result
-                    else:
-                        result = "⚠️", f"Статус: {status}"
-                        logger.warning(f"Helper_bot необычный статус: {result}")
-                        return result
-                        
-                except (ValueError, KeyError) as e:
-                    # Если не удалось распарсить JSON, но статус 200
-                    logger.warning(f"Не удалось распарсить JSON ответ: {e}, но статус 200")
-                    result = "✅", "HTTP: доступен"
-                    logger.info(f"Helper_bot доступен: {result}")
-                    return result
-            else:
-                logger.warning(f"HTTP статус не 200: {response.status_code}")
-                return "⚠️", f"HTTP: {response.status_code}"
-                
-        except requests.exceptions.Timeout:
-            logger.error("HTTP запрос к helper_bot завершился таймаутом")
-            return "⚠️", "HTTP: таймаут"
-        except requests.exceptions.ConnectionError as e:
-            logger.error(f"HTTP ошибка соединения с helper_bot: {e}")
-            return "❌", "HTTP: нет соединения"
-        except ImportError:
-            logger.debug("requests не доступен для HTTP проверки")
-            return "❌", "HTTP: requests недоступен"
-        except Exception as e:
-            logger.error(f"Неожиданная ошибка при HTTP проверке helper_bot: {e}")
-            return "❌", f"HTTP: ошибка"
--- a/infra/monitoring/pid_manager.py
+++ b/infra/monitoring/pid_manager.py
@@ -1,161 +0,0 @@
-"""
-Модуль для управления PID файлами процессов
-Общий модуль для всех ботов в проекте
-"""
-import os
-import sys
-import signal
-import atexit
-import logging
-from typing import Optional
-
-logger = logging.getLogger(__name__)
-
-
-class PIDManager:
-    """Класс для управления PID файлами"""
-    
-    def __init__(self, pid_file_path: str, process_name: str = "process"):
-        """
-        Инициализация PID менеджера
-        
-        Args:
-            pid_file_path: Путь к PID файлу
-            process_name: Имя процесса для логирования
-        """
-        self.pid_file_path = pid_file_path
-        self.process_name = process_name
-        self.pid = os.getpid()
-        
-    def create_pid_file(self) -> bool:
-        """
-        Создание PID файла с текущим PID процесса
-        
-        Returns:
-            bool: True если файл создан успешно, False в противном случае
-        """
-        try:
-            # Создаем директорию если не существует
-            pid_dir = os.path.dirname(self.pid_file_path)
-            if pid_dir and not os.path.exists(pid_dir):
-                os.makedirs(pid_dir, exist_ok=True)
-            
-            # Записываем PID в файл
-            with open(self.pid_file_path, 'w') as f:
-                f.write(str(self.pid))
-            
-            logger.info(f"PID файл создан для {self.process_name}: {self.pid_file_path} (PID: {self.pid})")
-            
-            # Регистрируем функцию очистки при завершении
-            atexit.register(self.cleanup_pid_file)
-            
-            # Регистрируем обработчики сигналов для корректной очистки
-            signal.signal(signal.SIGTERM, self._signal_handler)
-            signal.signal(signal.SIGINT, self._signal_handler)
-            
-            return True
-            
-        except Exception as e:
-            logger.error(f"Ошибка при создании PID файла для {self.process_name}: {e}")
-            return False
-    
-    def cleanup_pid_file(self):
-        """Удаление PID файла при завершении процесса"""
-        try:
-            if os.path.exists(self.pid_file_path):
-                os.remove(self.pid_file_path)
-                logger.info(f"PID файл удален для {self.process_name}: {self.pid_file_path}")
-        except Exception as e:
-            logger.error(f"Ошибка при удалении PID файла для {self.process_name}: {e}")
-    
-    def _signal_handler(self, signum, frame):
-        """Обработчик сигналов для корректного завершения"""
-        logger.info(f"Получен сигнал {signum} для {self.process_name}, очищаем PID файл...")
-        self.cleanup_pid_file()
-        sys.exit(0)
-    
-    def is_running(self) -> bool:
-        """
-        Проверка, запущен ли процесс с PID из файла
-        
-        Returns:
-            bool: True если процесс запущен, False в противном случае
-        """
-        try:
-            if not os.path.exists(self.pid_file_path):
-                return False
-            
-            with open(self.pid_file_path, 'r') as f:
-                content = f.read().strip()
-                if not content:
-                    return False
-                
-                try:
-                    pid = int(content)
-                    # Проверяем, существует ли процесс с таким PID
-                    os.kill(pid, 0)  # Отправляем сигнал 0 для проверки существования
-                    return True
-                except (ValueError, OSError):
-                    # PID не валидный или процесс не существует
-                    return False
-                    
-        except Exception as e:
-            logger.error(f"Ошибка при проверке PID файла для {self.process_name}: {e}")
-            return False
-    
-    def get_pid(self) -> Optional[int]:
-        """
-        Получение PID из файла
-        
-        Returns:
-            int: PID процесса или None если файл не существует или невалидный
-        """
-        try:
-            if not os.path.exists(self.pid_file_path):
-                return None
-            
-            with open(self.pid_file_path, 'r') as f:
-                content = f.read().strip()
-                if not content:
-                    return None
-                
-                return int(content)
-                
-        except (ValueError, FileNotFoundError) as e:
-            logger.error(f"Ошибка при чтении PID файла для {self.process_name}: {e}")
-            return None
-
-
-def create_pid_manager(process_name: str, project_root: str = None) -> PIDManager:
-    """
-    Создание PID менеджера для указанного процесса
-    
-    Args:
-        process_name: Имя процесса (например, 'helper_bot', 'admin_bot', etc.)
-        project_root: Корневая директория проекта. Если None, определяется автоматически
-        
-    Returns:
-        PIDManager: Экземпляр PID менеджера
-    """
-    if project_root is None:
-        # Определяем корень проекта автоматически
-        current_file = os.path.abspath(__file__)
-        # Поднимаемся на 2 уровня вверх от infra/monitoring/pid_manager.py
-        project_root = os.path.dirname(os.path.dirname(current_file))
-    
-    pid_file_path = os.path.join(project_root, f"{process_name}.pid")
-    
-    return PIDManager(pid_file_path, process_name)
-
-
-def get_bot_pid_manager(bot_name: str) -> PIDManager:
-    """
-    Удобная функция для создания PID менеджера для ботов
-    
-    Args:
-        bot_name: Имя бота (например, 'helper_bot', 'admin_bot', etc.)
-        
-    Returns:
-        PIDManager: Экземпляр PID менеджера
-    """
-    return create_pid_manager(bot_name)
--- a/infra/monitoring/prometheus_server.py
+++ b/infra/monitoring/prometheus_server.py
@@ -1,143 +0,0 @@
-import asyncio
-import logging
-from aiohttp import web
-try:
-    from .metrics_collector import MetricsCollector
-except ImportError:
-    from metrics_collector import MetricsCollector
-
-logger = logging.getLogger(__name__)
-
-
-class PrometheusServer:
-    def __init__(self, host='0.0.0.0', port=9091):
-        self.host = host
-        self.port = port
-        self.metrics_collector = MetricsCollector()
-        self.app = web.Application()
-        self.setup_routes()
-        
-    def setup_routes(self):
-        """Настройка маршрутов для Prometheus"""
-        self.app.router.add_get('/', self.root_handler)
-        self.app.router.add_get('/metrics', self.metrics_handler)
-        self.app.router.add_get('/health', self.health_handler)
-        
-    async def root_handler(self, request):
-        """Главная страница"""
-        return web.Response(
-            text="Prometheus Metrics Server\n\n"
-                 "Available endpoints:\n"
-                 "- /metrics - Prometheus metrics\n"
-                 "- /health - Health check",
-            content_type='text/plain'
-        )
-        
-    async def health_handler(self, request):
-        """Health check endpoint"""
-        return web.Response(
-            text="OK",
-            content_type='text/plain'
-        )
-        
-    async def metrics_handler(self, request):
-        """Endpoint для Prometheus метрик"""
-        try:
-            metrics_data = self.metrics_collector.get_metrics_data()
-            prometheus_metrics = self._format_prometheus_metrics(metrics_data)
-            
-            return web.Response(
-                text=prometheus_metrics,
-                content_type='text/plain'
-            )
-            
-        except Exception as e:
-            logger.error(f"Ошибка при получении метрик: {e}")
-            return web.Response(
-                text=f"Error: {str(e)}",
-                status=500,
-                content_type='text/plain'
-            )
-    
-    def _format_prometheus_metrics(self, metrics_data: dict) -> str:
-        """Форматирование метрик в Prometheus формат"""
-        lines = []
-        
-        # Системная информация
-        lines.append("# HELP system_info System information")
-        lines.append("# TYPE system_info gauge")
-        lines.append(f"system_info{{os=\"{self.metrics_collector.os_type}\"}} 1")
-        
-        # CPU метрики
-        if 'cpu_usage_percent' in metrics_data:
-            lines.append("# HELP cpu_usage_percent CPU usage percentage")
-            lines.append("# TYPE cpu_usage_percent gauge")
-            lines.append(f"cpu_usage_percent {metrics_data['cpu_usage_percent']}")
-            
-        if 'load_average_1m' in metrics_data:
-            lines.append("# HELP load_average_1m 1 minute load average")
-            lines.append("# TYPE load_average_1m gauge")
-            lines.append(f"load_average_1m {metrics_data['load_average_1m']}")
-            
-        if 'load_average_5m' in metrics_data:
-            lines.append("# HELP load_average_5m 5 minute load average")
-            lines.append("# TYPE load_average_5m gauge")
-            lines.append(f"load_average_5m {metrics_data['load_average_5m']}")
-            
-        if 'load_average_15m' in metrics_data:
-            lines.append("# HELP load_average_15m 15 minute load average")
-            lines.append("# TYPE load_average_15m gauge")
-            lines.append(f"load_average_15m {metrics_data['load_average_15m']}")
-        
-        # RAM метрики
-        if 'ram_usage_percent' in metrics_data:
-            lines.append("# HELP ram_usage_percent RAM usage percentage")
-            lines.append("# TYPE ram_usage_percent gauge")
-            lines.append(f"ram_usage_percent {metrics_data['ram_usage_percent']}")
-        
-        # Disk метрики
-        if 'disk_usage_percent' in metrics_data:
-            lines.append("# HELP disk_usage_percent Disk usage percentage")
-            lines.append("# TYPE disk_usage_percent gauge")
-            lines.append(f"disk_usage_percent {metrics_data['disk_usage_percent']}")
-            
-        if 'disk_io_percent' in metrics_data:
-            lines.append("# HELP disk_io_percent Disk I/O usage percentage")
-            lines.append("# TYPE disk_io_percent gauge")
-            lines.append(f"disk_io_percent {metrics_data['disk_io_percent']}")
-        
-        # Swap метрики
-        if 'swap_usage_percent' in metrics_data:
-            lines.append("# HELP swap_usage_percent Swap usage percentage")
-            lines.append("# TYPE swap_usage_percent gauge")
-            lines.append(f"swap_usage_percent {metrics_data['swap_usage_percent']}")
-        
-        # Uptime метрики
-        if 'system_uptime_seconds' in metrics_data:
-            lines.append("# HELP system_uptime_seconds System uptime in seconds")
-            lines.append("# TYPE system_uptime_seconds gauge")
-            lines.append(f"system_uptime_seconds {metrics_data['system_uptime_seconds']}")
-            
-        if 'monitor_uptime_seconds' in metrics_data:
-            lines.append("# HELP monitor_uptime_seconds Monitor uptime in seconds")
-            lines.append("# TYPE monitor_uptime_seconds gauge")
-            lines.append(f"monitor_uptime_seconds {metrics_data['monitor_uptime_seconds']}")
-        
-        return '\n'.join(lines)
-    
-    async def start(self):
-        """Запуск HTTP сервера"""
-        runner = web.AppRunner(self.app)
-        await runner.setup()
-        
-        site = web.TCPSite(runner, self.host, self.port)
-        await site.start()
-        
-        logger.info(f"Prometheus сервер запущен на http://{self.host}:{self.port}")
-        
-        return runner
-    
-    async def stop(self, runner):
-        """Остановка HTTP сервера"""
-        await runner.cleanup()
-        logger.info("Prometheus сервер остановлен")
--- a/infra/monitoring/server_monitor.py
+++ b/infra/monitoring/server_monitor.py
@@ -1,62 +0,0 @@
-import asyncio
-import logging
-try:
-    from .metrics_collector import MetricsCollector
-    from .message_sender import MessageSender
-    from .prometheus_server import PrometheusServer
-except ImportError:
-    from metrics_collector import MetricsCollector
-    from message_sender import MessageSender
-    from prometheus_server import PrometheusServer
-
-logger = logging.getLogger(__name__)
-
-
-class ServerMonitor:
-    def __init__(self):
-        # Создаем экземпляры модулей
-        self.metrics_collector = MetricsCollector()
-        self.message_sender = MessageSender()
-        self.prometheus_server = PrometheusServer()
-        
-        logger.info(f"Модуль мониторинга сервера запущен на {self.metrics_collector.os_type.upper()}")
-    
-    async def monitor_loop(self):
-        """Основной цикл мониторинга"""
-        logger.info(f"Модуль мониторинга сервера запущен на {self.metrics_collector.os_type.upper()}")
-        
-        # Запускаем Prometheus сервер
-        prometheus_runner = await self.prometheus_server.start()
-        
-        try:
-            while True:
-                try:
-                    # Проверка алертов и восстановлений
-                    await self.message_sender.process_alerts_and_recoveries()
-                    
-                    # Проверка необходимости отправки статуса
-                    if self.message_sender.should_send_status():
-                        await self.message_sender.send_status_message()
-                    
-                    # Пауза между проверками (30 секунд)
-                    await asyncio.sleep(30)
-                    
-                except Exception as e:
-                    logger.error(f"Ошибка в цикле мониторинга: {e}")
-                    await asyncio.sleep(30)
-        finally:
-            # Останавливаем Prometheus сервер при завершении
-            await self.prometheus_server.stop(prometheus_runner)
-    
-    async def send_startup_status(self):
-        """Отправка статуса при запуске"""
-        if self.message_sender.should_send_startup_status():
-            await self.message_sender.send_status_message()
-    
-    def get_system_info(self):
-        """Получение информации о системе (для обратной совместимости)"""
-        return self.metrics_collector.get_system_info()
-    
-    def get_metrics_data(self):
-        """Получение данных для метрик Prometheus (для обратной совместимости)"""
-        return self.metrics_collector.get_metrics_data()
--- a/infra/monitoring/test_monitor.py
+++ b/infra/monitoring/test_monitor.py
@@ -1,98 +0,0 @@
-#!/usr/bin/env python3
-"""
-Тестовый скрипт для проверки работы модуля мониторинга
-"""
-
-import sys
-import os
-import logging
-
-# Добавляем текущую директорию в путь для импорта
-sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
-
-from server_monitor import ServerMonitor
-
-# Настройка логирования
-logging.basicConfig(
-    level=logging.INFO,
-    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
-)
-
-def main():
-    """Основная функция тестирования"""
-    print("🚀 Тестирование модуля мониторинга сервера")
-    print("=" * 50)
-    
-    try:
-        # Создаем экземпляр мониторинга
-        monitor = ServerMonitor()
-        
-        # Получаем информацию о системе
-        print("📊 Получение информации о системе...")
-        system_info = monitor.get_system_info()
-        
-        if system_info:
-            print("✅ Информация о системе получена успешно")
-            print(f"   CPU: {system_info.get('cpu_percent', 'N/A')}%")
-            print(f"   RAM: {system_info.get('ram_percent', 'N/A')}%")
-            print(f"   Диск: {system_info.get('disk_percent', 'N/A')}%")
-            print(f"   Хост: {system_info.get('server_hostname', 'N/A')}")
-            print(f"   ОС: {monitor.os_type}")
-        else:
-            print("❌ Не удалось получить информацию о системе")
-            return
-        
-        # Проверяем статус процессов
-        print("\n🤖 Проверка статуса процессов...")
-        helper_status, helper_uptime = monitor.check_process_status('helper_bot')
-        
-        print(f"   Helper Bot: {helper_status} - {helper_uptime}")
-        
-        # Получаем метрики для Prometheus
-        print("\n📈 Получение метрик для Prometheus...")
-        metrics = monitor.get_metrics_data()
-        
-        if metrics:
-            print("✅ Метрики получены успешно")
-            for key, value in metrics.items():
-                print(f"   {key}: {value}")
-        else:
-            print("❌ Не удалось получить метрики")
-        
-        # Проверяем алерты
-        print("\n🚨 Проверка алертов...")
-        alerts, recoveries = monitor.check_alerts(system_info)
-        
-        if alerts:
-            print(f"   Найдено алертов: {len(alerts)}")
-            for alert_type, value, details in alerts:
-                print(f"     {alert_type}: {value}% - {details}")
-        else:
-            print("   Алертов не найдено")
-        
-        if recoveries:
-            print(f"   Найдено восстановлений: {len(recoveries)}")
-            for recovery_type, value in recoveries:
-                print(f"     {recovery_type}: {value}%")
-        
-        # Получаем сообщение о статусе
-        print("\n💬 Формирование сообщения о статусе...")
-        status_message = monitor.get_status_message(system_info)
-        if status_message:
-            print("✅ Сообщение о статусе сформировано")
-            print("   Первые 200 символов:")
-            print(f"   {status_message[:200]}...")
-        else:
-            print("❌ Не удалось сформировать сообщение о статусе")
-        
-        print("\n🎉 Тестирование завершено успешно!")
-        
-    except Exception as e:
-        print(f"❌ Ошибка при тестировании: {e}")
-        logging.error(f"Ошибка при тестировании: {e}", exc_info=True)
-        return 1
-    
-    return 0
-
-if __name__ == "__main__":
-    exit(main())
--- a/infra/nginx/AUTH_SETUP.md
+++ b/infra/nginx/AUTH_SETUP.md
@@ -0,0 +1,104 @@
+# Настройка авторизации для мониторинга
+
+## Обзор
+
+Добавлена HTTP Basic Authentication для следующих сервисов:
+- **Prometheus** (`/prometheus/`) - метрики и мониторинг
+- **Alertmanager** (`/alerts/` и `/api/v1/`) - управление алертами
+
+## Управление паролями
+
+### Автоматическая настройка через Ansible
+
+При развертывании через Ansible пароли настраиваются автоматически:
+
+```bash
+# Использовать пароли по умолчанию
+ansible-playbook -i inventory.ini playbook.yml
+
+# Задать свои пароли
+ansible-playbook -i inventory.ini playbook.yml \
+  -e monitoring_username=myuser \
+  -e monitoring_password=mypassword
+```
+
+### Ручная настройка
+
+1. **Создать файл паролей:**
+```bash
+sudo mkdir -p /etc/nginx/passwords
+sudo htpasswd -c /etc/nginx/passwords/monitoring.htpasswd admin
+```
+
+2. **Добавить дополнительных пользователей:**
+```bash
+sudo htpasswd /etc/nginx/passwords/monitoring.htpasswd username
+```
+
+3. **Установить правильные права:**
+```bash
+sudo chown root:www-data /etc/nginx/passwords/monitoring.htpasswd
+sudo chmod 640 /etc/nginx/passwords/monitoring.htpasswd
+```
+
+4. **Перезапустить nginx:**
+```bash
+sudo systemctl reload nginx
+```
+
+### Использование скрипта генерации
+
+```bash
+# Сгенерировать пароль для пользователя admin
+sudo /usr/local/bin/generate_auth_passwords.sh admin
+
+# Сгенерировать пароль для другого пользователя
+sudo /usr/local/bin/generate_auth_passwords.sh myuser
+```
+
+## Доступ к сервисам
+
+После настройки авторизации доступ к сервисам:
+
+- **Prometheus**: `https://your-server/prometheus/`
+- **Alertmanager**: `https://your-server/alerts/`
+- **Alertmanager API**: `https://your-server/api/v1/`
+
+При первом обращении браузер запросит логин и пароль.
+
+## Health Check endpoints
+
+Следующие endpoints остаются доступными без авторизации для мониторинга:
+
+- `https://your-server/prometheus/-/healthy` - проверка состояния Prometheus
+- `https://your-server/nginx-health` - проверка состояния nginx
+
+## Безопасность
+
+- Пароли хранятся в зашифрованном виде в файле `/etc/nginx/passwords/monitoring.htpasswd`
+- Файл доступен только для чтения пользователю root и группе www-data
+- Используется HTTPS для всех соединений
+- Настроена защита от брутфорса через fail2ban
+
+## Устранение проблем
+
+### Проверка конфигурации nginx
+```bash
+sudo nginx -t
+```
+
+### Проверка файла паролей
+```bash
+sudo cat /etc/nginx/passwords/monitoring.htpasswd
+```
+
+### Проверка логов nginx
+```bash
+sudo tail -f /var/log/nginx/error.log
+sudo tail -f /var/log/nginx/access.log
+```
+
+### Сброс пароля
+```bash
+sudo htpasswd /etc/nginx/passwords/monitoring.htpasswd admin
+```
--- a/infra/nginx/README.md
+++ b/infra/nginx/README.md
@@ -0,0 +1,106 @@
+# Nginx Reverse Proxy Configuration
+
+## Обзор
+
+Данная конфигурация nginx обеспечивает безопасный доступ к сервисам мониторинга через HTTPS с самоподписанными SSL сертификатами.
+
+## Архитектура
+
+```
+Интернет → Nginx (443) → 
+    ├→ /grafana → Grafana (3000)
+    ├→ /prometheus → Prometheus (9090)  
+    ├→ /status → Status page (с Basic Auth)
+    └→ / → Redirect to /grafana
+```
+
+## Структура файлов
+
+```
+infra/nginx/
+├── nginx.conf                 # Основная конфигурация nginx
+├── ssl/                       # SSL сертификаты (создаются автоматически)
+│   ├── cert.pem              # SSL сертификат
+│   └── key.pem               # Приватный ключ
+├── conf.d/                    # Конфигурации location'ов
+│   ├── grafana.conf          # Конфиг для Grafana
+│   ├── prometheus.conf       # Конфиг для Prometheus
+│   └── status.conf           # Конфиг для status page
+└── .htpasswd                 # Basic Auth для status page
+```
+
+## Доступ к сервисам
+
+### Grafana
+- **URL**: `https://your-server-ip/grafana/`
+- **Аутентификация**: Grafana admin credentials
+- **Особенности**: Настроен для работы через sub-path
+
+### Prometheus
+- **URL**: `https://your-server-ip/prometheus/`
+- **Особенности**: Полный доступ к Prometheus UI
+
+### Status Page
+- **URL**: `https://your-server-ip/status`
+- **Аутентификация**: Basic Auth (admin/admin123 по умолчанию)
+- **Особенности**: Показывает статус nginx (заготовка для Uptime Kuma)
+
+## Переменные окружения
+
+Добавьте в ваш `.env` файл:
+
+```bash
+# Server Configuration
+SERVER_IP=your_server_ip_here
+
+# Status Page Configuration  
+STATUS_PAGE_PASSWORD=admin123
+```
+
+## Безопасность
+
+- **SSL/TLS**: Самоподписанные сертификаты (365 дней)
+- **Rate Limiting**: 10 req/s для API, 1 req/s для status page
+- **Security Headers**: X-Frame-Options, X-Content-Type-Options, CSP
+- **Basic Auth**: Для status page
+- **Fail2ban**: Интеграция с nginx логами
+
+## Мониторинг
+
+- **Health Check**: `https://your-server-ip/nginx-health`
+- **Nginx Status**: `https://your-server-ip/nginx_status` (только локальные сети)
+- **Logs**: `/var/log/nginx/access.log`, `/var/log/nginx/error.log`
+
+## Развертывание
+
+Конфигурация автоматически развертывается через Ansible playbook:
+
+```bash
+ansible-playbook -i inventory.ini playbook.yml
+```
+
+## Устранение неполадок
+
+### Проверка конфигурации nginx
+```bash
+nginx -t
+```
+
+### Проверка SSL сертификатов
+```bash
+openssl x509 -in /etc/nginx/ssl/cert.pem -text -noout
+```
+
+### Проверка доступности сервисов
+```bash
+curl -k https://your-server-ip/grafana/api/health
+curl -k https://your-server-ip/prometheus/-/healthy
+curl -k https://your-server-ip/nginx-health
+```
+
+## Будущие улучшения
+
+- Интеграция с Uptime Kuma для status page
+- Let's Encrypt сертификаты вместо самоподписанных
+- Дополнительные security headers
+- Мониторинг nginx метрик в Prometheus
--- a/infra/nginx/nginx.conf
+++ b/infra/nginx/nginx.conf
@@ -0,0 +1,403 @@
+user www-data;
+worker_processes auto;
+error_log /var/log/nginx/error.log warn;
+pid /var/run/nginx.pid;
+
+events {
+    worker_connections 1024;
+    use epoll;
+    multi_accept on;
+}
+
+http {
+    include /etc/nginx/mime.types;
+    default_type application/octet-stream;
+
+    # Logging format
+    log_format main '$remote_addr - $remote_user [$time_local] "$request" '
+                    '$status $body_bytes_sent "$http_referer" '
+                    '"$http_user_agent" "$http_x_forwarded_for"';
+
+    access_log /var/log/nginx/access.log main;
+
+    # Basic settings
+    sendfile on;
+    tcp_nopush on;
+    tcp_nodelay on;
+    keepalive_timeout 65;
+    types_hash_max_size 2048;
+    client_max_body_size 16M;
+
+    # Gzip compression
+    gzip on;
+    gzip_vary on;
+    gzip_min_length 1024;
+    gzip_proxied any;
+    gzip_comp_level 6;
+    gzip_types
+        text/plain
+        text/css
+        text/xml
+        text/javascript
+        application/json
+        application/javascript
+        application/xml+rss
+        application/atom+xml
+        image/svg+xml;
+
+    # Rate limiting
+    limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;
+    limit_req_zone $binary_remote_addr zone=status:10m rate=1r/s;
+
+    # Security headers
+    add_header X-Frame-Options "SAMEORIGIN" always;
+    add_header X-Content-Type-Options "nosniff" always;
+    add_header X-XSS-Protection "1; mode=block" always;
+    add_header Referrer-Policy "strict-origin-when-cross-origin" always;
+    add_header Content-Security-Policy "default-src 'self'; script-src 'self' 'unsafe-inline' 'unsafe-eval'; style-src 'self' 'unsafe-inline'; img-src 'self' data: https:; font-src 'self' data:; connect-src 'self' wss: https:;" always;
+
+    # SSL configuration
+    ssl_protocols TLSv1.2 TLSv1.3;
+    ssl_ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES128-SHA256:ECDHE-RSA-AES256-SHA384;
+    ssl_prefer_server_ciphers off;
+    ssl_session_cache shared:SSL:10m;
+    ssl_session_timeout 10m;
+
+    # Upstream configurations
+    upstream grafana_backend {
+        server localhost:3000;
+        keepalive 32;
+    }
+
+    upstream prometheus_backend {
+        server localhost:9090;
+        keepalive 32;
+    }
+
+    upstream uptime_kuma_backend {
+        server localhost:3001;
+        keepalive 32;
+    }
+
+    upstream alertmanager_backend {
+        server localhost:9093;
+        keepalive 32;
+    }
+
+    # Main server block
+    # Redirect HTTP to HTTPS
+    server {
+        listen 80;
+        server_name _;
+        return 301 https://$host$request_uri;
+    }
+
+    server {
+        listen 443 ssl http2;
+        server_name _;
+
+        # SSL configuration (self-signed certificate)
+        ssl_certificate /etc/nginx/ssl/fullchain.pem;
+        ssl_certificate_key /etc/nginx/ssl/privkey.pem;
+        ssl_protocols TLSv1.2 TLSv1.3;
+        ssl_ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES128-SHA256:ECDHE-RSA-AES256-SHA384;
+        ssl_prefer_server_ciphers off;
+        ssl_session_cache shared:SSL:10m;
+        ssl_session_timeout 10m;
+
+        # Security headers
+        add_header X-Frame-Options "SAMEORIGIN" always;
+        add_header X-Content-Type-Options "nosniff" always;
+
+        # Root page - show simple status
+        location = / {
+            return 200 "Bot Infrastructure Status\n\nServices:\n- Grafana: /grafana/\n- Prometheus: /prometheus/\n- Uptime Kuma: /status/\n- Alertmanager: /alerts/\n";
+            add_header Content-Type text/plain;
+        }
+
+        # Health check endpoint
+        location /nginx-health {
+            access_log off;
+            return 200 "healthy\n";
+            add_header Content-Type text/plain;
+        }
+
+        # Uptime Kuma status page
+        location /status {
+            # Rate limiting
+            limit_req zone=status burst=5 nodelay;
+            
+            # Proxy to Uptime Kuma
+            proxy_pass http://127.0.0.1:3001/;
+            proxy_set_header Host $host;
+            proxy_set_header X-Real-IP $remote_addr;
+            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+            proxy_set_header X-Forwarded-Proto $scheme;
+            
+            # WebSocket support
+            proxy_http_version 1.1;
+            proxy_set_header Upgrade $http_upgrade;
+            proxy_set_header Connection "upgrade";
+            
+            # Timeouts
+            proxy_connect_timeout 30s;
+            proxy_send_timeout 30s;
+            proxy_read_timeout 30s;
+            
+            # Buffer settings
+            proxy_buffering on;
+            proxy_buffer_size 4k;
+            proxy_buffers 8 4k;
+        }
+
+        # Uptime Kuma dashboard
+        location /dashboard {
+            # Rate limiting
+            limit_req zone=status burst=5 nodelay;
+            
+            # Proxy to Uptime Kuma
+            proxy_pass http://127.0.0.1:3001/dashboard;
+            proxy_set_header Host $host;
+            proxy_set_header X-Real-IP $remote_addr;
+            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+            proxy_set_header X-Forwarded-Proto $scheme;
+            
+            # WebSocket support
+            proxy_http_version 1.1;
+            proxy_set_header Upgrade $http_upgrade;
+            proxy_set_header Connection "upgrade";
+            
+            # Timeouts
+            proxy_connect_timeout 30s;
+            proxy_send_timeout 30s;
+            proxy_read_timeout 30s;
+            
+            # Buffer settings
+            proxy_buffering on;
+            proxy_buffer_size 4k;
+            proxy_buffers 8 4k;
+        }
+
+        # Uptime Kuma static assets
+        location /assets/ {
+            # Rate limiting
+            limit_req zone=api burst=20 nodelay;
+            
+            # Proxy to Uptime Kuma
+            proxy_pass http://127.0.0.1:3001;
+            proxy_set_header Host $host;
+            proxy_set_header X-Real-IP $remote_addr;
+            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+            proxy_set_header X-Forwarded-Proto $scheme;
+            
+            # Cache static assets
+            expires 1y;
+            add_header Cache-Control "public, immutable";
+        }
+
+        # Uptime Kuma icons and manifest
+        location ~ ^/(icon.*\.(png|svg)|apple-touch-icon.*\.png|manifest\.json)$ {
+            # Rate limiting
+            limit_req zone=api burst=20 nodelay;
+            
+            # Proxy to Uptime Kuma
+            proxy_pass http://127.0.0.1:3001;
+            proxy_set_header Host $host;
+            proxy_set_header X-Real-IP $remote_addr;
+            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+            proxy_set_header X-Forwarded-Proto $scheme;
+            
+            # Cache static assets
+            expires 1y;
+            add_header Cache-Control "public, immutable";
+        }
+
+        # Uptime Kuma WebSocket (Socket.IO)
+        location /socket.io/ {
+            # Rate limiting
+            limit_req zone=api burst=20 nodelay;
+            
+            # Proxy to Uptime Kuma
+            proxy_pass http://127.0.0.1:3001;
+            proxy_set_header Host $host;
+            proxy_set_header X-Real-IP $remote_addr;
+            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+            proxy_set_header X-Forwarded-Proto $scheme;
+            
+            # WebSocket support
+            proxy_http_version 1.1;
+            proxy_set_header Upgrade $http_upgrade;
+            proxy_set_header Connection "upgrade";
+            
+            # Timeouts
+            proxy_connect_timeout 30s;
+            proxy_send_timeout 30s;
+            proxy_read_timeout 30s;
+        }
+
+        # Uptime Kuma API endpoints
+        location /api/ {
+            # Rate limiting
+            limit_req zone=api burst=10 nodelay;
+            
+            # Proxy to Uptime Kuma
+            proxy_pass http://127.0.0.1:3001;
+            proxy_set_header Host $host;
+            proxy_set_header X-Real-IP $remote_addr;
+            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+            proxy_set_header X-Forwarded-Proto $scheme;
+            
+            # CORS headers
+            add_header Access-Control-Allow-Origin "*" always;
+            add_header Access-Control-Allow-Methods "GET, POST, PUT, DELETE, OPTIONS" always;
+            add_header Access-Control-Allow-Headers "DNT,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Range,Authorization" always;
+            
+            # Handle preflight requests
+            if ($request_method = 'OPTIONS') {
+                add_header Access-Control-Allow-Origin "*";
+                add_header Access-Control-Allow-Methods "GET, POST, PUT, DELETE, OPTIONS";
+                add_header Access-Control-Allow-Headers "DNT,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Range,Authorization";
+                add_header Access-Control-Max-Age 1728000;
+                add_header Content-Type "text/plain; charset=utf-8";
+                add_header Content-Length 0;
+                return 204;
+            }
+        }
+
+        # Grafana proxy configuration
+        location /grafana/ {
+            proxy_pass http://grafana_backend;
+            proxy_set_header Host $host;
+            proxy_set_header X-Real-IP $remote_addr;
+            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+            proxy_set_header X-Forwarded-Proto $scheme;
+            proxy_set_header X-Forwarded-Host $host;
+            proxy_set_header X-Forwarded-Port $server_port;
+
+            # WebSocket support for Grafana
+            proxy_http_version 1.1;
+            proxy_set_header Upgrade $http_upgrade;
+            proxy_set_header Connection "upgrade";
+
+            # Timeouts
+            proxy_connect_timeout 60s;
+            proxy_send_timeout 60s;
+            proxy_read_timeout 60s;
+
+            # Buffer settings
+            proxy_buffering on;
+            proxy_buffer_size 4k;
+            proxy_buffers 8 4k;
+            proxy_busy_buffers_size 8k;
+        }
+
+        # Prometheus proxy configuration with authentication
+        location /prometheus/ {
+            # HTTP Basic Authentication
+            auth_basic "Prometheus Monitoring";
+            auth_basic_user_file /etc/nginx/passwords/monitoring.htpasswd;
+            
+            # Rate limiting
+            limit_req zone=api burst=10 nodelay;
+            
+            proxy_pass http://prometheus_backend/prometheus/;
+            proxy_redirect /prometheus/ /prometheus/;
+            proxy_set_header Host $host;
+            proxy_set_header X-Real-IP $remote_addr;
+            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+            proxy_set_header X-Forwarded-Proto $scheme;
+            proxy_set_header X-Forwarded-Host $host;
+            proxy_set_header X-Forwarded-Port $server_port;
+
+            # Timeouts
+            proxy_connect_timeout 30s;
+            proxy_send_timeout 30s;
+            proxy_read_timeout 30s;
+
+            # Buffer settings
+            proxy_buffering on;
+            proxy_buffer_size 4k;
+            proxy_buffers 8 4k;
+            proxy_busy_buffers_size 8k;
+        }
+
+        # Prometheus health check endpoint
+        location /prometheus/-/healthy {
+            proxy_pass http://prometheus_backend/prometheus/-/healthy;
+            proxy_set_header Host $host;
+            access_log off;
+        }
+
+        # Alertmanager proxy configuration with authentication
+        location /alerts/ {
+            # HTTP Basic Authentication
+            auth_basic "Alertmanager Monitoring";
+            auth_basic_user_file /etc/nginx/passwords/monitoring.htpasswd;
+            
+            # Rate limiting
+            limit_req zone=api burst=10 nodelay;
+            
+            # Remove trailing slash for proxy
+            rewrite ^/alerts/(.*)$ /$1 break;
+            
+            # Proxy to Alertmanager
+            proxy_pass http://alertmanager_backend;
+            proxy_set_header Host $host;
+            proxy_set_header X-Real-IP $remote_addr;
+            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+            proxy_set_header X-Forwarded-Proto $scheme;
+            proxy_set_header X-Forwarded-Host $host;
+            proxy_set_header X-Forwarded-Port $server_port;
+
+            # Timeouts
+            proxy_connect_timeout 30s;
+            proxy_send_timeout 30s;
+            proxy_read_timeout 30s;
+
+            # Buffer settings
+            proxy_buffering on;
+            proxy_buffer_size 4k;
+            proxy_buffers 8 4k;
+            proxy_busy_buffers_size 8k;
+
+            # Security headers
+            add_header X-Frame-Options "SAMEORIGIN" always;
+            add_header X-Content-Type-Options "nosniff" always;
+        }
+
+        # Alertmanager API with authentication
+        location /api/v1/ {
+            # HTTP Basic Authentication
+            auth_basic "Alertmanager API";
+            auth_basic_user_file /etc/nginx/passwords/monitoring.htpasswd;
+            
+            # Rate limiting
+            limit_req zone=api burst=20 nodelay;
+            
+            # Proxy to Alertmanager
+            proxy_pass http://alertmanager_backend;
+            proxy_set_header Host $host;
+            proxy_set_header X-Real-IP $remote_addr;
+            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+            proxy_set_header X-Forwarded-Proto $scheme;
+            
+            # CORS headers
+            add_header Access-Control-Allow-Origin "*" always;
+            add_header Access-Control-Allow-Methods "GET, POST, PUT, DELETE, OPTIONS" always;
+            add_header Access-Control-Allow-Headers "DNT,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Range,Authorization" always;
+            
+            # Handle preflight requests
+            if ($request_method = 'OPTIONS') {
+                add_header Access-Control-Allow-Origin "*";
+                add_header Access-Control-Allow-Methods "GET, POST, PUT, DELETE, OPTIONS";
+                add_header Access-Control-Allow-Headers "DNT,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Range,Authorization";
+                add_header Access-Control-Max-Age 1728000;
+                add_header Content-Type "text/plain; charset=utf-8";
+                add_header Content-Length 0;
+                return 204;
+            }
+        }
+
+        # All location configurations are now integrated into this file
+    }
+}
--- a/infra/nginx/ssl/letsencrypt.conf
+++ b/infra/nginx/ssl/letsencrypt.conf
@@ -0,0 +1,27 @@
+# Let's Encrypt SSL Configuration
+# This file contains the SSL configuration for Let's Encrypt certificates
+
+# SSL certificate paths (Let's Encrypt)
+ssl_certificate /etc/letsencrypt/live/{{DOMAIN}}/fullchain.pem;
+ssl_certificate_key /etc/letsencrypt/live/{{DOMAIN}}/privkey.pem;
+
+# SSL Security Configuration
+ssl_protocols TLSv1.2 TLSv1.3;
+ssl_ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES128-SHA256:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-RSA-AES128-SHA256:ECDHE-RSA-AES256-SHA384;
+ssl_prefer_server_ciphers off;
+ssl_session_cache shared:SSL:10m;
+ssl_session_timeout 10m;
+ssl_session_tickets off;
+
+# OCSP Stapling
+ssl_stapling on;
+ssl_stapling_verify on;
+ssl_trusted_certificate /etc/letsencrypt/live/{{DOMAIN}}/chain.pem;
+
+# Security Headers
+add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
+add_header X-Frame-Options "SAMEORIGIN" always;
+add_header X-Content-Type-Options "nosniff" always;
+add_header X-XSS-Protection "1; mode=block" always;
+add_header Referrer-Policy "strict-origin-when-cross-origin" always;
+add_header Content-Security-Policy "default-src 'self'; script-src 'self' 'unsafe-inline' 'unsafe-eval'; style-src 'self' 'unsafe-inline'; img-src 'self' data: https:; font-src 'self' data:; connect-src 'self' wss: https:;" always;
--- a/infra/prometheus/alert_rules.yml
+++ b/infra/prometheus/alert_rules.yml
@@ -0,0 +1,253 @@
+# Prometheus Alert Rules
+# This file defines alerting rules for monitoring the bot infrastructure
+
+groups:
+  # Bot Health Monitoring
+  - name: bot_health
+    rules:
+      # Telegram Bot Health
+      - alert: TelegramBotDown
+        expr: up{job="telegram-bot"} == 0
+        for: 1m
+        labels:
+          severity: critical
+          service: telegram-bot
+        annotations:
+          summary: "Telegram Bot is down"
+          description: "Telegram Bot has been down for more than 1 minute"
+          runbook_url: "https://docs.example.com/runbooks/telegram-bot-down"
+
+      - alert: TelegramBotHighErrorRate
+        expr: rate(http_requests_total{job="telegram-bot",status=~"5.."}[5m]) > 0.1
+        for: 2m
+        labels:
+          severity: warning
+          service: telegram-bot
+        annotations:
+          summary: "Telegram Bot high error rate"
+          description: "Telegram Bot error rate is {{ $value }} errors per second"
+
+      - alert: TelegramBotHighResponseTime
+        expr: histogram_quantile(0.95, rate(http_request_duration_seconds_bucket{job="telegram-bot"}[5m])) > 2
+        for: 5m
+        labels:
+          severity: warning
+          service: telegram-bot
+        annotations:
+          summary: "Telegram Bot high response time"
+          description: "95th percentile response time is {{ $value }} seconds"
+
+      # AnonBot Health
+      - alert: AnonBotDown
+        expr: up{job="anon-bot"} == 0
+        for: 1m
+        labels:
+          severity: critical
+          service: anon-bot
+        annotations:
+          summary: "AnonBot is down"
+          description: "AnonBot has been down for more than 1 minute"
+          runbook_url: "https://docs.example.com/runbooks/anon-bot-down"
+
+      - alert: AnonBotHighErrorRate
+        expr: rate(http_requests_total{job="anon-bot",status=~"5.."}[5m]) > 0.1
+        for: 2m
+        labels:
+          severity: warning
+          service: anon-bot
+        annotations:
+          summary: "AnonBot high error rate"
+          description: "AnonBot error rate is {{ $value }} errors per second"
+
+      - alert: AnonBotHighResponseTime
+        expr: histogram_quantile(0.95, rate(http_request_duration_seconds_bucket{job="anon-bot"}[5m])) > 2
+        for: 5m
+        labels:
+          severity: warning
+          service: anon-bot
+        annotations:
+          summary: "AnonBot high response time"
+          description: "95th percentile response time is {{ $value }} seconds"
+
+  # Infrastructure Health Monitoring
+  - name: infrastructure_health
+    rules:
+      # Prometheus Health
+      - alert: PrometheusDown
+        expr: up{job="prometheus"} == 0
+        for: 1m
+        labels:
+          severity: critical
+          service: prometheus
+        annotations:
+          summary: "Prometheus is down"
+          description: "Prometheus has been down for more than 1 minute"
+
+      - alert: PrometheusHighMemoryUsage
+        expr: (prometheus_tsdb_head_series / prometheus_tsdb_head_series_limit) > 0.8
+        for: 5m
+        labels:
+          severity: warning
+          service: prometheus
+        annotations:
+          summary: "Prometheus high memory usage"
+          description: "Prometheus memory usage is {{ $value | humanizePercentage }} of limit"
+
+      # Grafana Health
+      - alert: GrafanaDown
+        expr: up{job="grafana"} == 0
+        for: 1m
+        labels:
+          severity: critical
+          service: grafana
+        annotations:
+          summary: "Grafana is down"
+          description: "Grafana has been down for more than 1 minute"
+
+      # Nginx Health
+      - alert: NginxDown
+        expr: up{job="nginx"} == 0
+        for: 1m
+        labels:
+          severity: critical
+          service: nginx
+        annotations:
+          summary: "Nginx is down"
+          description: "Nginx has been down for more than 1 minute"
+
+      - alert: NginxHighErrorRate
+        expr: rate(nginx_http_requests_total{status=~"5.."}[5m]) > 0.1
+        for: 2m
+        labels:
+          severity: warning
+          service: nginx
+        annotations:
+          summary: "Nginx high error rate"
+          description: "Nginx error rate is {{ $value }} errors per second"
+
+  # System Resource Monitoring
+  - name: system_resources
+    rules:
+      # High CPU Usage
+      - alert: HighCPUUsage
+        expr: 100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
+        for: 5m
+        labels:
+          severity: warning
+          service: system
+        annotations:
+          summary: "High CPU usage"
+          description: "CPU usage is {{ $value }}% on {{ $labels.instance }}"
+
+      - alert: VeryHighCPUUsage
+        expr: 100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 95
+        for: 2m
+        labels:
+          severity: critical
+          service: system
+        annotations:
+          summary: "Very high CPU usage"
+          description: "CPU usage is {{ $value }}% on {{ $labels.instance }}"
+
+      # High Memory Usage
+      - alert: HighMemoryUsage
+        expr: (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100 > 80
+        for: 5m
+        labels:
+          severity: warning
+          service: system
+        annotations:
+          summary: "High memory usage"
+          description: "Memory usage is {{ $value }}% on {{ $labels.instance }}"
+
+      - alert: VeryHighMemoryUsage
+        expr: (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100 > 95
+        for: 2m
+        labels:
+          severity: critical
+          service: system
+        annotations:
+          summary: "Very high memory usage"
+          description: "Memory usage is {{ $value }}% on {{ $labels.instance }}"
+
+      # Disk Space
+      - alert: LowDiskSpace
+        expr: (1 - (node_filesystem_avail_bytes / node_filesystem_size_bytes)) * 100 > 80
+        for: 5m
+        labels:
+          severity: warning
+          service: system
+        annotations:
+          summary: "Low disk space"
+          description: "Disk usage is {{ $value }}% on {{ $labels.instance }} ({{ $labels.mountpoint }})"
+
+      - alert: VeryLowDiskSpace
+        expr: (1 - (node_filesystem_avail_bytes / node_filesystem_size_bytes)) * 100 > 95
+        for: 2m
+        labels:
+          severity: critical
+          service: system
+        annotations:
+          summary: "Very low disk space"
+          description: "Disk usage is {{ $value }}% on {{ $labels.instance }} ({{ $labels.mountpoint }})"
+
+  # Docker Container Monitoring
+  - name: docker_containers
+    rules:
+      # Container Restart
+      - alert: ContainerRestarting
+        expr: rate(container_start_time_seconds[10m]) > 0
+        for: 0m
+        labels:
+          severity: warning
+          service: docker
+        annotations:
+          summary: "Container restarting"
+          description: "Container {{ $labels.name }} is restarting frequently"
+
+      # Container High Memory Usage
+      - alert: ContainerHighMemoryUsage
+        expr: (container_memory_usage_bytes / container_spec_memory_limit_bytes) * 100 > 80
+        for: 5m
+        labels:
+          severity: warning
+          service: docker
+        annotations:
+          summary: "Container high memory usage"
+          description: "Container {{ $labels.name }} memory usage is {{ $value }}%"
+
+      # Container High CPU Usage
+      - alert: ContainerHighCPUUsage
+        expr: (rate(container_cpu_usage_seconds_total[5m]) / container_spec_cpu_quota * 100) > 80
+        for: 5m
+        labels:
+          severity: warning
+          service: docker
+        annotations:
+          summary: "Container high CPU usage"
+          description: "Container {{ $labels.name }} CPU usage is {{ $value }}%"
+
+  # Database Monitoring
+  - name: database_health
+    rules:
+      # Database Connection Issues
+      - alert: DatabaseConnectionFailed
+        expr: increase(database_connection_errors_total[5m]) > 5
+        for: 1m
+        labels:
+          severity: critical
+          service: database
+        annotations:
+          summary: "Database connection failures"
+          description: "{{ $value }} database connection failures in the last 5 minutes"
+
+      # Database High Query Time
+      - alert: DatabaseHighQueryTime
+        expr: histogram_quantile(0.95, rate(database_query_duration_seconds_bucket[5m])) > 1
+        for: 5m
+        labels:
+          severity: warning
+          service: database
+        annotations:
+          summary: "Database high query time"
+          description: "95th percentile database query time is {{ $value }} seconds"
--- a/infra/prometheus/prometheus.yml
+++ b/infra/prometheus/prometheus.yml
@@ -3,26 +3,23 @@ global:
  evaluation_interval: 15s

 rule_files:
-  # - "first_rules.yml"
-  # - "second_rules.yml"
+  - "alert_rules.yml"

 scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

-  # Job для мониторинга инфраструктуры
-  - job_name: 'infrastructure'
+  # Job для мониторинга Node Exporter
+  - job_name: 'node'
    static_configs:
-      - targets: ['host.docker.internal:9091']  # Порт для метрик сервера мониторинга
-    metrics_path: '/metrics'
-    scrape_interval: 30s
-    scrape_timeout: 10s
-    honor_labels: true
-  
+      - targets: ['172.20.0.1:9100']  # Node Exporter на хосте через Docker gateway
+        labels:
+          instance: 'main-server'
+
  - job_name: 'telegram-helper-bot'
    static_configs:
-      - targets: ['host.docker.internal:8080']  # Локальный бот на порту 8080
+      - targets: ['bots_telegram_bot:8080']  # Локальный бот на порту 8080
        labels:
          bot_name: 'telegram-helper-bot'
          environment: 'production'
@@ -32,8 +29,20 @@ scrape_configs:
    scrape_timeout: 10s
    honor_labels: true

+  - job_name: 'anon-bot'
+    static_configs:
+      - targets: ['bots_anon_bot:8081']  # AnonBot на порту 8081
+        labels:
+          bot_name: 'anon-bot'
+          environment: 'production'
+          service: 'anon-bot'
+    metrics_path: '/metrics'
+    scrape_interval: 15s
+    scrape_timeout: 10s
+    honor_labels: true
+
 alerting:
  alertmanagers:
    - static_configs:
        - targets:
-          # - alertmanager:9093
+          - alertmanager:9093
--- a/infra/uptime-kuma/README.md
+++ b/infra/uptime-kuma/README.md
@@ -0,0 +1,77 @@
+# Uptime Kuma Configuration
+
+Uptime Kuma - это статусная страница для мониторинга доступности сервисов.
+
+## Доступ
+
+- **Веб-интерфейс**: `https://your-domain/status/`
+- **Прямой доступ**: `http://localhost:3001` (только локально)
+
+## Настройка
+
+### Первоначальная настройка
+
+1. Запустите сервисы:
+   ```bash
+   make up
+   ```
+
+2. Откройте `https://your-domain/status/`
+
+3. Создайте администратора:
+   - Username: `admin`
+   - Password: `admin` (смените после первого входа)
+
+### Мониторинг сервисов
+
+Uptime Kuma автоматически настроит мониторинг следующих сервисов:
+
+- **Telegram Bot**: `http://telegram-bot:8080/health`
+- **AnonBot**: `http://anon-bot:8081/health`
+- **Prometheus**: `http://prometheus:9090/-/healthy`
+- **Grafana**: `http://grafana:3000/api/health`
+- **AlertManager**: `http://alertmanager:9093/-/healthy`
+- **Nginx**: `http://nginx:80/nginx-health`
+
+### Уведомления
+
+Настройте уведомления в веб-интерфейсе:
+- Telegram Bot
+- Email
+- Webhook
+- Discord
+- Slack
+
+## Файлы конфигурации
+
+- `monitors.json` - экспорт настроенных мониторов
+- `settings.json` - настройки приложения
+- `backup/` - резервные копии конфигурации
+
+## Команды управления
+
+```bash
+# Показать логи
+make logs-uptime-kuma
+
+# Перезапустить
+make restart-uptime-kuma
+
+# Проверить статус
+make status
+```
+
+## Резервное копирование
+
+Конфигурация сохраняется в Docker volume `uptime_kuma_data`.
+Для резервного копирования:
+
+```bash
+# Создать backup
+make backup
+
+# Восстановить
+make restore FILE=backup.tar.gz
+```
+
+
--- a/infra/uptime-kuma/backup/README.md
+++ b/infra/uptime-kuma/backup/README.md
@@ -0,0 +1,36 @@
+# Uptime Kuma Backup
+
+Эта директория содержит резервные копии конфигурации Uptime Kuma.
+
+## Автоматическое резервное копирование
+
+Создайте скрипт для автоматического бэкапа:
+
+```bash
+#!/bin/bash
+# backup-uptime-kuma.sh
+
+DATE=$(date +%Y%m%d-%H%M%S)
+BACKUP_DIR="/path/to/backups"
+CONTAINER_NAME="bots_uptime_kuma"
+
+# Создать backup
+docker exec $CONTAINER_NAME tar -czf /tmp/uptime-kuma-backup-$DATE.tar.gz /app/data
+
+# Скопировать backup на хост
+docker cp $CONTAINER_NAME:/tmp/uptime-kuma-backup-$DATE.tar.gz $BACKUP_DIR/
+
+# Очистить временные файлы
+docker exec $CONTAINER_NAME rm /tmp/uptime-kuma-backup-$DATE.tar.gz
+
+echo "Backup created: $BACKUP_DIR/uptime-kuma-backup-$DATE.tar.gz"
+```
+
+## Восстановление
+
+```bash
+# Восстановить из backup
+docker cp backup-file.tar.gz $CONTAINER_NAME:/tmp/
+docker exec $CONTAINER_NAME tar -xzf /tmp/backup-file.tar.gz -C /
+docker restart $CONTAINER_NAME
+```
--- a/infra/uptime-kuma/monitors.json
+++ b/infra/uptime-kuma/monitors.json
@@ -0,0 +1,147 @@
+{
+  "monitors": [
+    {
+      "id": 1,
+      "name": "Telegram Bot Health",
+      "url": "http://telegram-bot:8080/health",
+      "type": "http",
+      "method": "GET",
+      "interval": 60,
+      "retries": 3,
+      "timeout": 10,
+      "keyword": null,
+      "maxredirects": 10,
+      "ignoreTls": false,
+      "upsideDown": false,
+      "tags": ["bot", "telegram", "health"],
+      "description": "Мониторинг состояния Telegram Helper Bot",
+      "active": true
+    },
+    {
+      "id": 2,
+      "name": "AnonBot Health",
+      "url": "http://anon-bot:8081/health",
+      "type": "http",
+      "method": "GET",
+      "interval": 60,
+      "retries": 3,
+      "timeout": 10,
+      "keyword": null,
+      "maxredirects": 10,
+      "ignoreTls": false,
+      "upsideDown": false,
+      "tags": ["bot", "anon", "health"],
+      "description": "Мониторинг состояния AnonBot",
+      "active": true
+    },
+    {
+      "id": 3,
+      "name": "Prometheus Health",
+      "url": "http://prometheus:9090/-/healthy",
+      "type": "http",
+      "method": "GET",
+      "interval": 60,
+      "retries": 3,
+      "timeout": 10,
+      "keyword": null,
+      "maxredirects": 10,
+      "ignoreTls": false,
+      "upsideDown": false,
+      "tags": ["monitoring", "prometheus", "health"],
+      "description": "Мониторинг состояния Prometheus",
+      "active": true
+    },
+    {
+      "id": 4,
+      "name": "Grafana Health",
+      "url": "http://grafana:3000/api/health",
+      "type": "http",
+      "method": "GET",
+      "interval": 60,
+      "retries": 3,
+      "timeout": 10,
+      "keyword": null,
+      "maxredirects": 10,
+      "ignoreTls": false,
+      "upsideDown": false,
+      "tags": ["monitoring", "grafana", "health"],
+      "description": "Мониторинг состояния Grafana",
+      "active": true
+    },
+    {
+      "id": 5,
+      "name": "AlertManager Health",
+      "url": "http://alertmanager:9093/-/healthy",
+      "type": "http",
+      "method": "GET",
+      "interval": 60,
+      "retries": 3,
+      "timeout": 10,
+      "keyword": null,
+      "maxredirects": 10,
+      "ignoreTls": false,
+      "upsideDown": false,
+      "tags": ["monitoring", "alertmanager", "health"],
+      "description": "Мониторинг состояния AlertManager",
+      "active": true
+    },
+    {
+      "id": 6,
+      "name": "Nginx Health",
+      "url": "http://nginx:80/nginx-health",
+      "type": "http",
+      "method": "GET",
+      "interval": 60,
+      "retries": 3,
+      "timeout": 10,
+      "keyword": "healthy",
+      "maxredirects": 10,
+      "ignoreTls": false,
+      "upsideDown": false,
+      "tags": ["infrastructure", "nginx", "health"],
+      "description": "Мониторинг состояния Nginx",
+      "active": true
+    },
+    {
+      "id": 7,
+      "name": "External Bot Status",
+      "url": "https://your-domain/status/",
+      "type": "http",
+      "method": "GET",
+      "interval": 300,
+      "retries": 2,
+      "timeout": 15,
+      "keyword": null,
+      "maxredirects": 10,
+      "ignoreTls": false,
+      "upsideDown": false,
+      "tags": ["external", "status-page"],
+      "description": "Мониторинг внешней доступности статусной страницы",
+      "active": false
+    }
+  ],
+  "tags": [
+    {
+      "name": "bot",
+      "color": "#3498db"
+    },
+    {
+      "name": "monitoring",
+      "color": "#e74c3c"
+    },
+    {
+      "name": "infrastructure",
+      "color": "#f39c12"
+    },
+    {
+      "name": "health",
+      "color": "#27ae60"
+    },
+    {
+      "name": "external",
+      "color": "#9b59b6"
+    }
+  ]
+}
+
+
--- a/infra/uptime-kuma/settings.json
+++ b/infra/uptime-kuma/settings.json
@@ -0,0 +1,24 @@
+{
+  "language": "ru",
+  "theme": "light",
+  "timezone": "Europe/Moscow",
+  "dateLocale": "ru",
+  "dateFormat": "YYYY-MM-DD HH:mm:ss",
+  "timeFormat": "24",
+  "weekStart": 1,
+  "searchEngineIndex": true,
+  "primaryBaseURL": "https://your-domain/status/",
+  "public": true,
+  "publicGroupList": true,
+  "showTags": true,
+  "showPoweredBy": false,
+  "keepDataPeriodDays": 365,
+  "retentionCheckInterval": 3600,
+  "maxmindLicenseKey": "",
+  "dnsCache": true,
+  "dnsCacheTtl": 300,
+  "trustProxy": true,
+  "disableAuth": false,
+  "defaultTimezone": "Europe/Moscow",
+  "defaultLanguage": "ru"
+}
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,5 +0,0 @@
-psutil>=5.9.0
-asyncio
-aiohttp>=3.8.0
-python-dotenv>=1.0.0
-requests>=2.28.0
--- a/scripts/deploy-from-github.sh
+++ b/scripts/deploy-from-github.sh
@@ -0,0 +1,109 @@
+#!/bin/bash
+# Скрипт для деплоя из GitHub Actions
+# Используется на сервере для безопасного обновления
+
+set -e
+
+PROJECT_DIR="/home/prod"
+BACKUP_DIR="/home/prod/backups"
+LOG_FILE="/home/prod/logs/deploy.log"
+
+# Создаем директории если их нет
+mkdir -p "$BACKUP_DIR"
+mkdir -p "$(dirname "$LOG_FILE")"
+
+# Функция логирования
+log() {
+    echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
+}
+
+log "🚀 Starting deployment..."
+
+# Переходим в директорию проекта
+cd "$PROJECT_DIR" || exit 1
+
+# Сохраняем текущий коммит
+CURRENT_COMMIT=$(git rev-parse HEAD)
+log "Current commit: $CURRENT_COMMIT"
+
+# Создаем backup конфигурации перед обновлением
+log "💾 Creating backup..."
+BACKUP_FILE="$BACKUP_DIR/backup-$(date +%Y%m%d-%H%M%S).tar.gz"
+tar -czf "$BACKUP_FILE" \
+    infra/prometheus/prometheus.yml \
+    infra/grafana/provisioning/ \
+    docker-compose.yml \
+    2>/dev/null || true
+log "Backup created: $BACKUP_FILE"
+
+# Обновляем код
+log "📥 Pulling latest changes..."
+git fetch origin main
+git reset --hard origin/main
+
+# Проверяем изменения
+NEW_COMMIT=$(git rev-parse HEAD)
+if [ "$CURRENT_COMMIT" = "$NEW_COMMIT" ]; then
+    log "ℹ️  No new changes to deploy"
+    exit 0
+fi
+
+log "✅ Code updated: $CURRENT_COMMIT → $NEW_COMMIT"
+
+# Проверяем синтаксис docker-compose
+log "🔍 Validating docker-compose.yml..."
+if ! docker-compose config > /dev/null 2>&1; then
+    log "❌ docker-compose.yml validation failed!"
+    log "🔄 Rolling back..."
+    git reset --hard "$CURRENT_COMMIT"
+    exit 1
+fi
+
+# Перезапускаем сервисы
+log "🔄 Restarting services..."
+if command -v make &> /dev/null; then
+    make restart
+else
+    docker-compose down
+    docker-compose up -d --build
+fi
+
+# Ждем запуска сервисов
+log "⏳ Waiting for services to start..."
+sleep 20
+
+# Health checks
+log "🏥 Running health checks..."
+
+HEALTH_CHECK_FAILED=0
+
+# Prometheus
+if curl -f http://localhost:9090/-/healthy > /dev/null 2>&1; then
+    log "✅ Prometheus is healthy"
+else
+    log "❌ Prometheus health check failed"
+    HEALTH_CHECK_FAILED=1
+fi
+
+# Grafana
+if curl -f http://localhost:3000/api/health > /dev/null 2>&1; then
+    log "✅ Grafana is healthy"
+else
+    log "❌ Grafana health check failed"
+    HEALTH_CHECK_FAILED=1
+fi
+
+# Если health check не прошел, откатываемся
+if [ $HEALTH_CHECK_FAILED -eq 1 ]; then
+    log "❌ Health checks failed! Rolling back..."
+    git reset --hard "$CURRENT_COMMIT"
+    make restart || docker-compose restart
+    log "🔄 Rollback completed"
+    exit 1
+fi
+
+log "✅ Deployment completed successfully!"
+log "📊 Container status:"
+docker-compose ps || docker ps --filter "name=bots_"
+
+exit 0
--- a/scripts/generate_auth_passwords.sh
+++ b/scripts/generate_auth_passwords.sh
@@ -0,0 +1,31 @@
+#!/bin/bash
+
+# Script to generate HTTP Basic Auth passwords for monitoring services
+# Usage: ./generate_auth_passwords.sh [username]
+
+set -e
+
+# Default username if not provided
+USERNAME=${1:-"admin"}
+
+# Create passwords directory if it doesn't exist
+PASSWORDS_DIR="/etc/nginx/passwords"
+mkdir -p "$PASSWORDS_DIR"
+
+# Generate random password
+PASSWORD=$(openssl rand -base64 32 | tr -d "=+/" | cut -c1-25)
+
+# Create htpasswd file
+echo "Creating password file for user: $USERNAME"
+htpasswd -cb "$PASSWORDS_DIR/monitoring.htpasswd" "$USERNAME" "$PASSWORD"
+
+# Set proper permissions
+chown root:www-data "$PASSWORDS_DIR/monitoring.htpasswd"
+chmod 640 "$PASSWORDS_DIR/monitoring.htpasswd"
+
+echo "Password file created: $PASSWORDS_DIR/monitoring.htpasswd"
+echo "Username: $USERNAME"
+echo "Password: $PASSWORD"
+echo ""
+echo "Save this password securely!"
+echo "You can add more users with: htpasswd $PASSWORDS_DIR/monitoring.htpasswd <username>"
--- a/scripts/setup-ssl.sh
+++ b/scripts/setup-ssl.sh
@@ -0,0 +1,163 @@
+#!/bin/bash
+
+# SSL Setup Script for Let's Encrypt
+# This script sets up SSL certificates using Let's Encrypt
+
+set -e
+
+# Configuration
+DOMAIN="${DOMAIN:-localhost}"
+EMAIL="${EMAIL:-admin@${DOMAIN}}"
+NGINX_CONTAINER="bots_nginx"
+CERTBOT_IMAGE="certbot/certbot:latest"
+
+# Colors for output
+RED='\033[0;31m'
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+NC='\033[0m' # No Color
+
+# Logging function
+log() {
+    echo -e "${GREEN}[$(date +'%Y-%m-%d %H:%M:%S')] $1${NC}"
+}
+
+warn() {
+    echo -e "${YELLOW}[$(date +'%Y-%m-%d %H:%M:%S')] WARNING: $1${NC}"
+}
+
+error() {
+    echo -e "${RED}[$(date +'%Y-%m-%d %H:%M:%S')] ERROR: $1${NC}"
+    exit 1
+}
+
+# Check if running as root
+if [[ $EUID -eq 0 ]]; then
+   error "This script should not be run as root for security reasons"
+fi
+
+# Check if domain is localhost
+if [[ "$DOMAIN" == "localhost" ]]; then
+    warn "Domain is set to localhost. Let's Encrypt certificates cannot be issued for localhost."
+    warn "Please set the DOMAIN environment variable to your actual domain name."
+    warn "Example: DOMAIN=example.com ./scripts/setup-ssl.sh"
+    exit 1
+fi
+
+# Check if Docker is running
+if ! docker info > /dev/null 2>&1; then
+    error "Docker is not running. Please start Docker and try again."
+fi
+
+# Check if nginx container is running
+if ! docker ps | grep -q "$NGINX_CONTAINER"; then
+    error "Nginx container ($NGINX_CONTAINER) is not running. Please start it first with 'docker-compose up -d nginx'"
+fi
+
+log "Setting up SSL certificates for domain: $DOMAIN"
+log "Email for Let's Encrypt: $EMAIL"
+
+# Create necessary directories
+log "Creating Let's Encrypt directories..."
+sudo mkdir -p /etc/letsencrypt/live
+sudo mkdir -p /etc/letsencrypt/archive
+sudo mkdir -p /etc/letsencrypt/renewal
+sudo chmod 755 /etc/letsencrypt
+
+# Stop nginx temporarily for certificate generation
+log "Stopping nginx container for certificate generation..."
+docker stop "$NGINX_CONTAINER" || true
+
+# Generate certificate using certbot
+log "Generating SSL certificate using Let's Encrypt..."
+docker run --rm \
+    -v /etc/letsencrypt:/etc/letsencrypt \
+    -v /var/lib/letsencrypt:/var/lib/letsencrypt \
+    -p 80:80 \
+    -p 443:443 \
+    "$CERTBOT_IMAGE" certonly \
+    --standalone \
+    --non-interactive \
+    --agree-tos \
+    --email "$EMAIL" \
+    --domains "$DOMAIN" \
+    --expand
+
+# Check if certificate was generated successfully
+if [[ ! -f "/etc/letsencrypt/live/$DOMAIN/fullchain.pem" ]]; then
+    error "Failed to generate SSL certificate for $DOMAIN"
+fi
+
+log "SSL certificate generated successfully!"
+
+# Set proper permissions
+log "Setting proper permissions for SSL certificates..."
+sudo chmod 755 /etc/letsencrypt/live
+sudo chmod 755 /etc/letsencrypt/archive
+sudo chmod 644 /etc/letsencrypt/live/"$DOMAIN"/*.pem
+sudo chmod 600 /etc/letsencrypt/live/"$DOMAIN"/privkey.pem
+
+# Update nginx configuration to use Let's Encrypt certificates
+log "Updating nginx configuration..."
+if [[ -f "infra/nginx/ssl/letsencrypt.conf" ]]; then
+    # Replace domain placeholder in letsencrypt.conf
+    sed "s/{{DOMAIN}}/$DOMAIN/g" infra/nginx/ssl/letsencrypt.conf > /tmp/letsencrypt.conf
+    sudo cp /tmp/letsencrypt.conf /etc/letsencrypt/live/"$DOMAIN"/letsencrypt.conf
+    rm /tmp/letsencrypt.conf
+fi
+
+# Start nginx container
+log "Starting nginx container..."
+docker start "$NGINX_CONTAINER"
+
+# Wait for nginx to start
+log "Waiting for nginx to start..."
+sleep 10
+
+# Test SSL certificate
+log "Testing SSL certificate..."
+if curl -k -s "https://$DOMAIN" > /dev/null; then
+    log "SSL certificate is working correctly!"
+else
+    warn "SSL certificate test failed. Please check nginx configuration."
+fi
+
+# Set up automatic renewal
+log "Setting up automatic certificate renewal..."
+cat > /tmp/ssl-renewal.sh << EOF
+#!/bin/bash
+# SSL Certificate Renewal Script
+
+set -e
+
+DOMAIN="$DOMAIN"
+NGINX_CONTAINER="$NGINX_CONTAINER"
+CERTBOT_IMAGE="$CERTBOT_IMAGE"
+
+# Renew certificates
+docker run --rm \\
+    -v /etc/letsencrypt:/etc/letsencrypt \\
+    -v /var/lib/letsencrypt:/var/lib/letsencrypt \\
+    "$CERTBOT_IMAGE" renew --quiet
+
+# Reload nginx
+docker exec "\$NGINX_CONTAINER" nginx -s reload
+
+echo "\$(date): SSL certificates renewed successfully" >> /var/log/ssl-renewal.log
+EOF
+
+sudo mv /tmp/ssl-renewal.sh /usr/local/bin/ssl-renewal.sh
+sudo chmod +x /usr/local/bin/ssl-renewal.sh
+
+# Add cron job for automatic renewal (every Monday at 2 AM)
+log "Adding cron job for automatic renewal..."
+(crontab -l 2>/dev/null; echo "0 2 * * 1 /usr/local/bin/ssl-renewal.sh") | crontab -
+
+log "SSL setup completed successfully!"
+log "Certificate location: /etc/letsencrypt/live/$DOMAIN/"
+log "Automatic renewal is configured to run every Monday at 2 AM"
+log "You can test the renewal manually with: sudo /usr/local/bin/ssl-renewal.sh"
+
+# Display certificate information
+log "Certificate information:"
+openssl x509 -in "/etc/letsencrypt/live/$DOMAIN/fullchain.pem" -text -noout | grep -E "(Subject:|Not Before|Not After|DNS:)"
--- a/tests/infra/conftest.py
+++ b/tests/infra/conftest.py
@@ -1,318 +0,0 @@
-#!/usr/bin/env python3
-"""
-Общие фикстуры для тестов инфраструктуры
-"""
-
-import pytest
-import asyncio
-import sys
-import os
-from unittest.mock import Mock, AsyncMock, patch
-from pathlib import Path
-
-# Добавляем путь к модулям мониторинга
-sys.path.insert(0, os.path.join(os.path.dirname(__file__), '../../infra/monitoring'))
-
-# Настройка pytest-asyncio
-pytest_plugins = ('pytest_asyncio',)
-
-
-@pytest.fixture(scope="session")
-def event_loop():
-    """Создает event loop для асинхронных тестов"""
-    loop = asyncio.get_event_loop_policy().new_event_loop()
-    yield loop
-    loop.close()
-
-
-@pytest.fixture
-def mock_metrics_data():
-    """Создает мок данных метрик для тестов"""
-    return {
-        'cpu_usage_percent': 25.5,
-        'ram_usage_percent': 60.2,
-        'disk_usage_percent': 45.8,
-        'load_average_1m': 1.2,
-        'load_average_5m': 1.1,
-        'load_average_15m': 1.0,
-        'swap_usage_percent': 10.5,
-        'disk_io_percent': 15.3,
-        'system_uptime_seconds': 86400.0,
-        'monitor_uptime_seconds': 3600.0
-    }
-
-
-@pytest.fixture
-def mock_system_info():
-    """Создает мок системной информации для тестов"""
-    return {
-        'cpu_percent': 25.5,
-        'load_avg_1m': 1.2,
-        'load_avg_5m': 1.1,
-        'load_avg_15m': 1.0,
-        'cpu_count': 8,
-        'io_wait_percent': 2.5,
-        'ram_used': 8.0,
-        'ram_total': 16.0,
-        'ram_percent': 50.0,
-        'swap_used': 1.0,
-        'swap_total': 2.0,
-        'swap_percent': 50.0,
-        'disk_used': 100.0,
-        'disk_total': 500.0,
-        'disk_percent': 20.0,
-        'disk_free': 400.0,
-        'disk_read_speed': '1.0 MB/s',
-        'disk_write_speed': '512.0 KB/s',
-        'disk_io_percent': 15,
-        'system_uptime': '1д 0ч 0м',
-        'monitor_uptime': '1ч 0м',
-        'server_hostname': 'test-host',
-        'current_time': '2025-01-01 12:00:00'
-    }
-
-
-@pytest.fixture
-def mock_psutil():
-    """Создает мок для psutil"""
-    mock_psutil = Mock()
-    
-    # Мокаем CPU
-    mock_psutil.cpu_percent.return_value = 25.5
-    mock_psutil.getloadavg.return_value = (1.2, 1.1, 1.0)
-    mock_psutil.cpu_count.return_value = 8
-    
-    # Мокаем память
-    mock_memory = Mock()
-    mock_memory.used = 8 * (1024**3)  # 8 GB
-    mock_memory.total = 16 * (1024**3)  # 16 GB
-    mock_psutil.virtual_memory.return_value = mock_memory
-    
-    mock_swap = Mock()
-    mock_swap.used = 1 * (1024**3)  # 1 GB
-    mock_swap.total = 2 * (1024**3)  # 2 GB
-    mock_swap.percent = 50.0
-    mock_psutil.swap_memory.return_value = mock_swap
-    
-    # Мокаем диск
-    mock_disk = Mock()
-    mock_disk.used = 100 * (1024**3)  # 100 GB
-    mock_disk.total = 500 * (1024**3)  # 500 GB
-    mock_disk.free = 400 * (1024**3)  # 400 GB
-    mock_psutil.disk_usage.return_value = mock_disk
-    
-    # Мокаем disk I/O
-    mock_disk_io = Mock()
-    mock_disk_io.read_count = 1000
-    mock_disk_io.write_count = 500
-    mock_disk_io.read_bytes = 1024 * (1024**2)  # 1 GB
-    mock_disk_io.write_bytes = 512 * (1024**2)  # 512 MB
-    mock_psutil.disk_io_counters.return_value = mock_disk_io
-    
-    # Мокаем boot time
-    import time
-    mock_psutil.boot_time.return_value = time.time() - 86400  # 1 день назад
-    
-    return mock_psutil
-
-
-@pytest.fixture
-def mock_platform():
-    """Создает мок для platform"""
-    mock_platform = Mock()
-    mock_platform.system.return_value = 'Linux'
-    return mock_platform
-
-
-@pytest.fixture
-def mock_subprocess():
-    """Создает мок для subprocess"""
-    mock_subprocess = Mock()
-    
-    # Мокаем успешный результат diskutil
-    mock_result = Mock()
-    mock_result.returncode = 0
-    mock_result.stdout = """
-    Container Total Space: 500.0 GB
-    Container Free Space: 400.0 GB
-    """
-    mock_subprocess.run.return_value = mock_result
-    
-    return mock_subprocess
-
-
-@pytest.fixture
-def mock_os():
-    """Создает мок для os"""
-    mock_os = Mock()
-    mock_os.getenv.side_effect = lambda key, default=None: {
-        'THRESHOLD': '80.0',
-        'RECOVERY_THRESHOLD': '75.0'
-    }.get(key, default)
-    
-    # Мокаем uname
-    mock_uname = Mock()
-    mock_uname.nodename = "test-host"
-    mock_os.uname.return_value = mock_uname
-    
-    return mock_os
-
-
-@pytest.fixture
-def prometheus_config_sample():
-    """Создает пример конфигурации Prometheus для тестов"""
-    return {
-        'global': {
-            'scrape_interval': '15s',
-            'evaluation_interval': '15s'
-        },
-        'rule_files': [
-            '# - "first_rules.yml"',
-            '# - "second_rules.yml"'
-        ],
-        'scrape_configs': [
-            {
-                'job_name': 'prometheus',
-                'static_configs': [
-                    {
-                        'targets': ['localhost:9090']
-                    }
-                ]
-            },
-            {
-                'job_name': 'infrastructure',
-                'static_configs': [
-                    {
-                        'targets': ['host.docker.internal:9091']
-                    }
-                ],
-                'metrics_path': '/metrics',
-                'scrape_interval': '30s',
-                'scrape_timeout': '10s',
-                'honor_labels': True
-            },
-            {
-                'job_name': 'telegram-helper-bot',
-                'static_configs': [
-                    {
-                        'targets': ['bots_telegram_bot:8080'],
-                        'labels': {
-                            'bot_name': 'telegram-helper-bot',
-                            'environment': 'production',
-                            'service': 'telegram-bot'
-                        }
-                    }
-                ],
-                'metrics_path': '/metrics',
-                'scrape_interval': '15s',
-                'scrape_timeout': '10s',
-                'honor_labels': True
-            }
-        ],
-        'alerting': {
-            'alertmanagers': [
-                {
-                    'static_configs': [
-                        {
-                            'targets': [
-                                '# - alertmanager:9093'
-                            ]
-                        }
-                    ]
-                }
-            ]
-        }
-    }
-
-
-@pytest.fixture
-def mock_aiohttp():
-    """Создает мок для aiohttp"""
-    mock_aiohttp = Mock()
-    
-    # Мокаем web.Application
-    mock_app = Mock()
-    mock_aiohttp.web.Application.return_value = mock_app
-    
-    # Мокаем web.Response
-    mock_response = Mock()
-    mock_response.status = 200
-    mock_response.content_type = 'text/plain'
-    mock_response.text = 'Test response'
-    mock_aiohttp.web.Response.return_value = mock_response
-    
-    return mock_aiohttp
-
-
-@pytest.fixture
-def mock_request():
-    """Создает мок для HTTP запроса"""
-    request = Mock()
-    request.method = 'GET'
-    request.path = '/metrics'
-    request.headers = {}
-    return request
-
-
-@pytest.fixture
-def test_environment():
-    """Создает тестовое окружение"""
-    return {
-        'os_type': 'ubuntu',
-        'threshold': 80.0,
-        'recovery_threshold': 75.0,
-        'host': '127.0.0.1',
-        'port': 9091
-    }
-
-
-# Маркеры для категоризации тестов
-def pytest_configure(config):
-    """Настройка маркеров pytest"""
-    config.addinivalue_line(
-        "markers", "asyncio: mark test as async"
-    )
-    config.addinivalue_line(
-        "markers", "slow: mark test as slow"
-    )
-    config.addinivalue_line(
-        "markers", "integration: mark test as integration test"
-    )
-    config.addinivalue_line(
-        "markers", "unit: mark test as unit test"
-    )
-    config.addinivalue_line(
-        "markers", "prometheus: mark test as prometheus related"
-    )
-    config.addinivalue_line(
-        "markers", "metrics: mark test as metrics related"
-    )
-
-
-# Автоматическая маркировка тестов
-def pytest_collection_modifyitems(config, items):
-    """Автоматически маркирует тесты по их расположению"""
-    for item in items:
-        # Маркируем асинхронные тесты
-        if "async" in item.name or "Async" in item.name:
-            item.add_marker(pytest.mark.asyncio)
-        
-        # Маркируем интеграционные тесты
-        if "integration" in item.name.lower() or "Integration" in str(item.cls):
-            item.add_marker(pytest.mark.integration)
-        
-        # Маркируем unit тесты
-        if "unit" in item.name.lower() or "Unit" in str(item.cls):
-            item.add_marker(pytest.mark.unit)
-        
-        # Маркируем медленные тесты
-        if "slow" in item.name.lower() or "Slow" in str(item.cls):
-            item.add_marker(pytest.mark.slow)
-        
-        # Маркируем тесты Prometheus
-        if "prometheus" in item.name.lower() or "Prometheus" in str(item.cls):
-            item.add_marker(pytest.mark.prometheus)
-        
-        # Маркируем тесты метрик
-        if "metrics" in item.name.lower() or "Metrics" in str(item.cls):
-            item.add_marker(pytest.mark.metrics)
--- a/tests/infra/test_alert_delays.py
+++ b/tests/infra/test_alert_delays.py
@@ -1,230 +0,0 @@
-import pytest
-import time
-from unittest.mock import Mock, patch
-import sys
-import os
-
-# Добавляем путь к модулю для импорта
-sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..', '..', 'infra', 'monitoring'))
-
-from metrics_collector import MetricsCollector
-
-
-class TestAlertDelays:
-    """Тесты для механизма задержки алертов"""
-    
-    def setup_method(self):
-        """Настройка перед каждым тестом"""
-        # Мокаем переменные окружения
-        with patch.dict(os.environ, {
-            'CPU_ALERT_DELAY': '5',    # 5 секунд для быстрого тестирования
-            'RAM_ALERT_DELAY': '7',    # 7 секунд для быстрого тестирования
-            'DISK_ALERT_DELAY': '10'   # 10 секунд для быстрого тестирования
-        }):
-            self.collector = MetricsCollector()
-    
-    def test_alert_delays_initialization(self):
-        """Тест инициализации задержек алертов"""
-        assert self.collector.alert_delays['cpu'] == 5
-        assert self.collector.alert_delays['ram'] == 7
-        assert self.collector.alert_delays['disk'] == 10
-        
-        # Проверяем, что время начала превышения инициализировано как None
-        assert self.collector.alert_start_times['cpu'] is None
-        assert self.collector.alert_start_times['ram'] is None
-        assert self.collector.alert_start_times['disk'] is None
-    
-    def test_cpu_alert_delay_logic(self):
-        """Тест логики задержки алерта CPU"""
-        # Симулируем превышение порога CPU
-        system_info = {
-            'cpu_percent': 85.0,  # Выше порога 80%
-            'ram_percent': 70.0,  # Нормально
-            'disk_percent': 75.0, # Нормально
-            'load_avg_1m': 2.5,
-            'ram_used': 8.0,
-            'ram_total': 16.0,
-            'disk_free': 25.0
-        }
-        
-        # Первая проверка - должно начать отсчет задержки
-        alerts, recoveries = self.collector.check_alerts(system_info)
-        assert len(alerts) == 0  # Алерт еще не отправлен
-        assert self.collector.alert_start_times['cpu'] is not None  # Время начала установлено
-        
-        # Проверяем, что состояние алерта не изменилось
-        assert not self.collector.alert_states['cpu']
-        
-        # Симулируем время, прошедшее с начала превышения
-        # Устанавливаем время начала в прошлое (больше задержки)
-        self.collector.alert_start_times['cpu'] = time.time() - 6  # 6 секунд назад
-        
-        # Теперь алерт должен сработать
-        alerts, recoveries = self.collector.check_alerts(system_info)
-        assert len(alerts) == 1  # Алерт отправлен
-        assert alerts[0][0] == 'cpu'  # Тип алерта
-        assert alerts[0][1] == 85.0   # Значение CPU
-        assert self.collector.alert_states['cpu']  # Состояние алерта установлено
-    
-    def test_alert_reset_on_recovery(self):
-        """Тест сброса алерта при восстановлении"""
-        # Сначала превышаем порог и ждем задержку
-        system_info_high = {
-            'cpu_percent': 85.0,
-            'ram_percent': 70.0,
-            'disk_percent': 75.0,
-            'load_avg_1m': 2.5,
-            'ram_used': 8.0,
-            'ram_total': 16.0,
-            'disk_free': 25.0
-        }
-        
-        # Устанавливаем время начала превышения в прошлое
-        self.collector.alert_start_times['cpu'] = time.time() - 6
-        
-        # Проверяем - алерт должен сработать
-        alerts, recoveries = self.collector.check_alerts(system_info_high)
-        assert len(alerts) == 1  # Алерт отправлен
-        assert self.collector.alert_states['cpu']  # Состояние установлено
-        
-        # Теперь симулируем восстановление
-        system_info_low = {
-            'cpu_percent': 70.0,  # Ниже порога восстановления 75%
-            'ram_percent': 70.0,
-            'disk_percent': 75.0,
-            'load_avg_1m': 1.2,
-            'ram_used': 8.0,
-            'ram_total': 16.0,
-            'disk_free': 25.0
-        }
-        
-        alerts, recoveries = self.collector.check_alerts(system_info_low)
-        assert len(recoveries) == 1  # Сообщение о восстановлении
-        assert recoveries[0][0] == 'cpu'  # Тип восстановления
-        assert not self.collector.alert_states['cpu']  # Состояние сброшено
-        assert self.collector.alert_start_times['cpu'] is None  # Время сброшено
-    
-    def test_multiple_metrics_alert(self):
-        """Тест алертов по нескольким метрикам одновременно"""
-        system_info = {
-            'cpu_percent': 85.0,  # Выше порога
-            'ram_percent': 85.0,  # Выше порога
-            'disk_percent': 75.0, # Нормально
-            'load_avg_1m': 2.5,
-            'ram_used': 13.0,
-            'ram_total': 16.0,
-            'disk_free': 25.0
-        }
-        
-        # Устанавливаем время начала превышения для CPU и RAM в прошлое
-        self.collector.alert_start_times['cpu'] = time.time() - 6  # Больше CPU_ALERT_DELAY (5 сек)
-        self.collector.alert_start_times['ram'] = time.time() - 8  # Больше RAM_ALERT_DELAY (7 сек)
-        
-        alerts, recoveries = self.collector.check_alerts(system_info)
-        assert len(alerts) == 2  # Два алерта: CPU и RAM
-        
-        # Проверяем типы алертов
-        alert_types = [alert[0] for alert in alerts]
-        assert 'cpu' in alert_types
-        assert 'ram' in alert_types
-        
-        # Проверяем состояния
-        assert self.collector.alert_states['cpu']
-        assert self.collector.alert_states['ram']
-        assert not self.collector.alert_states['disk']
-    
-    def test_alert_delay_customization(self):
-        """Тест настройки пользовательских задержек"""
-        # Тестируем с другими значениями задержек
-        with patch.dict(os.environ, {
-            'CPU_ALERT_DELAY': '2',
-            'RAM_ALERT_DELAY': '3',
-            'DISK_ALERT_DELAY': '4'
-        }):
-            collector = MetricsCollector()
-            
-            assert collector.alert_delays['cpu'] == 2
-            assert collector.alert_delays['ram'] == 3
-            assert collector.alert_delays['disk'] == 4
-    
-    def test_no_false_alerts(self):
-        """Тест отсутствия ложных алертов при кратковременных пиках"""
-        system_info = {
-            'cpu_percent': 85.0,
-            'ram_percent': 70.0,
-            'disk_percent': 75.0,
-            'load_avg_1m': 2.5,
-            'ram_used': 8.0,
-            'ram_total': 16.0,
-            'disk_free': 25.0
-        }
-        
-        # Проверяем сразу после превышения порога
-        alerts, recoveries = self.collector.check_alerts(system_info)
-        assert len(alerts) == 0  # Алерт не должен сработать сразу
-        
-        # Проверяем, что время начала установлено
-        assert self.collector.alert_start_times['cpu'] is not None
-        
-        # Проверяем через короткое время (до истечения задержки)
-        # Устанавливаем время начала в прошлое, но меньше задержки
-        self.collector.alert_start_times['cpu'] = time.time() - 2  # 2 секунды назад
-        
-        alerts, recoveries = self.collector.check_alerts(system_info)
-        assert len(alerts) == 0  # Алерт все еще не должен сработать
-    
-    def test_alert_state_persistence(self):
-        """Тест сохранения состояния алерта между проверками"""
-        system_info = {
-            'cpu_percent': 85.0,
-            'ram_percent': 70.0,
-            'disk_percent': 75.0,
-            'load_avg_1m': 2.5,
-            'ram_used': 8.0,
-            'ram_total': 16.0,
-            'disk_free': 25.0
-        }
-        
-        # Первая проверка - начинаем отсчет
-        alerts, recoveries = self.collector.check_alerts(system_info)
-        assert len(alerts) == 0
-        initial_time = self.collector.alert_start_times['cpu']
-        assert initial_time is not None
-        
-        # Проверяем еще раз - время начала должно сохраниться
-        alerts, recoveries = self.collector.check_alerts(system_info)
-        assert len(alerts) == 0
-        assert self.collector.alert_start_times['cpu'] == initial_time  # Время не изменилось
-    
-    def test_disk_alert_delay(self):
-        """Тест задержки алерта для диска"""
-        system_info = {
-            'cpu_percent': 70.0,
-            'ram_percent': 70.0,
-            'disk_percent': 85.0,  # Выше порога
-            'load_avg_1m': 1.2,
-            'ram_used': 8.0,
-            'ram_total': 16.0,
-            'disk_free': 15.0
-        }
-        
-        # Первая проверка
-        alerts, recoveries = self.collector.check_alerts(system_info)
-        assert len(alerts) == 0
-        assert self.collector.alert_start_times['disk'] is not None
-        
-        # Устанавливаем время начала превышения в прошлое, но меньше задержки
-        self.collector.alert_start_times['disk'] = time.time() - 5  # 5 секунд назад (меньше DISK_ALERT_DELAY)
-        alerts, recoveries = self.collector.check_alerts(system_info)
-        assert len(alerts) == 0  # Алерт не должен сработать
-        
-        # Устанавливаем время начала превышения в прошлое, больше задержки
-        self.collector.alert_start_times['disk'] = time.time() - 11  # 11 секунд назад (больше DISK_ALERT_DELAY)
-        alerts, recoveries = self.collector.check_alerts(system_info)
-        assert len(alerts) == 1  # Алерт должен сработать
-        assert alerts[0][0] == 'disk'
-
-
-if __name__ == '__main__':
-    pytest.main([__file__])
-
--- a/tests/infra/test_infra.py
+++ b/tests/infra/test_infra.py
@@ -1,102 +0,0 @@
-#!/usr/bin/env python3
-"""
-Тесты для инфраструктуры мониторинга
-"""
-
-import pytest
-import sys
-import os
-
-# Добавляем путь к модулям мониторинга
-sys.path.insert(0, os.path.join(os.path.dirname(__file__), '../../infra/monitoring'))
-
-def test_imports():
-    """Тест импорта основных модулей"""
-    try:
-        from metrics_collector import MetricsCollector
-        from message_sender import MessageSender
-        from prometheus_server import PrometheusServer
-        from server_monitor import ServerMonitor
-        assert True
-    except ImportError as e:
-        pytest.fail(f"Failed to import modules: {e}")
-
-def test_metrics_collector_creation():
-    """Тест создания MetricsCollector"""
-    try:
-        from metrics_collector import MetricsCollector
-        collector = MetricsCollector()
-        assert collector is not None
-        assert hasattr(collector, 'get_system_info')
-        assert hasattr(collector, 'get_metrics_data')
-    except Exception as e:
-        pytest.fail(f"Failed to create MetricsCollector: {e}")
-
-def test_message_sender_creation():
-    """Тест создания MessageSender"""
-    try:
-        from message_sender import MessageSender
-        sender = MessageSender()
-        assert sender is not None
-    except Exception as e:
-        pytest.fail(f"Failed to create MessageSender: {e}")
-
-def test_prometheus_server_creation():
-    """Тест создания PrometheusServer"""
-    try:
-        from prometheus_server import PrometheusServer
-        server = PrometheusServer()
-        assert server is not None
-        assert hasattr(server, 'host')
-        assert hasattr(server, 'port')
-    except Exception as e:
-        pytest.fail(f"Failed to create PrometheusServer: {e}")
-
-def test_server_monitor_creation():
-    """Тест создания ServerMonitor"""
-    try:
-        from server_monitor import ServerMonitor
-        monitor = ServerMonitor()
-        assert monitor is not None
-        assert hasattr(monitor, 'metrics_collector')
-        assert hasattr(monitor, 'message_sender')
-        assert hasattr(monitor, 'prometheus_server')
-    except Exception as e:
-        pytest.fail(f"Failed to create ServerMonitor: {e}")
-
-def test_system_info_structure():
-    """Тест структуры системной информации"""
-    try:
-        from metrics_collector import MetricsCollector
-        collector = MetricsCollector()
-        system_info = collector.get_system_info()
-        
-        # Проверяем, что system_info это словарь
-        assert isinstance(system_info, dict)
-        
-        # Проверяем наличие основных ключей
-        expected_keys = ['cpu_percent', 'ram_percent', 'disk_percent', 'server_hostname']
-        for key in expected_keys:
-            assert key in system_info, f"Missing key: {key}"
-            
-    except Exception as e:
-        pytest.fail(f"Failed to get system info: {e}")
-
-def test_metrics_data_structure():
-    """Тест структуры метрик"""
-    try:
-        from metrics_collector import MetricsCollector
-        collector = MetricsCollector()
-        metrics = collector.get_metrics_data()
-        
-        # Проверяем, что metrics это словарь
-        assert isinstance(metrics, dict)
-        
-        # Проверяем, что есть хотя бы одна метрика
-        assert len(metrics) > 0, "Metrics should not be empty"
-        
-    except Exception as e:
-        pytest.fail(f"Failed to get metrics data: {e}")
-
-if __name__ == "__main__":
-    pytest.main([__file__, "-v"])
--- a/tests/infra/test_message_sender.py
+++ b/tests/infra/test_message_sender.py
@@ -1,92 +0,0 @@
-#!/usr/bin/env python3
-"""
-Тесты для MessageSender
-"""
-
-import pytest
-import sys
-import os
-
-# Добавляем путь к модулям мониторинга
-sys.path.insert(0, os.path.join(os.path.dirname(__file__), '../../infra/monitoring'))
-
-from infra.monitoring.message_sender import MessageSender
-
-
-class TestMessageSender:
-    """Тесты для класса MessageSender"""
-    
-    @pytest.fixture
-    def message_sender(self):
-        """Создает экземпляр MessageSender для тестов"""
-        return MessageSender()
-    
-    def test_get_cpu_emoji(self, message_sender):
-        """Тест получения эмодзи для CPU"""
-        # Тест зеленого уровня (нормальная нагрузка)
-        assert message_sender._get_cpu_emoji(25.0) == "🟢"
-        assert message_sender._get_cpu_emoji(49.9) == "🟢"
-        
-        # Тест желтого уровня (средняя нагрузка)
-        assert message_sender._get_cpu_emoji(50.0) == "⚠️"
-        assert message_sender._get_cpu_emoji(79.9) == "⚠️"
-        
-        # Тест красного уровня (высокая нагрузка)
-        assert message_sender._get_cpu_emoji(80.0) == "🚨"
-        assert message_sender._get_cpu_emoji(95.0) == "🚨"
-    
-    def test_get_memory_emoji(self, message_sender):
-        """Тест получения эмодзи для памяти"""
-        # Тест зеленого уровня (нормальное использование)
-        assert message_sender._get_memory_emoji(30.0) == "🟢"
-        assert message_sender._get_memory_emoji(59.9) == "🟢"
-        
-        # Тест желтого уровня (среднее использование)
-        assert message_sender._get_memory_emoji(60.0) == "⚠️"
-        assert message_sender._get_memory_emoji(84.9) == "⚠️"
-        
-        # Тест красного уровня (высокое использование)
-        assert message_sender._get_memory_emoji(85.0) == "🚨"
-        assert message_sender._get_memory_emoji(95.0) == "🚨"
-    
-    def test_get_load_average_emoji(self, message_sender):
-        """Тест получения эмодзи для Load Average"""
-        # Тест зеленого уровня (нормальная нагрузка)
-        assert message_sender._get_load_average_emoji(4.0, 8) == "🟢"  # 0.5 на ядро
-        assert message_sender._get_load_average_emoji(7.9, 8) == "🟢"  # 0.9875 на ядро
-        
-        # Тест желтого уровня (средняя нагрузка)
-        assert message_sender._get_load_average_emoji(8.0, 8) == "⚠️"  # 1.0 на ядро
-        assert message_sender._get_load_average_emoji(15.9, 8) == "⚠️"  # 1.9875 на ядро
-        
-        # Тест красного уровня (высокая нагрузка)
-        assert message_sender._get_load_average_emoji(16.0, 8) == "🚨"  # 2.0 на ядро
-        assert message_sender._get_load_average_emoji(24.0, 8) == "🚨"  # 3.0 на ядро
-    
-    def test_get_io_wait_emoji(self, message_sender):
-        """Тест получения эмодзи для IO Wait"""
-        # Тест зеленого уровня (нормальный IO Wait)
-        assert message_sender._get_io_wait_emoji(2.0) == "🟢"
-        assert message_sender._get_io_wait_emoji(4.9) == "🟢"
-        
-        # Тест желтого уровня (средний IO Wait)
-        assert message_sender._get_io_wait_emoji(5.0) == "⚠️"
-        assert message_sender._get_io_wait_emoji(19.9) == "⚠️"
-        
-        # Тест красного уровня (высокий IO Wait)
-        assert message_sender._get_io_wait_emoji(20.0) == "🚨"
-        assert message_sender._get_io_wait_emoji(35.0) == "🚨"
-    
-    def test_get_disk_space_emoji(self, message_sender):
-        """Тест получения эмодзи для дискового пространства"""
-        # Тест зеленого уровня (нормальное использование)
-        assert message_sender._get_disk_space_emoji(30.0) == "🟢"
-        assert message_sender._get_disk_space_emoji(59.9) == "🟢"
-        
-        # Тест желтого уровня (среднее использование)
-        assert message_sender._get_disk_space_emoji(60.0) == "⚠️"
-        assert message_sender._get_disk_space_emoji(89.9) == "⚠️"
-        
-        # Тест красного уровня (высокое использование)
-        assert message_sender._get_disk_space_emoji(90.0) == "🚨"
-        assert message_sender._get_disk_space_emoji(95.0) == "🚨"
--- a/tests/infra/test_metrics_collector.py
+++ b/tests/infra/test_metrics_collector.py
@@ -1,464 +0,0 @@
-#!/usr/bin/env python3
-"""
-Тесты для MetricsCollector
-"""
-
-import pytest
-import sys
-import os
-import time
-import platform
-from unittest.mock import Mock, patch, MagicMock
-from datetime import datetime
-
-# Добавляем путь к модулям мониторинга
-sys.path.insert(0, os.path.join(os.path.dirname(__file__), '../../infra/monitoring'))
-
-from infra.monitoring.metrics_collector import MetricsCollector
-
-
-class TestMetricsCollector:
-    """Тесты для класса MetricsCollector"""
-    
-    @pytest.fixture
-    def metrics_collector(self):
-        """Создает экземпляр MetricsCollector для тестов"""
-        return MetricsCollector()
-    
-    @pytest.fixture
-    def mock_psutil(self):
-        """Мок для psutil"""
-        mock_psutil = Mock()
-        
-        # Мокаем CPU
-        mock_psutil.cpu_percent.return_value = 25.5
-        mock_psutil.getloadavg.return_value = (1.2, 1.1, 1.0)
-        mock_psutil.cpu_count.return_value = 8
-        
-        # Мокаем память
-        mock_memory = Mock()
-        mock_memory.used = 8 * (1024**3)  # 8 GB
-        mock_memory.total = 16 * (1024**3)  # 16 GB
-        mock_psutil.virtual_memory.return_value = mock_memory
-        
-        mock_swap = Mock()
-        mock_swap.used = 1 * (1024**3)  # 1 GB
-        mock_swap.total = 2 * (1024**3)  # 2 GB
-        mock_swap.percent = 50.0
-        mock_psutil.swap_memory.return_value = mock_swap
-        
-        # Мокаем диск
-        mock_disk = Mock()
-        mock_disk.used = 100 * (1024**3)  # 100 GB
-        mock_disk.total = 500 * (1024**3)  # 500 GB
-        mock_disk.free = 400 * (1024**3)  # 400 GB
-        mock_psutil.disk_usage.return_value = mock_disk
-        
-        # Мокаем disk I/O
-        mock_disk_io = Mock()
-        mock_disk_io.read_count = 1000
-        mock_disk_io.write_count = 500
-        mock_disk_io.read_bytes = 1024 * (1024**2)  # 1 GB
-        mock_disk_io.write_bytes = 512 * (1024**2)  # 512 MB
-        mock_psutil.disk_io_counters.return_value = mock_disk_io
-        
-        # Мокаем boot time
-        mock_psutil.boot_time.return_value = time.time() - 86400  # 1 день назад
-        
-        return mock_psutil
-    
-    def test_init(self, metrics_collector):
-        """Тест инициализации MetricsCollector"""
-        assert metrics_collector.threshold == 80.0
-        assert metrics_collector.recovery_threshold == 75.0
-        assert isinstance(metrics_collector.alert_states, dict)
-        assert 'cpu' in metrics_collector.alert_states
-        assert 'ram' in metrics_collector.alert_states
-        assert 'disk' in metrics_collector.alert_states
-        assert metrics_collector.monitor_start_time > 0
-    
-    def test_detect_os_macos(self):
-        """Тест определения macOS"""
-        with patch('platform.system', return_value='Darwin'):
-            collector = MetricsCollector()
-            assert collector.os_type == "macos"
-    
-    def test_detect_os_linux(self):
-        """Тест определения Linux"""
-        with patch('platform.system', return_value='Linux'):
-            collector = MetricsCollector()
-            assert collector.os_type == "ubuntu"
-    
-    def test_detect_os_unknown(self):
-        """Тест определения неизвестной ОС"""
-        with patch('platform.system', return_value='Windows'):
-            collector = MetricsCollector()
-            assert collector.os_type == "unknown"
-    
-    def test_get_disk_path(self, metrics_collector):
-        """Тест получения пути к диску"""
-        # Для всех ОС должен возвращаться "/"
-        assert metrics_collector._get_disk_path() == "/"
-    
-    @patch('subprocess.run')
-    def test_get_macos_disk_usage_success(self, mock_subprocess, metrics_collector):
-        """Тест получения информации о диске macOS через diskutil"""
-        # Настраиваем мок для macOS
-        metrics_collector.os_type = "macos"
-        
-        # Мокаем успешный вывод diskutil
-        mock_result = Mock()
-        mock_result.returncode = 0
-        mock_result.stdout = """
-        Container Total Space: 500.0 GB
-        Container Free Space: 400.0 GB
-        """
-        mock_subprocess.return_value = mock_result
-        
-        disk_info = metrics_collector._get_macos_disk_usage()
-        
-        assert disk_info is not None
-        assert disk_info.total == 500.0 * (1024**3)  # В байтах
-        assert disk_info.free == 400.0 * (1024**3)
-        assert disk_info.used == 100.0 * (1024**3)
-    
-    @patch('subprocess.run')
-    def test_get_macos_disk_usage_fallback(self, mock_subprocess, metrics_collector):
-        """Тест fallback к psutil при ошибке diskutil"""
-        metrics_collector.os_type = "macos"
-        
-        # Мокаем неуспешный вывод diskutil
-        mock_result = Mock()
-        mock_result.returncode = 1
-        mock_subprocess.return_value = mock_result
-        
-        with patch('metrics_collector.psutil.disk_usage') as mock_psutil_disk:
-            mock_disk = Mock()
-            mock_disk.used = 100 * (1024**3)
-            mock_disk.total = 500 * (1024**3)
-            mock_disk.free = 400 * (1024**3)
-            mock_psutil_disk.return_value = mock_disk
-            
-            disk_info = metrics_collector._get_macos_disk_usage()
-            assert disk_info == mock_disk
-    
-    def test_get_system_uptime(self, metrics_collector):
-        """Тест получения uptime системы"""
-        with patch('metrics_collector.psutil.boot_time') as mock_boot_time:
-            mock_boot_time.return_value = time.time() - 3600  # 1 час назад
-            
-            uptime = metrics_collector._get_system_uptime()
-            assert uptime > 0
-            assert uptime <= 3600.1  # Не больше часа (с небольшим допуском)
-    
-    def test_get_monitor_uptime(self, metrics_collector):
-        """Тест получения uptime мониторинга"""
-        # Ждем немного, чтобы uptime изменился
-        time.sleep(0.1)
-        
-        uptime = metrics_collector.get_monitor_uptime()
-        assert isinstance(uptime, str)
-        assert 'м' in uptime or 'ч' in uptime or 'д' in uptime
-    
-    def test_get_system_info_success(self, metrics_collector):
-        """Тест получения системной информации"""
-        # Мокаем все необходимые функции psutil
-        with patch('metrics_collector.psutil.cpu_percent', return_value=25.5) as mock_cpu, \
-             patch('metrics_collector.psutil.getloadavg', return_value=(1.2, 1.1, 1.0)) as mock_load, \
-             patch('metrics_collector.psutil.cpu_count', return_value=8) as mock_cpu_count, \
-             patch('metrics_collector.psutil.cpu_times_percent') as mock_cpu_times, \
-             patch('metrics_collector.psutil.virtual_memory') as mock_virtual_memory, \
-             patch('metrics_collector.psutil.swap_memory') as mock_swap_memory, \
-             patch('metrics_collector.psutil.disk_usage') as mock_disk_usage, \
-             patch('metrics_collector.psutil.disk_io_counters') as mock_disk_io, \
-             patch('metrics_collector.psutil.boot_time', return_value=time.time() - 86400) as mock_boot_time, \
-             patch('os.uname') as mock_uname:
-            
-            # Настраиваем моки для CPU
-            mock_cpu_times_obj = Mock()
-            mock_cpu_times_obj.iowait = 2.5
-            mock_cpu_times.return_value = mock_cpu_times_obj
-            
-            # Настраиваем моки для памяти
-            mock_memory = Mock()
-            mock_memory.used = 8 * (1024**3)
-            mock_memory.total = 16 * (1024**3)
-            mock_virtual_memory.return_value = mock_memory
-            
-            # Настраиваем моки для swap
-            mock_swap = Mock()
-            mock_swap.used = 1 * (1024**3)
-            mock_swap.total = 2 * (1024**3)
-            mock_swap.percent = 50.0
-            mock_swap_memory.return_value = mock_swap
-            
-            # Настраиваем моки для диска
-            mock_disk = Mock()
-            mock_disk.used = 100 * (1024**3)
-            mock_disk.total = 500 * (1024**3)
-            mock_disk.free = 400 * (1024**3)
-            mock_disk_usage.return_value = mock_disk
-            
-            # Настраиваем моки для disk I/O
-            mock_disk_io_obj = Mock()
-            mock_disk_io_obj.read_count = 1000
-            mock_disk_io_obj.write_count = 500
-            mock_disk_io_obj.read_bytes = 1024 * (1024**2)
-            mock_disk_io_obj.write_bytes = 512 * (1024**2)
-            mock_disk_io.return_value = mock_disk_io_obj
-            
-            # Настраиваем мок для hostname
-            mock_uname.return_value.nodename = "test-host"
-            
-            # Мокаем _get_disk_usage чтобы возвращал наш мок
-            with patch.object(metrics_collector, '_get_disk_usage', return_value=mock_disk):
-                system_info = metrics_collector.get_system_info()
-                
-                assert isinstance(system_info, dict)
-                assert 'cpu_percent' in system_info
-                assert 'ram_percent' in system_info
-                assert 'disk_percent' in system_info
-                assert 'io_wait_percent' in system_info
-                assert 'server_hostname' in system_info
-                
-                # Проверяем расчеты
-                assert system_info['cpu_percent'] == 25.5
-                assert system_info['ram_percent'] == 50.0  # 8/16 * 100
-                assert system_info['disk_percent'] == 20.0  # 100/500 * 100
-                assert system_info['io_wait_percent'] == 2.5
-                assert system_info['server_hostname'] == "test-host"
-    
-    def test_get_system_info_error(self, metrics_collector):
-        """Тест получения системной информации при ошибке"""
-        with patch('metrics_collector.psutil.cpu_percent', side_effect=Exception("Test error")):
-            system_info = metrics_collector.get_system_info()
-            assert system_info == {}
-    
-    def test_format_bytes(self, metrics_collector):
-        """Тест форматирования байтов"""
-        assert metrics_collector._format_bytes(0) == "0 B"
-        assert metrics_collector._format_bytes(1024) == "1.0 KB"
-        assert metrics_collector._format_bytes(1024**2) == "1.0 MB"
-        assert metrics_collector._format_bytes(1024**3) == "1.0 GB"
-        assert metrics_collector._format_bytes(1024**4) == "1.0 TB"
-    
-    def test_format_uptime(self, metrics_collector):
-        """Тест форматирования uptime"""
-        assert metrics_collector._format_uptime(60) == "1м"
-        assert metrics_collector._format_uptime(3600) == "1ч 0м"
-        assert metrics_collector._format_uptime(86400) == "1д 0ч 0м"
-        assert metrics_collector._format_uptime(90000) == "1д 1ч 0м"
-    
-    def test_check_process_status_pid_file(self, metrics_collector, tmp_path):
-        """Тест проверки статуса процесса по PID файлу"""
-        # Создаем временный PID файл
-        pid_file = tmp_path / "helper_bot.pid"
-        pid_file.write_text("12345")
-        
-        # Временно заменяем путь к PID файлу
-        original_pid_files = metrics_collector.pid_files.copy()
-        metrics_collector.pid_files['helper_bot'] = str(pid_file)
-        
-        with patch('metrics_collector.psutil.pid_exists', return_value=True), \
-             patch('metrics_collector.psutil.Process') as mock_process:
-            
-            mock_proc = Mock()
-            mock_proc.create_time.return_value = time.time() - 3600
-            mock_process.return_value = mock_proc
-            
-            status, uptime = metrics_collector.check_process_status('helper_bot')
-            
-            assert status == "✅"
-            assert "Uptime" in uptime
-        
-        # Восстанавливаем оригинальные PID файлы
-        metrics_collector.pid_files = original_pid_files
-    
-    def test_check_process_status_not_running(self, metrics_collector):
-        """Тест проверки статуса неработающего процесса"""
-        with patch('metrics_collector.psutil.process_iter', return_value=[]):
-            status, message = metrics_collector.check_process_status('nonexistent_bot')
-            assert status == "❌"
-            assert message == "Выключен"
-    
-    def test_calculate_disk_speed(self, metrics_collector):
-        """Тест расчета скорости диска"""
-        # Инициализируем базовые значения
-        metrics_collector._initialize_disk_io()
-        
-        # Создаем текущую статистику диска
-        current_disk_io = Mock()
-        current_disk_io.read_bytes = 2048 * (1024**2)  # 2 GB
-        current_disk_io.write_bytes = 1024 * (1024**2)  # 1 GB
-        
-        # Ждем немного для расчета скорости
-        time.sleep(0.1)
-        
-        read_speed, write_speed = metrics_collector._calculate_disk_speed(current_disk_io)
-        
-        assert isinstance(read_speed, str)
-        assert isinstance(write_speed, str)
-        assert "/s" in read_speed
-        assert "/s" in write_speed
-    
-    def test_calculate_disk_io_percent(self, metrics_collector):
-        """Тест расчета процента загрузки диска"""
-        # Инициализируем базовые значения
-        metrics_collector._initialize_disk_io()
-        
-        # Создаем текущую статистику диска
-        current_disk_io = Mock()
-        current_disk_io.read_count = 2000
-        current_disk_io.write_count = 1000
-        current_disk_io.read_bytes = 2048 * (1024**2)
-        current_disk_io.write_bytes = 1024 * (1024**2)
-        
-        # Ждем немного для расчета
-        time.sleep(0.1)
-        
-        io_percent = metrics_collector._calculate_disk_io_percent()
-        
-        assert isinstance(io_percent, int)
-        assert 0 <= io_percent <= 100
-    
-    def test_get_metrics_data(self, metrics_collector):
-        """Тест получения данных для метрик Prometheus"""
-        with patch.object(metrics_collector, 'get_system_info') as mock_get_system_info:
-            mock_get_system_info.return_value = {
-                'cpu_percent': 25.5,
-                'ram_percent': 60.2,
-                'disk_percent': 45.8,
-                'load_avg_1m': 1.2,
-                'load_avg_5m': 1.1,
-                'load_avg_15m': 1.0,
-                'swap_percent': 10.5
-            }
-            
-            with patch.object(metrics_collector, '_get_system_uptime', return_value=86400.0):
-                metrics_data = metrics_collector.get_metrics_data()
-                
-                assert isinstance(metrics_data, dict)
-                assert 'cpu_usage_percent' in metrics_data
-                assert 'ram_usage_percent' in metrics_data
-                assert 'disk_usage_percent' in metrics_data
-                assert 'load_average_1m' in metrics_data
-                assert 'system_uptime_seconds' in metrics_data
-                assert 'monitor_uptime_seconds' in metrics_data
-    
-    def test_check_alerts(self, metrics_collector):
-        """Тест проверки алертов"""
-        # Сбрасываем состояния алертов для чистого теста
-        metrics_collector.alert_states = {'cpu': False, 'ram': False, 'disk': False}
-        metrics_collector.alert_start_times = {'cpu': None, 'ram': None, 'disk': None}
-        
-        # Устанавливаем минимальные задержки для тестов
-        metrics_collector.alert_delays = {'cpu': 0, 'ram': 0, 'disk': 0}
-        
-        # Тестируем превышение порога CPU
-        system_info = {
-            'cpu_percent': 85.0,  # Выше порога 80.0
-            'ram_percent': 60.0,  # Ниже порога
-            'disk_percent': 70.0,  # Ниже порога
-            'load_avg_1m': 2.5,
-            'ram_used': 8.0,
-            'ram_total': 16.0,
-            'disk_free': 300.0
-        }
-        
-        alerts, recoveries = metrics_collector.check_alerts(system_info)
-        
-        assert len(alerts) == 1
-        assert alerts[0][0] == 'cpu'  # Тип алерта
-        assert alerts[0][1] == 85.0   # Значение
-        assert len(recoveries) == 0
-        
-        # Проверяем, что состояние алерта изменилось
-        assert metrics_collector.alert_states['cpu'] is True
-        
-        # Тестируем восстановление
-        system_info['cpu_percent'] = 70.0  # Ниже recovery threshold 75.0
-        
-        alerts, recoveries = metrics_collector.check_alerts(system_info)
-        
-        assert len(alerts) == 0
-        assert len(recoveries) == 1
-        assert recoveries[0][0] == 'cpu'
-        assert metrics_collector.alert_states['cpu'] is False
-    
-    def test_environment_variables(self):
-        """Тест работы с переменными окружения"""
-        with patch.dict(os.environ, {'THRESHOLD': '90.0', 'RECOVERY_THRESHOLD': '85.0'}):
-            collector = MetricsCollector()
-            assert collector.threshold == 90.0
-            assert collector.recovery_threshold == 85.0
-    
-    def test_metrics_collector_integration(self, metrics_collector):
-        """Интеграционный тест MetricsCollector"""
-        # Проверяем, что можем получить системную информацию
-        system_info = metrics_collector.get_system_info()
-        
-        # Даже если некоторые метрики недоступны, должны получить словарь
-        assert isinstance(system_info, dict)
-        
-        # Проверяем, что можем получить метрики для Prometheus
-        metrics_data = metrics_collector.get_metrics_data()
-        assert isinstance(metrics_data, dict)
-        
-        # Проверяем, что можем проверить алерты
-        alerts, recoveries = metrics_collector.check_alerts(system_info)
-        assert isinstance(alerts, list)
-        assert isinstance(recoveries, list)
-
-
-class TestMetricsCollectorEdgeCases:
-    """Тесты граничных случаев для MetricsCollector"""
-    
-    def test_empty_system_info(self):
-        """Тест работы с пустой системной информацией"""
-        with patch('metrics_collector.psutil.cpu_percent', side_effect=Exception("Test error")):
-            collector = MetricsCollector()
-            system_info = collector.get_system_info()
-            assert system_info == {}
-    
-    def test_missing_disk_info(self):
-        """Тест работы при отсутствии информации о диске"""
-        collector = MetricsCollector()
-        
-        with patch.object(collector, '_get_disk_usage', return_value=None), \
-             patch('metrics_collector.psutil.cpu_percent', side_effect=Exception("Test error")):
-            system_info = collector.get_system_info()
-            assert system_info == {}
-    
-    def test_disk_io_calculation_without_previous_data(self):
-        """Тест расчета I/O диска без предыдущих данных"""
-        collector = MetricsCollector()
-        
-        # Сбрасываем предыдущие данные
-        collector.last_disk_io = None
-        collector.last_disk_io_time = None
-        
-        current_disk_io = Mock()
-        current_disk_io.read_bytes = 1024
-        current_disk_io.write_bytes = 512
-        
-        read_speed, write_speed = collector._calculate_disk_speed(current_disk_io)
-        
-        assert read_speed == "0 B/s"
-        assert write_speed == "0 B/s"
-    
-    def test_uptime_calculation_edge_cases(self):
-        """Тест расчета uptime для граничных случаев"""
-        collector = MetricsCollector()
-        
-        # Тест для очень малого времени
-        assert collector._format_uptime(0) == "0м"
-        assert collector._format_uptime(30) == "0м"
-        
-        # Тест для очень большого времени
-        large_uptime = 365 * 24 * 3600  # 1 год
-        uptime_str = collector._format_uptime(large_uptime)
-        assert "д" in uptime_str
-
-
-if __name__ == "__main__":
-    pytest.main([__file__, "-v"])
--- a/tests/infra/test_prometheus_config.py
+++ b/tests/infra/test_prometheus_config.py
@@ -3,340 +3,351 @@
 Тесты для конфигурации Prometheus
 """

-import pytest
-import yaml
-import sys
 import os
+import sys
 from pathlib import Path

-# Добавляем путь к модулям мониторинга
-sys.path.insert(0, os.path.join(os.path.dirname(__file__), '../../infra/monitoring'))
+import pytest
+import yaml


 class TestPrometheusConfig:
    """Тесты для конфигурации Prometheus"""
-    
+
    @pytest.fixture
    def prometheus_config_path(self):
        """Путь к файлу конфигурации Prometheus"""
-        return Path(__file__).parent.parent.parent / 'infra' / 'prometheus' / 'prometheus.yml'
-    
+        return (
+            Path(__file__).parent.parent.parent
+            / "infra"
+            / "prometheus"
+            / "prometheus.yml"
+        )
+
    @pytest.fixture
    def prometheus_config(self, prometheus_config_path):
        """Загруженная конфигурация Prometheus"""
        if not prometheus_config_path.exists():
            pytest.skip(f"Prometheus config file not found: {prometheus_config_path}")
-        
-        with open(prometheus_config_path, 'r', encoding='utf-8') as f:
+
+        with open(prometheus_config_path, "r", encoding="utf-8") as f:
            return yaml.safe_load(f)
-    
+
    def test_config_file_exists(self, prometheus_config_path):
        """Тест существования файла конфигурации"""
-        assert prometheus_config_path.exists(), f"Prometheus config file not found: {prometheus_config_path}"
-    
+        assert (
+            prometheus_config_path.exists()
+        ), f"Prometheus config file not found: {prometheus_config_path}"
+
    def test_config_is_valid_yaml(self, prometheus_config):
        """Тест валидности YAML конфигурации"""
-        assert isinstance(prometheus_config, dict), "Config should be a valid YAML dictionary"
-    
+        assert isinstance(
+            prometheus_config, dict
+        ), "Config should be a valid YAML dictionary"
+
    def test_global_section(self, prometheus_config):
        """Тест глобальной секции конфигурации"""
-        assert 'global' in prometheus_config, "Config should have global section"
-        
-        global_config = prometheus_config['global']
-        assert 'scrape_interval' in global_config, "Global section should have scrape_interval"
-        assert 'evaluation_interval' in global_config, "Global section should have evaluation_interval"
-        
+        assert "global" in prometheus_config, "Config should have global section"
+
+        global_config = prometheus_config["global"]
+        assert (
+            "scrape_interval" in global_config
+        ), "Global section should have scrape_interval"
+        assert (
+            "evaluation_interval" in global_config
+        ), "Global section should have evaluation_interval"
+
        # Проверяем значения интервалов
-        assert global_config['scrape_interval'] == '15s', "Default scrape_interval should be 15s"
-        assert global_config['evaluation_interval'] == '15s', "Default evaluation_interval should be 15s"
-    
+        assert (
+            global_config["scrape_interval"] == "15s"
+        ), "Default scrape_interval should be 15s"
+        assert (
+            global_config["evaluation_interval"] == "15s"
+        ), "Default evaluation_interval should be 15s"
+
    def test_scrape_configs_section(self, prometheus_config):
        """Тест секции scrape_configs"""
-        assert 'scrape_configs' in prometheus_config, "Config should have scrape_configs section"
-        
-        scrape_configs = prometheus_config['scrape_configs']
+        assert (
+            "scrape_configs" in prometheus_config
+        ), "Config should have scrape_configs section"
+
+        scrape_configs = prometheus_config["scrape_configs"]
        assert isinstance(scrape_configs, list), "scrape_configs should be a list"
        assert len(scrape_configs) >= 1, "Should have at least one scrape config"
-    
+
    def test_prometheus_job(self, prometheus_config):
        """Тест job для самого Prometheus"""
-        scrape_configs = prometheus_config['scrape_configs']
-        
+        scrape_configs = prometheus_config["scrape_configs"]
+
        # Ищем job для prometheus
        prometheus_job = None
        for job in scrape_configs:
-            if job.get('job_name') == 'prometheus':
+            if job.get("job_name") == "prometheus":
                prometheus_job = job
                break
-        
+
        assert prometheus_job is not None, "Should have prometheus job"
-        assert 'static_configs' in prometheus_job, "Prometheus job should have static_configs"
-        
-        static_configs = prometheus_job['static_configs']
+        assert (
+            "static_configs" in prometheus_job
+        ), "Prometheus job should have static_configs"
+
+        static_configs = prometheus_job["static_configs"]
        assert isinstance(static_configs, list), "static_configs should be a list"
        assert len(static_configs) > 0, "Should have at least one static config"
-        
+
        # Проверяем targets
-        targets = static_configs[0].get('targets', [])
-        assert 'localhost:9090' in targets, "Prometheus should scrape localhost:9090"
-    
-    def test_infrastructure_job(self, prometheus_config):
-        """Тест job для инфраструктуры"""
-        scrape_configs = prometheus_config['scrape_configs']
-        
-        # Ищем job для infrastructure
-        infra_job = None
-        for job in scrape_configs:
-            if job.get('job_name') == 'infrastructure':
-                infra_job = job
-                break
-        
-        assert infra_job is not None, "Should have infrastructure job"
-        
-        # Проверяем основные параметры
-        assert 'static_configs' in infra_job, "Infrastructure job should have static_configs"
-        assert 'metrics_path' in infra_job, "Infrastructure job should have metrics_path"
-        assert 'scrape_interval' in infra_job, "Infrastructure job should have scrape_interval"
-        assert 'scrape_timeout' in infra_job, "Infrastructure job should have scrape_timeout"
-        assert 'honor_labels' in infra_job, "Infrastructure job should have honor_labels"
-        
-        # Проверяем значения
-        assert infra_job['metrics_path'] == '/metrics', "Metrics path should be /metrics"
-        assert infra_job['scrape_interval'] == '30s', "Scrape interval should be 30s"
-        assert infra_job['scrape_timeout'] == '10s', "Scrape timeout should be 10s"
-        assert infra_job['honor_labels'] is True, "honor_labels should be True"
-        
-        # Проверяем targets
-        static_configs = infra_job['static_configs']
-        assert len(static_configs) > 0, "Should have at least one static config"
-        
-        targets = static_configs[0].get('targets', [])
-        assert 'host.docker.internal:9091' in targets, "Should scrape host.docker.internal:9091"
-    
+        targets = static_configs[0].get("targets", [])
+        assert "localhost:9090" in targets, "Prometheus should scrape localhost:9090"
+
    def test_telegram_bot_job(self, prometheus_config):
        """Тест job для telegram-helper-bot"""
-        scrape_configs = prometheus_config['scrape_configs']
-        
+        scrape_configs = prometheus_config["scrape_configs"]
+
        # Ищем job для telegram-helper-bot
        bot_job = None
        for job in scrape_configs:
-            if job.get('job_name') == 'telegram-helper-bot':
+            if job.get("job_name") == "telegram-helper-bot":
                bot_job = job
                break
-        
+
        assert bot_job is not None, "Should have telegram-helper-bot job"
-        
+
        # Проверяем основные параметры
-        assert 'static_configs' in bot_job, "Bot job should have static_configs"
-        assert 'metrics_path' in bot_job, "Bot job should have metrics_path"
-        assert 'scrape_interval' in bot_job, "Bot job should have scrape_interval"
-        assert 'scrape_timeout' in bot_job, "Bot job should have scrape_timeout"
-        assert 'honor_labels' in bot_job, "Bot job should have honor_labels"
-        
+        assert "static_configs" in bot_job, "Bot job should have static_configs"
+        assert "metrics_path" in bot_job, "Bot job should have metrics_path"
+        assert "scrape_interval" in bot_job, "Bot job should have scrape_interval"
+        assert "scrape_timeout" in bot_job, "Bot job should have scrape_timeout"
+        assert "honor_labels" in bot_job, "Bot job should have honor_labels"
+
        # Проверяем значения
-        assert bot_job['metrics_path'] == '/metrics', "Metrics path should be /metrics"
-        assert bot_job['scrape_interval'] == '15s', "Scrape interval should be 15s"
-        assert bot_job['scrape_timeout'] == '10s', "Scrape timeout should be 10s"
-        assert bot_job['honor_labels'] is True, "honor_labels should be True"
-        
+        assert bot_job["metrics_path"] == "/metrics", "Metrics path should be /metrics"
+        assert bot_job["scrape_interval"] == "15s", "Scrape interval should be 15s"
+        assert bot_job["scrape_timeout"] == "10s", "Scrape timeout should be 10s"
+        assert bot_job["honor_labels"] is True, "honor_labels should be True"
+
        # Проверяем static_configs
-        static_configs = bot_job['static_configs']
+        static_configs = bot_job["static_configs"]
        assert len(static_configs) > 0, "Should have at least one static config"
-        
+
        # Проверяем targets
-        targets = static_configs[0].get('targets', [])
-        assert 'host.docker.internal:8080' in targets, "Should scrape host.docker.internal:8080"
-        
+        targets = static_configs[0].get("targets", [])
+        assert (
+            "bots_telegram_bot:8080" in targets
+        ), "Should scrape bots_telegram_bot:8080"
+
        # Проверяем labels
-        labels = static_configs[0].get('labels', {})
+        labels = static_configs[0].get("labels", {})
        expected_labels = {
-            'bot_name': 'telegram-helper-bot',
-            'environment': 'production',
-            'service': 'telegram-bot'
+            "bot_name": "telegram-helper-bot",
+            "environment": "production",
+            "service": "telegram-bot",
        }
-        
+
        for key, value in expected_labels.items():
            assert key in labels, f"Should have label {key}"
            assert labels[key] == value, f"Label {key} should be {value}"
-    
+
    def test_alerting_section(self, prometheus_config):
        """Тест секции alerting"""
-        assert 'alerting' in prometheus_config, "Config should have alerting section"
-        
-        alerting_config = prometheus_config['alerting']
-        assert 'alertmanagers' in alerting_config, "Alerting section should have alertmanagers"
-        
-        alertmanagers = alerting_config['alertmanagers']
+        assert "alerting" in prometheus_config, "Config should have alerting section"
+
+        alerting_config = prometheus_config["alerting"]
+        assert (
+            "alertmanagers" in alerting_config
+        ), "Alerting section should have alertmanagers"
+
+        alertmanagers = alerting_config["alertmanagers"]
        assert isinstance(alertmanagers, list), "alertmanagers should be a list"
-        
-        # Проверяем, что alertmanager закомментирован (не активен)
-        # Это нормально для тестовой среды
+
+        # Проверяем, что alertmanager настроен правильно
        if len(alertmanagers) > 0:
            for am in alertmanagers:
-                if 'static_configs' in am:
-                    static_configs = am['static_configs']
+                if "static_configs" in am:
+                    static_configs = am["static_configs"]
+                    assert isinstance(
+                        static_configs, list
+                    ), "static_configs should be a list"
                    for sc in static_configs:
-                        if 'targets' in sc:
-                            targets = sc['targets']
+                        if "targets" in sc:
+                            targets = sc["targets"]
                            # targets может быть None если все строки закомментированы
                            if targets is not None:
-                                # Проверяем, что все targets закомментированы
+                                assert isinstance(
+                                    targets, list
+                                ), "targets should be a list"
+                                # Проверяем, что targets не пустые и имеют правильный формат
                                for target in targets:
-                                    assert target.startswith('#'), f"Alertmanager target should be commented: {target}"
-    
+                                    assert isinstance(
+                                        target, str
+                                    ), f"Target should be a string: {target}"
+                                    # Если target не закомментирован, проверяем формат
+                                    if not target.startswith("#"):
+                                        assert (
+                                            ":" in target
+                                        ), f"Target should have port: {target}"
+
    def test_rule_files_section(self, prometheus_config):
        """Тест секции rule_files"""
-        assert 'rule_files' in prometheus_config, "Config should have rule_files section"
-        
-        rule_files = prometheus_config['rule_files']
+        assert (
+            "rule_files" in prometheus_config
+        ), "Config should have rule_files section"
+
+        rule_files = prometheus_config["rule_files"]
        # rule_files может быть None если все строки закомментированы
        if rule_files is not None:
            assert isinstance(rule_files, list), "rule_files should be a list"
-            
-            # Проверяем, что все rule files закомментированы
+
+            # Проверяем, что rule files имеют правильный формат
            for rule_file in rule_files:
-                assert rule_file.startswith('#'), f"Rule file should be commented: {rule_file}"
-    
+                assert isinstance(
+                    rule_file, str
+                ), f"Rule file should be a string: {rule_file}"
+                # Если rule file не закомментирован, проверяем, что это валидный путь
+                if not rule_file.startswith("#"):
+                    assert rule_file.endswith(".yml") or rule_file.endswith(
+                        ".yaml"
+                    ), f"Rule file should have .yml or .yaml extension: {rule_file}"
+
    def test_config_structure_consistency(self, prometheus_config):
        """Тест консистентности структуры конфигурации"""
        # Проверяем, что все job'ы имеют одинаковую структуру
-        scrape_configs = prometheus_config['scrape_configs']
-        
-        required_fields = ['job_name', 'static_configs']
-        optional_fields = ['metrics_path', 'scrape_interval', 'scrape_timeout', 'honor_labels']
-        
+        scrape_configs = prometheus_config["scrape_configs"]
+
+        required_fields = ["job_name", "static_configs"]
+        optional_fields = [
+            "metrics_path",
+            "scrape_interval",
+            "scrape_timeout",
+            "honor_labels",
+        ]
+
        for job in scrape_configs:
            # Проверяем обязательные поля
            for field in required_fields:
-                assert field in job, f"Job {job.get('job_name', 'unknown')} missing required field: {field}"
-            
+                assert (
+                    field in job
+                ), f"Job {job.get('job_name', 'unknown')} missing required field: {field}"
+
            # Проверяем, что static_configs содержит targets
-            static_configs = job['static_configs']
-            assert isinstance(static_configs, list), f"Job {job.get('job_name', 'unknown')} static_configs should be list"
-            
+            static_configs = job["static_configs"]
+            assert isinstance(
+                static_configs, list
+            ), f"Job {job.get('job_name', 'unknown')} static_configs should be list"
+
            for static_config in static_configs:
-                assert 'targets' in static_config, f"Static config should have targets"
-                targets = static_config['targets']
+                assert "targets" in static_config, f"Static config should have targets"
+                targets = static_config["targets"]
                assert isinstance(targets, list), "Targets should be a list"
                assert len(targets) > 0, "Targets should not be empty"
-    
+
    def test_port_configurations(self, prometheus_config):
        """Тест конфигурации портов"""
-        scrape_configs = prometheus_config['scrape_configs']
-        
+        scrape_configs = prometheus_config["scrape_configs"]
+
        # Проверяем, что порты корректно настроены
        for job in scrape_configs:
-            static_configs = job['static_configs']
+            static_configs = job["static_configs"]
            for static_config in static_configs:
-                targets = static_config['targets']
+                targets = static_config["targets"]
                for target in targets:
-                    if ':' in target:
-                        host, port = target.split(':', 1)
+                    if ":" in target:
+                        host, port = target.split(":", 1)
                        # Проверяем, что порт это число
                        try:
                            port_num = int(port)
-                            assert 1 <= port_num <= 65535, f"Port {port_num} out of range"
+                            assert (
+                                1 <= port_num <= 65535
+                            ), f"Port {port_num} out of range"
                        except ValueError:
                            # Это может быть Docker service name без порта
                            pass
-    
+
    def test_environment_labels(self, prometheus_config):
        """Тест labels окружения"""
-        scrape_configs = prometheus_config['scrape_configs']
-        
+        scrape_configs = prometheus_config["scrape_configs"]
+
        # Проверяем, что production окружение правильно помечено
        for job in scrape_configs:
-            if job.get('job_name') == 'telegram-helper-bot':
-                static_configs = job['static_configs']
+            if job.get("job_name") == "telegram-helper-bot":
+                static_configs = job["static_configs"]
                for static_config in static_configs:
-                    labels = static_config.get('labels', {})
-                    if 'environment' in labels:
-                        assert labels['environment'] == 'production', "Environment should be production"
-    
+                    labels = static_config.get("labels", {})
+                    if "environment" in labels:
+                        assert (
+                            labels["environment"] == "production"
+                        ), "Environment should be production"
+
    def test_metrics_path_consistency(self, prometheus_config):
        """Тест консистентности paths для метрик"""
-        scrape_configs = prometheus_config['scrape_configs']
-        
+        scrape_configs = prometheus_config["scrape_configs"]
+
        # Проверяем, что все job'ы используют /metrics
        for job in scrape_configs:
-            if 'metrics_path' in job:
-                assert job['metrics_path'] == '/metrics', f"Job {job.get('job_name', 'unknown')} should use /metrics path"
+            if "metrics_path" in job:
+                assert (
+                    job["metrics_path"] == "/metrics"
+                ), f"Job {job.get('job_name', 'unknown')} should use /metrics path"


 class TestPrometheusConfigValidation:
    """Тесты валидации конфигурации Prometheus"""
-    
+
    @pytest.fixture
    def sample_valid_config(self):
        """Пример валидной конфигурации"""
        return {
-            'global': {
-                'scrape_interval': '15s',
-                'evaluation_interval': '15s'
-            },
-            'scrape_configs': [
+            "global": {"scrape_interval": "15s", "evaluation_interval": "15s"},
+            "scrape_configs": [
                {
-                    'job_name': 'test',
-                    'static_configs': [
-                        {
-                            'targets': ['localhost:9090']
-                        }
-                    ]
+                    "job_name": "test",
+                    "static_configs": [{"targets": ["localhost:9090"]}],
                }
-            ]
+            ],
        }
-    
+
    def test_minimal_valid_config(self, sample_valid_config):
        """Тест минимальной валидной конфигурации"""
        # Проверяем, что конфигурация содержит все необходимые поля
-        assert 'global' in sample_valid_config
-        assert 'scrape_configs' in sample_valid_config
-        
-        global_config = sample_valid_config['global']
-        assert 'scrape_interval' in global_config
-        assert 'evaluation_interval' in global_config
-        
-        scrape_configs = sample_valid_config['scrape_configs']
+        assert "global" in sample_valid_config
+        assert "scrape_configs" in sample_valid_config
+
+        global_config = sample_valid_config["global"]
+        assert "scrape_interval" in global_config
+        assert "evaluation_interval" in global_config
+
+        scrape_configs = sample_valid_config["scrape_configs"]
        assert len(scrape_configs) > 0
-        
+
        for job in scrape_configs:
-            assert 'job_name' in job
-            assert 'static_configs' in job
-            
-            static_configs = job['static_configs']
+            assert "job_name" in job
+            assert "static_configs" in job
+
+            static_configs = job["static_configs"]
            assert len(static_configs) > 0
-            
+
            for static_config in static_configs:
-                assert 'targets' in static_config
-                targets = static_config['targets']
+                assert "targets" in static_config
+                targets = static_config["targets"]
                assert len(targets) > 0
-    
+
    def test_config_without_required_fields(self):
        """Тест конфигурации без обязательных полей"""
        # Конфигурация без global секции
-        config_without_global = {
-            'scrape_configs': []
-        }
-        
+        config_without_global = {"scrape_configs": []}
+
        # Конфигурация без scrape_configs
-        config_without_scrape = {
-            'global': {
-                'scrape_interval': '15s'
-            }
-        }
-        
+        config_without_scrape = {"global": {"scrape_interval": "15s"}}
+
        # Конфигурация с пустыми scrape_configs
        config_empty_scrape = {
-            'global': {
-                'scrape_interval': '15s'
-            },
-            'scrape_configs': []
+            "global": {"scrape_interval": "15s"},
+            "scrape_configs": [],
        }
-        
+
        # Все эти конфигурации должны быть невалидными
-        assert 'global' not in config_without_global
-        assert 'scrape_configs' not in config_without_scrape
-        assert len(config_empty_scrape['scrape_configs']) == 0
+        assert "global" not in config_without_global
+        assert "scrape_configs" not in config_without_scrape
+        assert len(config_empty_scrape["scrape_configs"]) == 0


 if __name__ == "__main__":
--- a/tests/infra/test_prometheus_integration.py
+++ b/tests/infra/test_prometheus_integration.py
@@ -1,437 +0,0 @@
-#!/usr/bin/env python3
-"""
-Интеграционные тесты для Prometheus и связанных компонентов
-"""
-
-import pytest
-import pytest_asyncio
-import asyncio
-import sys
-import os
-import tempfile
-import yaml
-from unittest.mock import Mock, AsyncMock, patch, MagicMock
-from pathlib import Path
-
-# Добавляем путь к модулям мониторинга
-sys.path.insert(0, os.path.join(os.path.dirname(__file__), '../../infra/monitoring'))
-
-from prometheus_server import PrometheusServer
-from metrics_collector import MetricsCollector
-
-
-class TestPrometheusIntegration:
-    """Интеграционные тесты для Prometheus"""
-    
-    @pytest_asyncio.fixture
-    async def prometheus_server(self):
-        """Создает экземпляр PrometheusServer для интеграционных тестов"""
-        server = PrometheusServer(host='127.0.0.1', port=0)
-        return server
-    
-    @pytest.fixture
-    def metrics_collector(self):
-        """Создает экземпляр MetricsCollector для интеграционных тестов"""
-        return MetricsCollector()
-    
-    @pytest.fixture
-    def sample_prometheus_config(self):
-        """Создает пример конфигурации Prometheus для тестов"""
-        return {
-            'global': {
-                'scrape_interval': '15s',
-                'evaluation_interval': '15s'
-            },
-            'scrape_configs': [
-                {
-                    'job_name': 'test-infrastructure',
-                    'static_configs': [
-                        {
-                            'targets': ['127.0.0.1:9091'],
-                            'labels': {
-                                'environment': 'test',
-                                'service': 'test-monitoring'
-                            }
-                        }
-                    ],
-                    'metrics_path': '/metrics',
-                    'scrape_interval': '30s',
-                    'scrape_timeout': '10s',
-                    'honor_labels': True
-                }
-            ]
-        }
-    
-    @pytest.mark.integration
-    @pytest.mark.asyncio
-    async def test_prometheus_server_with_real_metrics_collector(self, prometheus_server):
-        """Тест интеграции PrometheusServer с реальным MetricsCollector"""
-        # Получаем реальные метрики
-        metrics_data = prometheus_server.metrics_collector.get_metrics_data()
-        
-        # Проверяем, что можем получить метрики
-        assert isinstance(metrics_data, dict)
-        
-        # Форматируем метрики в Prometheus формат
-        prometheus_metrics = prometheus_server._format_prometheus_metrics(metrics_data)
-        
-        # Проверяем, что метрики содержат системную информацию
-        assert '# HELP system_info System information' in prometheus_metrics
-        assert '# TYPE system_info gauge' in prometheus_metrics
-        
-        # Проверяем, что есть хотя бы одна метрика
-        lines = prometheus_metrics.split('\n')
-        assert len(lines) >= 3  # system_info help, type, value
-    
-    @pytest.mark.integration
-    def test_metrics_collector_system_integration(self, metrics_collector):
-        """Тест интеграции MetricsCollector с системой"""
-        # Получаем системную информацию
-        system_info = metrics_collector.get_system_info()
-        
-        # Проверяем, что получили словарь
-        assert isinstance(system_info, dict)
-        
-        # Проверяем, что можем получить метрики для Prometheus
-        metrics_data = metrics_collector.get_metrics_data()
-        assert isinstance(metrics_data, dict)
-        
-        # Проверяем, что можем проверить алерты
-        alerts, recoveries = metrics_collector.check_alerts(system_info)
-        assert isinstance(alerts, list)
-        assert isinstance(recoveries, list)
-    
-    @pytest.mark.integration
-    def test_prometheus_metrics_format_integration(self, prometheus_server, metrics_collector):
-        """Тест интеграции форматирования метрик Prometheus"""
-        # Получаем реальные метрики
-        metrics_data = metrics_collector.get_metrics_data()
-        
-        # Форматируем в Prometheus формат
-        prometheus_metrics = prometheus_server._format_prometheus_metrics(metrics_data)
-        
-        # Проверяем структуру метрик
-        lines = prometheus_metrics.split('\n')
-        
-        # Должна быть системная информация
-        system_info_lines = [line for line in lines if 'system_info' in line]
-        assert len(system_info_lines) >= 3  # help, type, value
-        
-        # Проверяем, что метрики содержат правильные типы
-        type_lines = [line for line in lines if '# TYPE' in line]
-        assert len(type_lines) > 0
-        
-        # Проверяем, что все метрики имеют правильный формат
-        metric_lines = [line for line in lines if line and not line.startswith('#')]
-        for line in metric_lines:
-            # Проверяем, что строка метрики содержит имя и значение
-            assert ' ' in line
-            parts = line.split(' ')
-            assert len(parts) >= 2
-    
-    @pytest.mark.integration
-    def test_os_detection_integration(self):
-        """Тест интеграции определения ОС"""
-        # Создаем коллектор с реальным определением ОС
-        collector = MetricsCollector()
-        
-        # Проверяем, что ОС определена
-        assert collector.os_type in ["macos", "ubuntu", "unknown"]
-        
-        # Проверяем, что можем получить информацию о диске
-        disk_info = collector._get_disk_usage()
-        if disk_info is not None:
-            assert hasattr(disk_info, 'total')
-            assert hasattr(disk_info, 'used')
-            assert hasattr(disk_info, 'free')
-    
-    @pytest.mark.integration
-    def test_disk_io_calculation_integration(self, metrics_collector):
-        """Тест интеграции расчета I/O диска"""
-        # Инициализируем базовые значения
-        metrics_collector._initialize_disk_io()
-        
-        # Получаем текущую статистику диска
-        current_disk_io = metrics_collector._get_disk_io_counters()
-        
-        if current_disk_io is not None:
-            # Рассчитываем скорость
-            read_speed, write_speed = metrics_collector._calculate_disk_speed(current_disk_io)
-            
-            # Проверяем, что получили строки с единицами измерения
-            assert isinstance(read_speed, str)
-            assert isinstance(write_speed, str)
-            assert "/s" in read_speed
-            assert "/s" in write_speed
-            
-            # Рассчитываем процент загрузки
-            io_percent = metrics_collector._calculate_disk_io_percent()
-            assert isinstance(io_percent, int)
-            assert 0 <= io_percent <= 100
-    
-    @pytest.mark.integration
-    def test_process_monitoring_integration(self, metrics_collector):
-        """Тест интеграции мониторинга процессов"""
-        # Проверяем статус процессов
-        for process_name in ['helper_bot']:
-            status, message = metrics_collector.check_process_status(process_name)
-            
-            # Статус должен быть либо ✅, либо ❌
-            assert status in ["✅", "❌"]
-            
-            # Сообщение должно быть строкой
-            assert isinstance(message, str)
-    
-    @pytest.mark.integration
-    def test_alert_system_integration(self, metrics_collector):
-        """Тест интеграции системы алертов"""
-        # Сбрасываем состояния алертов для чистого теста
-        metrics_collector.alert_states = {'cpu': False, 'ram': False, 'disk': False}
-        metrics_collector.alert_start_times = {'cpu': None, 'ram': None, 'disk': None}
-        
-        # Устанавливаем минимальные задержки для тестов
-        metrics_collector.alert_delays = {'cpu': 0, 'ram': 0, 'disk': 0}
-        
-        # Создаем тестовые данные
-        test_system_info = {
-            'cpu_percent': 85.0,  # Выше порога
-            'ram_percent': 60.0,  # Ниже порога
-            'disk_percent': 70.0,  # Ниже порога
-            'load_avg_1m': 2.5,
-            'ram_used': 8.0,
-            'ram_total': 16.0,
-            'disk_free': 300.0
-        }
-        
-        # Проверяем алерты
-        alerts, recoveries = metrics_collector.check_alerts(test_system_info)
-        
-        # Должен быть хотя бы один алерт для CPU
-        assert len(alerts) >= 1
-        assert any(alert[0] == 'cpu' for alert in alerts)
-        
-        # Проверяем, что состояние алерта изменилось
-        assert metrics_collector.alert_states['cpu'] is True
-        
-        # Тестируем восстановление
-        test_system_info['cpu_percent'] = 70.0  # Ниже recovery threshold
-        
-        alerts, recoveries = metrics_collector.check_alerts(test_system_info)
-        
-        # Должно быть восстановление
-        assert len(recoveries) >= 1
-        assert any(recovery[0] == 'cpu' for recovery in recoveries)
-        assert metrics_collector.alert_states['cpu'] is False
-    
-    @pytest.mark.integration
-    def test_uptime_calculation_integration(self, metrics_collector):
-        """Тест интеграции расчета uptime"""
-        # Получаем uptime системы
-        system_uptime = metrics_collector._get_system_uptime()
-        assert system_uptime > 0
-        
-        # Получаем uptime мониторинга
-        monitor_uptime = metrics_collector.get_monitor_uptime()
-        assert isinstance(monitor_uptime, str)
-        assert len(monitor_uptime) > 0
-        
-        # Форматируем uptime
-        formatted_uptime = metrics_collector._format_uptime(system_uptime)
-        assert isinstance(formatted_uptime, str)
-        assert len(formatted_uptime) > 0
-    
-    @pytest.mark.integration
-    def test_environment_variables_integration(self):
-        """Тест интеграции с переменными окружения"""
-        # Тестируем с пользовательскими значениями
-        test_threshold = '90.0'
-        test_recovery_threshold = '85.0'
-        
-        with patch.dict(os.environ, {
-            'THRESHOLD': test_threshold,
-            'RECOVERY_THRESHOLD': test_recovery_threshold
-        }):
-            collector = MetricsCollector()
-            
-            # Проверяем, что значения установлены
-            assert collector.threshold == float(test_threshold)
-            assert collector.recovery_threshold == float(test_recovery_threshold)
-    
-    @pytest.mark.integration
-    def test_prometheus_config_validation_integration(self, sample_prometheus_config):
-        """Тест интеграции валидации конфигурации Prometheus"""
-        # Проверяем структуру конфигурации
-        assert 'global' in sample_prometheus_config
-        assert 'scrape_configs' in sample_prometheus_config
-        
-        global_config = sample_prometheus_config['global']
-        assert 'scrape_interval' in global_config
-        assert 'evaluation_interval' in global_config
-        
-        scrape_configs = sample_prometheus_config['scrape_configs']
-        assert len(scrape_configs) > 0
-        
-        # Проверяем каждый job
-        for job in scrape_configs:
-            assert 'job_name' in job
-            assert 'static_configs' in job
-            
-            static_configs = job['static_configs']
-            assert len(static_configs) > 0
-            
-            for static_config in static_configs:
-                assert 'targets' in static_config
-                targets = static_config['targets']
-                assert len(targets) > 0
-    
-    @pytest.mark.integration
-    def test_metrics_data_consistency_integration(self, prometheus_server, metrics_collector):
-        """Тест интеграции консистентности данных метрик"""
-        # Получаем метрики разными способами
-        system_info = metrics_collector.get_system_info()
-        metrics_data = metrics_collector.get_metrics_data()
-        
-        # Проверяем консистентность между system_info и metrics_data
-        # Реальные метрики могут значительно отличаться из-за времени между вызовами
-        # и системной нагрузки, поэтому используем более широкие допуски
-        
-        if 'cpu_percent' in system_info and 'cpu_usage_percent' in metrics_data:
-            # CPU метрики могут сильно колебаться, используем допуск 50%
-            # Это связано с тем, что CPU измеряется в разные моменты времени
-            cpu_diff = abs(system_info['cpu_percent'] - metrics_data['cpu_usage_percent'])
-            assert cpu_diff < 50.0, f"CPU metrics difference too large: {cpu_diff}% (system: {system_info['cpu_percent']}%, metrics: {metrics_data['cpu_usage_percent']}%)"
-        
-        if 'ram_percent' in system_info and 'ram_usage_percent' in metrics_data:
-            # RAM метрики более стабильны, но все же используем допуск 15%
-            ram_diff = abs(system_info['ram_percent'] - metrics_data['ram_usage_percent'])
-            assert ram_diff < 15.0, f"RAM metrics difference too large: {ram_diff}% (system: {system_info['ram_percent']}%, metrics: {metrics_data['ram_usage_percent']}%)"
-        
-        if 'disk_percent' in system_info and 'disk_usage_percent' in metrics_data:
-            # Disk метрики должны быть очень стабильными, допуск 10%
-            disk_diff = abs(system_info['disk_percent'] - metrics_data['disk_usage_percent'])
-            assert disk_diff < 10.0, f"Disk metrics difference too large: {disk_diff}% (system: {system_info['disk_percent']}%, metrics: {metrics_data['disk_usage_percent']}%)"
-        
-        # Проверяем, что все метрики имеют разумные значения
-        for metric_name, value in system_info.items():
-            if isinstance(value, (int, float)):
-                assert value >= 0, f"Metric {metric_name} should be non-negative: {value}"
-        
-        for metric_name, value in metrics_data.items():
-            if isinstance(value, (int, float)):
-                assert value >= 0, f"Metric {metric_name} should be non-negative: {value}"
-    
-    @pytest.mark.integration
-    def test_error_handling_integration(self, prometheus_server, metrics_collector):
-        """Тест интеграции обработки ошибок"""
-        # Тестируем обработку ошибок в PrometheusServer
-        with patch.object(metrics_collector, 'get_metrics_data', side_effect=Exception("Test error")):
-            prometheus_server.metrics_collector = metrics_collector
-            
-            # Создаем мок запрос
-            request = Mock()
-            
-            # Обрабатываем запрос метрик
-            response = asyncio.run(prometheus_server.metrics_handler(request))
-            
-            # Должен вернуться ответ с ошибкой
-            assert response.status == 500
-            assert 'Error: Test error' in response.text
-    
-    @pytest.mark.integration
-    def test_performance_integration(self, prometheus_server, metrics_collector):
-        """Тест интеграции производительности"""
-        import time
-        
-        # Измеряем время получения системной информации
-        start_time = time.time()
-        system_info = metrics_collector.get_system_info()
-        system_info_time = time.time() - start_time
-        
-        # Измеряем время получения метрик
-        start_time = time.time()
-        metrics_data = metrics_collector.get_metrics_data()
-        metrics_time = time.time() - start_time
-        
-        # Измеряем время форматирования Prometheus метрик
-        start_time = time.time()
-        prometheus_metrics = prometheus_server._format_prometheus_metrics(metrics_data)
-        formatting_time = time.time() - start_time
-        
-        # Проверяем, что операции выполняются в разумное время
-        assert system_info_time < 5.0, f"System info collection took too long: {system_info_time}s"
-        assert metrics_time < 3.0, f"Metrics collection took too long: {metrics_time}s"
-        assert formatting_time < 0.1, f"Metrics formatting took too long: {formatting_time}s"
-        
-        # Проверяем, что получили данные
-        assert isinstance(system_info, dict)
-        assert isinstance(metrics_data, dict)
-        assert isinstance(prometheus_metrics, str)
-        assert len(prometheus_metrics) > 0
-
-
-class TestPrometheusEndToEnd:
-    """End-to-end тесты для Prometheus"""
-    
-    @pytest.mark.integration
-    @pytest.mark.slow
-    def test_full_metrics_pipeline(self):
-        """Тест полного пайплайна метрик"""
-        # Создаем все компоненты
-        metrics_collector = MetricsCollector()
-        prometheus_server = PrometheusServer()
-        
-        # 1. Собираем системную информацию
-        system_info = metrics_collector.get_system_info()
-        assert isinstance(system_info, dict)
-        
-        # 2. Получаем метрики для Prometheus
-        metrics_data = metrics_collector.get_metrics_data()
-        assert isinstance(metrics_data, dict)
-        
-        # 3. Форматируем метрики в Prometheus формат
-        prometheus_metrics = prometheus_server._format_prometheus_metrics(metrics_data)
-        assert isinstance(prometheus_metrics, str)
-        
-        # 4. Проверяем, что метрики содержат необходимую информацию
-        lines = prometheus_metrics.split('\n')
-        
-        # Должна быть системная информация
-        assert any('system_info' in line for line in lines)
-        
-        # Должны быть метрики системы
-        assert any('cpu_usage_percent' in line for line in lines) or any('ram_usage_percent' in line for line in lines)
-        
-        # 5. Проверяем алерты
-        alerts, recoveries = metrics_collector.check_alerts(system_info)
-        assert isinstance(alerts, list)
-        assert isinstance(recoveries, list)
-    
-    @pytest.mark.integration
-    @pytest.mark.slow
-    def test_metrics_stability(self):
-        """Тест стабильности метрик"""
-        import time
-        metrics_collector = MetricsCollector()
-        
-        # Получаем метрики несколько раз подряд
-        metrics_list = []
-        for _ in range(3):
-            metrics = metrics_collector.get_metrics_data()
-            metrics_list.append(metrics)
-            time.sleep(0.1)  # Небольшая пауза
-        
-        # Проверяем, что структура метрик не изменилась
-        for metrics in metrics_list:
-            assert isinstance(metrics, dict)
-            assert len(metrics) > 0
-        
-        # Проверяем, что ключи метрик не изменились
-        first_keys = set(metrics_list[0].keys())
-        for metrics in metrics_list[1:]:
-            current_keys = set(metrics.keys())
-            # Некоторые метрики могут отсутствовать, но структура должна быть похожей
-            assert len(current_keys.intersection(first_keys)) > 0
-
-
-if __name__ == "__main__":
-    pytest.main([__file__, "-v", "-m", "integration"])
--- a/tests/infra/test_prometheus_server.py
+++ b/tests/infra/test_prometheus_server.py
@@ -1,309 +0,0 @@
-#!/usr/bin/env python3
-"""
-Тесты для PrometheusServer
-"""
-
-import pytest
-import asyncio
-import sys
-import os
-from unittest.mock import Mock, AsyncMock, patch, MagicMock
-from aiohttp import web
-from aiohttp.test_utils import TestClient
-
-# Добавляем путь к модулям мониторинга
-sys.path.insert(0, os.path.join(os.path.dirname(__file__), '../../infra/monitoring'))
-
-from prometheus_server import PrometheusServer
-
-
-class TestPrometheusServer:
-    """Тесты для класса PrometheusServer"""
-    
-    @pytest.fixture
-    def prometheus_server(self):
-        """Создает экземпляр PrometheusServer для тестов"""
-        return PrometheusServer(host='127.0.0.1', port=9091)
-    
-    @pytest.fixture
-    def mock_metrics_collector(self):
-        """Создает мок MetricsCollector"""
-        mock_collector = Mock()
-        mock_collector.os_type = "ubuntu"
-        mock_collector.get_metrics_data.return_value = {
-            'cpu_usage_percent': 25.5,
-            'ram_usage_percent': 60.2,
-            'disk_usage_percent': 45.8,
-            'load_average_1m': 1.2,
-            'load_average_5m': 1.1,
-            'load_average_15m': 1.0,
-            'swap_usage_percent': 10.5,
-            'disk_io_percent': 15.3,
-            'system_uptime_seconds': 86400.0,
-            'monitor_uptime_seconds': 3600.0
-        }
-        return mock_collector
-    
-    def test_init(self, prometheus_server):
-        """Тест инициализации PrometheusServer"""
-        assert prometheus_server.host == '127.0.0.1'
-        assert prometheus_server.port == 9091
-        assert prometheus_server.metrics_collector is not None
-        assert isinstance(prometheus_server.app, web.Application)
-    
-    def test_setup_routes(self, prometheus_server):
-        """Тест настройки маршрутов"""
-        routes = list(prometheus_server.app.router.routes())
-        # aiohttp создает по 2 маршрута для каждого эндпоинта (GET и HEAD)
-        assert len(routes) == 6
-        
-        # Проверяем наличие всех маршрутов
-        route_paths = [route.resource.canonical for route in routes]
-        assert '/' in route_paths
-        assert '/metrics' in route_paths
-        assert '/health' in route_paths
-    
-    @pytest.mark.asyncio
-    async def test_root_handler(self, prometheus_server):
-        """Тест главного обработчика"""
-        request = Mock()
-        response = await prometheus_server.root_handler(request)
-        
-        assert isinstance(response, web.Response)
-        assert response.status == 200
-        assert response.content_type == 'text/plain'
-        assert 'Prometheus Metrics Server' in response.text
-        assert '/metrics' in response.text
-        assert '/health' in response.text
-    
-    @pytest.mark.asyncio
-    async def test_health_handler(self, prometheus_server):
-        """Тест health check обработчика"""
-        request = Mock()
-        response = await prometheus_server.health_handler(request)
-        
-        assert isinstance(response, web.Response)
-        assert response.status == 200
-        assert response.content_type == 'text/plain'
-        assert response.text == 'OK'
-    
-    @pytest.mark.asyncio
-    async def test_metrics_handler_success(self, prometheus_server, mock_metrics_collector):
-        """Тест обработчика метрик при успешном получении данных"""
-        # Заменяем metrics_collector на мок
-        prometheus_server.metrics_collector = mock_metrics_collector
-        
-        request = Mock()
-        response = await prometheus_server.metrics_handler(request)
-        
-        assert isinstance(response, web.Response)
-        assert response.status == 200
-        assert response.content_type == 'text/plain'
-        
-        # Проверяем, что метрики содержат ожидаемые данные
-        metrics_text = response.text
-        assert '# HELP system_info System information' in metrics_text
-        assert '# TYPE system_info gauge' in metrics_text
-        assert 'system_info{os="ubuntu"}' in metrics_text
-        assert '# HELP cpu_usage_percent CPU usage percentage' in metrics_text
-        assert 'cpu_usage_percent 25.5' in metrics_text
-    
-    @pytest.mark.asyncio
-    async def test_metrics_handler_error(self, prometheus_server, mock_metrics_collector):
-        """Тест обработчика метрик при ошибке"""
-        # Настраиваем мок для вызова исключения
-        mock_metrics_collector.get_metrics_data.side_effect = Exception("Test error")
-        prometheus_server.metrics_collector = mock_metrics_collector
-        
-        request = Mock()
-        response = await prometheus_server.metrics_handler(request)
-        
-        assert isinstance(response, web.Response)
-        assert response.status == 500
-        assert response.content_type == 'text/plain'
-        assert 'Error: Test error' in response.text
-    
-    def test_format_prometheus_metrics(self, prometheus_server, mock_metrics_collector):
-        """Тест форматирования метрик в Prometheus формат"""
-        prometheus_server.metrics_collector = mock_metrics_collector
-        
-        metrics_data = mock_metrics_collector.get_metrics_data()
-        formatted_metrics = prometheus_server._format_prometheus_metrics(metrics_data)
-        
-        # Проверяем структуру метрик
-        lines = formatted_metrics.split('\n')
-        
-        # Проверяем наличие системной информации
-        assert any('system_info' in line for line in lines)
-        assert any('os="ubuntu"' in line for line in lines)
-        
-        # Проверяем наличие CPU метрик
-        assert any('cpu_usage_percent' in line for line in lines)
-        assert any('25.5' in line for line in lines)
-        
-        # Проверяем наличие RAM метрик
-        assert any('ram_usage_percent' in line for line in lines)
-        assert any('60.2' in line for line in lines)
-        
-        # Проверяем наличие disk метрик
-        assert any('disk_usage_percent' in line for line in lines)
-        assert any('45.8' in line for line in lines)
-        
-        # Проверяем наличие load average метрик
-        assert any('load_average_1m' in line for line in lines)
-        assert any('1.2' in line for line in lines)
-    
-    def test_format_prometheus_metrics_empty_data(self, prometheus_server):
-        """Тест форматирования метрик с пустыми данными"""
-        empty_metrics = {}
-        formatted_metrics = prometheus_server._format_prometheus_metrics(empty_metrics)
-        
-        # Должна быть только системная информация
-        lines = formatted_metrics.split('\n')
-        assert len(lines) == 3  # system_info help, type, value
-        assert any('system_info' in line for line in lines)
-    
-    def test_format_prometheus_metrics_partial_data(self, prometheus_server, mock_metrics_collector):
-        """Тест форматирования метрик с частичными данными"""
-        prometheus_server.metrics_collector = mock_metrics_collector
-        
-        # Только CPU метрики
-        partial_metrics = {
-            'cpu_usage_percent': 50.0,
-            'load_average_1m': 2.5
-        }
-        
-        formatted_metrics = prometheus_server._format_prometheus_metrics(partial_metrics)
-        lines = formatted_metrics.split('\n')
-        
-        # Проверяем, что есть системная информация + CPU + load average
-        assert any('system_info' in line for line in lines)
-        assert any('cpu_usage_percent' in line for line in lines)
-        assert any('load_average_1m' in line for line in lines)
-        assert any('50.0' in line for line in lines)
-        assert any('2.5' in line for line in lines)
-        
-        # Проверяем, что нет RAM метрик
-        assert not any('ram_usage_percent' in line for line in lines)
-    
-    @pytest.mark.asyncio
-    async def test_start_and_stop(self, prometheus_server):
-        """Тест запуска и остановки сервера"""
-        # Мокаем web.AppRunner и TCPSite
-        with patch('prometheus_server.web.AppRunner') as mock_runner_class, \
-             patch('prometheus_server.web.TCPSite') as mock_site_class:
-            
-            mock_runner = Mock()
-            mock_runner.setup = AsyncMock()
-            mock_runner.cleanup = AsyncMock()
-            mock_runner_class.return_value = mock_runner
-            
-            mock_site = Mock()
-            mock_site.start = AsyncMock()
-            mock_site_class.return_value = mock_site
-            
-            # Запускаем сервер
-            runner = await prometheus_server.start()
-            
-            # Проверяем, что методы были вызваны
-            mock_runner.setup.assert_called_once()
-            mock_site.start.assert_called_once()
-            assert runner == mock_runner
-            
-            # Останавливаем сервер
-            await prometheus_server.stop(runner)
-            mock_runner.cleanup.assert_called_once()
-    
-    def test_different_os_types(self):
-        """Тест работы с разными типами ОС"""
-        # Тестируем macOS
-        with patch('platform.system', return_value='Darwin'):
-            server_macos = PrometheusServer()
-            assert server_macos.metrics_collector.os_type == "macos"
-        
-        # Тестируем Linux
-        with patch('platform.system', return_value='Linux'):
-            server_linux = PrometheusServer()
-            assert server_linux.metrics_collector.os_type == "ubuntu"
-        
-        # Тестируем неизвестную ОС
-        with patch('platform.system', return_value='Windows'):
-            server_unknown = PrometheusServer()
-            assert server_unknown.metrics_collector.os_type == "unknown"
-    
-    def test_custom_host_port(self):
-        """Тест создания сервера с пользовательскими параметрами"""
-        server = PrometheusServer(host='192.168.1.100', port=9092)
-        assert server.host == '192.168.1.100'
-        assert server.port == 9092
-    
-    def test_metrics_collector_integration(self, prometheus_server):
-        """Тест интеграции с MetricsCollector"""
-        # Проверяем, что metrics_collector имеет необходимые методы
-        collector = prometheus_server.metrics_collector
-        assert hasattr(collector, 'get_metrics_data')
-        assert hasattr(collector, 'os_type')
-        
-        # Проверяем, что можем получить данные
-        metrics_data = collector.get_metrics_data()
-        assert isinstance(metrics_data, dict)
-
-
-class TestPrometheusServerIntegration:
-    """Интеграционные тесты для PrometheusServer"""
-    
-    @pytest.mark.asyncio
-    async def test_server_creation_integration(self):
-        """Интеграционный тест создания сервера"""
-        server = PrometheusServer(host='127.0.0.1', port=0)
-        
-        # Проверяем, что сервер создался
-        assert server is not None
-        assert server.host == '127.0.0.1'
-        assert server.port == 0
-        
-        # Проверяем, что приложение создалось
-        assert server.app is not None
-        
-        # Проверяем, что маршруты настроены
-        routes = list(server.app.router.routes())
-        assert len(routes) > 0
-    
-    @pytest.mark.asyncio
-    async def test_metrics_collector_integration(self):
-        """Интеграционный тест с MetricsCollector"""
-        server = PrometheusServer(host='127.0.0.1', port=0)
-        
-        # Проверяем, что можем получить метрики
-        metrics_data = server.metrics_collector.get_metrics_data()
-        assert isinstance(metrics_data, dict)
-        
-        # Проверяем, что можем отформатировать метрики
-        prometheus_metrics = server._format_prometheus_metrics(metrics_data)
-        assert isinstance(prometheus_metrics, str)
-        assert len(prometheus_metrics) > 0
-    
-    @pytest.mark.asyncio
-    async def test_endpoint_handlers_integration(self):
-        """Интеграционный тест обработчиков эндпоинтов"""
-        server = PrometheusServer(host='127.0.0.1', port=0)
-        
-        # Тестируем корневой обработчик
-        request = Mock()
-        response = await server.root_handler(request)
-        assert response.status == 200
-        assert 'Prometheus Metrics Server' in response.text
-        
-        # Тестируем health обработчик
-        response = await server.health_handler(request)
-        assert response.status == 200
-        assert response.text == 'OK'
-        
-        # Тестируем metrics обработчик
-        response = await server.metrics_handler(request)
-        assert response.status == 200
-        assert '# HELP system_info' in response.text
-
-
-if __name__ == "__main__":
-    pytest.main([__file__, "-v"])
--- a/tests/test_pytest_config.py
+++ b/tests/test_pytest_config.py
@@ -3,46 +3,36 @@
 Тест конфигурации pytest
 """

-import pytest
 import os
 import sys

+import pytest
+
+
 def test_pytest_config_loaded():
    """Проверяем, что конфигурация pytest загружена"""
    # Проверяем, что мы находимся в корневой директории проекта
-    assert os.path.exists('pytest.ini'), "pytest.ini должен существовать в корне проекта"
-    
-    # Проверяем, что директория tests существует
-    assert os.path.exists('tests'), "Директория tests должна существовать"
-    assert os.path.exists('tests/infra'), "Директория tests/infra должна существовать"
-    assert os.path.exists('tests/bot'), "Директория tests/bot должна существовать"
+    assert os.path.exists(
+        "pytest.ini"
+    ), "pytest.ini должен существовать в корне проекта"
+
+    # Проверяем, что директория tests существует
+    assert os.path.exists("tests"), "Директория tests должна существовать"
+    assert os.path.exists("tests/infra"), "Директория tests/infra должна существовать"
+    assert os.path.exists("tests/bot"), "Директория tests/bot должна существовать"

-def test_import_paths():
-    """Проверяем, что пути импорта настроены правильно"""
-    # Проверяем, что можем импортировать модули мониторинга
-    sys.path.insert(0, 'infra/monitoring')
-    try:
-        import metrics_collector
-        import message_sender
-        import prometheus_server
-        import server_monitor
-        assert True
-    except ImportError as e:
-        pytest.fail(f"Failed to import monitoring modules: {e}")
-    finally:
-        # Убираем добавленный путь
-        if 'infra/monitoring' in sys.path:
-            sys.path.remove('infra/monitoring')

 def test_test_structure():
    """Проверяем структуру тестов"""
    # Проверяем наличие __init__.py файлов
-    assert os.path.exists('tests/__init__.py'), "tests/__init__.py должен существовать"
-    assert os.path.exists('tests/infra/__init__.py'), "tests/infra/__init__.py должен существовать"
-    assert os.path.exists('tests/bot/__init__.py'), "tests/bot/__init__.py должен существовать"
-    
-    # Проверяем наличие тестов инфраструктуры
-    assert os.path.exists('tests/infra/test_infra.py'), "tests/infra/test_infra.py должен существовать"
+    assert os.path.exists("tests/__init__.py"), "tests/__init__.py должен существовать"
+    assert os.path.exists(
+        "tests/infra/__init__.py"
+    ), "tests/infra/__init__.py должен существовать"
+    assert os.path.exists(
+        "tests/bot/__init__.py"
+    ), "tests/bot/__init__.py должен существовать"
+

 if __name__ == "__main__":
    pytest.main([__file__, "-v"])
Author	SHA1	Message	Date
Andrey	e35415d3d1	Merge branch 'main' of https://github.com/KerradKerridi/prod	2026-02-01 22:31:08 +03:00
Andrey	25dd64fc01	feat: add coverage test targets for Telegram bot and AnonBot in Makefile	2026-02-01 22:31:03 +03:00
ANDREY KATYKHIN	51c2a562fa	Merge pull request #5 from KerradKerridi/dev-5 refactor: упростил скрипты deploy.yml и ci.yml	2026-01-25 22:26:16 +03:00
Andrey	4d328444bd	refactor: упростил скрипты deploy.yml и ci.yml	2026-01-25 22:24:12 +03:00
Andrey	804ecd6107	remove: delete health check step from deploy workflow	2026-01-25 20:51:27 +03:00
Andrey	d736688c62	fix: increase container wait time, fix status variable name, fix delays array for zsh	2026-01-25 20:43:12 +03:00
Andrey	1bfe772a0d	fix: use flock directly with file instead of file descriptor for zsh compatibility	2026-01-25 20:36:07 +03:00
Andrey	e360e5e215	fix: replace exec 200 with flock -x 9 for zsh compatibility	2026-01-25 20:33:08 +03:00
Andrey	76cb533851	fix: use exec for flock file descriptors to work with zsh	2026-01-25 20:27:55 +03:00
Andrey	30465e0bea	debug: add more verbose logging for secrets in deploy steps	2026-01-25 20:24:06 +03:00
Andrey	0a73f9844e	fix: pass secrets directly to SSH scripts instead of using env	2026-01-25 20:14:49 +03:00
Andrey	2ee1977956	feat: add workflow_dispatch to deploy.yml and debug secrets	2026-01-25 20:09:18 +03:00
Andrey	220b24e867	Merge branch 'dev-4'	2026-01-25 20:08:34 +03:00
Andrey	fb33da172a	debug: add secrets availability check in deploy workflow	2026-01-25 20:08:25 +03:00
ANDREY KATYKHIN	9baee2ceb7	Merge pull request #4 from KerradKerridi/dev-4 Merge dev-4 into main	2026-01-25 19:58:22 +03:00
Andrey	60487b5488	some fix agaaain	2026-01-25 19:51:23 +03:00
Andrey	07982ee0f2	some fix 3	2026-01-25 19:24:55 +03:00
Andrey	6c51a82dce	some fix 2	2026-01-25 19:14:07 +03:00
Andrey	5e57e5214c	some fix CI	2026-01-25 19:08:24 +03:00
Andrey	8e595bf7f2	chore: remove outdated monitoring documentation files - Deleted FIX_PROMLEMS.md and MONITORING_AUTH.md as they contained obsolete information regarding Prometheus and Alertmanager configurations. - This cleanup helps streamline the documentation and focuses on current setup practices.	2026-01-25 19:02:46 +03:00
Andrey	34b0345983	some fix	2026-01-25 18:50:18 +03:00
Andrey	1dceab6479	chore: Обновление Docker Compose и CI/CD пайплайна - Docker Compose теперь использует GitHub Secrets для токенов ботов (приоритет над .env) - Добавлена функция ручного отката с указанием коммита - Реализованы проверки работоспособности с экспоненциальной задержкой - Улучшены уведомления об откате	2026-01-25 18:33:58 +03:00
Andrey	0cdc40cd21	chore: enhance deployment workflow with improved health checks and manual trigger - Updated the deployment job to allow manual triggering via workflow_dispatch. - Implemented a retry mechanism for health checks on Prometheus and Grafana to improve reliability. - Increased wait time for services to start before health checks are performed. - Modified health check messages for better clarity and added logging for failed checks.	2026-01-25 16:58:16 +03:00
Andrey	fde1f14708	chore: update CI/CD pipeline configuration for improved branch handling - Renamed the CI/CD pipeline for clarity and consistency. - Updated the branch triggers to include 'dev-*' for better integration of development branches. - Removed the URL setting for the production environment to streamline the deployment process.	2026-01-25 15:52:02 +03:00
Andrey	5a0c2d6942	chore: remove CI and deployment workflows to streamline processes - Deleted outdated CI workflow file to simplify the continuous integration process. - Removed deployment workflow file to eliminate redundancy and focus on a more efficient deployment strategy.	2026-01-25 15:46:58 +03:00
Andrey	153a7d4807	chore: refine CI and deployment workflows with enhanced notifications and checks - Improved CI workflow notifications for better clarity on test results. - Added a status check job in the deployment workflow to ensure only successful builds are deployed. - Updated deployment notification messages for improved context and clarity.	2026-01-25 15:44:21 +03:00
Andrey	0944175807	chore: enhance CI and deployment workflows with status checks and notifications - Updated CI workflow to provide clearer notifications on test results and deployment readiness. - Added a new job in the deployment workflow to check the status of the last CI run before proceeding with deployment, ensuring that only successful builds are deployed.	2026-01-25 15:39:19 +03:00
Andrey	3ee72ec48a	chore: update CI and deployment workflows for improved notifications and permissions - Upgraded the upload-artifact action from v3 to v4 in CI workflow for better performance. - Added a notification step in the CI workflow to send test results via Telegram, including job status and repository details. - Modified the deployment workflow to ensure correct file permissions before and after code updates. - Renamed the deployment notification step for clarity and included a link to the action run details in the message.	2026-01-25 15:35:56 +03:00
Andrey	dd8b1c02a4	chore: update Python version in Dockerfile and improve test commands in Makefile - Upgraded Python version in Dockerfile from 3.9 to 3.11.9 for enhanced performance and security. - Adjusted paths in Dockerfile to reflect the new Python version. - Modified test commands in Makefile to activate the virtual environment before running tests, ensuring proper dependency management.	2026-01-25 15:27:57 +03:00
Andrey	9e03c1f6f2	chore: optimize resource allocation and memory settings in Docker Compose - Added memory and CPU limits and reservations for Prometheus, Grafana, and Uptime Kuma services to enhance performance and resource management. - Updated Prometheus and Grafana configurations with new storage block duration settings for improved memory optimization. - Revised README to include additional commands for running specific services and restarting containers.	2026-01-23 21:38:48 +03:00
Andrey	75cd722cc4	fix: update htpasswd generation for monitoring and status page - Modified the htpasswd command to limit the password length to 72 characters for security compliance. - Added a new task to generate an htpasswd hash specifically for the status page. - Updated the task that creates the htpasswd file to use the output from the new hash generation.	2026-01-22 22:38:01 +03:00
Andrey	95fabdc0d1	refactor: consolidate Nginx configurations into a single file - Merged individual Nginx configuration files for Grafana, Prometheus, and Alertmanager into a unified nginx.conf. - Added location blocks for Grafana, Prometheus, and Alertmanager with appropriate proxy settings, authentication, and rate limiting. - Removed obsolete configuration files to streamline the Nginx setup and improve maintainability.	2025-09-20 01:14:10 +03:00
Andrey	8be219778c	chore: update configuration files for improved logging and service management - Enhanced .dockerignore to exclude bot logs, Docker volumes, and temporary files. - Updated .gitignore to include Ansible vars files for better environment management. - Modified docker-compose.yml health checks to use curl for service verification. - Refined Ansible playbook by adding tasks for creating default Zsh configuration files and cleaning up temporary files. - Improved Nginx configuration to support Uptime Kuma with specific location blocks for status and dashboard, including rate limiting and WebSocket support.	2025-09-19 16:40:40 +03:00
Andrey	a075ef6772	chore: remove specific version reference for telegram-helper-bot in Ansible playbook - Eliminated the hardcoded version 'dev-9' for the telegram-helper-bot repository in the Ansible playbook to allow for more flexible updates.	2025-09-19 13:03:25 +03:00
Andrey	8595fc5886	refactor: streamline Ansible playbook and logrotate configurations - Removed environment variable lookups for logrotate settings in logrotate configuration files, replacing them with hardcoded values. - Updated the Ansible playbook to simplify project root, deploy user, and old server configurations by removing environment variable dependencies. - Added tasks to copy Zsh configuration files from an old server to the new server, ensuring proper permissions and cleanup of temporary files. - Enhanced logrotate configurations for bots and system logs to ensure consistent management of log files.	2025-09-19 13:00:19 +03:00
Andrey	f7b08ae9e8	feat: enhance Ansible playbook and Nginx configuration with authentication and logrotate setup - Added environment variables for project configuration in env.template. - Updated Ansible playbook to use environment variables for project settings and added tasks for monitoring authentication setup. - Enhanced Nginx configuration for Alertmanager and Prometheus with HTTP Basic Authentication. - Introduced logrotate configuration for managing log files and set up cron for daily execution. - Removed obsolete Uptime Kuma docker-compose file.	2025-09-19 12:09:05 +03:00
Andrey	1eb11e454d	chore: remove Nginx service from docker-compose and update Ansible inventory with new server IP - Deleted the Nginx service configuration from docker-compose.yml. - Updated the Ansible inventory file to reflect a new server IP address.	2025-09-19 02:21:57 +03:00
Andrey	14b19699c5	feat: enhance Ansible playbook with project directory permissions and service checks - Add tasks to set directory permissions for the project before and after cloning. - Introduce a task to reload the SSH service to apply new configurations. - Implement a check for Node Exporter metrics availability. - Update Prometheus configuration comment for clarity on Node Exporter target.	2025-09-19 01:56:12 +03:00
Andrey	1db579797d	refactor: update Nginx configuration and Docker setup - Change user directive in Nginx configuration from 'nginx' to 'www-data'. - Update upstream server configurations in Nginx to use 'localhost' instead of service names. - Modify Nginx server block to redirect HTTP to a status page instead of Grafana. - Rename Alertmanager location from '/alertmanager/' to '/alerts/' for consistency. - Remove deprecated status page configuration and related files. - Adjust Prometheus configuration to reflect the new Docker network settings.	2025-09-18 21:21:23 +03:00
Andrey	9ec3f02767	feat: integrate Uptime Kuma and Alertmanager into Docker setup - Add Uptime Kuma service for status monitoring with health checks. - Introduce Alertmanager service for alert management and notifications. - Update docker-compose.yml to include new services and their configurations. - Enhance Makefile with commands for managing Uptime Kuma and Alertmanager logs. - Modify Ansible playbook to install necessary packages and configure SSL for new services. - Update Nginx configuration to route traffic to Uptime Kuma and Alertmanager. - Adjust Prometheus configuration to include alert rules and external URLs.	2025-09-16 21:50:56 +03:00
Andrey	5e10204137	Merge branch 'main' of https://github.com/KerradKerridi/prod	2025-09-16 18:52:53 +03:00
Andrey	5b8833a67f	Merge branch 'main' of https://github.com/KerradKerridi/prod	2025-09-16 18:52:24 +03:00
Andrey	2661b3865e	fix: update Dockerfile reference in docker-compose and add versioning to Ansible playbook - Change Dockerfile reference in docker-compose.yml from Dockerfile.bot to Dockerfile - Add versioning comment for the telegram-helper-bot repository in playbook.yml	2025-09-16 18:51:05 +03:00
ANDREY KATYKHIN	539c074e9f	Merge pull request #3 from KerradKerridi/dev-3 Dev 3	2025-09-16 18:32:23 +03:00
Andrey	f8d6b92fd2	feat: add Nginx reverse proxy and SSL configuration - Introduce Nginx service in docker-compose for handling HTTP/HTTPS traffic. - Configure Nginx with SSL support and health checks for Grafana and Prometheus. - Update env.template to include SERVER_IP and STATUS_PAGE_PASSWORD variables. - Enhance Ansible playbook with tasks for Nginx installation, SSL certificate generation, and configuration management.	2025-09-16 18:31:51 +03:00
Andrey	30830c5bd9	refactor: update Docker setup and remove deprecated monitoring components - Replace curl with wget in healthcheck commands for better reliability. - Remove server_monitor service and related configurations from docker-compose. - Update Dockerfile to use a multi-stage build for optimized image size. - Delete obsolete Dockerfile.optimized and related monitoring scripts. - Clean up Makefile by removing commands related to the server_monitor service. - Update README to reflect changes in monitoring services and commands.	2025-09-16 17:49:42 +03:00
Andrey	8673cb4f55	feat: enhance Ansible playbook with security and timezone configurations - Add fail2ban installation and configuration for SSH, Nginx, and Docker - Implement kernel security parameter adjustments to mitigate DDoS and spoofing attacks - Set timezone to Europe/Moscow - Update SSH configuration to use port 15722 and close the default port 22 - Enhance UFW rules to allow new SSH port and restrict access to essential services - Include checks for fail2ban status and debug output for verification	2025-09-16 16:41:54 +03:00
Andrey	a1586e78b3	feat: enhance Ansible playbook with swap file management - Update inventory to use root user with SSH options for security - Add tasks for creating, configuring, and enabling a swap file - Set swappiness parameter temporarily and permanently - Ensure swap file is added to /etc/fstab for automatic mounting - Include checks and debug information for swap status	2025-09-16 15:29:40 +03:00
Andrey	0d5dc67eb9	feat: add Node Exporter Full dashboard and auto-installation - Add Node Exporter Full dashboard (ID: 1860) from Grafana.com - Configure automatic dashboard installation in playbook.yml - Add prometheus-node-exporter service installation and configuration - Add port 9100 to UFW firewall rules - Add dashboard verification tasks in playbook - Configure Grafana variables for admin credentials	2025-09-16 12:19:48 +03:00
Andrey	4eb21a7dbc	Add node_exporter configuration to prometheus.yml	2025-09-16 12:09:34 +03:00
Andrey	81a4069623	Refactor Ansible playbook for improved server setup and monitoring - Update SSH user configuration for enhanced security - Add tasks for UFW setup and Docker service management - Optimize data migration processes for bots - Implement checks for database permissions and sizes - Clean up temporary files post-migration	2025-09-16 00:43:45 +03:00
Andrey	136469793c	Update Ansible playbook for server migration and configuration - Change SSH user to root for initial setup - Add tasks for updating SSH host keys and configuring UFW - Implement Docker Compose installation and service management - Enhance data migration process for telegram-helper-bot and AnonBot - Include checks for database sizes and permissions adjustments for voice_users - Clean up temporary files after migration	2025-09-11 00:09:19 +03:00
Andrey	bb91e139bc	Update Ansible configuration and enhance playbook - Add UFW configuration to secure server ports - Install additional packages including vim, zsh, and monitoring tools - Change default shell for 'deploy' user to zsh - Update .gitignore to include Ansible inventory files	2025-09-09 23:00:15 +03:00
Andrey	4981ae8877	Add Ansible playbook for bot migration to new server - Add inventory.ini with server configuration - Add playbook.yml with complete migration process - Configure user 'deploy' with UID/GID 1001:1001 - Add SSH key setup for GitHub access - Add Docker group membership for deploy user - Include data migration from old server - Add port validation for all services	2025-09-09 22:22:31 +03:00
Andrey	b34da5015d	Implement AnonBot integration and monitoring enhancements - Added AnonBot service to docker-compose with resource limits and environment variables. - Updated Makefile to include commands for AnonBot logs, restart, and dependency checks. - Enhanced Grafana dashboards with AnonBot health metrics and database connection statistics. - Implemented AnonBot status retrieval in the message sender for improved monitoring. - Updated Prometheus configuration to scrape metrics from AnonBot service.	2025-09-08 23:17:24 +03:00
Andrey	40968dd075	WIP: Development changes moved from master - Modified Grafana dashboards - Updated message sender and metrics collector - Added new rate limiting dashboard - Removed count_tests.py	2025-09-05 01:29:28 +03:00
Andrey	7d08575512	Update Prometheus configuration to use container name for telegram-helper-bot target	2025-09-04 09:00:22 +03:00
Andrey	d72b870173	Обновить конфигурацию Prometheus для использования имен контейнеров Docker - Изменить host.docker.internal на bots_server_monitor:9091 для infrastructure job - Изменить host.docker.internal на bots_telegram_bot:8080 для telegram-helper-bot job - Обновить комментарии для соответствия новой конфигурации	2025-09-04 08:59:11 +03:00
ANDREY KATYKHIN	f7d11abf69	Merge pull request #1 from KerradKerridi/dev-1 Dev 1	2025-09-04 01:02:24 +03:00