Files
The-Ouroboros/docs/testing.md
agentson 6b34367656
Some checks failed
Gitea CI / test (push) Failing after 3s
Gitea CI / test (pull_request) Failing after 3s
docs: v2/v3 구현 감사 문서 피드백 전체 반영 (#349)
11회 리뷰 사이클에서 남긴 [코멘트]를 모두 본문에 반영하고 블록을 제거한다.

변경 문서:
- docs/architecture.md: SmartScanner 동작 모드(both), 대시보드 10 API,
  DB 스키마(session_id/fx_pnl/mode), config 변수 갱신
- docs/commands.md: /api/pnl/history, /api/positions 엔드포인트 추가
- docs/testing.md: 테스트 수 고정값 제거, SmartScanner fallback 최신화,
  Dashboard 10 API routes 반영
- README.md: 고정 수치 제거, Gitea CI 명시, 파일별 수치 'CI 기준 변동' 표기
- CLAUDE.md: SmartScanner 섹션명 변경, 고정 수치 제거
- docs/requirements-log.md: #318~#331 구현 항목 추가
- docs/ouroboros/80_implementation_audit.md: ROOT-5/6/7 분리,
  REQ-V3-008 함수명 병기, v3 ~85% / 거버넌스 ~60%로 갱신
- docs/ouroboros/85_loss_recovery_action_plan.md: ACT-07 함수명 병기,
  테스트 수 갱신, 6.1/6.2 정확도 개선
- docs/ouroboros/60_repo_enforcement_checklist.md: CI job/step 구분 표 추가
- docs/ouroboros/README.md: 50_* 문서 (A)/(B) 보조 표기

Closes #349
2026-03-01 17:06:56 +09:00

313 lines
9.1 KiB
Markdown

# Testing Guidelines
## Test Structure
**998 tests** across **41 files**. `asyncio_mode = "auto"` in pyproject.toml — async tests need no special decorator.
The `settings` fixture in `conftest.py` provides safe defaults with test credentials and in-memory DB.
### Test Files
#### Core Components
##### `tests/test_risk.py` (14 tests)
- Circuit breaker boundaries and exact threshold triggers
- Fat-finger edge cases and percentage validation
- P&L calculation edge cases
- Order validation logic
##### `tests/test_broker.py` (11 tests)
- OAuth token lifecycle
- Rate limiting enforcement
- Hash key generation
- Network error handling
- SSL context configuration
> **Note**: 아래 파일별 테스트 수는 릴리즈 시점 스냅샷이며 실제 수치와 다를 수 있습니다. 현재 정확한 수치는 `pytest --collect-only -q`로 확인하세요.
##### `tests/test_brain.py` (24 tests)
- Valid JSON parsing and markdown-wrapped JSON handling
- Malformed JSON fallback
- Missing fields handling
- Invalid action validation
- Confidence threshold enforcement
- Empty response handling
- Prompt construction for different markets
##### `tests/test_market_schedule.py` (24 tests)
- Market open/close logic
- Timezone handling (UTC, Asia/Seoul, America/New_York, etc.)
- DST (Daylight Saving Time) transitions
- Weekend handling and lunch break logic
- Multiple market filtering
- Next market open calculation
##### `tests/test_db.py` (3 tests)
- Database initialization and table creation
- Trade logging with all fields (market, exchange_code, decision_id)
- Query and retrieval operations
##### `tests/test_main.py` (37 tests)
- Trading loop orchestration
- Market iteration and stock processing
- Dashboard integration (`--dashboard` flag)
- Telegram command handler wiring
- Error handling and graceful shutdown
#### Strategy & Playbook (v2)
##### `tests/test_pre_market_planner.py` (37 tests)
- Pre-market playbook generation
- Gemini API integration for scenario creation
- Timeout handling and defensive playbook fallback
- Multi-market playbook generation
##### `tests/test_scenario_engine.py` (44 tests)
- Scenario matching against live market data
- Confidence scoring and threshold filtering
- Multiple scenario type handling
- Edge cases (no match, partial match, expired scenarios)
##### `tests/test_playbook_store.py` (23 tests)
- Playbook persistence to SQLite
- Date-based retrieval and market filtering
- Playbook status management (generated, active, expired)
- JSON serialization/deserialization
##### `tests/test_strategy_models.py` (33 tests)
- Pydantic model validation for scenarios, playbooks, decisions
- Field constraints and default values
- Serialization round-trips
#### Analysis & Scanning
##### `tests/test_volatility.py` (24 tests)
- ATR and RSI calculation accuracy
- Volume surge ratio computation
- Momentum scoring
- Breakout/breakdown pattern detection
- Market scanner watchlist management
##### `tests/test_smart_scanner.py` (13 tests)
- Python-first filtering pipeline
- RSI and volume ratio filter logic
- Candidate scoring and ranking
- Fallback to static watchlist (domestic) or dynamic universe (overseas)
#### Context & Memory
##### `tests/test_context.py` (18 tests)
- L1-L7 layer storage and retrieval
- Context key-value CRUD operations
- Timeframe-based queries
- Layer metadata management
##### `tests/test_context_scheduler.py` (5 tests)
- Periodic context aggregation scheduling
- Layer summarization triggers
#### Evolution & Review
##### `tests/test_evolution.py` (24 tests)
- Strategy optimization loop
- High-confidence losing trade analysis
- Generated strategy validation
##### `tests/test_daily_review.py` (10 tests)
- End-of-day review generation
- Trade performance summarization
- Context layer (L6_DAILY) integration
##### `tests/test_scorecard.py` (3 tests)
- Daily scorecard metrics calculation
- Win rate, P&L, confidence tracking
#### Notifications & Commands
##### `tests/test_telegram.py` (25 tests)
- Message sending and formatting
- Rate limiting (leaky bucket)
- Error handling (network timeout, invalid token)
- Auto-disable on missing credentials
- Notification types (trade, circuit breaker, fat-finger, market events)
##### `tests/test_telegram_commands.py` (31 tests)
- 9 command handlers (/help, /status, /positions, /report, /scenarios, /review, /dashboard, /stop, /resume)
- Long polling and command dispatch
- Authorization filtering by chat_id
- Command response formatting
#### Dashboard
##### `tests/test_dashboard.py` (14 tests)
- FastAPI endpoint responses (10 API routes)
- Status, playbook, scorecard, performance, context, decisions, scenarios, pnl/history, positions
- Query parameter handling (market, date, limit)
#### Performance & Quality
##### `tests/test_token_efficiency.py` (34 tests)
- Gemini token usage optimization
- Prompt size reduction verification
- Cache effectiveness
##### `tests/test_latency_control.py` (30 tests)
- API call latency measurement
- Rate limiter timing accuracy
- Async operation overhead
##### `tests/test_decision_logger.py` (9 tests)
- Decision audit trail completeness
- Context snapshot capture
- Outcome tracking (P&L, accuracy)
##### `tests/test_data_integration.py` (38 tests)
- External data source integration
- News API, market data, economic calendar
- Error handling for API failures
##### `tests/test_backup.py` (23 tests)
- Backup scheduler and execution
- Cloud storage (S3) upload
- Health monitoring
- Data export functionality
## Coverage Requirements
**Minimum coverage: 80%**
Check coverage:
```bash
pytest -v --cov=src --cov-report=term-missing
```
**Note:** `main.py` has lower coverage as it contains the main loop which is tested via integration/manual testing.
## Backtest Automation Gate
백테스트 관련 검증은 `scripts/backtest_gate.sh``.github/workflows/backtest-gate.yml`로 자동 실행된다.
- PR: 변경 파일 기준 `auto` 모드
- `feature/**` push: 변경 파일 기준 `auto` 모드
- Daily schedule: `full` 강제 실행
- Manual dispatch: `mode`(`auto|smoke|full`) 지정 가능
실행 기준:
- `src/analysis/`, `src/strategy/`, `src/strategies/`, `src/main.py`, `src/markets/`, `src/broker/`
- 백테스트 핵심 테스트 파일 변경
- `docs/ouroboros/` 변경
`auto` 모드에서 백테스트 민감 영역 변경이 없으면 게이트는 `skip` 처리되며 실패로 간주하지 않는다.
로컬 수동 실행:
```bash
bash scripts/backtest_gate.sh
BACKTEST_MODE=full bash scripts/backtest_gate.sh
BASE_REF=origin/feature/v3-session-policy-stream BACKTEST_MODE=auto bash scripts/backtest_gate.sh
```
## Test Configuration
### `pyproject.toml`
```toml
[tool.pytest.ini_options]
asyncio_mode = "auto"
testpaths = ["tests"]
python_files = ["test_*.py"]
```
### `tests/conftest.py`
```python
@pytest.fixture
def settings() -> Settings:
"""Provide test settings with safe defaults."""
return Settings(
KIS_APP_KEY="test_key",
KIS_APP_SECRET="test_secret",
KIS_ACCOUNT_NO="12345678-01",
GEMINI_API_KEY="test_gemini_key",
MODE="paper",
DB_PATH=":memory:", # In-memory SQLite
CONFIDENCE_THRESHOLD=80,
ENABLED_MARKETS="KR",
)
```
## Writing New Tests
### Naming Convention
- Test files: `test_<module>.py`
- Test functions: `test_<feature>_<scenario>()`
- Use descriptive names that explain what is being tested
### Good Test Example
```python
async def test_send_order_with_market_price(broker, settings):
"""Market orders should use price=0 and ORD_DVSN='01'."""
# Arrange
stock_code = "005930"
order_type = "BUY"
quantity = 10
# Act
with patch.object(broker._session, 'post') as mock_post:
mock_post.return_value.__aenter__.return_value.status = 200
mock_post.return_value.__aenter__.return_value.json = AsyncMock(
return_value={"rt_cd": "0", "msg1": "OK"}
)
await broker.send_order(stock_code, order_type, quantity, price=0)
# Assert
call_args = mock_post.call_args
body = call_args.kwargs['json']
assert body['ORD_DVSN'] == '01' # Market order
assert body['ORD_UNPR'] == '0' # Price 0
```
### Test Checklist
- [ ] Test passes in isolation (`pytest tests/test_foo.py::test_bar -v`)
- [ ] Test has clear docstring explaining what it tests
- [ ] Arrange-Act-Assert structure
- [ ] Uses appropriate fixtures from conftest.py
- [ ] Mocks external dependencies (API calls, network)
- [ ] Tests edge cases and error conditions
- [ ] Doesn't rely on test execution order
## Running Tests
```bash
# All tests
pytest -v
# Specific file
pytest tests/test_risk.py -v
# Specific test
pytest tests/test_brain.py::test_parse_valid_json -v
# With coverage
pytest -v --cov=src --cov-report=term-missing
# Stop on first failure
pytest -x
# Verbose output with print statements
pytest -v -s
```
## CI/CD Integration
Tests run automatically on:
- Every commit to feature branches
- Every PR to main
- Scheduled daily runs
**Blocking conditions:**
- Test failures → PR blocked
- Coverage < 80% → PR blocked (warning only for main.py)
**Non-blocking:**
- `mypy --strict` errors (type hints encouraged but not enforced)
- `ruff check` warnings (must be acknowledged)