feat: implement Token Efficiency - Context optimization (issue #24) #28

Merged

jihoson merged 2 commits from feature/issue-24-token-efficiency into main

2026-02-04 18:39:20 +09:00

Author	SHA1	Message	Date
agentson	61f5aaf4a3	fix: resolve linting issues in token efficiency implementation Some checks failed CI / test (pull_request) Has been cancelled Details - Fix ambiguous variable names (l → layer) - Remove unused imports and variables - Organize import statements Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-04 18:35:55 +09:00
agentson	4f61d5af8e	feat: implement token efficiency optimization for issue #24 Implement comprehensive token efficiency system to reduce LLM costs: - Add prompt_optimizer.py: Token counting, compression, abbreviations - Add context_selector.py: Smart L1-L7 context layer selection - Add summarizer.py: Historical data aggregation and summarization - Add cache.py: TTL-based response caching with hit rate tracking - Enhance gemini_client.py: Integrate optimization, caching, metrics Key features: - Compressed prompts with abbreviations (40-50% reduction) - Smart context selection (L7 for normal, L6-L5 for strategic) - Response caching for HOLD decisions and high-confidence calls - Token usage tracking and metrics (avg tokens, cache hit rate) - Comprehensive test coverage (34 tests, 84-93% coverage) Metrics tracked: - Total tokens used - Avg tokens per decision - Cache hit rate - Cost per decision All tests passing (191 total, 76% overall coverage). Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-04 18:09:51 +09:00

Author

SHA1

Message

Date

agentson

61f5aaf4a3

fix: resolve linting issues in token efficiency implementation

CI / test (pull_request) Has been cancelled

Details

- Fix ambiguous variable names (l → layer)
- Remove unused imports and variables
- Organize import statements

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

2026-02-04 18:35:55 +09:00

agentson

4f61d5af8e

feat: implement token efficiency optimization for issue #24

Implement comprehensive token efficiency system to reduce LLM costs:

- Add prompt_optimizer.py: Token counting, compression, abbreviations
- Add context_selector.py: Smart L1-L7 context layer selection
- Add summarizer.py: Historical data aggregation and summarization
- Add cache.py: TTL-based response caching with hit rate tracking
- Enhance gemini_client.py: Integrate optimization, caching, metrics

Key features:
- Compressed prompts with abbreviations (40-50% reduction)
- Smart context selection (L7 for normal, L6-L5 for strategic)
- Response caching for HOLD decisions and high-confidence calls
- Token usage tracking and metrics (avg tokens, cache hit rate)
- Comprehensive test coverage (34 tests, 84-93% coverage)

Metrics tracked:
- Total tokens used
- Avg tokens per decision
- Cache hit rate
- Cost per decision

All tests passing (191 total, 76% overall coverage).

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

2026-02-04 18:09:51 +09:00

feat: implement Token Efficiency - Context optimization (issue #24) #28

2 Commits