|
| 1 | +# Claude Parser API Layering Audit |
| 2 | + |
| 3 | +## Current State Analysis |
| 4 | + |
| 5 | +### 🔴 Problems Identified |
| 6 | + |
| 7 | +1. **Low-level utilities exposed as public API:** |
| 8 | + - `find_message_by_uuid` - Implementation detail |
| 9 | + - `restore_file_content`, `backup_file` - Too low-level |
| 10 | + - `compare_files`, `generate_file_diff` - Need context |
| 11 | + |
| 12 | +2. **High-level utilities NOT exposed:** |
| 13 | + - `get_message_content()` - Needed for safe extraction |
| 14 | + - `filter_pure_conversation()` - Useful for UIs |
| 15 | + - `get_latest_assistant_message()` - Common UI need |
| 16 | + |
| 17 | +3. **Mixed abstraction levels:** |
| 18 | + - Some domains expose everything (tokens) |
| 19 | + - Some expose nothing useful (messages) |
| 20 | + |
| 21 | +## Proposed API Structure |
| 22 | + |
| 23 | +### Core Layer (`claude_parser.core`) |
| 24 | +Low-level building blocks for advanced users and library builders. |
| 25 | + |
| 26 | +#### Core Messages |
| 27 | +```python |
| 28 | +from claude_parser.core.messages import ( |
| 29 | + get_message_content, # Safe content extraction |
| 30 | + get_text, # Text from any message format |
| 31 | + get_token_usage, # Extract usage metadata |
| 32 | + is_tool_operation, # Check for tool use |
| 33 | +) |
| 34 | +``` |
| 35 | + |
| 36 | +#### Core Filtering |
| 37 | +```python |
| 38 | +from claude_parser.core.filtering import ( |
| 39 | + filter_messages_by_type, |
| 40 | + filter_messages_by_tool, |
| 41 | + filter_pure_conversation, |
| 42 | + exclude_tool_operations, |
| 43 | +) |
| 44 | +``` |
| 45 | + |
| 46 | +#### Core Navigation |
| 47 | +```python |
| 48 | +from claude_parser.core.navigation import ( |
| 49 | + find_message_by_uuid, |
| 50 | + get_message_sequence, |
| 51 | + find_current_checkpoint, |
| 52 | +) |
| 53 | +``` |
| 54 | + |
| 55 | +#### Core Storage |
| 56 | +```python |
| 57 | +from claude_parser.core.storage import ( |
| 58 | + query_jsonl, # Raw DuckDB access |
| 59 | + get_engine, # Direct engine access |
| 60 | +) |
| 61 | +``` |
| 62 | + |
| 63 | +### Feature Layer (`claude_parser`) |
| 64 | +High-level, UI-ready functions that use core internally. |
| 65 | + |
| 66 | +#### Conversation Display |
| 67 | +```python |
| 68 | +from claude_parser import ( |
| 69 | + get_conversation_for_display, # Returns UI-ready messages |
| 70 | + get_conversation_summary, # Quick stats |
| 71 | + format_message_for_ui, # Single message formatting |
| 72 | +) |
| 73 | + |
| 74 | +# One-liner for UI: |
| 75 | +messages = get_conversation_for_display(session) |
| 76 | +# Returns: [ |
| 77 | +# {'text': '...', 'role': 'user', 'timestamp': '...', 'has_tools': False}, |
| 78 | +# {'text': '...', 'role': 'assistant', 'timestamp': '...', 'tools_used': ['Bash']} |
| 79 | +# ] |
| 80 | +``` |
| 81 | + |
| 82 | +#### File Operations Display |
| 83 | +```python |
| 84 | +from claude_parser import ( |
| 85 | + get_file_changes_at_point, # Get diff at specific message |
| 86 | + get_modified_files_list, # List all changed files |
| 87 | + get_file_history, # Timeline of file changes |
| 88 | +) |
| 89 | + |
| 90 | +# One-liner for diff UI: |
| 91 | +changes = get_file_changes_at_point(session, uuid) |
| 92 | +# Returns: { |
| 93 | +# 'file': 'main.py', |
| 94 | +# 'before': '...', |
| 95 | +# 'after': '...', |
| 96 | +# 'diff_html': '...', # Pre-formatted for display |
| 97 | +# } |
| 98 | +``` |
| 99 | + |
| 100 | +#### Analytics Dashboard |
| 101 | +```python |
| 102 | +from claude_parser import ( |
| 103 | + get_session_analytics, # Complete dashboard data |
| 104 | + get_token_usage_summary, # Token counts with costs |
| 105 | + get_tool_usage_report, # Tool frequency analysis |
| 106 | +) |
| 107 | + |
| 108 | +# One-liner for dashboard: |
| 109 | +analytics = get_session_analytics(session) |
| 110 | +# Returns: { |
| 111 | +# 'messages': {'total': 45, 'user': 20, 'assistant': 25}, |
| 112 | +# 'tokens': {'used': 6600, 'cost': 0.05, 'remaining': 193400}, |
| 113 | +# 'tools': {'Bash': 10, 'Write': 5}, |
| 114 | +# 'duration': '25 minutes' |
| 115 | +# } |
| 116 | +``` |
| 117 | + |
| 118 | +## Migration Plan |
| 119 | + |
| 120 | +### Phase 1: Create Core Package |
| 121 | +- Move internal functions to `claude_parser/core/` |
| 122 | +- Keep backward compatibility with deprecation warnings |
| 123 | + |
| 124 | +### Phase 2: Build Feature Layer |
| 125 | +- Implement high-level functions using core |
| 126 | +- Each feature function should be <20 LOC |
| 127 | +- 100% framework delegation |
| 128 | + |
| 129 | +### Phase 3: Update Public API |
| 130 | +- Update `__init__.py` to expose both layers |
| 131 | +- Documentation with clear examples |
| 132 | +- Deprecate old mixed-level exports |
| 133 | + |
| 134 | +## UI Projects Benefit |
| 135 | + |
| 136 | +With this structure, UI projects like claude-explorer get: |
| 137 | + |
| 138 | +```python |
| 139 | +# Instead of this (current): |
| 140 | +messages = session.get('messages', []) |
| 141 | +filtered = filter_pure_conversation(messages) |
| 142 | +formatted = [] |
| 143 | +for msg in filtered: |
| 144 | + content = msg.get('content', msg.get('message', {}).get('content', '')) |
| 145 | + if content: |
| 146 | + formatted.append({ |
| 147 | + 'text': content, |
| 148 | + 'type': msg.get('type'), |
| 149 | + # ... more boilerplate |
| 150 | + }) |
| 151 | + |
| 152 | +# They get this (new): |
| 153 | +messages = get_conversation_for_display(session) # Done! |
| 154 | +``` |
| 155 | + |
| 156 | +## Next Steps |
| 157 | + |
| 158 | +1. ✅ Audit complete - we know what goes where |
| 159 | +2. ⏳ Create `core/` package structure |
| 160 | +3. ⏳ Implement feature layer functions |
| 161 | +4. ⏳ Update documentation |
| 162 | +5. ⏳ Release as v3.0.0 (breaking change) |
0 commit comments