+
Skip to content

Convert entire codebase from dataclasses to Pydantic models #102

@tunahorse

Description

@tunahorse

Summary

Convert all dataclasses throughout the TunaCode codebase to Pydantic models for better validation, maintainability, and error handling. This is a systematic improvement to eliminate anti-patterns and improve code quality across the entire project.

Current Anti-Patterns Found

The codebase has several anti-patterns related to data validation:

1. Manual Type Checking

# Found in models_registry.py - manual validation instead of proper framework
cost = ModelCost(
    input=cost_data.get("input") if isinstance(cost_data, dict) else None,
    output=cost_data.get("output") if isinstance(cost_data, dict) else None,
)

2. Silent Error Handling

# Silent failures that hide problems
except (URLError, json.JSONDecodeError, OSError):
    return False  # No logging or error context

3. Complex Manual Parsing Logic

  • Fragile parsing that's hard to maintain
  • No centralized validation rules
  • Silent data loss on malformed input

Benefits of Pydantic Conversion

1. Automatic Validation

  • Type checking with clear error messages
  • Data coercion (strings to numbers, etc.)
  • Custom validation rules

2. Better Error Handling

  • Detailed ValidationError messages
  • Clear indication of what went wrong
  • Fail-fast approach instead of silent failures

3. Serialization/Deserialization

  • Automatic JSON conversion
  • Schema generation for documentation
  • Better API integration

4. Maintainability

  • Less boilerplate validation code
  • Single source of truth for data models
  • Easier to extend and modify

Files to Convert (Based on Initial Analysis)

High Priority

  • src/tunacode/utils/models_registry.py - Model and provider data structures
  • src/tunacode/types.py - Core type definitions
  • src/tunacode/core/agents/ - Agent configuration and state models
  • src/tunacode/tools/ - Tool request/response models

Medium Priority

  • src/tunacode/cli/ - CLI configuration models
  • src/tunacode/services/ - Service data models
  • Test fixtures and mock data structures

Implementation Approach

Phase 1: Core Models

  1. Convert models_registry.py dataclasses (already identified)
  2. Update core type definitions in types.py
  3. Convert agent configuration models

Phase 2: Tool and Service Models

  1. Convert tool request/response models
  2. Update service data structures
  3. Convert CLI configuration models

Phase 3: Testing and Validation

  1. Update all test fixtures
  2. Add validation tests
  3. Ensure backward compatibility

Technical Considerations

Dependencies

  • Pydantic 2.11.7 is already available (via pydantic-ai dependency)
  • No additional dependencies required

Migration Strategy

  • Maintain backward compatibility where possible
  • Update imports systematically
  • Add comprehensive tests for validation scenarios

Performance Impact

  • Minimal overhead for CLI use case
  • Benefits outweigh performance costs
  • Can optimize specific cases if needed

Success Criteria

  • All dataclasses converted to Pydantic models
  • Manual validation code eliminated
  • Better error messages throughout
  • All tests passing
  • No breaking changes to public API
  • Documentation updated where needed

Additional Context

This conversion addresses several technical debt items and anti-patterns identified in the codebase. The systematic approach will improve code quality, maintainability, and developer experience across the entire project.

Pydantic is already used in the project (pydantic-ai), so this leverages existing infrastructure and follows established patterns.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载