projectrules.ai

Application entry point

pythonpydanticvalidationperformancecode-organization

Description

Comprehensive best practices and coding standards for utilizing Pydantic effectively in Python projects, covering code organization, performance, security, and testing.

Globs

**/*.py
---
description: Comprehensive best practices and coding standards for utilizing Pydantic effectively in Python projects, covering code organization, performance, security, and testing.
globs: **/*.py
---

- **Model Definition:**
  - Use `BaseModel` to define data schemas with type annotations for clarity and automatic validation.
  - Prefer simple models that encapsulate a single concept to maintain readability and manageability.
  - Use nested models for complex data structures while ensuring each model has clear validation rules.
  - Always define a `Config` class within your model to control model behavior.

- **Validation and Error Handling:**
  - Implement built-in and custom validators to enforce data integrity.
  - Utilize `@field_validator` for field-specific rules and `@root_validator` for cross-field validation.
  - Ensure that validation errors are user-friendly and logged for debugging purposes.
  - Custom error messages should be informative and guide the user on how to correct the data.
  - Use `ValidationError` to catch and handle validation errors.

- **Performance Optimization:**
  - Consider using lazy initialization and avoid redundant validation where data is already trusted.
  - Use Pydantic's configuration options to control when validation occurs, which can significantly enhance performance in high-throughput applications.
  - Use `model_rebuild` to dynamically rebuild models when related schema change.
  - Consider using `__slots__` in the `Config` class to reduce memory footprint.

- **Code Organization and Structure:**
  - **Directory Structure:**
    - Adopt a modular structure: `src/`, `tests/`, `docs/`.
    - Models can reside in `src/models/`.
    - Validators in `src/validators/`.
    - Example:
      
      project_root/
      ├── src/
      │   ├── __init__.py
      │   ├── models/
      │   │   ├── __init__.py
      │   │   ├── user.py
      │   │   ├── item.py
      │   ├── validators/
      │   │   ├── __init__.py
      │   │   ├── user_validators.py
      │   ├── main.py  # Application entry point
      ├── tests/
      │   ├── __init__.py
      │   ├── test_user.py
      │   ├── test_item.py
      ├── docs/
      │   ├── ...
      ├── .env
      ├── pyproject.toml
      ├── README.md
      
  - **File Naming:**
    - Use snake_case for file names (e.g., `user_model.py`).
    - Name model files after the primary model they define (e.g., `user.py` for `UserModel`).
  - **Module Organization:**
    - Group related models and validators into separate modules.
    - Utilize `__init__.py` to make modules importable.
  - **Component Architecture:**
    - Employ a layered architecture (e.g., data access, business logic, presentation).
    - Pydantic models are primarily used in the data access layer.
  - **Code Splitting:**
    - Split large models into smaller, manageable components using composition.
    - Leverage inheritance judiciously.

- **Common Patterns and Anti-patterns:**
  - **Design Patterns:**
    - **Data Transfer Object (DTO):** Pydantic models serve as DTOs.
    - **Factory Pattern:** Create model instances using factory functions for complex initialization.
    - **Repository Pattern:** Use repositories to abstract data access and validation logic.
  - **Recommended Approaches:**
    - Centralize validation logic in dedicated validator functions.
    - Utilize nested models for complex data structures.
    - Use `BaseSettings` for managing application settings.
  - **Anti-patterns:**
    - Embedding business logic directly into models.
    - Overly complex inheritance hierarchies.
    - Ignoring validation errors.
    - Performing I/O operations within validator functions.
  - **State Management:**
    - Use immutable models whenever possible to simplify state management.
    - Consider using state management libraries like `attrs` or `dataclasses` in conjunction with Pydantic for complex applications.
  - **Error Handling:**
    - Raise `ValidationError` exceptions when validation fails.
    - Provide informative error messages.
    - Log validation errors for debugging.

- **Performance Considerations:**
  - **Optimization Techniques:**
    - Use `model_rebuild` to recompile models when their schema changes.
    - Leverage `__slots__` in the `Config` class to reduce memory footprint.
    - Use the `@cached_property` decorator to cache expensive computations.
  - **Memory Management:**
    - Be mindful of large lists or dictionaries within models, as they can consume significant memory.
    - Use generators or iterators for processing large datasets.
  - **Efficient Data Parsing:**
   -  Utilize `model_validate_json` and `model_validate` for efficient data parsing.
  - **Controlling Validation:**
    -  Use `validate_default` and `validate_assignment` options in the `Config` class to control validation occurrence.

- **Security Best Practices:**
  - **Common Vulnerabilities:**
    - Injection attacks (e.g., SQL injection) if model data is used directly in database queries.
    - Cross-site scripting (XSS) if model data is displayed in web pages without proper escaping.
    - Deserialization vulnerabilities if models are deserialized from untrusted sources.
  - **Input Validation:**
    - Always validate all incoming data using Pydantic models.
    - Use appropriate type annotations and constraints to restrict input values.
    - Sanitize input data to remove potentially harmful characters or sequences.
  - **Authentication and Authorization:**
    - Use authentication and authorization mechanisms to restrict access to sensitive data.
    - Implement role-based access control (RBAC) to grant different levels of access to different users.
  - **Data Protection:**
    - Encrypt sensitive data at rest and in transit.
    - Use secure storage mechanisms for storing API keys and other secrets.
    - Mask sensitive data in logs and error messages.
  - **Secure API Communication:**
    - Use HTTPS for all API communication.
    - Implement API rate limiting to prevent denial-of-service attacks.

- **Testing Approaches:**
  - **Unit Testing:**
    - Test individual models and validators in isolation.
    - Use parameterized tests to cover different input values and scenarios.
    - Verify that validation errors are raised correctly.
  - **Integration Testing:**
    - Test the interaction between models and other components of the application.
    - Use mock objects to simulate external dependencies.
  - **End-to-End Testing:**
    - Test the entire application flow from end to end.
    - Use automated testing tools to simulate user interactions.
  - **Test Organization:**
    - Organize tests into separate modules based on the component being tested.
    - Use descriptive test names to indicate the purpose of each test.
  - **Mocking and Stubbing:**
    - Use mock objects to simulate external dependencies such as databases or APIs.
    - Use stub objects to provide predefined responses for certain functions or methods.

- **Common Pitfalls and Gotchas:**
  - **Frequent Mistakes:**
    - Misusing Union Types: Using `Union` incorrectly can complicate type validation and handling.
    - Optional Fields without Default Values: Forgetting to provide a default value for optional fields can lead to `None` values causing errors in your application.
    - Incorrect Type Annotations: Assigning incorrect types to fields can cause validation to fail. For example, using `str` for a field that should be an `int`.
  - **Edge Cases:**
    - Handling complex validation logic with dependencies between fields.
    - Dealing with large or deeply nested data structures.
    - Handling different input formats (e.g., JSON, CSV).
  - **Version-Specific Issues:**
    - Be aware of breaking changes between Pydantic versions.
    - Consult the Pydantic documentation for migration guides.
  - **Compatibility Concerns:**
    - Ensure compatibility between Pydantic and other libraries used in your project.
    - Be mindful of potential conflicts with other validation libraries.

- **Tooling and Environment:**
  - **Development Tools:**
    - Use a code editor or IDE with Pydantic support (e.g., VS Code with the Pylance extension).
    - Use a static type checker like MyPy to catch type errors.
    - Use a linter like Flake8 or Pylint to enforce code style.
  - **Build Configuration:**
    - Use a build tool like Poetry or Pipenv to manage dependencies.
    - Specify Pydantic as a dependency in your project's configuration file.
  - **Linting and Formatting:**
    - Configure a linter and formatter to enforce consistent code style.
    - Use pre-commit hooks to automatically run linters and formatters before committing code.
  - **Deployment:**
    - Use a deployment platform that supports Python applications (e.g., Heroku, AWS Elastic Beanstalk, Docker).
    - Configure your deployment environment to install Pydantic and its dependencies.
  - **CI/CD:**
    - Integrate Pydantic tests into your CI/CD pipeline.
    - Automatically run tests and linters on every commit.

- **Getting Started with Pydantic:**
  - Install Pydantic with `pip install pydantic`
  - Define your data models using `BaseModel` and type hints
  - Validate your data by instantiating the data models
  - Handle validation errors using `try...except ValidationError`

- **Example:**
  python
  from pydantic import BaseModel, ValidationError
  from typing import List, Optional

  class Address(BaseModel):
    street: str
    city: str
    zip_code: Optional[str] = None

  class User(BaseModel):
    id: int
    name: str
    email: str
    addresses: List[Address]

  try:
    user_data = {
        "id": 1,
        "name": "John Doe",
        "email": "invalid-email",
        "addresses": [{
            "street": "123 Main St",
            "city": "Anytown"
        }]
    }
    user = User(**user_data)
    print(user)
  except ValidationError as e:
    print(e.json())
Application entry point