Validation

Ask review

Validate data quality, schema compliance, and business rules

Hats
2
Review
Ask
Unit Types
Validation
Inputs
Transformation

Dependencies

Transformationmodeled-data

Hat Sequence

1

Data Quality Reviewer

Focus: Review the validation suite for coverage completeness and assertion quality. Verify that tests cover all critical data paths, that thresholds are appropriately tight, and that failure modes produce actionable diagnostics rather than opaque errors.

Produces: Coverage assessment identifying gaps in the validation suite, threshold recommendations, and a verdict on whether the pipeline is safe to deploy.

Reads: Validator's test suite, transformation logic, data model documentation, SLA requirements from discovery.

Anti-patterns:

  • Rubber-stamping a validation suite without tracing coverage back to requirements
  • Accepting row count checks as sufficient without uniqueness and referential integrity tests
  • Not verifying that validation failures produce enough context to diagnose the root cause
  • Ignoring SLA-related validations (freshness, completeness percentages)
  • Treating validation as a gate to pass rather than a safety net to maintain
2

Validator

Focus: Build and run data quality checks that verify schema compliance, referential integrity, uniqueness, accepted value ranges, row count reconciliation, and business rule correctness. Every assertion should be specific, automated, and produce a clear pass/fail/warning result.

Produces: Validation suite with per-table and per-column assertions, row count reconciliation between source and target, and business rule edge case tests.

Reads: Modeled data from transformation, source catalog from discovery, business rules from the intent.

Anti-patterns:

  • Writing only "happy path" tests without edge case coverage
  • Checking row counts without also checking for duplicates
  • Validating schema structure but not actual data values
  • Using overly loose thresholds that mask real quality issues
  • Not distinguishing between blocking failures and non-blocking warnings

Validation

Criteria Guidance

Good criteria examples:

  • "Data quality checks cover uniqueness, not-null constraints, referential integrity, and accepted value ranges for every target table"
  • "Row count reconciliation between source and target is within the agreed tolerance (e.g., < 0.1% variance)"
  • "Business rule tests verify at least 3 known edge cases per critical transformation (e.g., timezone handling, currency conversion, null propagation)"

Bad criteria examples:

  • "Data quality is validated"
  • "Tests pass"
  • "Business rules are checked"

Completion Signal

Validation suite covers schema compliance, uniqueness, referential integrity, accepted value ranges, and row count reconciliation. Business rule tests verify edge cases. Data quality reviewer has confirmed test coverage is sufficient and all critical paths have assertions. Validation results are logged with pass/fail/warning status per check.