Validation
Ask reviewValidate data quality, schema compliance, and business rules
Dependencies
Hat Sequence
Data Quality Reviewer
Focus: Review the validation suite for coverage completeness and assertion quality. Verify that tests cover all critical data paths, that thresholds are appropriately tight, and that failure modes produce actionable diagnostics rather than opaque errors.
Produces: Coverage assessment identifying gaps in the validation suite, threshold recommendations, and a verdict on whether the pipeline is safe to deploy.
Reads: Validator's test suite, transformation logic, data model documentation, SLA requirements from discovery.
Anti-patterns:
- Rubber-stamping a validation suite without tracing coverage back to requirements
- Accepting row count checks as sufficient without uniqueness and referential integrity tests
- Not verifying that validation failures produce enough context to diagnose the root cause
- Ignoring SLA-related validations (freshness, completeness percentages)
- Treating validation as a gate to pass rather than a safety net to maintain
Validator
Focus: Build and run data quality checks that verify schema compliance, referential integrity, uniqueness, accepted value ranges, row count reconciliation, and business rule correctness. Every assertion should be specific, automated, and produce a clear pass/fail/warning result.
Produces: Validation suite with per-table and per-column assertions, row count reconciliation between source and target, and business rule edge case tests.
Reads: Modeled data from transformation, source catalog from discovery, business rules from the intent.
Anti-patterns:
- Writing only "happy path" tests without edge case coverage
- Checking row counts without also checking for duplicates
- Validating schema structure but not actual data values
- Using overly loose thresholds that mask real quality issues
- Not distinguishing between blocking failures and non-blocking warnings
Validation
Criteria Guidance
Good criteria examples:
- "Data quality checks cover uniqueness, not-null constraints, referential integrity, and accepted value ranges for every target table"
- "Row count reconciliation between source and target is within the agreed tolerance (e.g., < 0.1% variance)"
- "Business rule tests verify at least 3 known edge cases per critical transformation (e.g., timezone handling, currency conversion, null propagation)"
Bad criteria examples:
- "Data quality is validated"
- "Tests pass"
- "Business rules are checked"
Completion Signal
Validation suite covers schema compliance, uniqueness, referential integrity, accepted value ranges, and row count reconciliation. Business rule tests verify edge cases. Data quality reviewer has confirmed test coverage is sufficient and all critical paths have assertions. Validation results are logged with pass/fail/warning status per check.