The data validation stage has three main components:
- the data analyzer computes statistics over the new data batch
- the data validator checks properties of the data against a schema
- the model unit tester looks for errors in the training code using synthetic data (schema-led fuzzing)
detecting skew