Skip to content

Data validation for machine learning

原文链接

The data validation stage has three main components:

  • the data analyzer computes statistics over the new data batch
  • the data validator checks properties of the data against a schema
  • the model unit tester looks for errors in the training code using synthetic data (schema-led fuzzing)

detecting skew