Chapter 8
Project 2.5: Schema and Metadata
It helps to keep the data schema separate from the various applications that share the schema. One way to do this is to have a separate module with class definitions that all of the applications in a suite can share. While this is helpful for a simple project, it can be awkward when sharing data schema more widely. A Python language module is particularly difficult for sharing data outside the Python environment.
This project will define a schema in JSON Schema Notation, first by building pydantic
class definitions, then by extracting the JSON from the class definition. This will allow you to publish a formal definition of the data being created. The schema can be used by a variety of tools to validate data files and assure that the data is suitable for further analytical use.
The schema is also useful for diagnosing problems with data sources. Validator tools like jsonschema
can provide detailed error reports that can help identify changes...