« Our work

National Wastewater Surveillance System Schema

National Wastewater Surveillance System Schema
Project website
Schema validation, process improvement
California Department of Public Health
October 2021

A structured schema to help validate submissions of COVID test results in wastewater samples at scale.

More about this project

The National Wastewater Surveillance System (NWSS) Schema is an open-source implementation of the data standard used by the Center for Disease Control for ingestion and analysis of submissions of COVID test results in wastewater samples. Read more about NWSS here.

In order to accept samples from across the country, the CDC defined a data standard with dozens of fields and necessarily complex validation rules. In partnership with the California Department of Public Health, DataMade translated the CDC's standard into a marshmallow schema such that programmatic validation can be incorporated into any Python-based data flow. The package, available on PyPI, also makes available a JSON Schema representation of the standard for data flows based in JavaScript.

The goal of automated, upfront validation is to give submitters immediate feedback when data is invalid, reducing the need for intermediaries at state health departments and water regulators to process submissions themselves and ferry remaining errors from the CDC back to submitters. With less overhead, submission programs can more easily scale, ensuring that the benefits of wastewater surveillance reach communities of all shapes, sizes, and resource availability.