Fatima Gomez
Published on Nov 03, 2022
Data can be a powerful tool for illuminating our world, but it can also reproduce bias, perpetuate harm, and subvert truth. A significant amount of our work at DataMade involves making data available, accessible, and intelligible; we therefore must take steps to minimize the biases, harm, and fallacies that can come from utilizing data irresponsibly.
This living document summarizes DataMade’s approach to a less harmful data practice with regards to the kinds of sensitive data we work with most often. This list includes crime report data; race, gender, and ethnicity data; individual-level public record data; and user data collection practices.
These recommendations guide our work with our current and future partners. Where possible, we are taking steps to revise or add to past work that does not align with these principles.
A myriad of state, county, and city level crime data reports exist alongside the national Uniform Crime Report (UCR) and National Crime Victimization Survey datasets. Our knowledge of the role of crime report data has grown, and we understand that crime report data can be used irresponsibly to justify the further policing and surveillance of communities of Black, Indigenous, and People of Color.
With this in mind, we commit to the following with regards to crime report data:
Journalists like Pascal Sabino at Block Club Chicago use policing data appropriately to surface how Chicago police unjustly search Black Chicagoans under the pretense of traffic stops.
Race, gender, and ethnicity data are both common and useful. When these labels are self-reported, they help us to understand people’s relationship to their society. However, when these classifications are imposed, they tell us more about the harmful effects of the systems and algorithms that use these imposed labels than they necessarily do about the people whom the data is about.
With this in mind, we commit to the following with regards to race, gender, and ethnicity data:
Journalism coming out of The Circuit project, which DataMade is a partner in, compellingly reports on defendants’ disparate treatment based on the demographic classifications imposed on them by Cook County Circuit Courts. The US Census provides an imperfect initial model for balancing self-reported demographic data while simultaneously using defined categories for data analysis.
Individual-level public record information about public servants and occasionally high-profile private citizens can be used to make the actions of people with significant power transparent and to hold these people accountable for their actions.
While accountability is important, it is equally important to balance individuals’ need for privacy. To that end, we commit to the following:
Though US courts do not currently recognize this right to privacy, we do look to the European General Data Protection Regulation (“the Right to Be Forgotten”) for guidance on when to avoid using or remove PII from our sites.
User data collection can allow a website to enhance user experience. In the US currently, though, it is commonplace to collect and sell user data without any transparency. This prevents users from being able to consent to having their data collected and/or sold, and precludes the possibility of collecting user data ethically.
A number of European companies employ clear, concise, and transparent ways of requesting user data; this blog post lists a few. Google’s data practices guide speaks to Google’s data practices in a concise and intelligible way.
In order to provide users with a positive user experience on our sites while allowing them control over their own data, we commit to the following:
As conversations around using, collecting, and distributing data continue and evolve, we will revisit these commitments to ensure we handle the data we work with responsibly.
For questions about these data practices or to learn more about them, contact us at info@datamade.us.