Brand new tools for analyzing community-level data

Published on Nov 21, 2016

New community data tools
Census tracts in D.C. Image by Michael Rodriguez, published under the Creative Commons license.

This week, DataMade is excited to announce three new additions to our open-source toolkit for developers. We’ve added libraries for dealing with data on crime, schools, and demographics in Illinois and across the United States.

At DataMade, we aspire to write code that is effective and easy to use. If you use these tools, make sure to let us know on Twitter or on Github – we’d love to signal boost your accomplishments, or incorporate your contributions to help make these tools even better.

Here’s what we’ve got in store for you:


1. Query all kinds of census geographies with census_area

Map the density of old homes in Chicago by census tract using the census_area library.

Getting the right information from the U.S. Census Bureau using their developer APIs can be a headache. In general, you have to stick to census designated geographies, and it can be tricky to filter fine-grained data for the area you’re interested in.

Our census_area Python library extends the Sunlight Foundation’s Census API Wrapper in order to open up the census API to a wider range of geographies. With census_area, you can query every tract, block, or block group in a given census place – say, to get a list of homes built before 1939 in the City of Chicago – or you can bring in a custom shape of your own to use as a filter.


2. Compare crime stats across Illinois with illinois-ucr

Illinois crime spreadsheet Working with crime stats that look like this can be frustrating. We're here to help.

Every year the Illinois State Police releases Uniform Crime Reports (UCRs) with detailed crime statistics across the state. But since the State Police doesn’t provide a unique ID for each police agency, it can be a pain to match UCR stats to a specific list of places, agencies, or jurisdictions that you might be interested in.

Enter illinois-ucr, a database that makes querying crime stats easier. Once you build the database, you’ll have access to the latest UCR numbers as well as a few crosswalks that can help compare across tables.

Wondering how we made our crosswalks? We used Dedupe, of course – our tool for fuzzy matching rows across datasets! Dedupe is available to the public as a subscription service, or as a set of open-source tools for Python and the command line.


3. Get the rundown on Illinois schools with school-report-cards

School report cards database A detailed database chock full of school report cards - right at your fingertips.

There’s a wealth of information about Illinois schools floating around the Internet, but it’s hosted in disparate places and in a variety of formats. Like illinois-ucr, school-report-cards helps to centralize this information in one database, making it easier to compare statistics across schools, districts, and counties.

Use GNU Make and PostgreSQL to build the comprehensive school-report-cards database and get easy access to twenty years’ worth of statistics about school performance in Illinois.


It’s all yours

All of these tools are free and open-source under the MIT License, which means that you’re welcome to use them for whatever projects you can dream up – commercial or otherwise. Go forth, and bring better data into the world!

Interested in using or helping to maintain these tools? Make sure to fork us on Github and submit an issue or a pull request.