The Marshall Project is a non-profit, Pulitzer prize-winning newsroom attempting to create a sense of national urgency around reform in the US criminal justice system. The Team, composed of data scientists and data journalists, works with widely federated data sources of archival records to real-time collected data sets.
Interview with David Eads and Ilica Mahajan
Data editor David Eads first became aware of Hasura while working for ProPublica when he needed a way to query Postgres tables in a graph-like way and was introduced to Hasura as a fit to solve that problem. Since then, the assignments and newsrooms have changed, but Hasura has remained a mainstay in his data journalism toolkit.
[Creating engineers from domain experts] is something that’s really important to us technically, in this environment, you’ve got a bunch of reporters [who are] really smart … but they’re not engineers. How do you bridge that gap?
David Eads Data Editor at The Marshall Project
When working with a volunteer base of citizen-reporters and crowd-sourced data-sets, the project needed to be able to interop with tooling that would be familiar to a non-technical audience. They leveraged Airtable as a data source that joined the primary data sources as part of a federated API, providing a more complete version of the final story.
[On generated queries] the fact that you can kind of abstract that away without really understanding how these joins are working in the backend is really handy.
Ilica Mahajan Computational Journalist
Being able to expose relational data in a graph-like way opened up an entirely new way to explore data with stakeholders, fellow journalists, and interested citizens.
Having data engineers on the team allowed the Marshall Project to create highly efficient table designs for optimized queries, without needing to expose or even explain the complexity to the end-users of the data.