For more, see Noveck and Goroff, Information for Impact: Liberating Nonprofit Sector Data. Link (PDF)
Every year in the United States approximately 1.5 million registered tax-exempt organizations file a version of the “Form 990” with the IRS and state tax authorities. While the questions vary between the version for private foundations or small nonprofits, the 990 collects details on the financial, governance and organizational structure of America’s universities, hospitals, foundations, and charities to the end of ensuring that they are deserving of tax exempt status. These organizations, which together pay $670 billion in wages and benefits annually, create America’s education, culture, art, religion, science, and provide many of the social services upon which our communities depend.
With a national movement in the U.S. to shrink the role of government, non-profits may be expected to expand their programs as they step in to fill essential needs. The role of nonprofits may now become even greater – and deserving of greater scrutiny.
The data that the IRS collects about nonprofit organizations present a great opportunity to learn about the sector and make it more effective. Yet this data could be made far more useful than it is today. It’s time to “liberate” 990 data and make it easier to gain insight into the workings of America’s nonprofits.
The IRS does make nonprofits’ Form 990 returns available, but only on DVDs for a high fee. A single year’s worth of 990s costs over $2,500, arguably to recoup the costs of pressing and mailing all these dics. But there is no reason to charge for the Form 990 data at all. Just as most people have gotten accustomed to sharing large files via a service like Drop Box, it would be simple for the IRS to publish the returns online for anyone to download in bulk for free. This week two groups committed to government transparency, Public Resource and the Internet Archive, used their own resources to post 12 years of returns online, demonstrating that it can be done.
As President Obama declared on his first day in office, “Information maintained by the Federal government is a national asset,” and IRS data on nonprofits is important and valuable information that should be available to everyone.
The DVDs are only part of the problem. Even if you can afford to buy the DVDs with Form 990 data, as some organizations and news media do, the data on them is contained in image files, which are created by scanning the printed Form 990s rather than putting their data into a searchable database. Image files are useful only for reading about one nonprofit organization at a time. The sector deserves comprehensive and computable data that can be openly aggregated, searched, checked, and analyzed.
In the long run, as a condition of being a nonprofit, organizations should be required to file the Form 990 electronically, rather than on paper, and the IRS should publish those returns in formats that lend themselves to doing aggregate analytics, creating visualizations and building analytic tools.
The IRS can start releasing in a timely fashion the data it holds that is filed electronically in computable form without waiting until all returns are electronically filed. There’s some debate about how much authority the IRS has to make changes like this on its own, and whether they would require Congressional action. Others argue that under the Freedom of Information Act, they must release the data. But we don’t need to wait for either a legal battle or for the IRS or Congress: The groups that now independently analyze IRS data can and should take the lead.
Today, the Foundation Center, GuideStar, the Urban Institute, Johns Hopkins’ Center for Civil Society Studies, and Indiana University’s Center on Philanthropy spend millions each year on converting the IRS images of the Form 990 into clean data that a computer can ingest and use to perform analysis and develop visualizations. They’ve had to do this conversion because there has been no comprehensive set of open data about the nonprofit sector available to them or the many others who would take advantage of it. But rather than replicating each other’s efforts and then charging for access to the results, these groups could follow a more collaborative, open model. (Some of these groups are beginning to explore a collaboration.)
At least for the short term, incumbent organizations whose goal it is to provide data about the nonprofit sector and who raise philanthropic dollars to do so can stand in the place of government and make a data resource on nonprofits available. These organizations and those who fund them should take their cue from Public Resource and Internet Archive by pooling their resources and collaborating to develop a single, open and comprehensive 990 database that is available and free to all.
It will reduce the costs of data management for these incumbents and make the task of converting IRS data more efficient. And it need not threaten their revenue models: What they lose on the sale of bulk data, they can more than make up for by providing new tools and analytic services.
More important, free, open, analyzable data on nonprofits will enable more innovators, researchers, and entrepreneurs to use the data to benefit the sector. There are now many examples of public benefits that have come from “opening up” government data. When the U.S. Department of Health and Human Services published its database of hospital infection rates online in a computable format, Microsoft and Google were able to mash it up with mapping data to create an application that shows infection rates for local hospitals across the country. This tool readily allows anyone — from the investigative journalist to the parent of a sick child — to see which hospitals are safest. The National Oceanic and Atmospheric Administration freely and publicly provides weather and forecast data online, and that data provides the backbone for such services as the Weather Channel. The GPS data we use to get from work to home were made available for civilian use by President Ronald Reagan, who saw the impact these data could provide as a public good. Cities have unlocked the data on when public transit runs and to where, making bus and subways easier to catch than ever before.
A comprehensive source of high-quality data on nonprofits, structured to allow comparisons and analyses across different organizations in the sector, would greatly enhance and accelerate research about the sector and make it possible to:
- Do more extensive, in-depth empirical research on the sector as a whole, including sector-wide issues such as the impact of the economic downturn on nonprofits, the geographic distribution of nonprofit services, and the efficiency of the nonprofit sector in delivering services;
- Combine the 990 data with other datasets, such as those on government spending, to better understand the relationship between public and private dollars in providing social services;
- Query the data to address issues relating to specific nonprofits, such as gaining greater insight into 501(c)(4) organizations that engage in lobbying or finding trends and outliers in executive compensation;
- Recognize fraud early, anticipate abuses, and target enforcement more efficiently and effectively; and
- Enable more people and organizations to analyze, visualize, and mash up the data, creating a large public community that is interested in the nonprofit sector and can collaborate to find ways to improve it.
Above all opening up 990 data would attract many new and innovative people who would bring energy, enthusiasm and creativity to developing tools to help the neediest among us access better services, nonprofit providers to become more effective and efficient, and everyone to understand the role of the nonprofit sector in our economy better. Instead of only the work that Guidestar’s and Indiana’s employees have the time to do, many more people could begin to create apps, develop visualizations and do research than have been able to today.
With open Form 990 data, we can expect to see again what we are now seeing in many sectors: When experts of all kinds have access to open data, it becomes a catalyst for creative problem solving and community innovation.