Updated April 19
Clarification: The following are notes of the March 30th workshop at the Sloan Foundation and reflect the views expressed by participants in the workshop not my opinions.
For more on ORGPedia, see the new ORGPedia project page at http://dotank.nyls.edu/orgpedia/.
On March 30th twenty economists, technologists, and government officials (Download Participant List) convened in person and by telephone at the Sloan Foundation in New York to discuss creating an open numbering scheme and platform to facilitate the comparison of data about organizations across levels of government and agencies in order to:
- Promote greater accountability and compliance;
- Enhance economic growth and innovation; and
- Enable research on the evolution of companies and organizations.
This ORGPedia project is convening a wide range of experts to inform the design and scope of:
- An open legal identifier system to enable datasets about companies to be compared. Currently, different agencies use different numbering schemes. An open ID will enable taxonomies to “talk” to one another.
- An online platform to mash up and visualize authenticated government datasets already collected about firms and organizations pursuant to statute or regulation.
- An Application Programming Interface (API) and supporting software libraries to make it easy for third parties to incorporate ORGPedia into their own systems.
- A community to encourage public participation in reviewing, annotating and contributing to collected government data whether by companies and organizations or by third parties.
ORGPedia is an experiment in designing an information system that effectively combines authenticated government data with user-contributed information – a hybrid wiki – to enhance public understanding about organizations and firms.
During the March 30th discussion, participants provided their thoughts on the opportunities, challenges, and strategies for implementation, including ideas for how to prototype and pilot a first phase of the system, from the perspective of government and research communities.
This is the first in a series of five planned workshops. The Sunlight Foundation will host a second meeting on April 8th to focus on issues of corporate accountability and compliance. There will be subsequent meetings focused on the needs of those businesses who consume business intelligence; the technology design; and the international opportunities and implications.
For a longer description of ORGPedia see this backgrounder (HTML, Download PDF).
The following are notes summarizing the discussion among participants at the March 30th Meeting:
Opportunities
There are 18 million registered legal entities in the United States. Having the ability to compare and track data about them would make it possible to:
- Compare datasets about legal entities across regulatory regimes and states
- Track changes in control and ownership
In order to make information more transparent to the public; facilitate information sharing across agencies and states; and streamline regulatory compliance by pre-populating information requests with information about entities.
Imagine if, as with the Encyclopedia of Life, which creates a page for every organism on earth, we had a system with a page for every legal entity on earth. Imagine if we had an “ISBN number” for every entity. It would enable all kinds of new services and research. This has become possible in the last few years as a result of advances in web technology and policies for opening up access to public data. The challenge is that firms evolve faster than fish and firms can morph into new firms with different names and owners through changes in control.
At root, we must address the fundamental microeconomic problem of identifying the boundaries of the firm. What if Adam Smith’s pin factory had a financing arm? Or an exclusive steel supplier? We now have the technology to represent these relationships and make the transparent.
Benefits to Government:
Having stable, unique identifier system by means of a single number or a data dictionary to translate across numbering schemes (or both – a single entity identifier plus a way to translate other common fields across schemes) would enable comparison of corporate activity across levels of government, states and across agencies. Right now we don’t know if a company doing business in one state is the same or related to a company doing business in another state. So when malfeasance is committed in one place, we are missing an opportunity to be on the look out before it happens in another state. It would be incredibly valuable to have a way to generate early warning signals.
Having a unique identifier or the ability to pull data from a common and authenticated collection of data about an entity would reduce the transaction costs to entities wishing to comply with requirements across multiple states.
The federal government alone spends $3.5 trillion. Public should be able to slice and dice. In order to make the information about how government spends accessible to people, we need to be able to trace this money even when companies change ownership and name. For example, when Boeing acquires McDonnell Douglas, a search today does not connect these two entities to provide an accurate picture.
Even though we track to the subcontractor level, we have none of the history to connect affiliates and see relationships.
This makes having a unique identifier a priority. If we had the ability to trace changes such as mergers, we could better understand the connection, if any, between government grants/contracts and campaign contributions; we could spot fraud and remove offending companies from the rolls across agencies.
Some discussion about needing a level of private information, especially about the individuals involved, even as we maintain public information at the entity level.
Benefits For Researchers:
Think about scholars working with firm as unit of analysis – engaging in same redundant transaction costs – cries out for public data set.
There are huge transaction costs associated with doing work about firms. Data sets tends to be proprietary, limited in scope and the info is at best outdated and, at worst, just terrible.
Accounting, business strategy, information technology management, finance, political science scholars are all engaging in the same socially wasteful redundant activity of trying to clean and match this data. If we could free up some of the time spent on cleaning data, we would free up researcher capacity.
For example, NYTimes did Pulitzer Prize piece on worker death at a manufacturing firm. It was tremendously labor intensive and next to impossible, to investigate the environmental compliance record of the same entity, though preliminary analysis showed they were turning in the same topic release statements to regulators each year rather than developing new figures.
If we wanted to “mash up” OSHA compliance data with EPA compliance data, we can’t do it today. Researchers have the interest but the incompleteness makes it so hard.
Over 50% of the business outputs in the United States are coming from intangibles. But there is no way to match up firms with IP output because we can’t connect patent registrations to the registrations to the entities that hold IP. At a time when innovation is becoming more important as a driver of the economy, this work is more important not less.
The field of business history is dying off because of difficulty of doing empirical research.
Technology:
Technologically, this problem is not unlike the naming issues we face today in trying to create websites (or banking codes) to identify entities, ie. sloan.org and we’re now trying to make sense of the secondary pages like the About page, address page etc. which search engines know how to do.
We have the ability to map when a firm is taken over, complex interdependencies, who owns what.
Visualizations will help make this data more usable. We can show where data came from, whether it is authenticated government data, or contributed by the public.
The technology platforms for building this kind of site exists. There are no show stoppers. Some work will be needed at the applied research level to transition technology from research to practice but there are existing models.
The Encyclopedia of Life (eol.org), funded by Sloan, provides some important organizational lessons learned about running a system of this type and complexity with a mix of authoritative and open information.
Challenges
Adding a signal field to existing identifier systems (ie. a universal identifier) might not be hard. Adding several fields to track changes in control, however, could be costly. However, there are Web technologies that can mitigate most of this cost if properly deployed.
What is the right role of the government? Should the government own such a system or should it be a stand-alone non-profit? What is the right governance structure to ensure legitimacy?
Pilot and Partners
Three areas of focus for potential pilot/prototype came up:
- Mashing up Environment and Labor enforcement databases
- Mashing up SEC’s XBRL data about public companies with state registrations to track and display changes in ownership
- Mashing up patent office applications with state corporate registrations to see who is patenting what
The National Organization of Secretaries of State would be a natural partner for implementing the necessary changes.
Also check out B-Lab at http://www.bcorporation.net/, a younger, more entrepreneurial set of companies committed to social benefit who might be willing to test contributing more of their data to be used in a pilot.
Check out: Bottega and Powell, Creating a Linchpin for Financial Data: Toward a Universal Legal Entity Identifier (http://www.federalreserve.gov/pubs/feds/2011/201107/index.html)
Check out: UK Companies House, which does impose an LEI but would benefit from the win/win of gains to companies and transparency of getting companies to share their data through such a platform. There will be a June/July paper on corporate reporting.
Check out the book: The Demography of Corporations
See also http://OpenCorporates.com which is already solving this problem with the open data community in many countries, including the US ;-)
Posted by: Chris Taggart | April 03, 2011 at 04:30 PM
It is very encouraging to see this work going on within multiple sectors and in various fora. It needs to be broader than just companies, whether publicly traded or privately held, the identifier scheme also needs to take into consideration other types of regulated or other types of entities, such as municipal authorities, state/federal/local agencies, and others. Currently there is a complete inability to perform any kind of meaningful analysis on that entity level, whether corporate or otherwise, as there is no open, nonproprietary means of identifying these regulated entities. In my work at EPA, I have begun to look at some of our major reporters, and begin taking initial steps toward consistency, whether defining and implementing a consistent means of dealing with such things as "Corp." vs. "Corporation", and putting in place web services to perform AJAX autocomplete functionality for those major reporters, where for example, the user would start entering information, and they could either select an appropriate choice from the ones that appear on a dropdown, or continue typing if it is not a match. Ultimately, a truly universal scheme of unique identifiers can replace this approach - and facilitate cross agency analysis of a number of things, whether issues of societal benefit, enforcement, assessing regulatory burden, and others.
Posted by: David G. Smith | April 03, 2011 at 05:06 PM
David, Thanks for your very helpful comments. EPA has really been out in front on this issue with people participating both in the Sloan and the upcoming Sunlight workshops. Are you in touch with your open gov team there? If you ping me via email or skype, I'll be happy to connect you, if you aren't already.
Posted by: Beth Simone Noveck | April 03, 2011 at 08:09 PM
Chris,
Glad to hear from you. Have been following your work. Was planning to reach out in connection with the upcoming corporate planning meeting and the one we are planning for London. Would like to understand better the work you are doing and how it compares to other projects under way. If we can pack up our toys, I'll be thrilled. Shoot me a line via email or skype so we can arrange an opportunity to talk.
Posted by: Beth Simone Noveck | April 03, 2011 at 08:09 PM
Beth
Will do. On the road for a few days (currently in San Francisco), but will email when I get back to London.
Posted by: Chris Taggart | April 04, 2011 at 01:49 PM
I would like to help. I am saving the data - see
http://semanticommunity.info/Build_Recovery.gov_in_the_Cloud and my comments at http://dallemang.typepad.com/my_weblog/2011/04/datagov-shutdown-is-my-fault.html
Posted by: Bniemannsr | April 04, 2011 at 05:00 PM
This is great, and having an interoperable database of legal entities will go a long way towards staking out the territory. However, I am not sure that this is going to address the root problem of the Coasian "nature of the firm" you mention at the end of the Opportunity section. An approach based on legal entity, in a sense, inherently assumes the problem away: a firm is what the books say it is.
Having done research on industries myself, I am aware of the limitations of such an approach – but I am also aware of the great value of having such a database, limitations and all.
In Europe, there would be an obvious policy implication: you could track and make transparent subsidies to firms, which are legion, and therefore very opaque as well as expensive to the taxpayer. We could build a sort of Farmsubsidy generalized to all subsidies! Pure rapture. ;-)
Posted by: Alberto Cottica | April 06, 2011 at 01:50 PM
Alberto, To be clear the goal is not to build a new database. But, instead, to look for ways to make the different numbering schemes talk to one another so that we can create mashups of existing databases. I agree with you, however, that the "firm" is a slippery thing, which is why we're better off trying to trace the evolution of its relationships over time. You are exactly right about the kind of pilot project that we're hoping to do and inspire others to do. Let me emphasize the latter!
Posted by: Beth Noveck | April 06, 2011 at 04:28 PM
Beth, I am doing some work on network math, and maybe this biases my thinking. But in principle research on the nature of the firm could be done using networks; there already is fascinating research on networks of board members (the same person participates in more than one board, thereby linking different companies). The trick is to associate legal entities to other legal entities via some kind of links: this helps perceive constellations of companies rather than individual companies, ecosystems rather than single species.
Posted by: Alberto Cottica | April 16, 2011 at 02:05 AM
Awesome project! Keep up the goodwork!
Posted by: Bidet Seat | May 19, 2011 at 02:23 AM
It was extremely laborious and almost impossible to study the history of environmental compliance from the same unit.
Posted by: קרטון | October 04, 2011 at 06:01 PM
So how does the project going on? Hope it will be successful as ever. Good luck!
Posted by: custom essays | October 25, 2011 at 04:16 PM
Thank you Chris Taggart. I just checked your shared link. You gave brilliant solution of problem. thanks a ton. So hows going everything? Keep it up the good work.
Posted by: comment system | October 31, 2011 at 07:40 AM
Hello,
We facilitate the provision of independent analysis to support expert testimony, regulatory or legislative engagements.http://www.potentiamed.com
Posted by: Account Deleted | November 23, 2011 at 08:01 AM
Perhaps more important, I believe that the methods used in this analysis of the Voynich mystery can be applied to difficult questions in other areas.
Posted by: sewing machine reviews | December 12, 2011 at 01:20 AM
Awesome project! http://www.jcantrade.com/security-cameras_c158
Posted by: Chang Carol | March 01, 2012 at 01:20 AM
This looks absolutely perfect. All these tinny details are made with lot of background knowledge. I like it a lot.
Posted by: lichen planus cure | August 07, 2012 at 05:18 AM
I appreciate spending some time to talk about that, I believe firmly regarding this and so really enjoy understanding more about this kind of subject.Keep up the good work!
Posted by: homeopathy trigeminal neuralgia | August 10, 2012 at 09:55 PM
It's really amazing post. I like to read such blog regarding government. Thanks for sharing this one.
Posted by: beats by dre | August 21, 2012 at 03:27 AM
Pretty good post. I just stumbled upon your blog and wanted to say that I have really enjoyed reading your posts. Any way I'll be subscribing to your feed and I hope you post again soon..
Posted by: skin moles | August 22, 2012 at 07:28 AM
I completely agree with you. I have no point to raise in against of what you have said I think you explain the whole situation very well
Posted by: corporate interior | August 23, 2012 at 01:30 AM
The topic that your blog deals with demands dozens of research. Thanks to you who has provided the intricate information in commoner words.
Posted by: idateasia scam | October 19, 2012 at 12:58 AM