US Federal RORs
The U.S. Federal Government is a major participant in the global research community and keeping track of research supported by the U.S. Federal Government is an important task. The Wikipedia List of United States research and development agencies includes five independent agencies, thirteen cabinet level agencies, and four multi-agency initiatives. Most of these include multiple layers of centers, divisions, services, institutes, directorates, and other organizations, each of which carry out, fund, and oversee research in almost every conceivable discipline.
The Research Organization Registry is a community-led project that aims to develop an open, sustainable, usable, and unique identifier for every research organization in the world. It is supported by several large identifier infrastructure operators (CrossRef and DataCite) as well as the University of California systemwide California Digital Library and recently released the first version of a registry based on GRIDs donated to the effort by Digital Science. Until the ROR community takes over curation and management of the Registry, the contents will continue to be the GRID data . The GRID database is updated several times a year according to open policies, and is publicly available at no cost and dumps of the ROR data are also available.
Adopting any new identifier system, even when benefits are well known, is a significant challenge. The ROR community roadmap identifies a number of product, policy, and community development tasks including raising awareness among community members. The goal of this blog is to raise ROR awareness among researchers affiliated with the U.S. Federal Government. How can these researchers use ROR and what future work might help facilitate adoption in the U.S. Federal research community?
Finding U.S. Federal RORs
Characterizing the granularity of ROR identifiers is an important step towards using these identifiers effectively in the U.S. Federal context. In the academic setting, the goal of ROR is generally to identify organizations at the University or College level, i.e. no academic departments. It is not immediately clear how this translates to the public sector. Given the deep hierarchy of Federal agencies and departments, finding Federal organizations can be a challenge. As mentioned above, there are roughly 20 “department level” entities in the U.S. Government that do research and development. Does this mean that 20 RORs will cover the whole U.S. Government? That seems like a very small number, perhaps too small to be very useful. Rather than try to answer this question theoretically, we start with an empirical characterization of the current ROR data as a baseline.
The ROR data includes two fields that may provide a starting point: organization country and type. The United States makes up 31% of the data (29,686 RORs) and 1247 of those are identified as Government types. Many of these organizations are state or regional entities, out of scope for this work, so we need a better search strategy. The ROR data also includes links to organization homepages and many U.S. Federal entities are in the .gov or .mil domains. The second step was to look up the domain paths to identify specific organizations with *.gov or .mil domains. The .gov domain still includes many state and regional entities, so some manual selection is still required. For example, many U.S. National Labs have specific domain names that do not include agency abbreviations, e.g. Lawrence Berkeley National Laboratory: http://www.lbl.gov/ which is part of the Department of Energy Office of Science.
Table 1 shows the number of RORs for U.S. Federal organizations discovered using domains in the homepage links for the organizations and manual exploration. The largest number of RORs identify regional and state Veteran’s Administration Medical Centers (e.g. VA Eastern Colorado Health Care System). Other include five cabinet level departments, ~10 National Labs, over 100 Centers, ~20 Services, and many other organizations. The Table also includes the highest-level RORs for each organization in parentheses.
|Organization Name||Prefix||Count||Organization Name||Prefix||Count|
|United States Department of Veterans Affairs (https://ror.org/05rsv9s98)||va||157||United States Department of Agriculture (https://ror.org/01na82s61)||usda||25|
|National Science Foundation (https://ror.org/021nxhr62)||nsf||60||Centers for Disease Control and Prevention (https://ror.org/042twtr12)||cdc||22|
|United States Department of Defense (https://ror.org/0447fe631)||*.mil||54||National Aeronautics and Space Administration (https://ror.org/027ka1x80)||nasa||21|
|United States Department of Energy (https://ror.org/01bj3aw27)||energy and others||47||United States Department of Health and Human Services (https://ror.org/033jnv181)||hhs||8|
|National Institutes of Health (https://ror.org/01cwqze88)||nih||35||United States Department of the Interior (https://ror.org/03v0pmy70)||doi||8|
|National Oceanic and Atmospheric Administration (https://ror.org/02z5nhe81)||noaa||35|
The U.S. Federal Departments listed in Table 1 are at the top of deep hierarchies with many levels. Parent-child relationships between organizations in these hierarchies are not included in ROR as of yet (they are available in GRID). All of the RORs identified in this search are included in the data along with those identified through parent-child relationships in GRID. Researchers searching for RORs can pick the organizations closest to them from the list. For example, thirty-five organizations within NOAA that currently have RORs are listed in Table 2 along with their RORs. The high-level Offices are shown in bold text with organizations they include indented below. Researchers in the Earth System Research Laboratory can pick the appropriate ROR from this list (https://ror.org/ 033tt8e33) and the National Centers for Environmental Information can use the appropriate ROR (https://ror.org/04r0wrp59) for datasets they manage. Researchers that cannot find the appropriate organization in the list can use the ROR for the parent organization, i.e. https://ror.org/02z5nhe81 for NOAA, shown in Table 1. The initial list of Federal RORs is also available. It represents an effort that necessarily includes some manual searching so some RORs may have been missed. If you cannot find a ROR for an organization that should be on the list, you can suggest that it be added to GRID. Also please let me know at firstname.lastname@example.org.
|Climate Program Office||https://ror.org/00mmmy130|
|Office of Education||https://ror.org/032h87485|
|Office of Marine and Aviation Operations||https://ror.org/04ggd2r74|
|Office for Coastal Management||https://ror.org/05v14bq57|
|Office of Ocean Exploration and Research||https://ror.org/05xqpda80|
|National Ice Center||https://ror.org/0235zh559|
|National Environmental Satellite Data and Information Service||https://ror.org/007qwym43|
|National Marine Fisheries Service||https://ror.org/033mqx355|
|National Ocean Service||https://ror.org/02k4h0334|
|National Weather Service||https://ror.org/00tgqzw13|
|Office of Oceanic and Atmospheric Research||https://ror.org/02kgve346|
Currently available RORs includes persistent identifiers for many U.S. Federal research and development agencies. In many cases, these RORs have sufficient granularity to reach down the Federal hierarchy at least a level or two. If the optimum granularity is not currently available, RORs for higher level departments or agencies can be used until RORs are added to the registry for more Federal organizations.
The dataset described here is available.
A short list of related resources.
- Who is Who and What is What? The Need for Universal Entity Identification in the United States
- Codes for the Identification of Federal and Federally-Assisted Organizations
- Organisation identifiers: current provider survey
- A-Z Index of U.S. Government Departments and Agencies
- List of United States research and development agencies