Difference between revisions of "Help us Classify Research Articles for Usage-Based Discovery"

From Earth Science Information Partners (ESIP)
(added note about the question mark.)
 
Line 12: Line 12:
 
#Email [mailto:chris.lynnes@nasa.gov chris.lynnes@nasa.gov] or [mailto:slafia@umich.edu slafia@umich.edu] to get the URL of the Classification Worksheet, and click on the ClassifyThis tab.
 
#Email [mailto:chris.lynnes@nasa.gov chris.lynnes@nasa.gov] or [mailto:slafia@umich.edu slafia@umich.edu] to get the URL of the Classification Worksheet, and click on the ClassifyThis tab.
 
#Click on the '''Article DOI URL''' to go to the article to be classified
 
#Click on the '''Article DOI URL''' to go to the article to be classified
#Read the title and the abstract and use the pulldowns in the '''topic-''' columns to classify it into one to five categories. (You don't need to fill all five.)
+
#Read the title and the abstract and use the pulldowns in the '''topic-''' columns to classify it into as many as five topics. (You don't need to fill all five.) Note:  If you are uncertain about whether a topic applies, you can end the topic with a question mark '?'.  We will likely filter those out in the User Interface (for now), but will try them in our machine learning task to see if they help.
 
#In some cases, you may want to dive into the body of the paper, which is fine (though experience shows most important topics are found in title and abstract.)
 
#In some cases, you may want to dive into the body of the paper, which is fine (though experience shows most important topics are found in title and abstract.)
 
#If you hit a paywall:  try clicking the Google Scholar link and look on the right side of the page for possible locations where authors have self-archive at their institutions or places like ResearchGate.
 
#If you hit a paywall:  try clicking the Google Scholar link and look on the right side of the page for possible locations where authors have self-archive at their institutions or places like ResearchGate.

Latest revision as of 12:10, September 29, 2021

Usage-based Discovery is powered by a database of Dataset-Usage relationships, where "usage" can be an Application or a Research Article.

These relationships can be difficult and tedious to locate. Fortunately, several of NASA's Distributed Active Archive Centers have maintained curated collections of dataset usage in research articles, giving us something to start with, more than 7000 articles(!)

HOWEVER, we need to classify these articles into broad research topics to provide the end user of Usage Based Discovery with a way to browse the many articles. We have provided a Google Sheet to crowd-source classifications from the community, which will

  1. help populate the database
  2. serve as labels for an automated machine learning classifier

Instructions are simple:

  1. Email chris.lynnes@nasa.gov or slafia@umich.edu to get the URL of the Classification Worksheet, and click on the ClassifyThis tab.
  2. Click on the Article DOI URL to go to the article to be classified
  3. Read the title and the abstract and use the pulldowns in the topic- columns to classify it into as many as five topics. (You don't need to fill all five.) Note: If you are uncertain about whether a topic applies, you can end the topic with a question mark '?'. We will likely filter those out in the User Interface (for now), but will try them in our machine learning task to see if they help.
  4. In some cases, you may want to dive into the body of the paper, which is fine (though experience shows most important topics are found in title and abstract.)
  5. If you hit a paywall: try clicking the Google Scholar link and look on the right side of the page for possible locations where authors have self-archive at their institutions or places like ResearchGate.
  6. If you want to participate in the leaderboard (coming soon), add your ORCID in the ORCID column.

Go ahead, try a couple, it's actually kind of...fun. No, really! Some of the papers you run across are quite interesting...

Pro Tip: The Valids tab has some hints on other keywords to keep an eye it for in the various categories.