Difference between revisions of "Best practices"

From Earth Science Information Partners (ESIP)
 
(21 intermediate revisions by the same user not shown)
Line 5: Line 5:
 
Below are a set of best practice (BP) recommendations for the IMCR.
 
Below are a set of best practice (BP) recommendations for the IMCR.
  
 
+
==Search==
 
 
==Discovering software==
 
Recommendations for discovering software in the IMCR.
 
===IMCR===
 
 
To find software, use the [http://vocab.lternet.edu/vocab/registry/index.php IMCR Controlled Vocabulary] in combination with the "Filter Software List" located in the [http://imcr.ontosoft.org/#list IMCR Portal]. Terms in the controlled vocabulary are organized around [https://www.dataone.org/data-life-cycle the data life cycle], which is an intuitive way to think about the different categories of information management. Browse the controlled vocabulary to identify terms you'd like to search on, then add these terms to the "Filter Software List" search tool to find what you're looking for. Some notes on the search tool:
 
To find software, use the [http://vocab.lternet.edu/vocab/registry/index.php IMCR Controlled Vocabulary] in combination with the "Filter Software List" located in the [http://imcr.ontosoft.org/#list IMCR Portal]. Terms in the controlled vocabulary are organized around [https://www.dataone.org/data-life-cycle the data life cycle], which is an intuitive way to think about the different categories of information management. Browse the controlled vocabulary to identify terms you'd like to search on, then add these terms to the "Filter Software List" search tool to find what you're looking for. Some notes on the search tool:
 
* '''Search''' only supports search across software names. Free-text searching is not supported.
 
* '''Search''' only supports search across software names. Free-text searching is not supported.
 
* '''Author''' supports search across software authors, including individuals, organizations and initiatives. Separate multiple terms with commas. An auto-complete drop down field lists software authors found in the IMCR.
 
* '''Author''' supports search across software authors, including individuals, organizations and initiatives. Separate multiple terms with commas. An auto-complete drop down field lists software authors found in the IMCR.
 
* '''Keywords''' supports search across keywords tagged to each software item of the IMCR. Browse the [http://vocab.lternet.edu/vocab/registry/index.php IMCR Controlled Vocabulary] for a list of keywords and definitions.
 
* '''Keywords''' supports search across keywords tagged to each software item of the IMCR. Browse the [http://vocab.lternet.edu/vocab/registry/index.php IMCR Controlled Vocabulary] for a list of keywords and definitions.
* '''Language'''
+
* '''Language''' supports search by implementation language.
* '''License'''  
+
* '''License''' supports search by the license under which the software was released.
* '''Operating System'''  
+
* '''Operating System''' supports search by operating system the software can run on.
 
* '''Publisher''' supports search across software publishers. The content of "Publisher" is often times equivalent to content listed under "Author"
 
* '''Publisher''' supports search across software publishers. The content of "Publisher" is often times equivalent to content listed under "Author"
  
==Developing software==
+
==Register==
No software development standards are enforced in the IMCR, however community recommended BPs are list below for anyone interested in learning and applying them.
+
BPs for registering software in the IMCR
===ESIP software assessment guidelines===
+
 
* [https://esipfed.github.io/Software-Assessment-Guidelines/guidelines.html ESIP Software Guidelines]
+
===Create user account===
 +
Create a user account and begin publishing your software. The metadata wizard guides you through the process of describing important attributes of your software, however some additional guidance is offered below to help optimize discovery and understanding of your software.
 +
 
 +
===Keywording===
 +
Keywording software with terms from the IMCR Vocabulary is the single most important piece of metadata as it's the mechanism by which software is categorized into information management tasks. Below is a guide for addressing common keywording challenges.
 +
* [https://docs.google.com/spreadsheets/d/1PklZRyOEelkK-65gr2-JnW7GqaYH9ogbZ7mje_rx2Ik/edit?usp=sharing If the term doesn't exist, suggest it.]
 +
* If the software isn't neatly described by one term, use multiple terms.
 +
* Keyword the software's IM function, not the internal functionality it performs (e.g. a software may search an external resource for data in the process of providing the final actionable output to the user, but the search isn't the salient IM function provided by the software).
 +
* Limit keywording to a few terms describing the primary focus of the software (too many terms dilutes search accuracy).
  
 +
===R Package in CRAN===
 +
To register a CRAN package:
 +
* Click '''Publish your software'''
 +
* Add the software name under '''Identify > What is the software called?'''
 +
* Add keywords from the IMCR Vocabulary under '''Identify > What are general categories (keywords, labels) for this software?'''
 +
* Add the URL for the package source code copied from the CRAN Package landing page (e.g. https://cran.r-project.org/web/packages/antiword/index.html) under '''Execute > What is the URL for the code?'''
 +
IMCR bots automate completion and maintenance of all the other metadata fields by using the supplied URL to extract metadata from the CRAN package DESCRIPTION file and any associated GitHub. Maintenance of keywords is conducted by IMCR admin humans.
  
 +
===GitHub Repository===
 +
Don't use this option if the software is archived in an official repository (e.g. CRAN, PyPI). To register a GitHub repository:
 +
* Click '''Publish your software'''
 +
* Add the software name under '''Identify > What is the software called?'''
 +
* Add keywords from the IMCR Vocabulary under '''Identify > What are general categories (keywords, labels) for this software?'''
 +
* Add the URL of the GitHub repository (e.g. https://github.com/EDIorg/EMLassemblyline) under '''Update > How is the software being developed or maintained?'''
 +
IMCR bots automate completion and maintenance of all the other metadata fields by using the supplied URL to extract metadata from the CRAN package DESCRIPTION file and any associated GitHub. Maintenance of keywords is conducted by IMCR admin humans.
  
==Publishing software==
+
===Other===
BPs for publishing software in the IMCR and software repositories (e.g. GitHub).
+
Registering software not in CRAN or GitHub will have to be manually added following the guidelines below.
===IMCR===
 
Publishing in the IMCR is easy but some guidance on how to best supply information about your software goes a long way to helping others discover and reuse it. Below are recommendations for completing the form fields for your software registration in the IMCR.
 
  
 
====Identify====
 
====Identify====
* '''What are general categories (keywords, labels) for this software?''' - Use the [http://vocab.lternet.edu/vocab/registry/index.php IMCR Controlled Vocabulary] to keyword your software. Keywords are the primary mechanism by which users find your software. Add broad terms and narrower terms to improve search and discovery. Be careful to spell the keywords correctly! If you can't find a suitable keyword in the controlled vocabulary, please suggest it and it's corresponding definition to the IMCR chairs.
+
* '''What are general categories (keywords, labels) for this software?''' - Use the [http://vocab.lternet.edu/vocab/registry/index.php IMCR Controlled Vocabulary] to select keywords for your software. Keywords are the primary mechanism by which users find software in the IMCR. Add both broad terms and narrower terms and be careful to spell the keywords correctly! If you can't find a suitable keyword in the controlled vocabulary, please suggest it and it's corresponding definition to IMCR chairs.
  
 
====Understand====
 
====Understand====
* '''Who created this software?''' - Add the project, organization, person, or initiative that helped create this software. You are encouraged to add multiple entities (i.e. organization and all individuals). For individuals, use given name and surname. Be careful to spell names correctly! Note, an drop down auto-complete field helps with this. The content of this field is searchable by the "Filter Software List" "Author" field.
+
* '''Who created this software?''' - Add the project, organization, person, and/or initiative that helped create this software. Be careful to spell names correctly! NOTE: This field auto-completes to software creators already listed in the IMCR. This field is searchable by the "Filter Software List" "Author" field.
* '''Are there any additional contributors of note for this software?''' - Add major contributors to the software project. These entities can be of the same types listed under "Who created this software?" (i.e. project, organization, person, etc.). The content of this field is searchable by the "Filter Software List" "Author" field.
+
* '''Are there any additional contributors of note for this software?''' - Add major contributors to the software. These entities can be of the same types listed under "Who created this software?" (i.e. project, organization, person, etc.). The content of this field is searchable by the "Filter Software List" "Author" field.
 
* '''(Optional) Who is the publisher of this software if not the author?''' - Add the project, organization, person, or initiative who published this software. Duplicate entries are encouraged from the "Who created this software?" field. The content of this field is searchable by the "Filter Software List" "Publisher" field. Separate multiple entries with commas.
 
* '''(Optional) Who is the publisher of this software if not the author?''' - Add the project, organization, person, or initiative who published this software. Duplicate entries are encouraged from the "Who created this software?" field. The content of this field is searchable by the "Filter Software List" "Publisher" field. Separate multiple entries with commas.
* '''What are domain specific keywords for this software? (e.g. hydrology, climate)''' - List science domain specific keywords to this field and to the field "What are general categories (keywords, labels) for this software?" located at /Locate/Important. Use the [http://vocab.lternet.edu/vocab/registry/index.php IMCR Controlled Vocabulary] to keyword your software with science domain specific keywords (see the "science domain" category of the IMCR CV). Keywords are the primary mechanism by which users find your software. Add broad terms and narrower terms to improve search and discovery. Be careful to spell the keywords correctly! If you can't find a suitable keyword in the controlled vocabulary, please suggest it and it's corresponding definition to the IMCR chairs.
+
* '''What are domain specific keywords for this software? (e.g. hydrology, climate)''' - List science domain specific keywords to this field and to the field "What are general categories (keywords, labels) for this software?" located at /Locate/Important. Use the [https://vocab.lternet.edu/vocab/vocab/index.php LTER Controlled Vocabulary "disciplines" category] to keyword your software. Add broad terms and narrower terms to improve search and discovery. Be careful to spell the keywords correctly! If you can't find a suitable keyword in the controlled vocabulary, please suggest it and it's corresponding definition to the IMCR chairs.
  
 
====Execute====
 
====Execute====
* '''What license is the code released under?''' - A list of common licenses are recognized. Use the auto-complete feature of this field to select the recognized license spelling.
+
* '''What license is the code released under?''' - A list of common licenses are recognized and available through the auto-complete feature of this field.
* '''What language(s) is the software written in? ''' - A list of languages used in the IMCR portal are provided by the auto-complete feature of this field. If the language is not available in the drop down list, then add it.
+
* '''What language(s) is the software written in? ''' - A list of languages already listed in the IMCR are provided by the auto-complete feature of this field. If the language is not available in the drop down list, then add it.
* '''What operating system can the software run on?''' - A list of operating systems used in the IMCR portal are provided by the auto-complete feature of this field. Add an operating system if it's not available in the drop down list.
+
* '''What operating system can the software run on?''' - Use [https://bioportal.bioontology.org/ontologies/SWO/?p=classes&conceptid=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FIAO_0000025&jump_to_nav=true the Software Ontology "programming language" category] to keyword this field.
* '''What other software does the software require to be installed?''' - Do not use this field!!! OntoSoft automatically adds entries in this field to the master list of registered software in the IMCR. We don't want the IMCR populated with these items.
+
* '''What other software does the software require to be installed?''' - ''DO NOT USE THIS FIELD!'' OntoSoft automatically adds entries in this field to the master list of registered software in the IMCR. We don't want the IMCR populated with these items.
 +
 
 +
====Do Research====
 +
* '''What input files does the software require?''' - Use [https://bioportal.bioontology.org/ontologies/SWO/?p=classes&conceptid=http%3A%2F%2Fedamontology.org%2Fformat_1915&jump_to_nav=true the Software Ontology "Format" category] to keyword this field.
 +
* '''What output files does the software produce?''' - Use [https://bioportal.bioontology.org/ontologies/SWO/?p=classes&conceptid=http%3A%2F%2Fedamontology.org%2Fformat_1915&jump_to_nav=true the Software Ontology "Format" category] to keyword this field.
 +
 
 +
====Misc.====
 +
* '''Input data type?''' - Use [https://bioportal.bioontology.org/ontologies/SWO/?p=classes&conceptid=http%3A%2F%2Fedamontology.org%2Fformat_2350 the Software Ontology "Format (by type of data)" category] to fill in this field.
  
====Update====
+
===GitHub Organization===
=====(Optional) How is the software being developed or maintained?=====
+
The best practices for registering a repository, not a software library (e.g. https://github.com/bd-R) have yet to be determined.
=====(Optional) Are there any on-line resources for accessing the developer community for this software?=====
 
=====What versions does the software have?=====
 
  
===Project repository===
+
===Single Script===
Considerations for features to include in the project repository.
+
The best practices for registering a single scripts have yet to be determined.
====README====
 
README are essential for orienting the user to what the project addresses.
 
====Vignette====
 
Vignettes are useful for demonstrating the projects functionality.
 
====DOI====
 
Periodic releases of the project code can be accompanied by the minting of a DOI. GitHub supports archive in with generation of a DOI, thereby making the project citable.
 
* [https://zenodo.org Zenodo]
 
* [https://guides.github.com/activities/citable-code/ Interfacing GitHub and Zenodo]
 
====Test data====
 
Test data facilitates experimentation and understanding of the projects functionality.
 
====Tagging====
 
Tagging the project repo with good keywords facilitates discovery. See the IMCR Controlled Vocabulary.
 

Latest revision as of 13:31, October 7, 2019

Return to IM Code Registry main page


Overview

Below are a set of best practice (BP) recommendations for the IMCR.

Search

To find software, use the IMCR Controlled Vocabulary in combination with the "Filter Software List" located in the IMCR Portal. Terms in the controlled vocabulary are organized around the data life cycle, which is an intuitive way to think about the different categories of information management. Browse the controlled vocabulary to identify terms you'd like to search on, then add these terms to the "Filter Software List" search tool to find what you're looking for. Some notes on the search tool:

  • Search only supports search across software names. Free-text searching is not supported.
  • Author supports search across software authors, including individuals, organizations and initiatives. Separate multiple terms with commas. An auto-complete drop down field lists software authors found in the IMCR.
  • Keywords supports search across keywords tagged to each software item of the IMCR. Browse the IMCR Controlled Vocabulary for a list of keywords and definitions.
  • Language supports search by implementation language.
  • License supports search by the license under which the software was released.
  • Operating System supports search by operating system the software can run on.
  • Publisher supports search across software publishers. The content of "Publisher" is often times equivalent to content listed under "Author"

Register

BPs for registering software in the IMCR

Create user account

Create a user account and begin publishing your software. The metadata wizard guides you through the process of describing important attributes of your software, however some additional guidance is offered below to help optimize discovery and understanding of your software.

Keywording

Keywording software with terms from the IMCR Vocabulary is the single most important piece of metadata as it's the mechanism by which software is categorized into information management tasks. Below is a guide for addressing common keywording challenges.

  • If the term doesn't exist, suggest it.
  • If the software isn't neatly described by one term, use multiple terms.
  • Keyword the software's IM function, not the internal functionality it performs (e.g. a software may search an external resource for data in the process of providing the final actionable output to the user, but the search isn't the salient IM function provided by the software).
  • Limit keywording to a few terms describing the primary focus of the software (too many terms dilutes search accuracy).

R Package in CRAN

To register a CRAN package:

  • Click Publish your software
  • Add the software name under Identify > What is the software called?
  • Add keywords from the IMCR Vocabulary under Identify > What are general categories (keywords, labels) for this software?
  • Add the URL for the package source code copied from the CRAN Package landing page (e.g. https://cran.r-project.org/web/packages/antiword/index.html) under Execute > What is the URL for the code?

IMCR bots automate completion and maintenance of all the other metadata fields by using the supplied URL to extract metadata from the CRAN package DESCRIPTION file and any associated GitHub. Maintenance of keywords is conducted by IMCR admin humans.

GitHub Repository

Don't use this option if the software is archived in an official repository (e.g. CRAN, PyPI). To register a GitHub repository:

  • Click Publish your software
  • Add the software name under Identify > What is the software called?
  • Add keywords from the IMCR Vocabulary under Identify > What are general categories (keywords, labels) for this software?
  • Add the URL of the GitHub repository (e.g. https://github.com/EDIorg/EMLassemblyline) under Update > How is the software being developed or maintained?

IMCR bots automate completion and maintenance of all the other metadata fields by using the supplied URL to extract metadata from the CRAN package DESCRIPTION file and any associated GitHub. Maintenance of keywords is conducted by IMCR admin humans.

Other

Registering software not in CRAN or GitHub will have to be manually added following the guidelines below.

Identify

  • What are general categories (keywords, labels) for this software? - Use the IMCR Controlled Vocabulary to select keywords for your software. Keywords are the primary mechanism by which users find software in the IMCR. Add both broad terms and narrower terms and be careful to spell the keywords correctly! If you can't find a suitable keyword in the controlled vocabulary, please suggest it and it's corresponding definition to IMCR chairs.

Understand

  • Who created this software? - Add the project, organization, person, and/or initiative that helped create this software. Be careful to spell names correctly! NOTE: This field auto-completes to software creators already listed in the IMCR. This field is searchable by the "Filter Software List" "Author" field.
  • Are there any additional contributors of note for this software? - Add major contributors to the software. These entities can be of the same types listed under "Who created this software?" (i.e. project, organization, person, etc.). The content of this field is searchable by the "Filter Software List" "Author" field.
  • (Optional) Who is the publisher of this software if not the author? - Add the project, organization, person, or initiative who published this software. Duplicate entries are encouraged from the "Who created this software?" field. The content of this field is searchable by the "Filter Software List" "Publisher" field. Separate multiple entries with commas.
  • What are domain specific keywords for this software? (e.g. hydrology, climate) - List science domain specific keywords to this field and to the field "What are general categories (keywords, labels) for this software?" located at /Locate/Important. Use the LTER Controlled Vocabulary "disciplines" category to keyword your software. Add broad terms and narrower terms to improve search and discovery. Be careful to spell the keywords correctly! If you can't find a suitable keyword in the controlled vocabulary, please suggest it and it's corresponding definition to the IMCR chairs.

Execute

  • What license is the code released under? - A list of common licenses are recognized and available through the auto-complete feature of this field.
  • What language(s) is the software written in? - A list of languages already listed in the IMCR are provided by the auto-complete feature of this field. If the language is not available in the drop down list, then add it.
  • What operating system can the software run on? - Use the Software Ontology "programming language" category to keyword this field.
  • What other software does the software require to be installed? - DO NOT USE THIS FIELD! OntoSoft automatically adds entries in this field to the master list of registered software in the IMCR. We don't want the IMCR populated with these items.

Do Research

Misc.

GitHub Organization

The best practices for registering a repository, not a software library (e.g. https://github.com/bd-R) have yet to be determined.

Single Script

The best practices for registering a single scripts have yet to be determined.