UsabilityCluster/MonthlyMeeting/2018-06-06 MeetingNotes

From Federation of Earth Science Information Partners

Meeting Agenda - Usability Cluster - 2018-06-06 1PM EDT

http://wiki.esipfed.org/index.php/UsabilityCluster/MonthlyMeeting/2018-06-06


Attendees: Bob, Megan, Tamar, Annie, Madison, Connor, Sophie


Agenda:

  1. Presentation on "Measuring User Experience"
  2. Discussion of the Framework's "Post-Test Reflection" section
  3. Usability pilot testing -  IEDA usability tasks

Notes:

  • Sophie goes over the agenda for today
  1. Presentation on “Measuring User Experience”
    • The following presentation was taken from a “Measuring User Experience” course. Contact Sophie about questions she will forward them to original presenter.
    • Understanding the difference between quantitative and qualitative approaches to usability
    • Qualitative (Formative)
      1. We use information from users to iterate over designs
      2. Heuristic evaluation: do we meet a criteria? How do we perform against a principal?
    • Quantitative (Summative)
      1. Usually measured at the end of a SDLC
      2. Like NCAR data archive
      3. We need a calculable matrix; larger sample sizes for statistical soundness
      4. For Qualitative, 5 people can be sufficient - Qualitative should be 30-40 up to 100 people
      5. We have clear tests (likert scales with number values)
    • Design Considerations
      1. Mitigate errors
      2. Internal vs. External Validity
        1. Internal: Randomizing the order of questions
        2. External: Ensuring that the user is correct (nurse v. educator with operating room app)
      3. Within-subjects design: Asking a user to go between multiple designs and evaluate the designs comparatively
        1. Can require less users
        2. Problems with experience carrying over from one design compared to other
        3. Randomize the order of designs
      4. Self-Reported vs. Performance
        1. Self reported: How did you feel you did?
        2. Performance: measure variables (number of clicks, time, etc.)
      5. In person v. remote moderated v remote unmoderated
        1. In person: most accurate, due to live feedback and presence of moderator
        2. Remote: Not ideal, but not necessarily worse; preferred to remote unmoderated;
        3. Remote Unmoderated: Worst. We don’t know what people are doing between tasks. Uncontrolled environment
      6. Rating scale
        1. Lickert - 1 to 5
        2. Semantic Differential - Completely Satisfied v Completely Outraged
      7. Well written tasks
        1. Controlled, Clear Singular Solution, internally valid
      8. Guidelines for writing tasks
        1. Clear success criterion
        2. Its important that people do be distracted by other items
        3. Make the task neutral - but also mindful of the persona
    • Statistics
      1. We won’t go into significant detail
      2. It’s important to properly analyze results
    • Applications
      1. Evaluating Task Success   
    • Connor asks about Likert Scale and Semantic differential
      1. Sophie explains that they should not be combined
      2. Consistency is key
  2. Discussion of the Framework's "Post-Test Reflection" section
    • Recommendations on how to “interview” users after the test
      1. Madison : Quantitative questions could apply
      2. Bob: Important to understand if the experience was a “waste of their time”. Phrasing “This was an efficient use of my time”
      3. System usability testing questions can be applied here from
        1. Problem? Is 10 questions too many?
      4. Bob: It’s important to clearly express which 1 and 5 is (ie. worst to best or vice versa)
      5. We can ask about specific tasks in a more granular fashion to understand specific functional aspects of the system vs. the system as a whole
  3. Usability pilot testing -  IEDA usability tasks
    • Tamar will do the pilot testing
    • Megan asks about if we would give the task taker a copy of the tasks
      1. Madison: It could be ok if they give out tasks one at a time
    • Megan asks if it’s appropriate to prompt people about their general opinions and feeling about the site
      1. Sophie says no problem; its very appropriate
    • Tamar
      1. First thoughts: simple interface, very text heavy, attention is directed to titles of search results, search facet list very straightforward not too cluttered, overall nice
      2. First task: “If you want data from Juan de Fuca ridge what would you do”
        1. Used the search function
        2. Was sucessful
      3. Next: Narrow results down? How would you do that
        1. Tamar: Started with another search term
        2. Moved to Search Facets - Not certain how to re-search or apply changes to original search
        3. Does new search with settings selected; doesn’t seem to work
        4. Tamar sees that she needs to click on small text to apply the filter
          1. Decides to move to author
      4. Sophie interjects at next task to explain explain expectations of next task (Essentially how would you view a dataset)
        1. Tamar clicks the DOI to the research site
        2. Megan - asks tamar to access the data set specifically
          1. Sophie clarifies that the task is to “obtain the files”
        3. Tamar clicks on download data - is now seeking a download link
      5. Next task - Perform another search?
        1. Easily done
      6. If you need help what would you do?
        1. Tooltips were cool
      7. New link - How would you find the IEDA integrated catalog?
        1. Via search data button on IEDA main page
    • Sophie expresses approval at Megan’s distancing herself from guiding the tasks
      1. Sophie encourages her to ask more questions in the case that the user is quiet “What are you thinking what are you doing?”
      2. It’s ok to add more contextual guidance (assume a certain motivation as a user)
      3. Megan expresses personal frustrations with one of the search facets (considers removing it entirely)
      4. Sophie relates with Megan’s experiences
    • Madison - There was the point where Tamar had no datasets and she was asked to look into a dataset
      1. It’s ok to restart them from that point
    • Megan does not feel like she needs to adjust the tasks according to this pilot
      1. It was interesting to see how much the search box was the fixation
    • Time Considerations - 20-25 mins (this is a good baseline)
    • Tamar - Was unsure if she had succeeded at a task