Developers Perception of Peer Code Review in Research Software Development

From AcaWiki
Jump to: navigation, search

Citation: Nasir U. Eisty, Jeffrey C. Carver (2021/09/22) Developers Perception of Peer Code Review in Research Software Development. arXiv (RSS)
arXiv (preprint): arXiv:2109.10971
Internet Archive Scholar (search for fulltext): Developers Perception of Peer Code Review in Research Software Development
Download: https://arxiv.org/abs/2109.10971
Tagged: Computer Science (RSS) software engineering (RSS)


Background

  • Software is important for research.
  • Research software engineers should follow standard software practices.
  • However, these practices differ from industry.
    • Risks due to exploration.
    • Constantly changing requirements.
    • Complex communication or I/O patterns.
    • Need highly-specialized knowledge
    • Larger scale of single executions
    • Complex software due to modeling complex phenomena
    • Different goals, knowledge, skills than industry devs
  • Tests are hard because no oracle, large number of parameters, and legacy code.

Solution

  • On the other hand, peer code review could work
    • Numerous benefits
      • Reviewers suggest comments that improve code quality
      • Authors more likely to make code readable
      • Spreads out knowledge of the change
      • Community building
    • Google developers expect four key themes from peer code review: education, maintaining norms, gatekeeping, and accident prevention.
    • Microsoft dev spends 15--25% of time reviewing.
    • 25% of comments on improve core functionality

Study

  • RQ1: How do research software developers perform peer code review?
  • RQ2: What effect does peer code review have on research software?
  • RQ3: What difficulties do research software developers face with peer code review?
  • RQ4: What improvements to the peer code review process do research software developers need?

Methodology

Survey Design

  • Questions from prior literature on peer code review.

Pilot Interviews

  • Pilot interviews suggested ways of revising the questions, and develop multiple-choice answers.
  • Interview audience: 13 from NCSA, 9 from Einstein Toolkit Workshop.
  • "Convenience sampling"

Survey

Data Analysis

  • Valid response := answer all quantitative and at least one qualitative
    • What motivates requiring one qualitative answer?
  • Coded these to qualitative answers individually, and then merged codes, resolving differences case-by-case.

Results

  • Most respondents are financially compensated for their participation, have been on both sides of code review, and more than five years of experience.

RQ1: Code review details

  • Most respondents spend less than 5 hours per review, half spend 1 to 5 hours.
  • Most requests get a response within 3 days, 40% within 1 day.
  • Most commits goes through review.
  • Most of reviews are resolved within a month, half within a week.
  • Number of LoC and number of reviewers varies widely.
  • Common criteria when deciding reviews: coding standards, domain knowledge are roughly tied followed by functionality, correctness, time, tests, documentation, always-accept
  • Common mistakes corrected during review: code mistakes, design, style, testing, documentation, performance, readability, maintainability.

Positive experiences about code review

  • Knowledge sharing, improved code quality, helpful feedback, positive feeling, problems identified
  • "In a big project it is rare that anyone understands the whole picture... It [code review] can lead to more complete understanding of the task."
  • "It [code review] leads to design discussions happening that would not have happened otherwise."
  • "It makes the team more knowledgeable about what work is."
  • "People found mistakes in code that I wrote, that I would have missed and only found out about much further on the validation process."
  • Code review results in "much better code and a better understanding of different parts of the code."

Negative experiences about code review

  • Takes too long, requestors misunderstand criticism, disagreements, bottleneck, hard to find reviewers, difficult task, unresponsive author
  • "it [peer code review] can be long and time consuming for very small changes, as the process must be followed for even a single character change if it affects results."
  • There are also problems when the "review process gets stalled while nit-picking irrelevant details."
  • "Sometimes people get annoyed when they get feedback especially if they think they are experts"

RQ2: Impact of code review on research software

  • By a large margin, respondents strongly agreed code review is important for their project.
    • This could be due to selection bias.
  • Impacts: improves code quality followed by knowledge sharing
  • Why does code review improve code quality: correctness followed by a tie between improves readability, more eyes, and better maintainability.
  • On correctness: "If you’ve written code yourself, it’s hard to see the assumptions you’ve made. Others can spot these and ask you to clarify, also spot your mistakes"
  • On readability: "make[ing] the codebase more uniform and improves the quality of the code"

RQ3: Difficulties research software developers face with code review

  • Difficulties: understanding code, understanding system, administrative issues
  • Barriers: Finding time followed distantly by phrasing comments, finding the right people, participation, developer egos, takes too long

RQ4: What improvements do research software developers need?

  • Formalizing process, followed by tooling, more people, better incentives, more training, more time
  • Formalizing process: "a more formal structure of at least one science review followed by one technical review. It’s currently a bit of a free-for-all"
  • Tooling: branching VCS and automatic analysis

Threats to validity

  • Participants might not know what certain terms mean, but authors think they do.
  • Human-perception can be wrong, but there is no better source of truth.
  • Perhaps the sample is not representative of the population.
    • Those wliling to answer a survey on code review are more likely to be aware of it.
  • Participants may have misunderstood questions, but authors tried to be clear.

Conclusion

  • Similar results to commercial software engineering, despite differences in research context.
  • Code review largely beneficial, but could benefit from explicit process.
  • Authors plan to raise awareness of code review, its flaws, and its benefits, within community.

References