Specifying the research question(s)
Specifying the research questions is the most important part of any systematic review.
The review questions drive the entire systematic review methodology:
- The type of question affects which type of systematic review is appropriate, and the method that should be used for aggregating the evidence. A general question such as “What is the current status of Mutation Testing research?” would be addressed by a mapping study. A question such as “Is pair-programming more effective than code inspection at reducing code defects?” would be addressed by a quantitative systematic review. A question such is “What are the benefits and risks of adopting DevOps?” would be addressed by a qualitative systematic review.
- The eligibility criteria must be designed to exclude studies that do not address the research questions and include those studies that do.
- The search process must be designed to identify the primary studies that address the research questions.
- The data extraction process must extract the data items needed to answer the questions.
- The data analysis process must synthesize the data in such a way that the review questions can be answered.
The more specific the question, the easier it is find evidence that can answer the question. For example, if our question is “Are cost estimates based on software estimation tools more accurate than cost estimates based on expert opinion?” we know that we need to look for software engineering studies that compare cost estimates obtained from tools with estimates obtained from experts[1]. This allows us to target our evidence better, for example: we can eliminate studies that compare the accuracy of estimates from different tools. However, if our question is “What is the best method for cost estimation?”, we may find many studies which compare many different types of cost estimation methods, each of which may use a different basis for comparison (e.g., most accurate, least risky, or most easy to use), so it may be difficult to know how to aggregate the evidence into a single answer.
However, if we ask a very specific question, it is extremely unlikely that we would find any existing evidence. For example a question such as “Is expert opinion better than a cost model to estimate the elapsed time a team of two developers using pair programming will take to design, code and test a major enhancement to a Java application?[2]” might define exactly what you would like to know, but it may be far too specific to find any relevant empirical evidence.
Generally, software engineering questions are rather broad, and need some additional information about context if the findings are to be relevant to software practitioners. In addition, because we seldom have multiple empirical studies addressing exactly the same research question, we often need to adopt qualitative rather than quantitative methods to aggregate results.
A specific problem that arises when formulating research questions is that software engineering terminology is not always well-defined, and is subject to change. For example, the term technical debt can be traced back to an article written in 1992, however, the concept was discussed from the 1980’s as one of the laws of system evolution, while in agile methods the term refactoring is used. Thus, using only the term technical debt to define the topic, risks missing relevant studies.
[1] We should point out that compared with other disciplines even this question is quite broad. We have not specified what type of software product we want to estimate, e.g., a new application or an enhancement, nor what languages we are interested in e.g., C or Java or Cobol, nor whether we are concerned about point estimates or ranges, nor whether we are actually concerned with monetary cost, or staff effort or elapsed time.
[2] This is similar to the sort of detail that is used for health care questions!