Planning - EBSE

Review Preparation & Planning

Goal

The goal of the systematic review planning stage is to define the review research questions and the methods that will be used to answer those questions.

Scope

If you are planning a review from scratch, or as part of a joint project with a software company, planning will involve:

Justifying the need for the review.
Specifying the research question(s).
Preparing the review protocol.
Evaluating the review protocol.

If you are undertaking a commissioned review, the need for the review and the research questions will have been agreed with the commissioning body. You will need to:

Prepare a formal review protocol.
Evaluate the review protocol.

Outputs

The outputs of the planning stage are:

The review research questions as specified in the review protocol.
The review protocol itself together with its publication date and how it can be accessed.
The knowledge gained by the review team from trialling the specified review methods. This point is not always emphasized, but such knowledge is always useful, particularly for systematic reviews performed by large review teams.

This page was updated in July 2023.

Developing a review protocol

A review protocol specifies all the methods that will be used to conduct a specific systematic review. In some disciplines, review protocols must be formally registered and published prior to the start of the review, however, this is not common practice in Software Engineering.

A pre-defined protocol is an important element in terms of the ‘systematic’ aspect of a review, and is necessary to reduce the possibility of researcher bias. For example, without a protocol, it is possible that the selection of individual studies or the analysis may be driven by researcher expectations and prejudices.

The components of a protocol include descriptions of the plan for all the elements of the review plus some additional planning information, including:

Background, including an introduction to the review topic and a justification for why a review is needed.
Research Questions.
Eligibility Criteria.
Search Process.
Selection Process.
Data Definition and Data Extraction.
Data Synthesis.
Critical Appraisal of Primary Studies (Risk of Bias/Quality Assessment).
Assessing the strength of evidence (Certainty Assessment).
Project timetable and staffing information.
The process used to evaluate the review protocol and the results of the review
Dissemination and reporting strategy.

In addition, the protocol should identify any issues that require adapting the standard review process.

If review authors have, themselves, contributed candidate empirical studies, the selection, risk of bias assessment and data extraction processes may need to be organized to ensure reviewers do not make critical assessments of their own empirical studies.
If there are related systematic reviews and mapping studies, the protocol should specify how information from these studies will be used.
The protocol should define how individual articles that report more than one empirical study are to be identified and processed. For systematic reviews, the usual procedure is to assess each different empirical study as a separate primary study. However, this is not usually required for mapping studies.

Constructing a protocol not only involves defining the methods that will be used to conduct the review, it also involves trying out different aspects of the methodology. In order to try out the processes, you need to identify a set of known primary studies. This can be based on your own knowledge, or preliminary Google Scholar searches for potential primary studies, or related secondary studies (which can provide multiple candidate primary studies). Known studies can be used to help validate the eligibility criteria, and to prototype the search process, the selection process, the data definition and data extraction and critical appraisal processes. In particular, the known studies can be used to:

try out the eligibility criteria
help identify appropriate key words for search strings
assess the coverage achieved by different search strings and digital libraries
identify an appropriate critical appraisal instrument based on the known primary study types(s) and to assess the agreement obtained by different members of the review team to assess its reliability (not usually required for mapping studies).
try out the data collection process and assess its reliability.
help to define the strength of evidence instrument and assess the process used to assess strength of evidence (not required for mapping studies).

Specifying the research question(s)

Specifying the research questions is the most important part of any systematic review.

The review questions drive the entire systematic review methodology:

The type of question affects which type of systematic review is appropriate, and the method that should be used for aggregating the evidence. A general question such as “What is the current status of Mutation Testing research?” would be addressed by a mapping study. A question such as “Is pair-programming more effective than code inspection at reducing code defects?” would be addressed by a quantitative systematic review. A question such is “What are the benefits and risks of adopting DevOps?” would be addressed by a qualitative systematic review.
The eligibility criteria must be designed to exclude studies that do not address the research questions and include those studies that do.
The search process must be designed to identify the primary studies that address the research questions.
The data extraction process must extract the data items needed to answer the questions.
The data analysis process must synthesize the data in such a way that the review questions can be answered.

The more specific the question, the easier it is find evidence that can answer the question. For example, if our question is “Are cost estimates based on software estimation tools more accurate than cost estimates based on expert opinion?” we know that we need to look for software engineering studies that compare cost estimates obtained from tools with estimates obtained from experts[1]. This allows us to target our evidence better, for example: we can eliminate studies that compare the accuracy of estimates from different tools. However, if our question is “What is the best method for cost estimation?”, we may find many studies which compare many different types of cost estimation methods, each of which may use a different basis for comparison (e.g., most accurate, least risky, or most easy to use), so it may be difficult to know how to aggregate the evidence into a single answer.

However, if we ask a very specific question, it is extremely unlikely that we would find any existing evidence. For example a question such as “Is expert opinion better than a cost model to estimate the elapsed time a team of two developers using pair programming will take to design, code and test a major enhancement to a Java application?[2]” might define exactly what you would like to know, but it may be far too specific to find any relevant empirical evidence.

Generally, software engineering questions are rather broad, and need some additional information about context if the findings are to be relevant to software practitioners. In addition, because we seldom have multiple empirical studies addressing exactly the same research question, we often need to adopt qualitative rather than quantitative methods to aggregate results.

A specific problem that arises when formulating research questions is that software engineering terminology is not always well-defined, and is subject to change. For example, the term technical debt can be traced back to an article written in 1992, however, the concept was discussed from the 1980’s as one of the laws of system evolution, while in agile methods the term refactoring is used. Thus, using only the term technical debt to define the topic, risks missing relevant studies.

[1] We should point out that compared with other disciplines even this question is quite broad. We have not specified what type of software product we want to estimate, e.g., a new application or an enhancement, nor what languages we are interested in e.g., C or Java or Cobol, nor whether we are concerned about point estimates or ranges, nor whether we are actually concerned with monetary cost, or staff effort or elapsed time.

[2] This is similar to the sort of detail that is used for health care questions!

Review Preparation & Planning

Goal

Scope

Outputs

Commissioning a Review

Developing a review protocol

Justifying the Need for the Review

Specifying the research question(s)

Specifying Eligibility Criteria

Evaluating the review protocol

Review Preparation & Planning

Goal

Scope

Outputs

Commissioning a Review

Developing a Review Protocol

Justifying the Need for the Review

Specifying the Research Questions

Specifying Eligibility Criteria

Evaluating the Review Protocol