Evidence-Based Software Engineering
Welcome to the web site for Evidence-Based Software Engineering (EBSE). This website provides an introduction for researchers, practitioners and teachers in computing, software engineering and IT who have heard about EBSE and want to know more about it. The website explains the basic concepts and processes, as well as providing some resources, together with links to sources of other, more detailed, material.
Whether you are interested in EBSE as a potential user or gatherer of evidence, or may even be considering commissioning others to undertake the task, we hope that you will find our description of EBSE sufficiently interesting and useful to read further!
So, what is EBSE?
EBSE is concerned with using findings from empirical research to determine what software engineering practices, tools and standards work, and the situations when and where they work.
Originating in 2004, EBSE was inspired by the success of the evidence-based medicine (EBM), as employed in clinical medicine and healthcare. EBM arose because recommendations from experts were not always supported by research and there was evidence that some recommended clinical practices could actually endanger patients. In the case of software engineering, the 80’s and 90’s saw plethora of methods and tools recommended by consultants and experts that were largely a matter of opinion, usually expressed as experience papers, and not supported by any reliable evidence. EBSE was seen as a means of addressing that problem. It involved adapting the evidence-based practices to meet the rather different characteristics of software engineering, and the consequences that these characteristics have for empirical studies performed on software engineering topics.
EBSE aims to provide the means by which current best evidence from research can be integrated with practical experience and human values in the decision-making processes involved in the development and maintenance of software.
How does EBSE work?
EBSE is defined as a five-stage process involving the following steps:
- Converting the need for information (such as about development and maintenance methods, management procedures etc.) into an answerable question.
- Tracking down the best evidence with which to answer that question.
- Critically appraising that evidence for its validity (closeness to the truth), impact (size of the effect), and applicability (usefulness in software development practice).
- Integrating the critical appraisal with our software engineering expertise and with our stakeholders’ values and circumstances.
- Evaluating our effectiveness and efficiency in executing steps 1-4 and seeking ways to improve them both for next time.
A core tool of the evidence-based approach is the systematic review, which addresses steps 2-4 above.
What’s a Systematic Review?
A systematic review is a specific form of secondary study, and the material provided by this web site is concerned with how to conduct, report and use the various forms of secondary study used in software engineering.
A systematic review aims to identify and summarize all relevant high-quality empirical studies that are related to the question being addressed (referred to as the primary studies), in a way that ensures that its conclusions are fair and trustworthy.
The systematic review process is intended to avoid human errors due to tiredness or fatigue, such as misunderstanding or misinterpreting an empirical study report, or wrongly transcribing information from an empirical report; as well as errors arising from personal bias, such as giving greater weight to results that agree with our own opinions.
What questions do Systematic Reviews answer?
There are three broad categories of systematic review, each of which addresses a different sort of question.
- Mapping Studies. These are relatively ‘lightweight’ reviews which aim to answer questions of the form “what studies have investigated…?”. A good mapping study will not only aim to identify relevant studies, but also to provide some form of analysis, perhaps based upon categories (types of empirical study, types of participant, size of task etc.). They can provide a useful overview of a topic area and can help to determine if a more detailed study would be worthwhile.
- Qualitative Systematic Reviews. A qualitative systematic review provides a deeper level of knowledge than a mapping study, and typically is used to aggregate information from studies that report the benefits, risks, and stakeholder attitudes involved in using SE technologies and to prioritise the most important issues these identify. They can be based on aggregating results from thematic analysis of case studies and/or studies reporting industry opinion surveys. Such a review can provide useful guidance on issues to consider when adopting new technologies, including identifying potential barriers.
- Quantitative Systematic Reviews. As might be expected, a quantitative systematic review aggregates quantitative findings, typically provided as statistical values, and where possible, these aim to provide some form of numerical information using statistical techniques such as meta-analysis. An example of the outcomes to be expected of such a review might be to provide a ranking of a set of techniques in terms of their effectiveness.
Which forms of systematic review are most commonly used in EBSE?
The findings of many systematic reviews covering a range of software engineering issues have been published since 2004. Most published systematic reviews are either mapping studies or qualitative systematic reviews (sometimes something of a mix of both). This is because the human-centric nature of SE, with its dependence upon individual skills, makes it difficult to conduct the sort of studies that can contribute to a quantitative systematic review.
The EBSE web site
This web site is maintained by researchers who are interested in EBSE (see ‘About Us’) and its purpose is to act as a resource for members of the wider Software Engineering community. Everyone is welcome to use the material it provides. The site is intended to provide material that will be useful for practitioners, researchers, teachers, students and policy-makers, particularly those who might wish to make evidence-informed decisions related to software engineering activities.
Much of our material is concerned with providing ‘how-to’ guidance about performing and reporting the findings from systematic reviews. More detailed information relevant to different stakeholders is presented in the pages below.
This page was updated in March 2023.
What is Important for Researchers?
In the context of software engineering research, the most important element of EBSE is the well-defined systematic review method. The systematic review process provides a method for rigorously and fairly analysing all of the available information about a given phenomenon. This provides a less biased and more complete perspective than would be obtained from a single study or an informal literature review. Taking a broad view of available evidence should make it possible to produce more reliable conclusions and to minimise the risk of bias.
Systematic reviews originated in clinical medicine, where it was recognised that the outcomes from individual experiments were not a safe or sufficient basis for decision-making. They were originally designed to take advantage of the rigorously defined experiments used in medical studies (referred to as randomised controlled trials, or RCTs). Subsequently their use has spread into other domains where such experimental rigour is either difficult or impossible to achieve, and where many different experimental forms may well be employed. Indeed, for that reason, many people prefer to use the term evidence-informed to indicate that the findings from a review may need to be adapted to a particular context.
Systematic reviews are now a standard research method used in all research-based disciplines. As such they are an important method for all software engineering researchers.
What about Mapping Studies?
A mapping study (also termed a scoping review) provides a more ‘open’ form of review. This reviews a specific software engineering topic and classifies the primary research papers for that specific domain. The research questions for such a study are quite high-level, and can include issues such as which sub-topics have been investigated, what empirical forms have been used, and which sub-topics have been addressed by enough studies to merit a more detailed systematic review. The table below shows the differences between the two forms, but the main distinction is that where an SR seeks to aggregate the outcomes of primary studies, a mapping study aims only to classify literature and aggregate studies within the categories.
||Systematic Literature Review
||Classification of available literature
||Identifying best practice, seeking consensus when empirical study finding disagree, thematic analysis and conceptual modelling of the literature.
||General — related to research trends: which researchers, how much activity, what type of studies, etc.
||Specific — related to the outcomes from empirical studies.
Of the form: “is technology / method A better or not than B”.Or of the form: “80% of high quality studies confirm the importance of factor X”.
||Defined by topic area
||Defined by research question
||Broad — all papers related to a topic area are included, but only classification data about those are collected
||Focused — only empirical papers related to a specific research question are included and detailed information about individual research outcomes is extracted from each paper
|Search strategy Requirements
||Less stringent if only research trends are of interest
||Extremely stringent — all relevant studies must be found
||Important to ensure that results are based upon best quality evidence
||Set of papers related to a topic area, categorised in a variety of dimensions and counts of the number of papers in various categories
||Answer to specific research question, possibly with qualifiers (e.g. results apply only to novices). Prioritised lists of important factors, qualitative models of the relationship among factors.
Are there Other Types of Systematic Review?
Two other types of systematic review that are of use in software engineering are:
- Tertiary studies. These are systematic reviews where the empirical studies are themselves secondary studies. See (Petersen, Vakkalanka & Kuzniarz, 2015), Guidelines for conducting systematic mapping studies in software engineering: An update. This form of review is often used to investigate how SE researchers are performing the systematic review process. However, they can also be used to compare and contrast different systematic reviews performed on the same topic.
- Rapid Reviews (RRs). Formally, a rapid review is a “form of knowledge synthesis that accelerates the process of conducting a traditional systematic review through streamlining or omitting various methods to produce evidence for stakeholders in a resource-efficient manner”. The first example of an RR in software engineering was reported by Cartaxo and his colleagues in 2018, see (Cartaxo, Pinto & Soares, 2018) The role of rapid reviews in supporting decision-making in software engineering practice.
What Research Opportunities does this provide?
We encourage researchers to undertake both mapping studies and systematic literature reviews as part of the task of mapping out our empirical domain knowledge. Systematic reviews can help summarize areas where a large number of empirical studies have been performed, for example cost estimation and fault estimation. In addition, mapping studies, whether undertaken by others or not, can provide the basis for identifying where new primary studies are needed, perhaps because no-one has studied a particular aspect, or because a replication would provide valuable confidence about the findings.
Updated in March 2023
Is EBSE relevant to me?
As a Software Developer
If you work in a software development organisation, as a software coder, tester or maintainer, EBSE is probably not of direct relevance to you. Your employer probably has a set of development standards and tools that staff are mandated to use, and its is not usually the job of development staff to make decisions about how to improve those standards.
Also, if you have specific coding problem, you are more likely to find an answer on a Q&A site such as StackOverflow than from the empirical software engineering literature. However, you might still benefit from adopting a cautious approach to answers based solely on the expert opinion of a single person. Some empirical studies have been critical of the quality of code snippets available on line, particularly from the viewpoint of security. (See Felix Fischer et al. (2017) Stack Overflow Considered Harmful? The Impact of Copy & Paste on Android Application Security. 2017 IEEE Symposium on Security and Privacy; and Sarah Meldrum et al. (2020) Understanding stack overflow code quality: A recommendation of caution. (Science of Computer Programming, 199, 102516.2.)
As a Software Manager
If you are a project manager, a quality manager or a process manager, EBSE may well be useful. If you have process or quality problems, and want to assess the options available to address that problem, you might try looking for an existing systematic review that addresses the problem area (usually accessible from a Google Scholar query).
However, if there is nothing suitable available, it might be better to collaborate with a local university than seek to undertake a systematic review from scratch. The main practical problem with undertaking a systematic review, is that you need experience reading scientific articles, which usually needs the skills from a post-graduate degree. Most university computing department are experienced dealing with the scientific literature and are usually happy to forge collaborations with industry as long as they are allowed to publish some of their results.
We do Process Improvement , do we Need EBSE?
EBSE is not intended to be a replacement for standard software process improvement models and methods. A high level model such as the Capability Maturity Model identifies different levels of process maturity and the capabilities that each level should exhibit, but does not specify exactly how the capability should be implemented. A lower level process model would use steps such as
- Identify a problem.
- Propose a technology or procedure to address that problem.
- Evaluate the proposed technology in a pilot project.
- If the technology is appropriate, adopt and implement it.
- Monitor the organization after implementing the new technology.
- Return to step 1.
In both cases, EBSE proposes that the selection of an appropriate process change should be informed by empirical evidence regarding the available options and should consider the specific context (i.e., the type of company, type of software application, languages used, current processes) in which the process change is planned.
A useful resource addressing the practical use of evidence-based knowledge for a number of software engineering topics is the Voice of Evidence column published in IEEE Software magazine between 2007 and 2017. The aim of this was to “bridge the gap between research and practice by extracting actionable lessons from the body of research”. The team of editors provide a brief review and evaluation of the column in this paper: https://ieeexplore.ieee.org/document/7974677.
Students & Novice Researchers
Software engineering and computer science courses for undergraduate students concentrate on computer science theories, methods, techniques and tools, and do not usually include content related to empirical research or secondary studies. This page is intended for postgraduate students or advanced undergraduates who have little previous training in empirical research methods. The linked short notes provide a set of brief introductions to specific empirical methods and secondary studies. Most of the notes were originally crafted to support lectures on these topics, so have been deliberately kept to a limit of two sides of A4. There is no restriction on how you can use the notes, but, if you use them in your own documents please include an acknowledgment of our web site.
Students seeking to learn more about evidence-based software engineering or to apply it for their own studies may also find the following resources useful:
- Guidelines for conducting systematic reviews and other evidence-based studies.
- The glossary, a comprehensive set of EBSE terms and definitions.
- A set of templates to assist in creating protocols for empirical studies.
The material on this page provides both information about the role of EBSE in studying the teaching of software engineering, and also material intended to help with teaching software engineering.
Teaching Support Materials
The following downloadable documents provide lecture notes on relevant topics.
The following pages may be particularly useful to those teaching about empirical software engineering:
Systematic Reviews that have findings useful for teaching and practice and were published up to the end of 2015 are summarised in the paper by: (Budgen, Brereton, Williams & Drummond, 2020)
Studies of Software Engineering Education
The following papers are particularly concerned with studying how we teach software engineering:
- Engaging the net generation with evidence-based software engineering through a community-driven web database. (Janzen and Ryoo, 2009).
- Pair programming as a teaching tool: a student review of empirical studies. (Brereton, Turner and Kaur, 2009).
- Using systematic reviews and evidence-based software engineering with masters students. (Oates and Capper, 2009).