Instructions:
Perform a database search on urinary tract infections. There are various levels of evidence. Select the article that has the best level of evidence.
Please write an 150 word discussion with a 50 word reply to a colleague answering the following questions:
1. Describe why this article was selected and how it meets the criteria to be considered the best level of evidence.
2. Describe if this is a peer-reviewed article. Why is it significant to find peer-reviewed versus non–peer-reviewed articles for research?
3. Describe the nurses’ role in research in this type of research.
***Include citations (according to 7th edition APA manual) in the body of the discussion board and a reference in APA format at the bottom of the initial discussion.**
*** UPLOADING RUBRIC W/ REQUIRED RULES AND STRICT GUIDELINES***
*** UPLOADING 1ST REFERENCE REQUIRED FROM REQUIRED TEXTBOOK; TEXTBOOK LINK***
*** INLUDING LINK FROM ARTICLE I CHOSE FROM DATABASE BELOW***
–
978-0-387-72659-5_15 (springer.com)
–
Urinary Tract Infection | SpringerLink
*** PLEASE USE CORRECT APA FORMAT***
** 50 WORD REPLY IS OPTIONAL**
ANY QUESTIONS PLEASE LET ME KNOW.
2
Acquisitions Editor: Christina C. Burns
Product Director: Jennifer K. Forestieri
Development Editor: Meredith L. Brittain
Production Product Manager: Marian Bellus
Design Coordinator: Joan Wendt
Illustration Coordinator: Jennifer Clements
Manufacturing Coordinator: Karin Duffield
Prepress Vendor: Absolute Service, Inc.
9th edition
Copyright 2018 © Wolters Kluwer
Fifth Edition © 2011 by LIPPINCOTT WILLIAMS & WILKINS, a WOLTERS
KLUWER business
Fourth Edition © 2006 by LIPPINCOTT WILLIAMS & WILKINS
Third Edition © 2001 by LIPPINCOTT WILLIAMS & WILKINS
All rights reserved. This book is protected by copyright. No part of this book may be
reproduced or transmitted in any form or by any means, including as photocopies or
scanned-in or other electronic copies, or utilized by any information storage and
retrieval system without written permission from the copyright owner, except for brief
quotations embodied in critical articles and reviews. Materials appearing in this book
prepared by individuals as part of their official duties as U.S. government employees are
not covered by the above-mentioned copyright. To request permission, please contact
Wolters Kluwer at Two Commerce Square, 2001 Market Street, Philadelphia, PA
19103, via email at permissions@lww.com, or via our website at lww.com (products
and services).
9 8 7 6 5 4 3 2 1
Library of Congress Cataloging-in-Publication Data
Names: Polit, Denise F., author. | Beck, Cheryl Tatano, author.
Title: Essentials of nursing research : appraising evidence for nursing practice / Denise
F. Polit, Cheryl Tatano Beck.
Description: Ninth edition. | Philadelphia : Wolters Kluwer Health, [2018] | Includes
bibliographical references and index.
Identifiers: LCCN 2016043994 | ISBN 9781496351296
Subjects: | MESH: Nursing Research | Evidence-Based Nursing
Classification: LCC RT81.5 | NLM WY 20.5 | DDC 610.73072—dc23 LC record
available at https://lccn.loc.gov/2016043994
This work is provided “as is,” and the publisher disclaims any and all warranties,
3
mailto:permissions@lww.com
http://lww.com
https://lccn.loc.gov/2016043994
express or implied, including any warranties as to accuracy, comprehensiveness, or
currency of the content of this work.
This work is no substitute for individual patient assessment based upon healthcare
professionals’ examination of each patient and consideration of, among other things,
age, weight, gender, current or prior medical conditions, medication history, laboratory
data, and other factors unique to the patient. The publisher does not provide medical
advice or guidance, and this work is merely a reference tool. Healthcare professionals,
and not the publisher, are solely responsible for the use of this work including all
medical judgments and for any resulting diagnosis and treatments.
Given continuous, rapid advances in medical science and health information,
independent professional verification of medical diagnoses, indications, appropriate
pharmaceutical selections and dosages, and treatment options should be made and
healthcare professionals should consult a variety of sources. When prescribing
medication, healthcare professionals are advised to consult the product information
sheet (the manufacturer’s package insert) accompanying each drug to verify, among
other things, conditions of use, warnings and side effects and identify any changes in
dosage schedule or contraindications, particularly if the medication to be administered is
new, infrequently used or has a narrow therapeutic range. To the maximum extent
permitted under applicable law, no responsibility is assumed by the publisher for any
injury and/or damage to persons or property, as a matter of products liability, negligence
law or otherwise, or from any reference to or use by any person of this work.
LWW.com
4
http://lww.com
TO
Our Families—Our Husbands, Our Children (and Their
Spouses/Fiancés), and Our Grandchildren
Husbands: Alan and Chuck
Children: Alex (Maryanna), Alaine (Jeff), Lauren (Vadim), Norah
(Chris), Curt, and Lisa
Grandchildren: Maren, Julia, Cormac, Ronan, and Cullen
5
ABOUT THE AUTHORS
Denise F. Polit, PhD, FAAN, is an American health care researcher who is recognized
internationally as an authority on research methods, statistics, and measurement. She
received her Bachelor’s degree from Wellesley College and her Ph.D. from Boston
College. She is the president of a research consulting company, Humanalysis, Inc., in
Saratoga Springs, New York, and professor at Griffith University, Brisbane, Australia.
She has published in numerous journals and has written several award-winning
textbooks. She has recently written a ground-breaking book on measurement in health,
Measurement and the Measurement of Change: A Primer for the Health Professions.
Her research methods books with Dr. Cheryl Beck have been translated into French,
Spanish, Portuguese, German, Chinese, and Japanese. She has been invited to give
lectures and presentations in many countries, including Australia, India, Ireland,
Denmark, Norway, South Africa, Turkey, Sweden, and the Philippines. Denise has lived
in Saratoga Springs for 29 years and is active in the community. She has assisted
numerous nonprofit organizations in designing surveys and analyzing survey data.
Currently, she serves on the board of directors of the YMCA, Opera Saratoga, and the
Saratoga Foundation.
6
Cheryl Tatano Beck, DNSc, CNM, FAAN, is a distinguished professor at the
University of Connecticut, School of Nursing, with a joint appointment in the
Department of Obstetrics and Gynecology at the School of Medicine. She received her
master’s degree in maternal–newborn nursing from Yale University and her doctor of
nursing science degree from Boston University. She has received numerous awards such
as the Association of Women’s Health, Obstetric and Neonatal Nursing’s Distinguished
Professional Service Award, Eastern Nursing Research Society’s Distinguished
Researcher Award, the Distinguished Alumna Award from Yale University School of
Nursing, and the Connecticut Nurses’ Association’s Diamond Jubilee Award for her
contribution to nursing research. Over the past 30 years, Cheryl has focused her
research efforts on developing a research program on postpartum mood and anxiety
disorders. Based on the findings from her series of qualitative studies, Cheryl developed
the Postpartum Depression Screening Scale (PDSS), which is published by Western
Psychological Services. She is a prolific writer who has published over 150 journal
articles. In addition to co-authoring award-winning research methods books with Denise
Polit, Cheryl coauthored with Dr. Jeanne Driscoll Postpartum Mood and Anxiety
Disorders: A Clinician’s Guide, which received the 2006 American Journal of Nursing
Book of the Year Award. In addition, Cheryl has published two other books: Traumatic
Childbirth and Routledge International Handbook of Qualitative Nursing Research. Her
most recent book is Developing a Program of Research in Nursing.
7
PREFACE
Essentials of Nursing Research, ninth edition, helps students learn how to read and
critique research reports and to develop an appreciation of research as a path to
enhancing nursing practice.
We continue to enjoy updating this book with important innovations in research
methods and with nurse researchers’ use of emerging methods. Feedback from our loyal
adopters has inspired several important changes to the content and organization. We are
convinced that these revisions introduce important improvements—while retaining
many features that have made this book a classic best-selling textbook throughout the
world. The ninth edition of this book, its study guide, and its online resources will make
it easier and more satisfying for nurses to pursue a professional pathway that
incorporates thoughtful appraisals of evidence.
LEGACY OF ESSENTIALS OF NURSING RESEARCH
This edition, like its predecessors, is focused on the art—and science—of research
critique. The textbook offers guidance to students who are learning to appraise research
reports and use research findings in practice.
Among the basic principles that helped to shape this and earlier editions of this book
are as follows:
1. An assumption that competence in doing and appraising research is critical to the
nursing profession
2. A conviction that research inquiry is intellectually and professionally rewarding to
nurses
3. An unswerving belief that learning about research methods need be neither
intimidating nor dull
Consistent with these principles, we have tried to present research fundamentals in a
way that both facilitates understanding and arouses curiosity and interest.
NEW TO THIS EDITION
New Organization
8
In the previous edition, we separated chapters on quantitative and qualitative designs
and methods into two separate parts. In this edition, we organized the parts by
methodologic content. So, for example, Part 3 in this edition covers designs and
methods for quantitative, qualitative, and mixed methods research, and Part 4 is devoted
to analysis and interpretation in quantitative and qualitative studies. (Please see “The
Text” later in this preface for more information.) We think this new organization offers
greater continuity of methodologic concepts and will facilitate better understanding of
key methodologic differences between quantitative and qualitative research. We are
confident that this new organization will better meet the needs of students and faculty.
Manageable Text for One-Semester Course
We have streamlined the text to make it more manageable for use in a one-semester
course. We reduced the length by organizing content differently and by keeping
essential information in the text while moving background/advanced content online,
making this an 18-chapter book rather than the previous 19 chapters in the eighth
edition.
Enhanced Accessibility
To make this edition even more user-friendly than in the past, we have made a
concerted effort to simplify the presentation of complex topics. Most importantly, we
have reduced and simplified the coverage of statistical information. We eliminated the
chapter on measurement, opting to present a shorter, more digestible section on this
topic in our chapter on quantitative data collection, which is supplemented by
information in the chapter on statistical analysis. In addition, throughout the book we
have used more straightforward, concise language.
New Content
In addition to updating the book with new information on conventional research
methods, we have added content on the following topics:
Quality improvement projects, describing how they are distinct from research studies
and evidence-based practice (EBP) projects. This new content is found in Chapter 13.
Clinical significance, a seldom-mentioned but important topic that has gained
prominence among researchers in other health care fields but has only recently
gained traction among nurse researchers. This new content is found in Chapter 15.
THE TEXT
The content of this edition is as follows:
Part 1, Overview of Nursing Research and Its Role in Evidence-Based Practice,
9
introduces fundamental concepts in nursing research. Chapter 1 summarizes the
background of nursing research, discusses the philosophical underpinnings of
qualitative research versus quantitative research, and describes major purposes of
nursing research. Chapter 2 offers guidance on using research to build an evidence-
based practice. Chapter 3 introduces readers to key research terms and presents an
overview of steps in the research process for both quantitative and qualitative studies.
Chapter 4 focuses on research journal articles, explaining what they are and how to
read them. Chapter 5 discusses ethics in nursing studies.
Part 2, Preliminary Steps in Quantitative and Qualitative Research, further sets
the stage for learning about the research process by considering aspects of a study’s
conceptualization. Chapter 6 focuses on the development of research questions and
the formulation of research hypotheses. Chapter 7 discusses how to retrieve research
evidence (especially in electronic bibliographic databases) and the role of research
literature reviews. Chapter 8 presents information about theoretical and conceptual
frameworks.
Part 3, Designs and Methods for Quantitative and Qualitative Nursing Research,
presents material on the design and conduct of all types of nursing studies. Chapter 9
describes fundamental design principles and discusses many specific aspects of
quantitative research design, including efforts to enhance rigor. Chapter 10
introduces the topics of sampling and data collection in quantitative studies.
Concepts relating to quality in measurements—reliability and validity—are
introduced in this chapter. Chapter 11 describes the various qualitative research
traditions that have contributed to the growth of constructivist inquiry and presents
the basics of qualitative design. Chapter 12 covers sampling and data collection
methods used in qualitative research, describing how these differ from approaches
used in quantitative studies. Chapter 13 emphasizes mixed methods research, but the
chapter also discusses other special types of research such as surveys, outcomes
research, and quality improvement projects.
Part 4, Analysis and Interpretation in Quantitative and Qualitative Research,
presents tools for making sense of research data. Chapter 14 reviews methods of
statistical analysis. The chapter assumes no prior instruction in statistics and focuses
primarily on helping readers to understand why statistics are useful, what test might
be appropriate in a given situation, and what statistical information in a research
article means. Chapter 15 discusses approaches to interpreting statistical results,
including interpretations linked to assessments of clinical significance. Chapter 16
discusses qualitative analysis, with an emphasis on ethnographic, phenomenologic,
and grounded theory studies. Chapter 17 elaborates on criteria for appraising
trustworthiness and integrity in qualitative studies. Finally, Chapter 18 describes
systematic reviews, including how to understand and appraise both meta-analyses
and metasyntheses.
10
At the end of the book, we offer students additional critiquing support. In the
appendices, we offer full-length research articles —two quantitative, one
qualitative, and one mixed methods—that students can read, analyze, and critique.
Students can model their critiques on the full critiques of two of those studies
provided or compare their work to the ones provided. A glossary at the end of
the book provides additional support for those needing to look up the meaning of a
methodologic term.
FEATURES OF THE TEXT
We have retained many of the classic features that were successfully used in previous
editions to assist those learning to read and apply evidence from nursing research:
Clear, User-Friendly Style. Our writing style is easily digestible and nonintimidating
—and we have worked even harder in this edition to write clearly and simply.
Concepts are introduced carefully, difficult ideas are presented thoughtfully, and
readers are assumed to have no prior knowledge of technical terms.
Critiquing Guidelines. Each chapter includes guidelines for conducting a critique of
various aspects of a research report. The guidelines sections provide a list of
questions that walk students through a study, drawing attention to aspects of the
study that are amenable to appraisal by research consumers.
Research Examples and Critical Thinking Exercises. Each chapter concludes with
one or two actual research examples designed to highlight critical points made in the
chapter and to sharpen the reader’s critical thinking skills. In addition, many research
examples are used to illustrate key points in the text and to stimulate students’
thinking about areas of research inquiry. We have chosen many international
examples to communicate to students that nursing research is growing in importance
worldwide. Some of the Critical Thinking Exercises focus on the full-length articles
in Appendix A (a quantitative study) and Appendix B (a qualitative study).
Tips for Students. The textbook is filled with practical guidance and tips on how to
translate the abstract notions of research methods into more concrete applications. In
these tips, we have paid special attention to helping students read research reports,
which are often daunting to those without specialized research training.
Graphics. Colorful graphics—in the form of supportive tables, figures, and examples
—reinforce the text and offer visual stimulation.
Chapter Objectives. Learning objectives are identified in the chapter opener to focus
students’ attention on critical content.
Key Terms. Each chapter opener includes a list of new terms, and we have made the
list more focused and less daunting by including only key new terms. In the text, new
terms are defined in context (and bolded) when used for the first time; terms of lesser
11
importance are italicized. Key terms are also defined in our glossary.
Bulleted Summary Points. A succinct list of summary points that focus on salient
chapter content is provided at the end of each chapter.
Essentials of Nursing Research: Appraising Evidence for Nursing Practice, ninth
edition, has ancillary resources designed with both students and instructors in mind,
available on website.
Student Resources Available on
Supplements for Each Chapter further students’ exploration of specific topics. A full
list of the Supplements appears on page xxii. These supplements can be assigned to
provide additional background or to offer advanced material to meet students’
specific needs.
Interactive Critical Thinking Activity brings the Critical Thinking Exercises from
the textbook (except those focused on studies in the appendices) to an easy-to-use
interactive tool that enables students to apply new skills that they learn in each
chapter. Students are guided through appraisals of real research examples and then
ushered through a series of questions that challenge them to think about the quality of
evidence from the study. Responses can be printed or e-mailed directly to instructors
for homework or testing.
12
Hundreds of Student Review Questions help students to identify areas of strength
and areas needing further study.
Answers to Critical Thinking Exercises are provided for questions related to the
studies in Appendices A and B of the textbook.
Journal Articles—18 full articles from Wolters Kluwer journals (one corresponding
to each chapter)—are provided for additional critiquing opportunities. Many of these
are the full journal articles for studies used as the end-of-chapter Research Examples.
All journal articles that appear on are identified in the text with and are
called out in the References lists for appropriate chapters with a double asterisk (**).
Internet Resources with relevant and useful websites related to chapter content can
be clicked on directly without having to retype the URL and risk a typographical
error. This edition also includes links to all open-access articles cited in the
textbook; these articles are called out in the References lists for appropriate
chapters with a single asterisk (*).
Critiquing Guidelines and Learning Objectives from the textbook are available in
Microsoft Word for your convenience.
Nursing Professionals Roles and Responsibilities.
Instructor’s Resources Available on
NEW! Test Generator Questions are completely new and written by the book’s
authors for the ninth edition. Hundreds of multiple-choice questions aid instructors in
assessing their students’ understanding of the chapter content.
An Instructor’s Manual includes a preface that offers guidance to improve the
teaching experience. We have recognized the need for strong support for instructors
in teaching a course that can be quite challenging. Part of the difficulty stems from
students’ anxiety about the course content and their concern that research methods
might not be relevant to their nursing practice. We offer numerous suggestions on
how to make learning about—and teaching—research methods more rewarding. The
contents of the Instructor’s Manual include the following for each chapter:
Statement of Intent. Discover the authors’ goals for each chapter.
Special Class Projects. Find numerous ideas for interesting and meaningful class
projects. Check out the icebreakers and activities relating to the Great Cookie
Experiment with accompanying SPSS files.
Test Questions and Answers. True/false questions, plus important application
questions, test students’ comprehension and their ability to put their new
critiquing skills to use. The application questions focus on a brief summary of a
study and include several short-answer questions (with our answers), plus essay
questions. These application questions are intended to assess students’ knowledge
13
about methodologic concepts and their critiquing skills.
Answers to the Interactive Critical Thinking Activity. Suggested answers to the
questions in the Interactive Critical Thinking Activity are available to instructors.
Students can either print or e-mail their responses directly to the instructor for testing
or as a homework assignment.
Two sets of PowerPoint Slides:
“Test Yourself!” PowerPoint Slides. For each chapter, a slide set of five
multiple-choice “Test Yourself!” questions relating to key concepts in the chapter
are followed by answers to the questions. The aim of these slides is not to evaluate
student performance. We recommend these slides be given to students for self-
testing, or they can be used in the classroom with i>clickers to assess students’
grasp of important concepts. To enhance the likelihood that students will see the
relevance of the concepts to clinical practice, all the questions are application-type
questions. We hope instructors will use the slides to clarify any misunderstandings
and, just as importantly, to reward students with immediate positive feedback
about newly acquired skills.
PowerPoint Presentations offer traditional summaries of key points in each
chapter for use in class presentations. These slides are available in a format that
permits easy adaptation and also include audience response questions that can be
used on their own or are compatible with i>clicker and other audience response
programs and devices.
An Image Bank includes figures from the text.
QSEN Map shows how the book content integrates QSEN competencies.
BSN Essentials Competencies Map shows how the book content integrates American
Association of Colleges of Nursing (AACN) Essentials of Baccalaureate Education
for Professional Nursing Practice competencies.
Strategies for Effective Teaching offer creative approaches for engaging students.
Learning Management System Course Cartridges.
Access to all student resources previously discussed.
STUDY GUIDE
The accompanying Study Guide for Essentials of Nursing Research, ninth edition, is
available for purchase and augments the text, providing students with opportunities to
apply their learning.
Critiquing opportunities abound in the Study Guide, which includes eight research
articles in their entirety. The studies represent a range of nursing topics and
research approaches, including a randomized controlled trial, a correlational/mixed
14
methods study, an EBP project, three qualitative studies (ethnographic,
phenomenologic, and grounded theory), a meta-analysis, and a metasynthesis. The
Application Exercises in each chapter guide students in reading, understanding, and
critiquing these eight studies.
Answers to the “Questions of Fact” section in the Application Exercises in each
chapter are presented in Appendix I of the Study Guide so that students can get
immediate feedback about their responses.
Although critiquing skills are emphasized in the Study Guide, other included activities
support students’ learning of fundamental research terms and principles, such as fill-
in-the-blank exercises, matching exercises, and focused Study Questions. Answers to
those questions that have an objective answer are provided in Appendix I.
COMPREHENSIVE, INTEGRATED DIGITAL
LEARNING SOLUTIONS
We are delighted to introduce an expanded suite of digital solutions to support
instructors and students using Essentials of Nursing Research, ninth edition. Now for
the first time, our textbook is embedded into two integrated digital learning solutions—
one specific for prelicensure programs and the other for postlicensure—that build on the
features of the text with proven instructional design strategies. To learn more about
these solutions, visit http://www.nursingeducationsuccess.com/ or contact your local
Wolters Kluwer representative.
Our prelicensure solution, Lippincott CoursePoint, is a rich learning environment that
drives course and curriculum success to prepare students for practice. Lippincott
CoursePoint is designed for the way students learn. The solution connects learning to
real-life application by integrating content from Essentials of Nursing Research with
video cases, interactive modules, and evidence-based journal articles. Ideal for active,
case-based learning, this powerful solution helps students develop higher level cognitive
skills and asks them to make decisions related to simple-to-complex scenarios.
Lippincott CoursePoint for Nursing Research features the following:
Leading Content in Context. Digital content from Essentials of Nursing Research,
ninth edition, is embedded in our Powerful Tools, engaging students and encouraging
interaction and learning on a deeper level.
The complete interactive eBook features annual content updates with the latest
EBPs and provides students with anytime, anywhere access on multiple devices.
Full online access to Stedman’s Medical Dictionary for the Health Professions
and Nursing ensures students work with the best medical dictionary available.
Powerful Tools to Maximize Class Performance. Additional course-specific tools
provide case-based learning for every student:
15
http://www.nursingeducationsuccess.com/
Video Cases show how nursing research and evidence-based practice relates to
real-life nursing practice. By watching the videos and completing related
activities, students will flex their evidence-based practice skills and build a spirit
of inquiry.
Interactive Modules help students quickly identify what they do and do not
understand, so they can study smartly. With exceptional instructional design that
prompts students to discover, reflect, synthesize, and apply, students actively
learn. Remediation links to the digital textbook are integrated throughout.
Curated Collections of Journal Articles are provided via Lippincott
NursingCenter, Wolters Kluwer’s premier destination for peer-reviewed nursing
journals. Through integration of CoursePoint and NursingCenter, students will
engage in how nursing research influences practice.
Data to Measure Students’ Progress. Student performance data provided in an
intuitive display lets instructors quickly assess whether students have viewed
interactive modules and video cases outside of class as well as see students’
performance on related NCLEX-style quizzes, ensuring students are coming to the
classroom ready and prepared to learn.
To learn more about Lippincott CoursePoint, please visit
www.nursingeducationsuccess.com/coursepoint.
Lippincott RN to BSN Online: Nursing Research is a postlicensure solution for
online and hybrid courses, marrying experiential learning with the trusted content in
Essentials of Nursing Research, ninth edition.
Built around learning objectives that are aligned to the BSN Essentials and QSEN
16
http://www.nursingeducationsuccess.com/coursepoint
nursing curriculum standards, every aspect of Lippincott RN to BSN Online is designed
to engage, challenge, and cultivate postlicensure students.
Self-Paced Interactive Modules employ key instructional design strategies—
including storytelling, modeling, and case-based and problem-based scenarios—to
actively involve students in learning new material and focus students’ learning
outcomes on real-life application.
Pre- and Postmodule Assessments activate students’ existing knowledge prior to
engaging with the module, then assess their competency after completing the
module.
Discussion Board Questions create an ongoing dialogue to foster social learning.
Writing and Group Work Assignments hone students’ competence in writing and
communication, instilling the skills needed to advance their nursing careers.
Collated Journal Articles acquaint students to the body of nursing research ongoing
in recent literature.
Case Study Assignments, including unfolding cases that evolve from cases in the
interactive modules, aid students in applying theory to real-life situations.
Best Practices in Scholarly Writing Guide covers APA formatting and style
guidelines.
Used alone or in conjunction with other instructor-created resources, Lippincott RN
to BSN Online adds interactivity to courses. It also saves instructors time by keeping
both textbook and course resources current and accurate through regular updates to the
content.
To learn more about Lippincott RN to BSN Online, please visit
http://www.nursingeducationsuccess.com/nursing-education-solutions/lippincott-
17
http://www.nursingeducationsuccess.com/nursing-education-solutions/lippincott-rn-bsn-online/
rn-bsn-online/.
CLOSING NOTE
It is our hope and expectation that the content, style, and organization of this ninth
edition of Essentials of Nursing Research will be helpful to those students who want to
become skillful, thoughtful readers of nursing studies and to those wishing to enhance
their clinical performance based on research findings. We also hope that this textbook
will help to develop an enthusiasm for the kinds of discoveries and knowledge that
research can produce.
Denise F. Polit, PhD, FAAN
Cheryl Tatano Beck, DNSc, CNM, FAAN
18
USER’S GUIDE
Learning Objectives focus students’ attention on critical content
Key Terms alert students to important terminology
Examples help students apply content to real-life research
Tip boxes describe what is found in actual research articles
How-to-Tell Tip boxes explain confusing issues in actual research articles
19
Critiquing Guidelines boxes lead students through key issues in a research article
Research Examples highlight critical points made in the chapter and sharpen critical
thinking skills
Critical Thinking Exercises provide opportunities to practice critiquing actual research
articles
Summary Points review chapter content to ensure success
Special icons alert students to important content found on and in the
20
accompanying Study Guide
21
REVIEWERS
Lisa Aiello-Laws, RN, MSN, AOCNS, APN-C
Assistant Clinical Professor
College of Nursing and Health Professions
Drexel University
Philadelphia, Pennsylvania
Elizabeth W. Black, MSN, CSN
Assistant Professor
Gwynedd Mercy University
Gwynedd Valley, Pennsylvania
Lynn P. Blanchette, RN, PhD
Program Director
Rhode Island College
Providence, Rhode Island
Anne Watson Bongiorno, PhD, APHN-BC, CNE
Associate Professor
State University of New York at Plattsburgh
Plattsburgh, New York
Katherine Bowman, PhD, RN
Assistant Teaching Professor
Sinclair School of Nursing
University of Missouri
Columbia, Missouri
Barb Braband, EdD, RN, CNE
Master’s Program Director
University of Portland
Portland, Oregon
Vera Brancato, EdD, MSN, RN, CNE
Professor of Nursing
22
Alvernia University
Reading, Pennsylvania
Jennifer Bryer, PhD, RN, CNE
Chairperson and Associate Professor
Department of Nursing
Farmingdale State College
Farmingdale, New York
Wendy Budin, PhD, RN-BC, FACCE, FAAN
Adjunct Professor
New York University
New York, New York
Carol Caico, PhD, CS, NP
Associate Professor
New York Institute of Technology
New York, New York
Mary Ann Cantrell, PhD, RN, CNE, FAAN
Assistant Professor
Villanova University
Villanova, Pennsylvania
Ruth Chaplen, RN, MSN, DNP, ACNS BC, AOCN
Associate Professor of Nursing
Rochester College
Rochester Hills, Michigan
Lori Ciafardoni, RN, MSN/ED
Assistant Professor
State University of New York at Delhi
Delhi, New York
Leah Cleveland, EdD, RN, CNS, PHN, CDE
Lecturer
California State University, Fullerton
Fullerton, California
Susan Davidson, EdD, APN, NP-C
Professor
School of Nursing
Coordinator
Gateway RN-BSN Program
23
School of Nursing
University of Tennessee at Chattanooga
Chattanooga, Tennessee
Pamela de Cordova, PhD, RN-BC
Assistant Professor
Rutgers University
New Brunswick, New Jersey
Josephine DeVito, PhD, RN
Undergraduate Chair and Associate Professor
College of Nursing
Seton Hall University
South Orange, New Jersey
Nancy Ann C. Falvo, BSN, MSN, PhD
Assistant Professor
Clarion University of Pennsylvania
Clarion, Pennsylvania
Jeanie Flood, PhD, RN-C, IBCLC
RN to BSN Faculty Advisor
University of Hawaii at Hilo
Hilo, Hawaii
Deborah Hunt, PhD, RN
Associate Professor
College of New Rochelle
New Rochelle, New York
Linda Johanson, EdD, RN
Associate Professor
Appalachian State University
Boone, North Carolina
Lucina Kimpel, PhD, RN
Associate Professor
Mercy College of Health Sciences
Des Moines, Iowa
Pamela Kohlbry, PhD, RN, CNL
Associate Professor
Med/Surg Lead and CNL Program Coordinator
California State University San Marcos
24
San Marcos, California
Leann Laubach PhD, RN
Professor
Career Advancement Coordinator
University of Central Oklahoma
Edmond, Oklahoma
Hayley Mark, PhD, MSN, MPH, RN
Chairperson
Department of Nursing
Towson University
Towson, Maryland
Donna Martin, DNP, MSN, RN-BC, CDE
Assistant Professor
Lewis University
Romeoville, Illinois
Ditsapelo McFarland, PhD, MSN, EdD
Associate Professor
Adelphi University
Garden City, New York
Kristina S. Miller, DNP, RN, PCNS-BC
Instructor of Maternal Child Nursing
College of Nursing
University of South Alabama
Mobile, Alabama
Kathy T. Morris, EdD, MSN, RN
Assistant Professor
Armstrong State University
Savannah, Georgia
Elizabeth Murray, PhD, RN, CNE
Assistant Professor
Florida Gulf Coast University
Fort Myers, Florida
Sarah Newton, PhD, RN
Associate Professor
School of Nursing
Oakland University
25
Rochester, Michigan
Mae Ann Pasquale, RN, BSN, MSN
Assistant Professor of Nursing
Cedar Crest College
Allentown, Pennsylvania
Kim L. Paxton DNP, APRN, ANP-BC, LIHT-C
Assistant Professor
Cardinal Stritch University
Milwaukee, Wisconsin
Janet Reagor, PhD, RN
Interim Dean and Assistant Professor of Nursing
Director
RN-BSN Program
Avila University
Kansas City, Missouri
Elizabeth A. Roe, PhD, RN
Acting Assistant Dean
College of Human and Health Sciences
Saginaw Valley State University
Saginaw, Michigan
Cathy Rozmus, PhD, RN
Professor
Associate Dean for Academic Affairs
University of Texas Health Science Center at Houston
Houston, Texas
Milena P. Staykova, EdD, FNC-BC
Director
Post-Licensure Bachelor of Science in Nursing
Jefferson College of Health Sciences
Roanoke, Virginia
Amy Stimpfel, PhD, RN
Assistant Professor
College of Nursing
New York University
New York, New York
Yiyuan Sun, DNSc
26
Associate Professor
Adelphi University
Garden City, New York
Annie Thomas, PhD, RN
Assistant Professor
Loyola University Chicago
Chicago, Illinois
Elizabeth VandeWaa, PhD
Professor of Adult Health Nursing
University of South Alabama
Mobile, Alabama
Adrienne Wald, BSN, MBA, EdD
Assistant Professor
College of New Rochelle
New Rochelle, New York
Camille Wendekier, PhD, CRRN, CSN, RN
Assistant Professor
Saint Francis University
Loretto, Pennsylvania
Kathleen Williamson, RN, PhD
Chair
Wilson School of Nursing
Midwestern State University
Wichita Falls, Texas
Roxanne Wilson, PhD, RN
Assistant Professor
St. Cloud State University
St. Cloud, Minnesota
Paige Wimberley, RN, CNS, CNE
Assistant Professor of Nursing
Arkansas State University
Jonesboro, Arkansas
Charlotte A. Wisnewski, PhD, RN, CDE, CNE
Associate Professor
University of Texas Medical Branch
Galveston, Texas
27
ACKNOWLEDGMENTS
This ninth edition, like the previous eight editions, depended on the contribution of
many generous people. To all of the many faculty and students who used the text and
have made invaluable suggestions for its improvement, we are very grateful.
Suggestions were made to us both directly in personal interactions (mostly at the
University of Connecticut and Griffith University in Australia) and via e-mail
correspondence. We would like in particular to thank Valori Banfi, nursing librarian at
the University of Connecticut, and John McNulty, a faculty member at the University of
Connecticut. We would also like to acknowledge the reviewers of the ninth edition of
Essentials.
Other individuals made specific contributions. Although it would be impossible to
mention all, we note with thanks the nurse researchers who shared their work with us as
we developed examples, including work that in some cases was not yet published. We
also extend our warm thanks to those who helped to turn the manuscript into a finished
product. The staff at Wolters Kluwer has been of tremendous assistance in the support
they have given us over the years. We are indebted to Christina C. Burns, Emily
Lupash, Meredith L. Brittain, Marian Bellus, and all the others behind the scenes for
their fine contributions. Thanks also to Rodel Fariñas for his patience and good humor
in turning our manuscript into this textbook.
Finally, we thank our families, our loved ones, and our friends, who provided
ongoing support and encouragement throughout this endeavor and who were tolerant
when we worked long into the night, over weekends, and during holidays to get this
ninth edition finished.
28
CONTENTS
Part 1 Overview of Nursing Research and Its Role in Evidence-Based
Practice
1 Introduction to Nursing Research in an Evidence-Based Practice Environment
2 Fundamentals of Evidence-Based Nursing Practice
3 Key Concepts and Steps in Quantitative and Qualitative Research
4 Reading and Critiquing Research Articles
5 Ethics in Research
Part 2 Preliminary Steps in Quantitative and Qualitative Research
6 Research Problems, Research Questions, and Hypotheses
7 Finding and Reviewing Research Evidence in the Literature
8 Theoretical and Conceptual Frameworks
Part 3 Designs and Methods for Quantitative and Qualitative Nursing
Research
9 Quantitative Research Design
10 Sampling and Data Collection in Quantitative Studies
11 Qualitative Designs and Approaches
12 Sampling and Data Collection in Qualitative Studies
13 Mixed Methods and Other Special Types of Research
Part 4 Analysis and Interpretation in Quantitative and Qualitative
Research
14 Statistical Analysis of Quantitative Data
15 Interpretation and Clinical Significance in Quantitative Research
16 Analysis of Qualitative Data
17 Trustworthiness and Integrity in Qualitative Research
29
18 Systematic Reviews: Meta-Analysis and Metasynthesis
Appendix A Swenson et al.’s (2016) Study: Parents’ Use of Praise and Criticism in
a Sample of Young Children Seeking Mental Health Services
Appendix B Beck and Watson’s (2010) Study: Subsequent Childbirth After a
Previous Traumatic Birth
Appendix C Wilson et al.’s (2016) Study: A Randomized Controlled Trial of an
Individualized Preoperative Education Intervention for Symptom
Management After Total Knee Arthroplasty
Critique of Wilson and Colleagues’ Study
Appendix D Sawyer et al.’s (2010) Study: Differences in Perceptions of the
Diagnosis and Treatment of Obstructive Sleep Apnea and Continuous
Positive Airway Pressure Therapy Among Adherers and Nonadherers
Critique of Sawyer and Colleagues’ Study
Glossary
Index
CHAPTER SUPPLEMENTS AVAILABLE ON
Supplement for Chapter 1 The History of Nursing Research
Supplement for Chapter 2 Evaluating Clinical Practice Guidelines—AGREE II
Supplement for Chapter 3 Deductive and Inductive Reasoning
Supplement for Chapter 4 Guide to an Overall Critique of a Quantitative
Research Report and Guide to an Overall Critique of a
Qualitative Research Report
Supplement for Chapter 5 Informed Consent
Supplement for Chapter 6 Simple and Complex Hypotheses
Supplement for Chapter 7 Finding Evidence for an EBP Inquiry in PubMed
Supplement for Chapter 8 Prominent Conceptual Models of Nursing Used by
Nurse Researchers
Supplement for Chapter 9 Selected Experimental and Quasi-Experimental
Designs: Diagrams, Uses, and Drawbacks
Supplement for Chapter 10 Vignettes and Q-Sorts
Supplement for Chapter 11 Qualitative Descriptive Studies
30
Supplement for Chapter 12 Transferability and Generalizability
Supplement for Chapter 13 Other Specific Types of Research
Supplement for Chapter 14 Multivariate Statistics
Supplement for Chapter 15 Research Biases
Supplement for Chapter 16 A Glaserian Grounded Theory Study: Illustrative
Materials
Supplement for Chapter 17 Whittemore and Colleagues’ Framework of Quality
Criteria in Qualitative Research
Supplement for Chapter 18 Publication Bias in Meta-Analyses
31
Part 1 Overview of Nursing Research and Its
Role in Evidence-Based Practice
1 Introduction to Nursing Research in
an Evidence-Based Practice
Environment
Learning Objectives
On completing this chapter, you will be able to:
Understand why research is important in nursing
Discuss the need for evidence-based practice
Describe broad historical trends and future directions in nursing research
Identify alternative sources of evidence for nursing practice
Describe major characteristics of the positivist and constructivist paradigm
Compare the traditional scientific method (quantitative research) with constructivist
methods (qualitative research)
Identify several purposes of quantitative and qualitative research
Define new terms in the chapter
Key Terms
Assumption
Cause-probing research
Clinical nursing research
Clinical significance
Constructivist paradigm
Empirical evidence
Evidence-based practice (EBP)
Generalizability
Journal club
32
Nursing research
Paradigm
Positivist paradigm
Qualitative research
Quantitative research
Research
Research methods
Scientific method
Systematic review
NURSING RESEARCH IN PERSPECTIVE
We know that many of you readers are not taking this course because you plan to
become nurse researchers. Yet, we are also confident that many of you will participate
in research-related activities during your careers, and virtually all of you will be
expected to be research-savvy at a basic level. Although you may not yet grasp the
relevance of research in your career as a nurse, we hope that you will come to see the
value of nursing research during this course and will be inspired by the efforts of the
thousands of nurse researchers now working worldwide to improve patient care. You
are embarking on a lifelong journey in which research will play a role. We hope to
prepare you to enjoy the voyage.
What Is Nursing Research?
Whether you know it or not, you have already done a lot of research. When you use the
Internet to find the “best deal” on a laptop or an airfare, you start with a question (e.g.,
Who has the best deal for what I want?), collect the information by searching different
websites, and then come to a conclusion. This “everyday research” has much in
common with formal research—but, of course, there are important differences, too.
As a formal enterprise, research is systematic inquiry that uses disciplined methods
to answer questions and solve problems. The ultimate goal of formal research is to gain
knowledge that would be useful for many people. Nursing research is systematic
inquiry designed to develop trustworthy evidence about issues of importance to nurses
and their clients. In this book, we emphasize clinical nursing research, which is
research designed to guide nursing practice. Clinical nursing research typically begins
with questions stemming from practice problems—problems you may have already
encountered.
Examples of nursing research questions
Does a text message notification process help to reduce follow-up time for women
with abnormal mammograms? (Oakley-Girvan et al., 2016)
What are the daily experiences of patients receiving hemodialysis treatment for
33
end-stage renal disease? (Chiaranai, 2016)
TIP You may have the impression that research is abstract and irrelevant to
practicing nurses. But nursing research is about real people with real
problems, and studying those problems offers opportunities to solve or
address them through improvements to nursing care.
The Importance of Research to Evidence-Based Nursing
Nursing has experienced profound changes in the past few decades. Nurses are
increasingly expected to understand and undertake research and to base their practice on
evidence from research—that is, to adopt an evidence-based practice (EBP). EBP,
broadly defined, is the use of the best evidence in making patient care decisions. Such
evidence typically comes from research conducted by nurses and other health care
professionals. Nurse leaders recognize the need to base specific nursing decisions on
evidence indicating that the decisions are clinically appropriate and cost-effective and
result in positive client outcomes.
In some countries, research plays an important role in nursing credentialing and
status. For example, the American Nurses Credentialing Center—an arm of the
American Nurses Association—has developed a Magnet Recognition Program to
recognize health care organizations that provide high-quality nursing care. To achieve
Magnet status, practice environments must demonstrate a sustained commitment to EBP
and nursing research. Changes to nursing practice are happening every day because of
EBP efforts.
Example of evidence-based practice
Many clinical practice changes reflect the impact of research. For example,
“kangaroo care,” the holding of diaper-clad preterm infants skin-to-skin, chest-to-
chest by parents, is now widely practiced in neonatal intensive care units (NICUs),
but in the early 1990s, only a minority of NICUs offered kangaroo care options. The
adoption of this practice reflects good evidence that early skin-to-skin contact has
clinical benefits and no negative side effects (Ludington-Hoe, 2011; Moore et al.,
2012). Some of this evidence comes from rigorous studies by nurse researchers (e.g.,
Campbell-Yeo et al., 2013; Cong et al., 2009; Cong et al., 2011; Holditch-Davis et
al., 2014; Lowson et al., 2015).
Roles of Nurses in Research
In the current EBP environment, every nurse is likely to engage in one or more activities
along a continuum of research participation. At one end of the continuum are users or
consumers of nursing research—nurses who read research reports to keep up-to-date on
34
findings that may affect their practice. EBP depends on well-informed nursing research
consumers.
At the other end of the continuum are the producers of nursing research: nurses
who actively design and undertake studies. At one time, most nurse researchers were
academics who taught in schools of nursing, but research is increasingly being
conducted by practicing nurses who want to find what works best for their clients.
Between these two end points on the continuum lie a variety of research activities in
which nurses engage. Even if you never conduct a study, you may do one of the
following:
1. Contribute an idea for a clinical inquiry
2. Assist in collecting research information
3. Offer advice to clients about participating in a study
4. Search for research evidence
5. Discuss the implications of a study in a journal club in your practice setting, which
involves meetings to discuss research articles
In all research-related activities, nurses who have some research skills are better
able than those without them to contribute to nursing and to EBP. Thus, with the
research skills you gain from this book, you will be prepared to contribute to the
advancement of nursing.
Nursing Research: Past and Present
Most people agree that research in nursing began with Florence Nightingale in the mid-
19th century. Based on her skillful analysis of factors affecting soldier mortality and
morbidity during the Crimean War, she was successful in bringing about changes in
nursing care and in public health. For many years after Nightingale’s work, however,
research was absent from the nursing literature. Studies began to appear in the early
1900s but most concerned nurses’ education.
In the 1950s, nursing research began to flourish. An increase in the number of
nurses with advanced skills and degrees, an increase in the availability of research
funding, and the establishment of the journal Nursing Research helped to propel nursing
research. During the 1960s, practice-oriented research began to emerge, and research-
oriented journals started publication in several countries. During the 1970s, there was a
change in research emphasis from areas such as teaching and nurses’ characteristics to
improvements in client care. Nurses also began to pay attention to the utilization of
research findings in nursing practice.
The 1980s brought nursing research to a new level of development. Of particular
importance in the United States was the establishment in 1986 of the National Center
for Nursing Research (NCNR) at the National Institutes of Health (NIH). The purpose
of NCNR was to promote and financially support research projects and training relating
to patient care. Nursing research was strengthened and given more visibility when
NCNR was promoted to full institute status within the NIH: In 1993, the National
35
Institute of Nursing Research (NINR) was established. The birth and expansion of
NINR helped put nursing research more into the mainstream of research activities
enjoyed by other health disciplines. Funding opportunities expanded in other countries
as well.
The 1990s witnessed the birth of several more journals for nurse researchers, and
specialty journals increasingly came to publish research articles. International
cooperation in integrating EBP into nursing also began to develop in the 1990s. For
example, Sigma Theta Tau International sponsored the first international research
utilization conference, in cooperation with the faculty of the University of Toronto, in
1998.
TIP For those interested in learning more about the history of nursing
research, we offer an expanded summary in the Supplement to this chapter
on website.
Future Directions for Nursing Research
Nursing research continues to develop at a rapid pace and will undoubtedly flourish in
the 21st century. In 1986, NCNR had a budget of $16 million, whereas NINR funding
in fiscal year 2016 was just under $150 million. Among the trends we foresee for the
near future are the following:
Continued focus on EBP. Encouragement for nurses to use research findings in
practice is sure to continue. This means that improvements will be needed in the
quality of nursing studies and in nurses’ skills in locating, understanding, critiquing,
and using relevant study results. Relatedly, there is an emerging interest in
translational research—research on how findings from studies can best be translated
into practice.
Stronger evidence through confirmatory strategies. Practicing nurses rarely adopt an
innovation on the basis of poorly designed or isolated studies. Strong research
designs are essential, and confirmation is usually needed through replication (i.e.,
repeating) of studies in different clinical settings to ensure that the findings are
robust.
Continued emphasis on systematic reviews. Systematic reviews are a cornerstone of
EBP and have assumed increasing importance in all health disciplines. Systematic
reviews rigorously integrate research information on a topic so that conclusions
about the state of evidence can be reached.
Expanded local research in health care settings. Small studies designed to solve local
problems will likely increase. This trend will be reinforced as more hospitals apply
for (and are recertified for) Magnet status in the United States and in other countries.
Expanded dissemination of research findings. The Internet and other technological
advances have had a big impact on the dissemination of research information, which
36
in turn helps to promote EBP.
Increased focus on cultural issues and health disparities. The issue of health
disparities has emerged as a central concern, and this in turn has raised consciousness
about the cultural sensitivity of health interventions. Research must be sensitive to
the beliefs, behaviors, epidemiology, and values of culturally and linguistically
diverse populations.
Clinical significance and patient input. Research findings increasingly must meet the
test of being clinically significant, and patients have taken center stage in efforts to
define clinical significance. A major challenge in the years ahead will involve
incorporating both research evidence and patient preferences into clinical decisions.
What are nurse researchers likely to be studying in the future? Although there is
tremendous diversity in research interests, research priorities have been articulated by
NINR, Sigma Theta Tau International, and other nursing organizations. For example,
NINR’s Strategic Plan, launched in 2011 and updated in 2013, described five areas of
focus: promoting health and preventing disease, symptom management and self-
management, end-of-life and palliative care, innovation, and the development of nurse
scientists (http://www.ninr.nih.gov).
TIP All websites cited in this chapter, plus additional websites with useful
content relating to the foundations of nursing research, are in the Internet
Resources on website. This will allow you to simply use the
“Control/Click” feature to go directly to the website, without having to type
in the URL and risk a typographical error. Websites corresponding to the
content of all chapters of the book are also on .
SOURCES OF EVIDENCE FOR NURSING PRACTICE
Nurses make clinical decisions based on a large repertoire of knowledge. As a nursing
student, you are gaining skills on how to practice nursing from your instructors,
textbooks, and clinical placements. When you become a registered nurse (RN), you will
continue to learn from other nurses and health care professionals. Because evidence is
constantly evolving, learning about best-practice nursing will carry on throughout your
career.
Some of what you have learned thus far is based on systematic research, but much
of it is not. What are the sources of evidence for nursing practice? Where does
knowledge for practice come from? Until fairly recently, knowledge primarily was
handed down from one generation to the next based on clinical experience, trial and
error, tradition, and expert opinion. These alternative sources of knowledge are different
from research-based information.
Tradition and Authority
37
http://www.ninr.nih.gov
Some nursing interventions are based on untested traditions, customs, and “unit culture”
rather than on sound evidence. Indeed, a recent analysis suggests that some “sacred
cows” (ineffective traditional habits) persist even in a health care center recognized as a
leader in EBP (Hanrahan et al., 2015). Another common source of knowledge is an
authority, a person with specialized expertise. Reliance on authorities (such as nursing
faculty or textbook authors) is unavoidable. Authorities, however, are not infallible—
particularly if their expertise is based primarily on personal experience; yet, their
knowledge is often unchallenged.
Example of “myths” in nursing textbooks
One study suggests that nursing textbooks may contain many “myths.” In their
analysis of 23 widely used undergraduate psychiatric nursing textbooks, Holman and
colleagues (2010) found that all books contained at least one unsupported assumption
(myth) about loss and grief—i.e., assumptions not supported by current research
evidence. And many evidence-based findings about grief and loss failed to be
included in the textbooks.
TIP The consequences of not using research-based evidence can be
devastating. For example, from 1956 through the 1980s, Dr. Benjamin Spock
published several editions of Baby and Child Care, a parental guide that sold
over 19 million copies worldwide. As an authority figure, he wrote the
following advice: “I think it is preferable to accustom a baby to sleeping on
his stomach from the beginning if he is willing” (Spock, 1979, p. 164).
Research has clearly demonstrated that this sleeping position is associated
with heighted risk of sudden infant death syndrome (SIDS). In their
systematic review of evidence, Gilbert and colleagues (2005) wrote, “Advice
to put infants to sleep on the front for nearly half a century was contrary to
evidence from 1970 that this was likely to be harmful” (p. 874). They
estimated that if medical advice had been guided by research evidence, over
60,000 infant deaths might have been prevented.
Clinical Experience and Trial and Error
Clinical experience is a functional source of knowledge. Yet, personal experience has
limitations as a source of evidence for practice because each nurse’s experience is too
narrow to be generally useful, and personal experiences are often colored by biases.
Trial and error involves trying alternatives successively until a solution to a problem is
found. Trial and error can be practical, but the method tends to be haphazard, and
solutions may be idiosyncratic.
Assembled Information
38
In making clinical decisions, health care professionals also rely on information that has
been assembled for various purposes. For example, local, national, and international
benchmarking data provide information on such issues as the rates of using various
procedures (e.g., rates of cesarean deliveries) or rates of clinical problems (e.g.,
nosocomial infections). Quality improvement and risk data, such as medication error
reports, can be used to assess practices and determine the need for practice changes.
Such sources offer useful information but provide no mechanism to actually guide
improvements.
Disciplined Research
Disciplined research is considered the best method of acquiring reliable knowledge that
humans have developed. Evidence-based health care compels nurses to base their
clinical practice, to the extent possible, on rigorous research-based findings rather than
on tradition, authority, or personal experience. However, nursing will always be a rich
blend of art and science.
PARADIGMS AND METHODS FOR NURSING
RESEARCH
The questions that nurse researchers ask and the methods they use to answer their
questions spring from a researcher’s view of how the world “works.” A paradigm is a
worldview, a general perspective on the world’s complexities. Disciplined inquiry in
nursing has been conducted mainly within two broad paradigms. This section describes
the two paradigms and outlines the research methods associated with them.
The Positivist Paradigm
The paradigm that dominated nursing research for decades is called positivism.
Positivism is rooted in 19th century thought, guided by such philosophers as Newton
and Locke. Positivism is a reflection of a broad cultural movement (modernism) that
emphasizes the rational and scientific.
As shown in Table 1.1, a fundamental assumption of positivists is that there is a
reality out there that can be studied and known. An assumption is a principle that is
believed to be true without verification. Adherents of positivism assume that nature is
ordered and regular and that a reality exists independent of human observation. In other
words, the world is assumed not to be merely a creation of the human mind. The
assumption of determinism refers to the positivists’ belief that phenomena are not
haphazard but rather have antecedent causes. If a person has a stroke, a scientist in a
positivist tradition assumes that there must be one or more reasons that can be
potentially identified. Within the positivist paradigm, research activity is often aimed
at understanding the underlying causes of natural phenomena.
39
TIP What do we mean by phenomena? In a research context, phenomena
are those things in which researchers are interested—such as a health event
(e.g., a patient fall), a health outcome (e.g., pain), or a health experience (e.g.,
living with chronic pain).
Because of their belief in objective reality, positivists prize objectivity. Their
approach involves the use of orderly, disciplined procedures with tight controls over the
research situation to test hunches about the nature of phenomena being studied and
relationships among them.
Strict positivist thinking has been challenged, and few researchers adhere to the
tenets of pure positivism. Postpositivists still believe in reality and seek to understand it,
but they recognize the impossibility of total objectivity. Yet, they see objectivity as a
goal and strive to be as unbiased as possible. Postpositivists also appreciate the barriers
to knowing reality with certainty and therefore seek probabilistic evidence—i.e.,
learning what the true state of a phenomenon probably is. This modified positivist
position remains a dominant force in nursing research. For the sake of simplicity, we
refer to it as positivism.
The Constructivist Paradigm
The constructivist paradigm (sometimes called the naturalistic paradigm) began as a
countermovement to positivism with writers such as Weber and Kant. The constructivist
paradigm is a major alternative system for conducting research in nursing. Table 1.1
compares four major assumptions of the positivist and constructivist paradigms.
40
For the naturalistic inquirer, reality is not a fixed entity but rather a construction of
the people participating in the research; reality exists within a context, and many
constructions are possible. Naturalists take the position of relativism: If there are
multiple interpretations of reality that exist in people’s minds, then there is no process
by which the ultimate truth or falsity of the constructions can be determined.
The constructivist paradigm assumes that knowledge is maximized when the
distance between the inquirer and participants in the study is minimized. The voices and
interpretations of those under study are crucial to understanding the phenomenon of
interest, and subjective interactions are the best way to access them. Findings from a
constructivist inquiry are the product of the interaction between the inquirer and the
participants.
Paradigms and Methods: Quantitative and Qualitative
Research
Research methods are the techniques researchers use to structure a study and to gather
and analyze relevant information. The two paradigms correspond to different methods
of developing evidence. A key methodologic distinction is between quantitative
research, which is most closely allied with positivism, and qualitative research, which
is associated with constructivist inquiry—although positivists sometimes undertake
qualitative studies, and constructivist researchers sometimes collect quantitative
information. This section gives an overview of the methods linked to the two alternative
paradigms.
The Scientific Method and Quantitative Research
The traditional, positivist scientific method involves using a set of orderly procedures
to gather information. Quantitative researchers typically move in a systematic fashion
from the definition of a problem to a solution. By systematic, we mean that investigators
progress through a series of steps, according to a prespecified plan. Quantitative
researchers use objective methods designed to control the research situation with the
goal of minimizing bias and maximizing validity.
Quantitative researchers gather empirical evidence—evidence that is rooted in
objective reality and gathered directly or indirectly through the senses rather than
through personal beliefs or hunches. Evidence for a quantitative study is gathered
systematically, using formal instruments to collect needed information. Usually (but not
always) the information is quantitative—that is, numeric information that results from
some type of formal measurement and that is analyzed statistically. Quantitative
researchers strive to go beyond the specifics of a research situation; the ability to
generalize research findings to individuals other than those who took part in the study
(referred to as generalizability) is an important goal.
The traditional scientific method has been used productively by nurse researchers
studying a wide range of questions. Yet, there are important limitations. For example,
41
quantitative researchers must deal with problems of measurement. To study a
phenomenon, scientists must measure it, that is, attach numeric values that express
quantity. For example, if the phenomenon of interest were patient stress, researchers
would want to assess whether stress is high or low, or higher under certain conditions or
for some people. Physiologic phenomena such as blood pressure and temperature can be
measured with accuracy and precision, but the same cannot be said of most
psychological phenomena, such as stress or resilience.
Another issue is that nursing research focuses on human beings, who are inherently
complicated and diverse. The traditional scientific method typically focuses on a
relatively small set of phenomena (e.g., weight gain, depression) in a study.
Complexities tend to be controlled and, if possible, eliminated rather than studied
directly, and this narrowness of focus can sometimes obscure insights. Relatedly,
quantitative research within the positivist paradigm has sometimes been accused of a
narrowness of vision that does not capture the full breadth of human experience.
TIP Students often find quantitative studies more intimidating and difficult
than qualitative ones. Try not to worry too much about the jargon at first—
remember that each study has a story to tell, and grasping the main point of
the story is what is initially important.
Constructivist Methods and Qualitative Research
Researchers in constructivist traditions emphasize the inherent complexity of humans,
their ability to shape and create their own experiences, and the idea that truth is a
composite of realities. Consequently, constructivist studies are heavily focused on
understanding the human experience as it is lived, through the careful collection and
analysis of qualitative materials that are narrative and subjective.
Researchers who reject the traditional scientific method believe that a major
limitation is that it is reductionist—that is, it reduces human experience to only the few
concepts under investigation, and those concepts are defined in advance by researchers
rather than emerging from the experiences of those under study. Constructivist
researchers tend to emphasize the dynamic, holistic, and individual aspects of human
life and try to capture those aspects in their entirety, within the context of those who are
experiencing them.
Flexible, evolving procedures are used to capitalize on findings that emerge during
the study, which typically is undertaken in naturalistic settings. The collection of
information and its analysis usually progress concurrently. As researchers sift through
information, insights are gained, new questions emerge, and further evidence is sought
to confirm the insights. Through an inductive process (going from specifics to the
general), researchers integrate information to develop a theory or description that
illuminates the phenomena under observation.
Constructivist studies yield rich, in-depth information that can potentially clarify the
42
varied dimensions (or themes) of a complicated phenomenon. Findings from qualitative
research are typically grounded in the real-life experiences of people with firsthand
knowledge of a phenomenon. Nevertheless, the approach has several limitations.
Human beings are used directly as the instrument through which information is
gathered, and humans are highly intelligent—but fallible—tools.
Another potential limitation involves the subjectivity of constructivist inquiry,
which sometimes raises concerns about the idiosyncratic nature of the conclusions.
Would two constructivist researchers studying the same phenomenon in similar settings
arrive at similar conclusions? The situation is magnified by the fact that most
constructivist studies involve a small group of participants. Thus, the generalizability of
findings from constructivist inquiries is an issue of potential concern.
TIP Researchers usually do not discuss or even mention the underlying
paradigm of their studies in their reports. The paradigm provides context,
without being explicitly referenced.
Multiple Paradigms and Nursing Research
Paradigms are lenses that help to sharpen researchers’ focus on phenomena of interest,
not blinders that limit curiosity. We think that the emergence of alternative paradigms
for studying nursing problems is a desirable trend that can maximize the breadth of new
evidence for practice. Nursing knowledge would be thin if it were not for a rich array of
methods—methods that are often complementary in their strengths and limitations.
We have emphasized differences between the two paradigms and associated
methods so that distinctions would be easy to understand. It is equally important,
however, to note that the two paradigms have many features in common, some of which
are mentioned here:
Ultimate goals. The ultimate aim of disciplined research, regardless of paradigm, is to
answer questions and solve problems. Both quantitative and qualitative researchers
seek to capture the truth with regard to the phenomena in which they are interested.
External evidence. The word empiricism is often associated with the scientific method,
but researchers in both traditions gather and analyze evidence gathered empirically,
that is, through their senses.
Reliance on human cooperation. Human cooperation is essential in both quantitative
and qualitative research. To understand people’s characteristics and experiences,
researchers must encourage people to participate in the study and to speak candidly.
Ethical constraints. Research with human beings is guided by ethical principles that
sometimes interfere with research goals. Ethical dilemmas often confront
researchers, regardless of paradigms or methods.
Fallibility. Virtually all studies have limitations. Every research question can be
addressed in different ways, and inevitably, there are tradeoffs. Financial constraints
43
are often an issue, but limitations exist even in well-funded research. This means that
no single study can ever definitively answer a research question. The fallibility of
any single study makes it important to understand and critique researchers’ methods
when evaluating evidence quality.
Thus, despite philosophic and methodologic differences, researchers using the
traditional scientific method or constructivist methods share basic goals and face many
similar challenges. The selection of an appropriate method depends not only on
researchers’ philosophy and worldview but also on the research question. If a researcher
asks, “What are the effects of cryotherapy on nausea and oral mucositis in patients
undergoing chemotherapy?” the researcher needs to examine effects through the careful
quantitative assessment of patients. On the other hand, if a researcher asks, “What is the
process by which parents learn to cope with the death of a child?” the researcher would
be hard pressed to quantify such a process. Personal worldviews of researchers help to
shape their questions.
In reading about the alternative paradigms, you likely were more attracted to one of
the two paradigms—the one that corresponds most closely to your view of the world. It
is important, however, to learn about and value both approaches to disciplined inquiry
and to recognize their respective strengths and limitations.
HOW-TO-TELL TIP How can you tell if a study is quantitative or
qualitative? As you progress through this book, you should be able to
identify most studies as quantitative versus qualitative based simply on the
study’s title or on terms in the summary at the beginning of an article. At
this point, though, it may be easiest to distinguish the two types of studies
based on how many numbers appear in the article, especially in tables.
Quantitative studies typically have several tables with numbers and
statistical information. Qualitative studies may have no tables with
quantitative information, or only one numeric table describing participants’
characteristics (e.g., the percentage who were male or female). Qualitative
studies often have “word tables” or diagrams and figures illustrating
processes inferred from the narrative information gathered.
THE PURPOSES OF NURSING RESEARCH
Why do nurses do research? Several different systems have been devised to classify
different research goals. We describe two such classification systems—not because it is
important for you to categorize a study as having one purpose or the other but rather
because this will help us to illustrate the broad range of questions that have intrigued
nurses and to further show differences between quantitative and qualitative inquiry.
TIP Sometimes a distinction is made between basic and applied research.
44
Basic research is appropriate for discovering general principles of human
behavior and biophysiologic processes. Applied research is designed to
examine how these principles can be used to solve problems in nursing
practice.
Research to Achieve Varying Levels of Explanation
One way to classify research purposes is by the extent to which studies are designed to
provide explanations. A fundamental distinction that is especially relevant in
quantitative research is between studies whose primary goal is to describe phenomena
and those that are cause-probing—that is, studies designed to illuminate the underlying
causes of phenomena.
Using a descriptive/explanatory framework, the specific purposes of nursing
research include identification, description, exploration, explanation, and
prediction/control. When researchers state their study purpose, they often use these
terms (e.g., The purpose of this study was to explore . . . ). For each purpose, various
types of question are addressed—some more amenable to quantitative than to
qualitative inquiry, and vice versa.
Identification and Description
In quantitative research, researchers begin with a phenomenon that has been previously
studied or defined. Qualitative researchers, by contrast, sometimes study phenomena
about which little is known. In some cases, so little is known that the phenomenon has
yet to be clearly identified or named or has been inadequately defined. The in-depth,
probing nature of qualitative research is well suited to answering such questions as
“What is this phenomenon?” and “What is its name?” (Table 1.2).
Quantitative example of description
Palese and colleagues (2015) conducted a study to describe the average healing time
of stage II pressure ulcers. They found that it took approximately 23 days to achieve
complete reepithelialization.
45
Qualitative example of identification
Stapleton and Pattison (2015) studied the experience of men with advanced cancer in
relation to their perceptions of masculinity. Through in-depth interviews, the
researchers identified a new aspect of masculinity, which they called thwarted
ambition.
Description of phenomena is an important purpose of research. In descriptive
studies, researchers count, delineate, and classify. Nurse researchers have described a
wide variety of phenomena, such as patients’ stress, health beliefs, and so on.
Quantitative description focuses on the prevalence, size, and measurable aspects of
phenomena. Qualitative researchers describe the nature, dimensions, and salience of
phenomena, as shown in Table 1.2.
Exploration
Exploratory research begins with a phenomenon of interest; but rather than simply
describing it, exploratory researchers examine the nature of the phenomenon, the
manner in which it is manifested, and other factors to which it is related—including
factors that might be causing it. For example, a descriptive quantitative study of
patients’ preoperative stress might document how much stress patients experience. An
exploratory study might ask: What factors increase or lower a patient’s stress?
Qualitative methods can be used to explore the nature of little understood phenomena
and to shed light on the ways in which a phenomenon is expressed.
46
Qualitative example of exploration
Wazneh and colleagues (2016) used in-depth interviews to explore the extent to
which the contents of a special backpack called the “Venturing Out Pack” met the
practical, psychosocial, and information needs of young adults being treated for
cancer.
Explanation
Explanatory research seeks to understand the underlying causes or full nature of a
phenomenon. In quantitative research, theories or prior findings are used deductively to
generate hypothesized explanations that are tested statistically. Qualitative researchers
search for explanations about how or why a phenomenon exists or what a phenomenon
means as a basis for developing a theory that is grounded in rich, in-depth, experiential
evidence.
Quantitative example of explanation
Golfenshtein and Drach-Zahavy (2015) tested a theoretical model to explain the role
of patients’ attributions in nurses’ regulation of emotions in pediatric hospital wards.
Prediction and Control
Many phenomena defy explanation, yet it is often possible to predict or control them
based on research evidence. For example, research has shown that the incidence of
Down syndrome in infants increases with maternal age. We can predict that a woman
aged 40 years is at higher risk of bearing a child with Down syndrome than a woman
aged 25 years. We can attempt to influence the outcome by educating women about the
risks and offering amniocentesis to women older than 35 years of age. The ability to
predict and control in this example does not rely on an explanation of what causes older
women to be at a higher risk. In many quantitative studies, prediction and control are
key goals. Although explanatory studies are powerful, studies whose purpose is
prediction and control are also critical to EBP.
Quantitative example of prediction
Jain and colleagues (2016) conducted a study to assess whether scores on a measure
of neurological impairment at hospital arrival, among patients who had a transient
ischemic attack or a stroke, predicted their functional outcomes, such as ambulatory
status at hospital discharge.
Research Purposes Linked to Evidence-Based Practice
Another system for classifying studies has emerged in efforts to communicate EBP-
related purposes (e.g., DiCenso et al., 2005; Guyatt et al., 2008; Melnyk & Fineout-
47
Overholt, 2015). Table 1.3 identifies some of the questions relevant for each EBP
purpose and offers an actual nursing research example. In this classification scheme, the
various purposes can best be addressed with quantitative research, with the exception of
the last category (meaning/process), which requires qualitative research.
Therapy, Treatment, or Intervention
Studies with a therapy purpose seek to identify effective treatments for improving or
preventing health problems. Such studies range from evaluations of highly specific
treatments (e.g., comparing two types of cooling blankets for febrile patients) to
complex multicomponent interventions designed to effect behavioral changes (e.g.,
nurse-led smoking cessation interventions). Intervention research plays a critical role in
EBP.
Diagnosis and Assessment
Many nursing studies concern the rigorous development and testing of formal
instruments to screen, diagnose, and assess patients and to measure clinical outcomes.
High-quality instruments with documented accuracy are essential both for clinical
practice and for research.
Prognosis
48
Studies of prognosis examine the consequences of a disease or health problem, explore
factors that can modify the prognosis, and examine when (and for which types of
people) the consequences are most likely. Such studies facilitate the development of
long-term care plans for patients. They also provide valuable information for guiding
patients to make beneficial lifestyle choices or to be vigilant for key symptoms.
Etiology (Causation) and Harm
It is difficult to prevent harm or treat health problems if we do not know what causes
them. For example, there would be no smoking cessation programs if research had not
provided firm evidence that smoking cigarettes causes or contributes to many health
problems. Thus, determining the factors and exposures that affect or cause illness,
mortality, or morbidity is an important purpose of many studies.
Meaning and Processes
Many health care activities (e.g., motivating people to comply with treatments,
providing sensitive advice to patients, designing appealing interventions) can greatly
benefit from understanding the clients’ perspectives. Research that offers evidence
about what health and illness mean to clients, what barriers they face to positive health
practices, and what processes they experience in a transition through a health care crisis
is important to evidence-based nursing practice.
TIP Most EBP-related purposes (except diagnosis and meaning) involve
cause-probing research. For example, research on interventions focuses on
whether an intervention causes improvements in key outcomes. Prognosis
research examines whether a disease or health condition causes subsequent
adverse consequences. Etiology research seeks explanations about the
underlying causes of health problems.
ASSISTANCE FOR CONSUMERS OF NURSING
RESEARCH
We hope that this book will help you develop skills that will allow you to read,
appraise, and use nursing studies and to appreciate nursing research. In each chapter, we
present information relating to methods used by nurse researchers and provide guidance
in several ways. First, we offer tips on what you can expect to find in actual research
articles, identified by the icon . There are also special “how-to-tell” tips (identified
with the icon ) that help with some potentially confusing issues in research articles.
Second, we include guidelines for critiquing various aspects of a study. The guiding
questions in Box 1.1 are designed to assist you in using the information in this chapter
in a preliminary assessment of a research article. And third, we offer opportunities to
apply your new skills. The critical thinking exercises at the end of each chapter guide
49
you through appraisals of real research examples of both quantitative and qualitative
studies. These activities also challenge you to think about how the findings from these
studies could be used in nursing practice. Answers to many of these questions are on
website. Some of these examples are featured in our interactive Critical
Thinking Activity on website. Some of the journal articles are found in the
appendices. The full journal article for studies identified with ** in the references list of
each chapter are available on website.
Box 1.1 Questions for a Preliminary Overview of a Research Report
1. How relevant is the research problem to the actual practice of nursing?
2. Was the study quantitative or qualitative?
3. What was the underlying purpose (or purposes) of the study—identification,
description, exploration, explanation, or prediction/control? Does the purpose
correspond to an EBP focus such as therapy/treatment, diagnosis, prognosis,
etiology/harm, or meaning?
4. What might be some clinical implications of this research? To what type of people
and settings is the research most relevant? If the findings were accurate, how
might I use the results of this study in my clinical work?
This section presents examples of studies with different purposes. Read the
research summaries for Examples 1 and 2 and then answer the critical
thinking questions that follow, referring to the full research reports if
necessary. The critical thinking questions for Examples 3 and 4 are based on
the studies that appear in their entirety in Appendices A and B of this book.
TIP Examples 1 and 2 are also featured in our interactive Critical
Thinking Activity on website, where you can record, print, and e-
mail your responses to your instructor. Our comments for the questions in
Examples 3 and 4 are in the Student Resources section on .
EXAMPLE 1: QUANTITATIVE RESEARCH
Study: Psychological outcomes after a sexual assault video intervention: A
randomized trial (Miller et al., 2015)
Study Purpose: The purpose of the study was to test whether a brief video-
based intervention had positive effects on the mental health of victims of a
sexual assault. The intervention provided psychoeducation and information
50
about coping strategies to survivors at the time of a sexual assault nurse
examination.
Study Methods: Female sexual assault victims who received forensic
examinations within 72 hours of their victimization were assigned to one of
two groups: (1) those receiving standard care plus the video intervention and
(2) those receiving care as usual, without the video. A total of 164 women
participated in the study. They completed mental health assessments 2 weeks
and 2 months after the forensic examination.
Key Findings: The researchers found that women in both groups had lower
anxiety at the follow-up assessments. However, women in the special
intervention group had significantly lower levels of anxiety symptoms than
those in the usual care group at both follow-ups.
Conclusions: Miller and colleagues (2015) concluded that forensic nurses
have an opportunity to intervene immediately after a sexual assault with an
effective and inexpensive intervention.
Critical Thinking Exercises
1. Answer the relevant questions from Box 1.1 regarding this study.
2. Also consider the following targeted questions, which may assist you in
assessing aspects of the study’s merit:
a. Why do you think levels of anxiety improved over time in both the
intervention and standard care groups?
b. Could this study have been undertaken as a qualitative study? Why or
why not?
EXAMPLE 2: QUALITATIVE RESEARCH
Study: The pain experience of patients hospitalized with inflammatory bowel
disease: A phenomenological study (Bernhofer et al., 2015)
Study Purpose: The purpose of this study was to understand the unique
experience of pain in hospitalized patients with an admitting diagnosis of
inflammatory bowel disease (IBD).
Study Methods: Sixteen men and women with diverse backgrounds (e.g.,
age, length of IBD diagnosis) were recruited from two colorectal units of a
large academic medical center. Patients participated in interviews that lasted
about a half hour. The interviews, which were audiotaped and then
transcribed, focused on what the patients’ pain experiences were like in the
hospital.
Key Findings: Five recurring themes emerged in the analysis of the
interview data: (1) feeling discredited and misunderstood, (2) a desire to
51
dispel the stigma, (3) frustration with constant pain, (4) a need for caregiver
knowledge and understanding, and (5) nurses as the connector between the
patient and physicians. Here is an excerpt from an interview that illustrates
the second theme on stigma: “I’ve been judged on numerous amounts of
occasions in regards to them thinking that I’m just simply seeking out some
kind of pain medication when in reality, I’m seeking out to feel better, to
make the pain go away” (p. 5).
Conclusions: The researchers concluded that nurses caring for hospitalized
patients with IBD could provide better pain management if they understand
the issues highlighted in these themes.
Critical Thinking Exercises
1. Answer the relevant questions from Box 1.1 regarding this study.
2. Also consider the following targeted questions, which may assist you in
assessing aspects of the study’s merit:
a. Why do you think that the researchers audiotaped and transcribed their
in-depth interviews with study participants?
b. Do you think it would have been appropriate for the researchers to
conduct this study using quantitative research methods? Why or why
not?
EXAMPLE 3: QUANTITATIVE RESEARCH IN APPENDIX A
• Read the abstract and the introduction of Swenson and colleagues’ (2016)
study (“Parents’ use of praise and criticism in a sample of young children
seeking mental health services”) in Appendix A of this book.
Critical Thinking Exercises
1. Answer the relevant questions from Box 1.1 regarding this study.
2. Also consider the following targeted questions:
a. Could this study have been undertaken as a qualitative study? Why or
why not?
b. Who provided some financial support for this research? (This
information appears on the first page of the report.)
EXAMPLE 4: QUALITATIVE RESEARCH IN APPENDIX B
• Read the abstract and the introduction of Beck and Watson’s (2010) study
(“Subsequent childbirth after a previous traumatic birth”) in Appendix B
of this book.
Critical Thinking Exercises
52
1. Answer the relevant questions from Box 1.1 regarding this study.
2. Also consider the following targeted questions:
a. What gap in the existing research was the study designed to fill?
b. Was Beck and Watson’s study conducted within the positivist
paradigm or the constructivist paradigm? Provide a rationale for your
choice.
WANT TO KNOW MORE?
A wide variety of resources to enhance your learning and understanding of
this chapter are available on .
• Interactive Critical Thinking Activity
• Chapter Supplement on The History of Nursing Research
• Answers to the Critical Thinking Exercises for Examples 3 and 4
• Internet Resources with useful websites for Chapter 1
• A Wolters Kluwer journal article in its entirety—the study described as
Example 1 on pp. 15-16.
Additional study aids, including eight journal articles and related
questions, are also available in Study Guide for Essentials of Nursing
Research, 9e.
Summary
Points
Nursing research is systematic inquiry undertaken to develop evidence on
problems of importance to nurses.
Nurses in various settings are adopting an evidence-based practice (EBP) that
incorporates research findings into their decisions and interactions with clients.
Knowledge of nursing research enhances the professional practice of all nurses—
including both consumers of research (who read and evaluate studies) and
producers of research (who design and undertake studies).
Nursing research began with Florence Nightingale but developed slowly until its
rapid acceleration in the 1950s. Since the 1980s, the focus has been on clinical
nursing research—that is, on problems relating to clinical practice.
The National Institute of Nursing Research (NINR), established at the U.S.
National Institutes of Health in 1993, affirms the stature of nursing research in the
United States.
53
Future emphases of nursing research are likely to include EBP projects,
replications of research, research integration through systematic reviews,
expanded dissemination efforts, increased focus on health disparities, and a focus
on the clinical significance of research results.
Disciplined research stands in contrast to other knowledge sources for nursing
practice, such as tradition, authority, personal experience, and trial and error.
Disciplined inquiry in nursing is conducted mainly within two broad paradigms—
worldviews with underlying assumptions about reality: the positivist paradigm
and the constructivist paradigm.
In the positivist paradigm, it is assumed that there is an objective reality and that
natural phenomena are regular and orderly. The related assumption of determinism
refers to the belief that phenomena result from prior causes and are not haphazard.
In the constructivist paradigm, it is assumed that reality is not a fixed entity but
is rather a construction of human minds—and thus, “truth” is a composite of
multiple constructions of reality.
Quantitative research (associated with positivism) involves the collection and
analysis of numeric information. Quantitative research is typically conducted
within the traditional scientific method, which is systematic and controlled.
Quantitative researchers base their findings on empirical evidence (evidence
collected by way of the human senses) and strive for generalizability beyond a
single setting or situation.
Constructivist researchers emphasize understanding human experience as it is
lived through the collection and analysis of subjective, narrative materials using
flexible procedures; this paradigm is associated with qualitative research.
A fundamental distinction that is especially relevant in quantitative research is
between studies whose primary intent is to describe phenomena and those that are
cause-probing—i.e., designed to illuminate underlying causes of phenomena.
Specific purposes on the description/explanation continuum include identification,
description, exploration, explanation, and prediction/control.
Many nursing studies can also be classified in terms of an EBP-related aim:
therapy/treatment/intervention, diagnosis and assessment, prognosis, etiology and
harm, and meaning and process.
REFERENCES FOR CHAPTER 1
Bernhofer, E., Masina, V., Sorrell, J., & Modic, M. (2015). The pain experience of patients hospitalized with
inflammatory bowel disease: A phenomenological study. Gastroenterology Nursing. Advance online
publication.
*Campbell-Yeo, M., Johnston, C., Benoit, B., Latimer, M., Vincer, M., Walker, C., . . . Caddell, K. (2013). Trial of
repeated analgesia with kangaroo mother care (TRAKC trial). BMC Pediatrics, 13, 182.
Chiaranai, C. (2016). The lived experience of patients receiving hemodialysis treatment for end-stage renal disease:
A qualitative study. The Journal of Nursing Research, 24, 101–108.
*Cong, X., Ludington-Hoe, S., McCain, G., & Fu, P. (2009). Kangaroo care modifies preterm infant heart rate
54
variability in response to heel stick pain: Pilot study. Early Human Development, 85, 561–567.
Cong, X., Ludington-Hoe, S., & Walsh, S. (2011). Randomized crossover trial of kangaroo care to reduce
biobehavioral pain responses in preterm infants: A pilot study. Biological Research for Nursing, 13, 204–216.
DiCenso, A., Guyatt, G., & Ciliska, D. (2005). Evidence-based nursing: A guide to clinical practice. St. Louis,
MO: Elsevier Mosby.
*Gilbert, R., Salanti, G., Harden, M., & See, S. (2005). Infant sleeping position and the sudden infant death
syndrome: Systematic review of observational studies and historical review of recommendations from 1940 to
2002. International Journal of Epidemiology, 34, 874–887.
Golfenshtein, N., & Drach-Zahavy, A. (2015). An attribution theory perspective on emotional labour in nurse-
patient encounters: A nested cross-sectional study in paediatric settings. Journal of Advanced Nursing, 71,
1123–1134.
Guyatt, G., Rennie, D., Meade, M., & Cook, D. (2008). Users’ guides to the medical literature: Essentials of
evidence-based clinical practice (2nd ed.). New York, NY: McGraw Hill.
Hagerty, T., Kertesz, L., Schmidt, J., Agarwal, S., Claassen, J., Mayer, S., . . . Shang, J. (2015). Risk factors for
catheter-associated urinary tract infections in critically ill patients with subarachnoid hemorrhage. Journal of
Neuroscience Nursing, 47, 51–54.
Hanrahan, K., Wagner, M., Matthews, G., Stewart, S., Dawson, C., Greiner, J., . . . Williamson, A. (2015). Sacred
cow gone to pasture: A systematic evaluation and integration of evidence-based practice. Worldviews on
Evidence-Based Nursing, 12, 3–11.
*Holditch-Davis, D., White-Traut, R., Levy, J., O’Shea, T., Geraldo, V., & David, R. (2014). Maternally
administered interventions for preterm infants in the NICU: Effects on maternal psychological distress and
mother-infant relationship. Infant Behavior & Development, 37, 695–710.
Holman, E., Perisho, J., Edwards, A., & Mlakar, N. (2010). The myths of coping with loss in undergraduate
psychiatric nursing books. Research in Nursing & Health, 33, 486–499.
Jain, A., van Houten, D., & Sheikh, L. (2016). Retrospective study on National Institutes of Health Stroke Scale as
a predictor of patient recovery after stroke. Journal of Cardiovascular Nursing, 31, 69–72.
Kwon, J. H., Shin, Y., & Juon, H. (2016). Effects of Nei-Guan (P6) acupressure wristband: On nausea, vomiting,
and retching in women after thyroidectomy. Cancer Nursing, 39, 61–66.
*Lowson, K., Offer, C., Watson, J., McGuire, B., & Renfrew, M. (2015). The economic benefits of increasing
kangaroo skin-to-skin care and breastfeeding in neonatal units: Analysis of a pragmatic intervention in clinical
practice. International Breastfeeding Journal, 10, 11.
Ludington-Hoe, S. M. (2011). Thirty years of Kangaroo Care science and practice. Neonatal Network, 30, 357–
362.
Melnyk, B. M., & Fineout-Overholt, E. (2015). Evidence-based practice in nursing & healthcare: A guide to best
practice (3rd ed.). Philadelphia, PA: Lippincott Williams & Wilkins.
**Miller, K., Cranston, C., Davis, J., Newman, E., & Resnick, H. (2015). Psychological outcomes after a sexual
assault video intervention: A randomized trial. Journal of Forensic Nursing, 11, 129–136.
*Moore, E., Anderson, G., Bergman, N., & Dowswell, T. (2012). Early skin-to-skin contact for mothers and their
healthy newborn infants. Cochrane Database of Systematic Reviews, (5), CD003519.
Oakley-Girvan, I., Londono, C., Canchola, A., & Watkins Davis, S. (2016). Text messaging may improve
abnormal mammogram follow-up in Latinas. Oncology Nursing Forum, 43, 36–43.
Palese, A., Luisa, S., Ilenia, P., Laquintana, D., Stinco, G., & Di Giulio, P. (2015). What is the healing time of
stage II pressure ulcers? Findings from a secondary analysis. Advances in Skin & Wound Care, 28, 69–75.
Pieters, H. C. (2016). “I’m still here”: Resilience among older survivors of breast cancer. Cancer Nursing, 39,
E20–E28.
Sitzer, V. (2016). Development of an automated self-assessment of Fall Risk Questionnaire for hospitalized
patients. Journal of Nursing Care Quality, 31, 46–53.
Spock, B. (1979). Baby and child care. New York, NY: Dutton.
Stapleton, S., & Pattison, N. (2015). The lived experience of men with advanced cancer in relation to their
perceptions of masculinity: A qualitative phenomenological study. Journal of Clinical Nursing, 24, 1069–1078.
Storey, S., & Von Ah, D. (2015). Prevalence and impact of hyperglycemia on hospitalized leukemia patients.
European Journal of Oncology Nursing, 19, 13–17.
Wazneh, L., Tsimicalis, A., & Loiselle, C. (2016). Young adults’ perceptions of the Venturing Out Pack program
as a tangible cancer support service. Oncology Nursing Forum, 43, E34–E42.
*A link to this open-access article is provided in the Internet Resources section on website.
55
**This journal article is available on for this chapter.
56
2 Fundamentals of Evidence-Based
Nursing Practice
Learning Objectives
On completing this chapter, you will be able to:
Distinguish research utilization and evidence-based practice (EBP) and discuss their
current status within nursing
Identify several resources available to facilitate EBP in nursing practice
List several models for implementing EBP
Discuss the five major steps in undertaking an EBP effort for individual nurses
Identify the components of a well-worded clinical question and be able to frame such a
question
Discuss broad strategies for undertaking an EBP project at the organizational level
Distinguish EBP and quality improvement (QI) efforts
Define new terms in the chapter
Key Terms
Clinical practice guideline
Cochrane Collaboration
Evidence hierarchy
Evidence-based practice
Implementation potential
Meta-analysis
Metasynthesis
Pilot test
Quality improvement (QI)
Research utilization (RU)
Systematic review
Learning about research methods provides a foundation for evidence-based practice
(EBP) in nursing. This book will help you to develop methodologic skills for reading
research articles and evaluating research evidence. Before we elaborate on
57
methodologic techniques, we discuss key aspects of EBP to further help you understand
the key role that research now plays in nursing.
BACKGROUND OF EVIDENCE-BASED NURSING
PRACTICE
This section provides a context for understanding evidence-based nursing practice and
two closely related concepts: research utilization and knowledge translation.
Definition of Evidence-Based Practice
Pioneer Sackett and his colleagues (2000) defined evidence-based practice as “the
integration of best research evidence with clinical expertise and patient values” (p. 1).
The definition proposed by Sigma Theta Tau International (2008) is as follows: “The
process of shared decision-making between practitioner, patient, and others significant
to them based on research evidence, the patient’s experiences and preferences, clinical
expertise or know-how, and other available robust sources of information” (p. 57). A
key ingredient in EBP is the effort to personalize “best evidence” to a specific patient’s
needs within a particular clinical context.
A basic feature of EBP as a clinical problem-solving strategy is that it de-
emphasizes decisions based on custom, authority, or ritual. A core aspect of EBP is on
identifying the best available research evidence and integrating it with other factors in
making clinical decisions. Advocates of EBP do not minimize the importance of clinical
expertise. Rather, they argue that evidence-based decision making should integrate best
research evidence with clinical expertise, patient preferences, and local circumstances.
EBP involves efforts to personalize evidence to fit a specific patient’s needs and a
particular clinical situation.
Because research evidence can provide valuable insights about human health and
illness, nurses must be lifelong learners who have the skills to search for, understand,
and evaluate new information about patient care and the capacity to adapt to change.
Research Utilization
Research utilization (RU) is the use of findings from studies in a practical application
that is unrelated to the original research. In RU, the emphasis is on translating new
knowledge into real-world applications. EBP is a broader concept than RU because it
integrates research findings with other factors, as just noted. Also, whereas RU begins
with the research itself (e.g., How can I put this new knowledge to good use in my
clinical setting?), the starting point in EBP is usually a clinical question (e.g., What does
the evidence say is the best approach to solving this clinical problem?).
During the 1980s, RU emerged as an important topic. In education, nursing schools
began to include courses on research methods so that students would become skillful
research consumers. In research, there was a shift in focus toward clinical nursing
58
problems. Yet, concerns about the limited use of research evidence in the delivery of
nursing care continued to mount.
The need to reduce the gap between research and practice led to formal RU projects,
including the groundbreaking Conduct and Utilization of Research in Nursing (CURN)
Project, a 5-year project undertaken by the Michigan Nurses Association in the 1970s.
CURN’s objectives were to increase the use of research findings in nurses’ daily
practice by disseminating current findings and facilitating organizational changes
needed to implement innovations (Horsley et al., 1978). The CURN Project team
concluded that RU by practicing nurses was feasible but only if the research is relevant
to practice and if the results are broadly disseminated.
During the 1980s and 1990s, RU projects were undertaken by numerous hospitals
and organizations. During the 1990s, however, the call for RU began to be superseded
by the push for EBP.
The Evidence-Based Practice Movement
One keystone of the EBP movement is the Cochrane Collaboration, which was founded
in the United Kingdom based on the work by British epidemiologist Archie Cochrane.
Cochrane published a book in the 1970s that drew attention to the shortage of solid
evidence about the effects of health care. He called for efforts to make research
summaries about interventions available to health care providers. This led to the
development of the Cochrane Center in Oxford in 1993 and the international Cochrane
Collaboration, with centers now established in locations throughout the world. Its aim
is to help providers make good decisions by preparing and disseminating systematic
reviews of the effects of health care interventions.
At about the same time that the Cochrane Collaboration was started, a group from
McMaster Medical School in Canada developed a learning strategy they called
evidence-based medicine. The evidence-based medicine movement, pioneered by Dr.
David Sackett, has broadened to the use of best evidence by all health care practitioners.
EBP has been considered a major paradigm shift in health care education and practice.
With EBP, skillful clinicians can no longer rely on a repository of memorized
information but rather must be adept in accessing, evaluating, and using new research
evidence.
The EBP movement has advocates and critics. Supporters argue that EBP is a
rational approach to providing the best possible care with the most cost-effective use of
resources. Advocates also note that EBP provides a framework for self-directed lifelong
learning that is essential in an era of rapid clinical advances and the information
explosion. Critics worry that the advantages of EBP are exaggerated and that individual
clinical judgments and patient inputs are being devalued. They are also concerned that
insufficient attention is being paid to the role of qualitative research. Although there is a
need for close scrutiny of how the EBP journey unfolds, an EBP path is the one that
health care professions will almost surely follow in the years ahead.
59
TIP A debate has emerged concerning whether the term evidence-based
practice should be replaced with evidence-informed practice (EIP). Those
who advocate for a different term have argued that the word “based” suggests
a stance in which patient values and preferences are not sufficiently
considered in EBP clinical decisions (e.g., Glasziou, 2005). Yet, as noted by
Melnyk (2014), all current models of EBP incorporate clinicians’ expertise
and patients’ preferences. She argued that “changing terms now . . . will only
create confusion at a critical time where progress is being made in
accelerating EBP” (p. 348). We concur and we use EBP throughout this
book.
Knowledge Translation
RU and EBP involve activities that can be undertaken at the level of individual nurses
or at a higher organizational level (e.g., by nurse administrators), as we describe later in
this chapter. A related movement emerged that mainly concerns system-level efforts to
bridge the gap between knowledge generation and use. Knowledge translation (KT) is a
term that is often associated with efforts to enhance systematic change in clinical
practice. The World Health Organization (WHO) (2005) has defined KT as “the
synthesis, exchange, and application of knowledge by relevant stakeholders to
accelerate the benefits of global and local innovation in strengthening health systems
and improving people’s health.”
TIP Translation science (or implementation science) is a new discipline
devoted to promoting KT. In nursing, the need for translational research was
an important stimulus for the development of the Doctor of Nursing Practice
degree. Several journals have emerged that are devoted to this field (e.g., the
journal Implementation Science).
EVIDENCE-BASED PRACTICE IN NURSING
Before describing procedures relating to EBP in nursing, we briefly discuss some
important issues, including the nature of “evidence,” challenges to pursuing EBP, and
resources available to address some of those challenges.
Types of Evidence and Evidence Hierarchies
There is no consensus about what constitutes usable evidence for EBP, but there is
general agreement that findings from rigorous research are paramount. Yet, there is
some debate about what constitutes “rigorous” research and what qualifies as “best”
evidence.
Early in the EBP movement, there was a strong bias favoring evidence from a type
60
of study called a randomized controlled trial (RCT). This bias reflected the Cochrane
Collaboration’s initial focus on evidence about the effectiveness of therapies rather than
about broader health care questions. RCTs are especially well-suited for drawing
conclusions about the effects of health care interventions (see Chapter 9). The bias in
ranking research approaches in terms of questions about effective therapies led to some
resistance to EBP by nurses who felt that evidence from qualitative and non-RCT
studies would be ignored.
Positions about the contribution of various types of evidence are less rigid than
previously. Nevertheless, many published evidence hierarchies rank evidence sources
according to the strength of the evidence they provide, and in most cases, RCTs are near
the top of these hierarchies. We offer a modified evidence hierarchy that looks similar
to others but is unique in illustrating that the ranking of evidence-producing strategies
depends on the type of question being asked.
Figure 2.1 shows that systematic reviews are at the pinnacle of the hierarchy (Level
I) because the strongest evidence comes from careful syntheses of multiple studies. The
next highest level (Level II) depends on the nature of inquiry. For Therapy questions
regarding the efficacy of a therapy or intervention (What works best for improving
health outcomes?), individual RCTs constitute Level II evidence (systematic reviews of
multiple RCTs are Level I). Going down the “rungs” of the evidence hierarchy for
Therapy questions results in less reliable evidence. For example, Level III evidence
comes from a type of study called quasi-experimental. In-depth qualitative studies are
near the bottom, in terms of evidence regarding intervention effectiveness. (Terms in
Fig. 2.1 will be discussed in later chapters.)
61
For a Prognosis question, by contrast, Level II evidence comes from a single
prospective cohort study, and Level III evidence is from a type of study called case-
control (Level I evidence is from a systematic review of cohort studies). Thus, contrary
to what is often implied in discussions of evidence hierarchies, there really are multiple
hierarchies. If one is interested in best evidence for questions about meaning, an RCT
would be a poor source of evidence, for example. Figure 2.1 illustrates these multiple
hierarchies, with information on the right indicating the type of individual study that
would offer the best evidence (Level II) for different questions. In all cases, appropriate
systematic reviews are at the pinnacle.
Of course, within any level in an evidence hierarchy, evidence quality can vary
considerably. For example, an individual RCT could be well designed, yielding strong
Level II evidence for Therapy questions, or it could be so flawed that the evidence
would be weak.
Thus, in nursing, best evidence refers to research findings that are methodologically
appropriate, rigorous, and clinically relevant for answering pressing questions. These
questions cover not only the efficacy, safety, and cost effectiveness of nursing
interventions but also the reliability of nursing assessment tests, the causes and
consequences of health problems, and the meaning and nature of patients’ experiences.
Confidence in the evidence is enhanced when the research methods are compelling,
when there have been multiple confirmatory studies, and when the evidence has been
62
carefully evaluated and synthesized.
Evidence-Based Practice Challenges
Studies that have explored barriers to evidence-based nursing have yielded similar
results in many countries. Most barriers fall into one of three categories: (1) quality and
nature of the research, (2) characteristics of nurses, and (3) organizational factors.
With regard to the research itself, one problem is the limited availability of strong
research evidence for some practice areas. The need for research that directly addresses
pressing clinical problems and for replicating studies in a range of settings remains a
challenge. Also, nurse researchers need to improve their ability to communicate
evidence to practicing nurses. In non-English-speaking countries, another impediment is
that most studies are reported in English.
Nurses’ attitudes and education are also potential barriers to EBP. Studies have
found that some nurses do not value or understand research, and others simply resist
change. And, among the nurses who do appreciate research, many do not have the skills
for accessing research evidence or for evaluating it for possible use in clinical decision
making.
Finally, many challenges to using research in practice are organizational. “Unit
culture” can undermine research use, and administrative or organizational barriers also
play a major role. Although many organizations support the idea of EBP in theory, they
do not always provide the necessary supports in terms of staff release time and
provision of resources. Strong leadership in health care organizations is essential to
making EBP happen.
RESOURCES FOR EVIDENCE-BASED PRACTICE
In this section, we describe some of the resources that are available to support evidence-
based nursing practice and to address some of the challenges.
Pre-Appraised Evidence
Research evidence comes in various forms, the most basic of which is from individual
studies. Primary studies published in journals are not pre-appraised for quality and use
in practice.
Preprocessed (pre-appraised) evidence is evidence that has been selected from
primary studies and evaluated for use by clinicians. DiCenso and colleagues (2005)
have described a hierarchy of preprocessed evidence. On the first rung above primary
studies are synopses of single studies, followed by systematic reviews, and then
synopses of systematic reviews. Clinical practice guidelines are at the top of the
hierarchy. At each successive step in the hierarchy, there is greater ease in applying the
evidence to clinical practice. We describe several types of pre-appraised evidence
sources in this section.
63
Systematic Reviews
EBP relies on meticulous integration of all key evidence on a topic so that well-
grounded conclusions can be drawn about EBP questions. A systematic review is not
just a literature review. A systematic review is in itself a methodical, scholarly inquiry
that follows many of the same steps as those for other studies.
Systematic reviews can take various forms. One form is a narrative (qualitative)
integration that merges and synthesizes findings, much like a rigorous literature review.
For integrating evidence from quantitative studies, narrative reviews increasingly are
being replaced by a type of systematic review known as a meta-analysis.
Meta-analysis is a technique for integrating quantitative research findings
statistically. In essence, meta-analysis treats the findings from a study as one piece of
information. The findings from multiple studies on the same topic are combined and
then all of the information is analyzed statistically in a manner similar to that in a usual
study. Thus, instead of study participants being the unit of analysis (the most basic
entity on which the analysis focuses), individual studies are the unit of analysis in a
meta-analysis. Meta-analysis provides an objective method of integrating a body of
findings and of observing patterns that might not have been detected.
Example of a meta-analysis
Shah and colleagues (2016) conducted a meta-analysis of the evidence on the effect
of bathing intensive care unit (ICU) patients with 2% chlorhexidine gluconate (CHG)
on central line–associated bloodstream infection (CLABSI). Integrating results from
four intervention studies, the researchers concluded that 2% CHG is effective in
reducing infections. They noted that “nursing provides significant influence for the
prevention of CLABSIs in critical care via evidence-based best practices” (p. 42).
For qualitative studies, integration may take the form of a metasynthesis. A
metasynthesis, however, is distinct from a quantitative meta-analysis: A metasynthesis
is less about reducing information and more about interpreting it.
Example of a metasynthesis
Magid and colleagues (2016) undertook a metasynthesis of studies exploring the
perceptions of key elements of caregiving among patients using a left ventricular
assist device. Their metasynthesis of eight qualitative studies resulted in the
identification of eight important themes.
Systematic reviews are increasingly available. Such reviews are published in
professional journals that can be accessed using standard literature search procedures
(see Chapter 7) and are also available in databases that are dedicated to such reviews. In
particular, the Cochrane Database of Systematic Reviews (CDSR) contains thousands of
systematic reviews relating to health care interventions.
64
TIP Websites with useful content relating to EBP, including ones for
locating systematic reviews, are in the Internet Resources for Chapter 2 on
for you to access simply by using the “Control/Click” feature.
Clinical Practice Guidelines and Care Bundles
Evidence-based clinical practice guidelines distill a body of evidence into a usable
form. Unlike systematic reviews, clinical practice guidelines (which often are based on
systematic reviews) give specific recommendations for evidence-based decision
making. Guideline development typically involves the consensus of a group of
researchers, experts, and clinicians. The implementation or adaptation of a clinical
practice guideline is often an ideal focus for an organizational EBP project.
Also, organizations are developing and adopting care bundles—a concept
developed by the Institute for Healthcare Improvement—that encompass a set of
interventions to treat or prevent a specific cluster of symptoms (www.ihi.org). There is
growing evidence that a combination or bundle of strategies produces better outcomes
than a single intervention.
Example of a care bundle project
Tayyib et al. (2015) studied the effectiveness of a pressure ulcer prevention care
bundle in reducing the incidence of pressure ulcers in critically ill patients. Patients
who received the bundled interventions had a significantly lower incidence of
pressure ulcers than patients who did not.
Finding care bundles and clinical practice guidelines can be challenging because
there is no single guideline repository. A standard search in a bibliographic database
such as MEDLINE (see Chapter 7) will yield many references; however, the results are
likely to include not only the actual guidelines but also commentaries, implementation
studies, and so on.
A recommended approach is to search in guideline databases or through specialty
organizations that have sponsored guideline development. A few of the many possible
sources deserve mention. In the United States, nursing and health care guidelines are
maintained by the National Guideline Clearinghouse (www.guideline.gov). In Canada,
the Registered Nurses’ Association of Ontario (RNAO) (www.rnao.org/bestpractices)
maintains information about clinical practice guidelines. Two sources in the United
Kingdom are the Translating Research Into Practice (TRIP) database and the National
Institute for Health and Care Excellence (NICE).
There are many topics for which practice guidelines have not yet been developed,
but the opposite problem is also true: Sometimes there are multiple guidelines on the
same topic. Worse yet, because of differences in the rigor of guideline development and
interpretation of evidence, different guidelines sometimes offer different or even
65
http://www.ihi.org
http://www.guideline.gov
http://www.rnao.org/bestpractices
conflicting recommendations (Lewis, 2001). Thus, those who wish to adopt clinical
practice guidelines should appraise them to identify ones that are based on the strongest
evidence, have been meticulously developed, are user-friendly, and are appropriate for
local use or adaptation.
Several appraisal instruments are available to evaluate clinical practice guidelines.
One with broad support is the Appraisal of Guidelines Research and Evaluation
(AGREE) Instrument, now in its second version (Brouwers et al., 2010). The AGREE II
instrument has ratings for 23 dimensions within six domains (e.g., scope and purpose,
rigor of development, presentation). As examples, a dimension in the scope and purpose
domain is “The population (patients, public, etc.) to whom the guideline is meant to
apply is specifically described,” and one in the rigor of development domain is “The
guideline has been externally reviewed by experts prior to its publication.” The AGREE
tool should be applied to a guideline by a team of two to four appraisers.
Example of using AGREE II
Homer and colleagues (2014) evaluated English-language guidelines on the screening
and management of group B Streptococcus (GBS) colonization in pregnant women
and the prevention of early-onset GBS disease in newborns. Four guidelines were
appraised using the AGREE II instrument.
TIP For those interested in learning more about the AGREE II instrument,
we offer more information in the chapter supplement on website.
Models of the Evidence-Based Practice Process
EBP models offer frameworks for designing and implementing EBP projects in practice
settings. Some models focus on the use of research by individual clinicians (e.g., the
Stetler Model, one of the oldest models that originated as an RU model), but most focus
on institutional EBP efforts (e.g., the Iowa Model). The many worthy EBP models are
too numerous to list comprehensively but include the following:
Advancing Research and Clinical Practice Through Close Collaboration (ARCC)
Model (Melnyk & Fineout-Overholt, 2015)
Diffusion of Innovations Model (Rogers, 1995)
Iowa Model of Evidence-Based Practice to Promote Quality Care (Titler, 2010)
Johns Hopkins Nursing Evidence-Based Practice Model (Dearholt & Dang, 2012)
Promoting Action on Research Implementation in Health Services (PARiHS) Model,
(Rycroft-Malone, 2010; Rycroft-Malone et al., 2013)
Stetler Model of Research Utilization (Stetler, 2010)
For those wishing to follow a formal EBP model, the cited references should be
consulted. Several are also nicely synthesized by Melnyk and Fineout-Overholt (2015).
66
Each model offers different perspectives on how to translate research findings into
practice, but several steps and procedures are similar across the models. We provide an
overview of key activities and processes in EBP efforts, based on a distillation of
common elements from the various models, in a subsequent section of this chapter. We
rely heavily on the Iowa Model, shown in Figure 2.2.
TIP Gawlinski and Rutledge (2008) offer suggestions for selecting an EBP
model.
67
EVIDENCE-BASED PRACTICE IN INDIVIDUAL
NURSING PRACTICE
This and the following section provide an overview of how research can be put to use in
clinical settings. We first discuss strategies and steps for individual clinicians and then
describe activities used by organizations or teams of nurses.
Clinical Scenarios and the Need for Evidence
Individual nurses make many decisions and are called upon to provide health care
advice, and so they have ample opportunity to put research into practice. Here are four
clinical scenarios that provide examples of such opportunities:
Clinical Scenario 1. You work on an ICU and notice that Clostridium difficile infection
has become more prevalent among surgical patients in your hospital. You want to
know whether there is a reliable screening tool for assessing the risk of infection so
that preventive measures can be initiated in a more timely and effective manner.
Clinical Scenario 2. You work in an allergy clinic and notice how difficult it is for
many children to undergo allergy scratch tests. You wonder whether an interactive
distraction intervention would help reduce children’s pain when they are being tested
for allergens.
Clinical Scenario 3. You work in a rehabilitation hospital, and one of your elderly
patients, who had total hip replacement, tells you she is planning a long airplane trip.
You know that a long plane ride will increase her risk of deep vein thrombosis and
wonder whether compression stockings are an effective in-flight treatment. You
decide to look for the best possible evidence to answer this question.
Clinical Scenario 4. You are caring for a hospitalized cardiac patient who tells you that
he has sleep apnea. He confides in you that he is reluctant to undergo continuous
positive airway pressure (CPAP) treatment because he worries it will hinder intimacy
with his wife. You wonder if there is any evidence about what it is like to undergo
CPAP treatment so that you can better understand how to address your patient’s
concerns.
In these and thousands of other clinical situations, research evidence can be put to
good use to improve nursing care. Some situations might lead to unit-wide or
institution-wide scrutiny of current practices, but in other situations, individual nurses
can personally examine evidence to help address specific problems.
For individual EBP efforts, the major steps in EBP include the following:
1. Asking clinical questions that can be answered with research evidence
2. Searching for and retrieving relevant evidence
3. Appraising and synthesizing the evidence
4. Integrating the evidence with your own clinical expertise, patient preferences, and
local context
68
5. Assessing the effectiveness of the decision, intervention, or advice
Asking Well-Worded Clinical Questions: PIO and PICO
A crucial first step in EBP involves asking relevant clinical questions that reflect
uncertainties in clinical practice. Some EBP writers distinguish between background
and foreground questions. Background questions are foundational questions about a
clinical issue, such as What is cancer cachexia (progressive body wasting), and what is
its pathophysiology? Answers to such questions are typically found in textbooks.
Foreground questions, by contrast, are those that can be answered based on current best
research evidence on diagnosing, assessing, or treating patients or on understanding the
meaning or prognosis of their health problems. For example, we may wonder, is a fish
oil–enhanced nutritional supplement effective in stabilizing weight in patients with
advanced cancer? The answer to such a question may provide guidance on how best to
address the needs of patients with cachexia.
Most guidelines for EBP use the acronyms PIO or PICO to help practitioners
develop well-worded questions that facilitate a search for evidence. In the most basic
PIO form, the clinical question is worded to identify three components:
1. P: the population or patients (What are the characteristics of the patients or people?)
2. I: the intervention, influence, or exposure (What are the interventions or therapies of
interest? or What are the potentially harmful influences/exposures of concern?)
3. O: the outcomes (What are the outcomes or consequences in which we are
interested?)
Applying this scheme to our question about cachexia, our population (P) is cancer
patients with cachexia, the intervention (I) is fish oil–enhanced nutritional supplements,
and the outcome (O) is weight stabilization. As another example, in the second clinical
scenario about scratch tests cited earlier, the population is children being tested for
allergies, the intervention is interactive distraction, and the outcome is pain.
For questions that can best be answered with qualitative information (e.g., about the
meaning of an experience or health problem), two components are most relevant:
1. The population (What are the characteristics of the patients or clients?)
2. The situation (What conditions, experiences, or circumstances are we interested in
understanding?)
For example, suppose our question was, What is it like to suffer from cachexia? In
this case, the question calls for rich qualitative information; the population is patients
with advanced cancer and the situation is the experience of cachexia.
In addition to the basic PIO components, other components are sometimes important
in an evidence search. In particular, a comparison (C) component may be needed, when
the intervention or influence of interest is contrasted with a specific alternative. For
example, we might be interested in learning whether fish oil–enhanced supplements (I)
are better than melatonin (C) in stabilizing weight (O) in cancer patients (P). When a
69
specific comparison is of interest, a PICO question is required, but if we were interested
in uncovering evidence about all alternatives to an intervention of primary interest, then
PIO components are sufficient. (By contrast, when asking questions to undertake an
actual study, the “C” must always be specified).
TIP Other components may be relevant, such as a time frame in which an
intervention might be appropriate (adding a “T” for PICOT questions) or a
setting (adding an “S” for PICOS questions).
Table 2.1 offers templates for asking well-worded clinical questions for different
types of foreground questions. The far right column includes questions with an explicit
comparison (PICO), whereas the middle column does not have a comparison (PIO). The
questions are categorized in a manner similar to that discussed in Chapter 1 (EBP
Purpose), as featured in Table 1.3. One exception is that we have added description as a
category. Note that although there are some differences in components across question
types, there is always a P component.
70
TIP It is crucial to practice asking clinical questions—it is the starting
point for evidence-based nursing. Take some time to fill in the blanks in
Table 2.1 for each question category. Do not be too self-critical at this point.
Your comfort in developing questions will increase over time. Chapter 2 of
the study guide that accompanies this book offers additional opportunities for
you to practice asking well-worded questions.
Finding Research Evidence
By wording clinical queries as PIO or PICO questions, you should be able to search the
research literature for the information you need. Using the templates in Table 2.1, the
information you insert into the blanks are keywords that can be used in an electronic
search.
For an individual EBP endeavor, the best place to begin is by searching for evidence
in a systematic review, clinical practice guideline, or other preprocessed source because
71
this approach leads to a quicker answer and, if your methodologic skills are limited,
potentially a superior answer as well. Researchers who prepare reviews and clinical
guidelines typically are well trained in research methods and use rigorous standards in
evaluating the evidence. Moreover, preprocessed evidence is often prepared by a team,
which means that the conclusions are cross-checked and fairly objective. Thus, when
preprocessed evidence is available to answer a clinical question, you may not need to
look any farther unless the review is outdated. When preprocessed evidence cannot be
located or is old, you will need to look for best evidence in primary studies, using
strategies we describe in Chapter 7.
TIP Searching for evidence for an EBP project has been greatly simplified
in recent years. Guidance on doing a search for evidence on clinical
questions is available in the supplement for Chapter 7 (the chapter on
literature reviews) on the book’s website, .
Appraising the Evidence for Evidence-Based Practice
Evidence should be appraised before clinical action is taken. The critical appraisal of
evidence for the purposes of EBP may involve several types of assessments (Box 2.1)
but often focuses primarily on evidence quality.
Box 2.1 Questions for Appraising the Evidence
1. What is the quality of the evidence—i.e., how rigorous and reliable is it?
2. What is the evidence—what is the magnitude of effects?
3. How precise is the estimate of effects?
4. What evidence is there of any side effects/side benefits?
5. What is the financial cost of applying (and not applying) the evidence?
6. Is the evidence relevant to my particular clinical situation?
Evidence Quality
The overriding appraisal issue is the extent to which the findings are valid. That is, were
the study methods sufficiently rigorous that the evidence can be trusted? Ideally, you
would find pre-appraised evidence, but a goal of this book is to help you evaluate
research evidence yourself. If there are several primary studies and no existing
systematic review, you would need to draw conclusions about the body of evidence
taken as a whole. Clearly, you would want to put most weight on the most rigorous
studies.
Magnitude of Effects
You would also need to assess whether study findings are clinically important. This
72
criterion considers not whether the results are “real” but how powerful the effects are.
For example, consider clinical scenario 3 cited earlier, which suggests this question:
Does the use of compression stockings lower the risk of flight-related deep vein
thrombosis for high-risk patients? In our search, we found a relevant systematic review
in the nursing literature—a meta-analysis of nine RCTs (Hsieh & Lee, 2005)—and
others in the Cochrane database (Clarke et al., 2006; O’Meara et al., 2012). The
conclusion of these reviews, based on reliable evidence, was that compression stockings
are effective and the magnitude of the risk-reducing effect is fairly substantial. Thus,
advice about using compression stockings may be appropriate, pending an appraisal of
other factors. The magnitude of effects can be quantified, and several methods are
described later in this book. The magnitude of effects also has a bearing on clinical
significance, which we also discuss in a later chapter.
Precision of Estimates
When the evidence is quantitative, another consideration is how precise the estimate of
effect is. This type of appraisal requires some statistical knowledge, and so we postpone
our discussion of confidence intervals to Chapter 14. Suffice it to say that research
results provide only an estimate of effects, and it is useful to understand not only the
exact estimate but also the range within which the actual effect probably lies.
Peripheral Effects
Even if the evidence is judged to be valid and the magnitude of effects is sizeable,
peripheral benefits and costs may be important in guiding decisions. In framing your
clinical question, you would have identified the outcomes (O) in which you were
interested—for example, weight stabilization for an intervention to address cancer
cachexia. Research on this topic, however, would likely have considered other outcomes
that need to be taken into account—for example, effects on quality of life.
Financial Costs
Another issue concerns the costs of applying the evidence. Costs may be small or
nonexistent. For example, in clinical scenario 4 concerning the experience of CPAP
treatment, nursing action would be cost-neutral because the evidence would be used to
reassure and inform patients. When interventions and assessments are costly, however,
the resources needed to put best evidence into practice need to be factored into any
decision. Of course, although the cost of a clinical decision needs to be considered, the
cost of not taking action is equally important.
Clinical Relevance
Finally, it is important to appraise the evidence in terms of its relevance for the clinical
situation at hand—that is, for your patient in a specific clinical setting. Best practice
evidence can most readily be applied to an individual patient in your care if he or she is
73
sufficiently similar to people in the study or studies under review. Would your patient
have qualified for participation in the study—or would some factor (e.g., age, illness
severity, comorbidities) have disqualified him or her? DiCenso and colleagues (2005),
who advised clinicians to ask whether there is a compelling reason to conclude that
results may not be applicable in their clinical situation, have written some useful tips on
applying evidence to individual patients.
Actions Based on Evidence Appraisals
Appraisals of the evidence may lead you to different courses of action. You may reach
this point and conclude that the evidence base is not sufficiently sound, or that the likely
effect is too small, or that the cost of applying the evidence is too high. The evidence
appraisal may suggest that “usual care” is the best strategy. If, however, the initial
appraisal of evidence suggests a promising clinical action, then you can proceed to the
next step.
Integrating Evidence in Evidence-Based Practice
Research evidence needs to be integrated with other types of information, including
your own clinical expertise and knowledge of your clinical setting. You may be aware
of factors that would make implementation of the evidence, no matter how sound and
how promising, inadvisable. Patient preferences and values are also important. A
discussion with the patient may reveal negative attitudes toward a potentially beneficial
course of action, contraindications (e.g., comorbidities), or possible impediments (e.g.,
lack of health insurance).
One final issue is the desirability of integrating evidence from qualitative research.
Qualitative research can provide rich insights about how patients experience a problem
or about barriers to complying with a treatment. A potentially beneficial intervention
may fail to achieve desired outcomes if it is not implemented with sensitivity to the
patients’ perspectives. As Morse (2005) so aptly noted, evidence from an RCT may tell
us whether a pill is effective, but qualitative research can help us understand why
patients may not swallow the pill.
Implementing the Evidence and Evaluating Outcomes
After the first four steps of the EBP process have been completed, you can use the
resulting information to make an evidence-based decision or to provide evidence-based
advice. Although the steps in the process, as just described, may seem complicated, in
reality, the process can be quite efficient—if there is adequate evidence, and especially
if it has been skillfully preprocessed. EBP is most challenging when findings from
research are contradictory, inconclusive, or “thin”—that is, when better quality evidence
is needed.
One last step in an individual EBP effort concerns evaluation. Part of the evaluation
process involves following up to determine whether your actions achieved the desired
74
outcome. Another part, however, concerns an evaluation of how well you are
performing EBP. Sackett and colleagues (2000) offer self-evaluation questions that
relate to the previous EBP steps, such as asking answerable questions (Am I asking any
clinical questions at all? Am I asking well-worded questions?) and finding external
evidence (Do I know the best sources of current evidence? Am I efficient in my
searching?). A self-appraisal may lead to the conclusion that at least some of the clinical
questions of interest to you are best addressed as a group effort.
EVIDENCE-BASED PRACTICE IN AN
ORGANIZATIONAL CONTEXT
For some clinical scenarios, individual nurses may be able to implement EBP strategies
on their own (e.g., giving advice about compression stockings). Many situations,
however, require decision making by an organization or by a team of nurses working to
solve a recurrent problem. This section describes some issues that are relevant to
institutional efforts at EBP—efforts designed to result in a formal policy or protocol
affecting the practice of many nurses.
Many steps in organizational EBP projects are similar to the ones described in the
previous section. For example, gathering and appraising evidence are key activities in
both, as shown in the Iowa Model in Figure 2.2 (assemble relevant research; critique
and synthesize research). Additional issues are relevant at the organizational level,
however, including the selection of a problem; an assessment of whether the topic is an
organizational priority; deciding whether to test an EBP innovation on a trial basis; and
deciding, based on a trial, whether the innovation should be adopted. We briefly discuss
some of these topics.
Selecting a Problem for an Institutional Evidence-Based
Practice Project
Some EBP projects originate in deliberations among clinicians who have encountered a
recurrent problem and seek a resolution. Others, however, are “top-down” efforts in
which administrators take steps to stimulate the use of research evidence among
clinicians. This latter approach is increasingly likely to occur in U.S. hospitals as part of
the Magnet recognition process.
Several models of EBP, such as the Iowa Model, distinguish two types of stimulus
(“triggers”) for an EBP endeavor: (1) problem-focused triggers—the identification of a
clinical practice problem in need of solution, or (2) knowledge-focused triggers
—readings in the research literature. The problem identification approach is likely to be
clinically relevant and to have staff support if the problem is one that numerous nurses
have encountered.
A second catalyst for an EBP project is a knowledge-focused trigger, which is akin
to RU. The catalyst might be a new clinical guideline or a research article discussed in a
75
journal club. With knowledge-focused triggers, the clinical relevance of the research
might need to be assessed. The central issue is whether a problem of significance to
nurses in a particular setting will be solved by introducing an innovation.
Appraising Implementation Potential
With either type of trigger, the feasibility of undertaking an organizational EBP project
needs to be assessed. In the Iowa Model (Fig. 2.2), the first major decision point
involves determining whether the topic is a priority for the organization considering
practice changes. Titler and colleagues (2001) advised considering the following issues
before finalizing a topic for EBP: the topic’s fit with the organization’s strategic plan,
the magnitude of the problem, the number of people invested in the problem, support of
nurse leaders and of those in other disciplines, costs and availability of resources, and
possible barriers to change.
Some EBP models involve a formal assessment of organizational “fit,” often called
implementation potential (or environmental readiness). In assessing the
implementation potential of an innovation, several issues should be considered,
particularly the transferability of the innovation (i.e., the extent to which the innovation
might be appropriate in new settings), the feasibility of implementing it, and its cost–
benefit ratio. If the implementation assessment suggests that there might be problems in
testing the innovation in a particular practice setting, then the team can either identify a
new problem and begin the process anew or develop a plan to improve the
implementation potential (e.g., seeking external resources if costs are prohibitive).
Evidence Appraisals and Subsequent Actions
In the Iowa Model, the second major decision relies on the synthesis and appraisal of
research evidence. The crux of the decision concerns whether the research base is
sufficient to justify an evidence-based change—for example, whether a new clinical
practice guideline is of sufficient quality that it can be used or adapted, or whether the
research evidence is rigorous enough to recommend a practice innovation.
Assessments about the adequacy of the evidence can lead to different action paths. If
the research evidence is weak, the team could assemble nonresearch evidence (e.g.,
through consultation with experts or client surveys) to determine the benefit of a
practice change. Another option is to conduct an original study to address the practice
question, thereby gathering new evidence. This course of action may be impractical and
would result in years of delay.
If, on the other hand, there is a solid research base or a high-quality clinical practice
guideline, then the team would develop plans to implement a practice innovation. A key
activity usually involves developing or adapting a local evidence-based clinical practice
protocol or guideline. Strategies for developing clinical practice guidelines are
suggested in DiCenso et al. (2005) and Melnyk and Fineout-Overholt (2015).
76
Implementing and Evaluating the Innovation
Once the EBP product has been developed, the next step is to pilot test it (give it a trial
run) and evaluate the outcome. Building on the Iowa Model, this phase of the project
likely would involve the following activities:
1. Developing an evaluation plan (e.g., identifying outcomes to be achieved,
determining how many clients to include, deciding when and how often to measure
outcomes)
2. Measuring client outcomes prior to implementing the innovation so that there is a
comparison against which the outcomes of the innovation can be assessed
3. Training relevant staff in the use of the new guideline and, if necessary, “marketing”
the innovation to users
4. Trying the guideline out on one or more units or with a group of clients
5. Evaluating the pilot project, in terms of both process (e.g., How was the innovation
received? What problems were encountered?) and outcomes (e.g., How were client
outcomes affected? What were the costs?)
A fairly informal evaluation may be adequate, but formal efforts are often
appropriate and provide opportunities for dissemination to others at conferences or in
professional journals.
TIP Every nurse can play a role in using research evidence. Here are some
strategies:
Read widely and critically.
Attend professional conferences.
Become involved in a journal club.
Pursue and participate in EBP projects.
QUALITY IMPROVEMENT
We conclude this chapter with a brief discussion of quality improvement (QI)
projects, which are efforts ongoing in many health care settings and which sometimes
involve nurses. In recent years, there has been a lot of discussion in health journals
about the differences and similarities between QI projects and research. And in nursing,
efforts have been made to distinguish QI, research, and EBP projects (Shirey et al.,
2011). All three have much in common, notably the use of systematic methods of
solving health problems with an overall aim of fostering improvements in health care.
Often, the research methods used overlap: Patient data are used in all three, and
statistical analysis—sometimes combined with analysis of qualitative data—are also
used in all three.
The definitions of QI, research, and EBP activities are distinct, and yet it is not
always easy to distinguish them in real-world projects, resulting in confusion. QI has
77
been defined by the U.S. Centers for Medicare & Medicaid Services (CMS) as “an
assessment, conducted by or for a QI organization, of a patient care problem for the
purpose of improving patient care through peer analysis, intervention, resolution of the
problem, and follow-up” (CMS, 2003). Under the Code of Federal Regulations in the
United States, research is defined as a “systematic investigation, including research
development, testing and evaluation, designed to develop or contribute to generalizable
knowledge” (U.S. Code of Federal Regulations, 2009). And EBP projects, as we have
seen, are efforts to translate “best evidence” into protocols to guide the actions of health
care staff to maximize good outcomes for clients. Shirey and colleagues (2011)
summarize the differences between the three as follows: “All three have an important,
but different, relationship with knowledge: research generates it, EBP translates it, and
QI incorporates it” (p. 60).
QI projects are discussed briefly in Chapter 13. Here, we note a few characteristics
of QI:
In QI efforts, the intervention or protocol can change as it is being evaluated to
incorporate new ideas or insights.
The purpose of a QI project is often to effect immediate improvement in health care
delivery.
QI is designed with the intent of sustaining an improvement.
QI is a necessary, integral activity for a health care institution; research is not.
A literature review may not be undertaken in a QI project.
QI projects are not externally funded.
Example of a nurse-led quality improvement project
McMullen and colleagues (2016) undertook a QI project in a Magnet hospital to
promote safe sleep guidelines for hospitalized infants based on recommendations
from the American Academy of Pediatrics. The project involved an educational
initiative for parents and hospital staff.
Hundreds of projects to translate research evidence into nursing practice are
underway worldwide. Those that have been described in the nursing literature
offer good information about planning and implementing such an endeavor. In
this section, we summarize one such project.
Read the research summary for Example 1 and then answer the critical
thinking questions that follow, referring to the full research report if
necessary (this example is featured on the interactive Critical Thinking
Activity on website). The critical thinking questions for Examples 2
and 3 are based on the studies that appear in their entirety in Appendices A
78
and B of this book. Our comments for these exercises are in the Student
Resources section on .
EXAMPLE 1: EVIDENCE-BASED PRACTICE PROJECT
Study: Implementation of the ABCDE bundle to improve patient outcomes in
the intensive care unit in a rural community hospital (Kram et al., 2015)
Purpose: A team of nurses undertook an EBP to implement an existing care
bundle designed to manage delirium—the ABCDE bundle—in a rural
community ICU. The bundle incorporates awakening, breathing, coordination
(or choice of sedative), delirium monitoring and management, and early
mobility on a daily basis. The question for this EBP project was: Does the
implementation of the ABCDE bundle care, versus the usual care (absence of
the ABCDE bundle components), reduce the incidence of delirium, decrease
patient length of stay (LOS) in the ICU, decrease patient total hospital LOS,
and decrease length of mechanical ventilation of patients, thus decreasing
costs in the ICU?
Framework: The project used the Johns Hopkins Nursing Evidence-Based
Practice Model as its guiding framework.
Approach: The team began by reviewing the current body of evidence on the
ABCDE bundle. They also undertook an organizational assessment and
identified which practice changes were required. Key stakeholder support was
sought. Approval was obtained from the nurse executive committee, the chief
medical officer, and from physicians with ICU admitting privileges.
Educational sessions, using various instructional methods, were conducted
with staff from nursing, respiratory therapy, and rehabilitation services. The
ABCDE bundle was implemented for all adult patients admitted to the ICU
starting in October 2014.
Evaluation: To assess the effects of the ABCDE bundle, the team collected
and organized relevant information for two periods: from October 2013 to
January 2014 (pre-bundle) and from October 2014 to January 2015 (post-
bundle). The outcomes of interest included rate of compliance to bundle
elements by direct care providers, changes in hospital and ICU length of stay
between the two periods, changes in the number of ventilator days from pre-
bundle to post-bundle, and prevalence of post-bundle delirium. Information
was obtained for 47 patients in the pre-bundle group and 36 patients in the
post-bundle group.
Findings and Conclusions: The team found that compliance with the bundle
protocols was high. The average hospital stay was 1.8 days lower after the
implementation of the bundle. Mechanical ventilation was lower by an
79
average of 1 day in the post-bundle group. A delirium prevalence rate of 19%
was established as a baseline after the bundle was implemented. The EBP
team concluded that the ABCDE bundle “can be implemented in rural,
community-based hospitals and provides a safe, cost-effective method for
enhancing ICU patient outcomes” (p. 250).
Critical Thinking Exercises
1. Of the EBP-focused research purposes (Table 1.3), which purpose was the
central focus of this project?
2. What is the clinical question that the EBP team asked in this project?
Identify the components of the question using the PICO framework.
3. Discuss how this project could have been based on either a knowledge-
focused or problem-focused trigger.
EXAMPLE 2: QUANTITATIVE RESEARCH IN APPENDIX A
• Read the abstract and the introduction of Swenson and colleagues’ (2016)
study (“Parents’ use of praise and criticism in a sample of young children
seeking mental health services”) in Appendix A of this book.
Critical Thinking Exercises
1. Identify one or more clinical foreground questions that, if posed, would be
addressed by this study. Which PIO or PICO components does your
question capture?
2. How, if at all, might evidence from this study be used in an EBP project
(individual or organizational)?
EXAMPLE 3: QUALITATIVE RESEARCH IN APPENDIX B
• Read the abstract and the introduction of Beck and Watson’s (2010) study
(“Subsequent childbirth after a previous traumatic birth”) in Appendix B
of this book.
Critical Thinking Exercises
1. Identify one or more clinical foreground questions that, if posed, would be
addressed by this study. Which PIO or PICO components does your
question capture?
2. How, if at all, might evidence from this study be used in an EBP project
(individual or organizational)?
WANT TO KNOW MORE?
80
A wide variety of resources to enhance your learning and understanding of
this chapter are available on .
• Interactive Critical Thinking Activity
• Chapter Supplement on Evaluating Clinical Practice Guidelines—AGREE
II
• Answers to the Critical Thinking Exercises for Examples 2 and 3
• Internet Resources with useful websites for Chapter 2
• A Wolters Kluwer journal article in its entirety—the EBP project
described as Example 1 on p. 37.
Additional study aids, including eight journal articles and related
questions, are also available in Study Guide for Essentials of Nursing
Research, 9e.
Summary
Points
Evidence-based practice (EBP) is the conscientious use of current best evidence
in making clinical decisions about patient care; it is a clinical problem-solving
strategy that de-emphasizes decision making based on custom and emphasizes the
integration of research evidence with clinical expertise and patient preferences.
Research utilization (RU) and EBP are overlapping concepts that concern efforts
to use research as a basis for clinical decisions, but RU starts with a research-
based innovation that gets evaluated for possible use in practice. Knowledge
translation (KT) is a term used primarily about system-wide efforts to effect
systematic change in clinical practice or policies.
Two underpinnings of the EBP movement are the Cochrane Collaboration
(which is based on the work of British epidemiologist Archie Cochrane) and the
clinical learning strategy developed at the McMaster Medical School called
evidence-based medicine.
EBP involves evaluating evidence to determine best evidence. Often an evidence
hierarchy is used to rank study findings according to the strength of evidence
provided, but different hierarchies are appropriate for different types of questions.
In all evidence hierarchies, however, systematic reviews are at the pinnacle.
Systematic reviews are rigorous integrations of research evidence from multiple
studies on a topic. Systematic reviews can involve either quantitative methods
(meta-analysis) that integrate findings statistically or narrative approaches to
integration (including metasynthesis of qualitative studies).
81
Evidence-based clinical practice guidelines combine an appraisal of research
evidence with specific recommendations for clinical decisions.
Many models of EBP have been developed, including models that provide a
framework for individual clinicians (e.g., the Stetler Model) and others for
organizations or teams of clinicians (e.g., the Iowa Model).
Individual nurses have opportunities to put research into practice. The five basic
steps for individual EBP are (1) asking an answerable clinical question, (2)
searching for relevant research-based evidence, (3) appraising and synthesizing the
evidence, (4) integrating evidence with other factors, and (5) assessing
effectiveness of actions.
One scheme for asking well-worded clinical questions involves four primary
components, an acronym for which is PICO: population (P), intervention or
influence (I), comparison (C), and outcome (O). When there is no explicit
comparison, the acronym is PIO.
An appraisal of the evidence involves such considerations as the validity of study
findings, their clinical importance, the magnitude and precision of effects,
associated costs and risks, and utility in a particular clinical situation.
EBP in an organizational context involves many of the same steps as individual
EBP efforts but is more formalized and must take organizational factors into
account.
Triggers for an organizational project include both pressing clinical problems
(problem-focused) and existing knowledge (knowledge-focused).
Before an EBP-based guideline or protocol can be tested, there should be an
assessment of its implementation potential, which includes the issues of
transferability, feasibility, and the cost–benefit ratio of implementing a new
practice in a clinical setting.
Once an evidence-based protocol or guideline has been developed and deemed
worthy of implementation, the EBP team can move forward with a pilot test of the
innovation and an assessment of the outcomes prior to widespread adoption.
The purpose of quality improvement (QI) is to improve practices and processes
within a specific organization—not to generate new knowledge that can be
generalized. QI does not typically involve translating “best evidence” into a
protocol.
REFERENCES FOR CHAPTER 2
*Brouwers, M., Kho, M., Browman, G., Burgers, J., Cluzeau, F., Feder, G., . . . Zitzelsberger, L. (2010). AGREE
II: Advancing guideline development, reporting and evaluation in health care. Canadian Medical Association
Journal, 182, E839–E842.
Centers for Medicare & Medicaid Services. (2003). Quality improvement organization manual. Retrieved from
http://cms.gov/Regulations-and-Guidance/Guidance/Manuals/Internet-Only-Manuals-IOMs-
Items/CMS019035.html
82
http://cms.gov/Regulations-and-Guidance/Guidance/Manuals/Internet-Only-Manuals-IOMs-Items/CMS019035.html
Clarke, M., Hopewell, S., Juszczak, E., Eisinga, A., & Kjeldstrøm, M. (2006). Compression stockings for
preventing deep vein thrombosis in airline passengers. Cochrane Database of Systematic Reviews, (2),
CD004002.
Dearholt, D., & Dang, D. (Eds.). (2012). Johns Hopkins nursing evidence-based practice: Model and guidelines
(2nd ed.). Indianapolis, IN: Sigma Theta Tau International.
DiCenso, A., Guyatt, G., & Ciliska, D. (2005). Evidence-based nursing: A guide to clinical practice. St. Louis,
MO: Elsevier Mosby.
Gawlinski, A., & Rutledge, D. (2008). Selecting a model for evidence-based practice changes. AACN Advanced
Critical Care, 19, 291–300.
Glasziou, P. (2005). Evidence-based medicine: Does it make a difference? Make it evidence informed with a little
wisdom. BMJ, 330(7482), 92.
Homer, C. S., Scarf, V., Catling, C., & Davis, D. (2014). Culture-based versus risk-based screening for the
prevention of group B streptococcal disease in newborns: A review of national guidelines. Women and Birth,
27(1), 46–51.
Horsley, J. A., Crane, J., & Bingle, J. D. (1978). Research utilization as an organizational process. Journal of
Nursing Administration, 8, 4–6.
Hsieh, H. F., & Lee, F. P. (2005). Graduated compression stockings as prophylaxis for flight-related venous
thrombosis: Systematic literature review. Journal of Advanced Nursing, 51, 83–98.
**Kram, S., DiBartolo, M., Hinderer, K., & Jones, R. (2015). Implementation of the ABCDE bundle to improve
patient outcomes in the intensive care unit in a rural community hospital. Dimensions of Critical Care Nursing,
34, 250–258.
Lewis, S. (2001). Further disquiet on the guidelines front. Canadian Medical Association Journal, 165, 180–181.
Magid, M., Jones, J., Allen, L., McIlvennan, C., Magid, K., Thompson, J., & Matlock, D. (2016). The perceptions
of important elements of caregiving for a left ventricular assist device patient: A qualitative meta-synthesis.
Journal of Cardiovascular Nursing, 31, 215–225.
McMullen, S., Fioravanti, I., Brown, K., & Carey, M. (2016). Safe sleep for hospitalized infants. MCN: American
Journal of Maternal Child Nursing, 41, 43–50.
Melnyk, B. M. (2014). Evidence-based practice versus evidence-informed practice: A debate that could stall
forward momentum in improving healthcare quality, safety, patient outcomes, and costs. Worldviews on
Evidence-Based Nursing, 11, 347–349.
Melnyk, B. M., & Fineout-Overholt, E. (2015). Evidence-based practice in nursing and healthcare (3rd ed.).
Philadelphia, PA: Lippincott Williams & Wilkins.
Morse, J. (2005). Beyond the clinical trial: Expanding criteria for evidence. Qualitative Health Research, 15, 3–4.
O’Meara, S., Cullum, N., Nelson, E., & Dumville, J. (2012). Compression for venous ulcers. Cochrane Database
of Systematic Reviews, (1), CD000265.
Rogers, E. M. (1995). Diffusion of innovations (4th ed.). New York, NY: Free Press.
Rycroft-Malone, J. (2010). Promoting Action on Research Implementation in Health Services (PARiHS). In J.
Rycroft-Malone & T. Bucknall (Eds.), Models and frameworks for implementing evidence-based practice:
Linking evidence to action (pp. 109–133). Malden, MA: Wiley-Blackwell.
*Rycroft-Malone, J., Seers, K., Chandler, J., Hawkes, C., Crichton, N., Allen, C., . . . Strunin, L. (2013). The role
of evidence, context, and facilitation in an implementation trial: Implications for the development of the
PARIHS framework. Implementation Science, 8, 28.
Sackett, D. L., Straus, S. E., Richardson, W. S., Rosenberg, W., & Haynes, R. B. (2000). Evidence-based medicine:
How to practice and teach EBM (2nd ed.). Edinburgh, United Kingdom: Churchill Livingstone.
Shah, H., Schwartz, J., Luna, G., & Cullen, D. (2016). Bathing with 2% chlorhexidine gluconate: Evidence and
costs associated with central line-associated bloodstream infections. Critical Care Nursing Quarterly, 39, 42–
50.
Shirey, M., Hauck, S., Embree, J., Kinner, T., Schaar, G., Phillips, L., . . . McCool, I. (2011). Showcasing
differences between quality improvement, evidence-based practice, and research. Journal of Continuing
Education in Nursing, 42, 57–68.
Sigma Theta Tau International. (2008). Sigma Theta Tau International position statement on evidence-based
practice, February 2007 summary. Worldviews of Evidence-Based Nursing, 5, 57–59.
Stetler, C. B. (2010). Stetler model. In J. Rycroft-Malone & T. Bucknall (Eds.), Models and frameworks for
implementing evidence-based practice: Linking evidence to action (pp. 51–77). Malden, MA: Wiley-Blackwell.
Tayyib, N., Coyer, F., & Lewis, P. (2015). A two-arm cluster randomized control trial to determine the
effectiveness of a pressure ulcer prevention bundle for critically ill patients. Journal of Nursing Scholarship, 47,
237–247.
Titler, M. (2010). Iowa model of evidence-based practice. In J. Rycroft-Malone & T. Bucknall (Eds.), Models and
83
frameworks for implementing evidence-based practice: Linking evidence to action (pp. 137–144). Malden, MA:
Wiley-Blackwell.
Titler, M. G., Kleiber, C., Steelman, V., Rakel, B., Budreau, G., Everett, L., . . . Goode, C. (2001). The Iowa model
of evidence-based practice to promote quality care. Critical Care Nursing Clinics of North America, 13, 497–
509.
U.S. Code of Federal Regulations, 45 C.F.R. 46.102 (2009). Retrieved from
http://www.hhs.gov/ohrp/sites/default/files/ohrp/policy/ohrpregulations .
*World Health Organization. (2005). Bridging the “Know-Do” gap: Meeting on knowledge translation in global
health. Retrieved from http://www.who.int/kms/WHO_EIP_KMS_2006_2
*A link to this open-access article is provided in the Internet Resources section on website.
**This journal article is available on for this chapter.
84
http://www.hhs.gov/ohrp/sites/default/files/ohrp/policy/ohrpregulations
http://www.who.int/kms/WHO_EIP_KMS_2006_2
3 Key Concepts and Steps in
Quantitative and Qualitative
Research
Learning Objectives
On completing this chapter, you will be able to:
Define new terms presented in the chapter and distinguish terms associated with
quantitative and qualitative research
Distinguish experimental and nonexperimental research
Identify the three main disciplinary traditions for qualitative nursing research
Describe the flow and sequence of activities in quantitative and qualitative research
and discuss why they differ
Key Terms
Cause-and-effect (causal) relationship
Clinical trial
Concept
Conceptual definition
Construct
Data
Dependent variable
Emergent design
Ethnography
Experimental research
Gaining entrée
Grounded theory
Hypothesis
Independent variable
Informant
Intervention protocol
Literature review
85
Nonexperimental research
Observational study
Operational definition
Outcome variable
Phenomenology
Population
Qualitative data
Quantitative data
Relationship
Research design
Sample
Saturation
Statistical analysis
Study participant
Subject
Theme
Theory
Variable
THE BUILDING BLOCKS OF RESEARCH
Research, like any discipline, has its own language—its own jargon—and that jargon
can sometimes be intimidating. We readily admit that the jargon is abundant and can be
confusing. Some research jargon used in nursing research has its roots in the social
sciences but, sometimes, different terms are used in medical research. Also, some terms
are used by both quantitative and qualitative researchers, but others are used mainly by
one or the other group. Please bear with us as we cover key terms that you will likely
encounter in the research literature.
The Faces and Places of Research
When researchers answer a question through disciplined research, they are doing a study
(or an investigation). Studies with humans involve two sets of people: those who do the
research and those who provide the information. In a quantitative study, the people
being studied are called subjects or study participants, as shown in Table 3.1. In a
qualitative study, the people cooperating in the study are called study participants or
informants. The person who conducts the research is the researcher or investigator.
Studies are often undertaken by a research team rather than by a single researcher.
86
HOW-TO-TELL TIP How can you tell if an article appearing in a
nursing journal is a study? In journals that specialize in research (e.g., the
journal Nursing Research), most articles are original research reports, but in
specialty journals, there is usually a mix of research and nonresearch
articles. Sometimes you can tell by the title, but sometimes you cannot.
You can tell, however, by looking at the major headings of an article. If
there is no heading called “Method” or “Research Design” (the section that
describes what a researcher did) and no heading called “Findings” or
“Results” (the section that describes what a researcher learned), then it is
probably not a study.
Research can be undertaken in a variety of settings (the types of place where
information is gathered), like in hospitals, homes, or other community settings. A site is
the specific location for the research—it could be an entire community (e.g., a Haitian
neighborhood in Miami) or an institution (e.g., a clinic in Seattle). Researchers
sometimes do multisite studies because the use of multiple sites offers a larger and often
more diverse group of participants.
Concepts, Constructs, and Theories
Research involves real-world problems, but studies are conceptualized in abstract terms.
For example, pain, fatigue, and obesity are abstractions of human characteristics. These
abstractions are called phenomena (especially in qualitative studies) or concepts.
Researchers sometimes use the term construct, which also refers to an abstraction,
but often one that is deliberately invented (or constructed). For example, self-care in
Orem’s model of health maintenance is a construct. The terms construct and concept are
sometimes used interchangeably, but a construct often refers to a more complex
abstraction than a concept.
A theory is an explanation of some aspect of reality. In a theory, concepts are
87
knitted together into a coherent system to describe or explain some aspect of the world.
Theories play a role in both quantitative and qualitative research. In a quantitative study,
researchers often start with a theory and, using deductive reasoning, make predictions
about how phenomena would behave in the real world if the theory were valid. The
specific predictions are then tested. In qualitative studies, theory often is the product of
the research: The investigators use information from study participants inductively to
develop a theory rooted in the participants’ experiences.
TIP The reasoning process of deduction is associated with quantitative
research, and induction is associated with qualitative research. The
supplement for Chapter 3 on website explains and illustrates the
distinction.
Variables
In quantitative studies, concepts are usually called variables. A variable, as the name
implies, is something that varies. Weight, anxiety, and fatigue are all variables—they
vary from one person to another. Most human characteristics are variables. If everyone
weighed 150 pounds, weight would not be a variable; it would be a constant. But it is
precisely because people and conditions do vary that most research is conducted.
Quantitative researchers seek to understand how or why things vary and to learn how
differences in one variable relate to differences in another. For example, in lung cancer
research, lung cancer is a variable because not everybody has this disease. Researchers
have studied factors that might be linked to lung cancer, such as cigarette smoking.
Smoking is also a variable because not everyone smokes. A variable, then, is any
quality of a person, group, or situation that varies or takes on different values. Variables
are the central building blocks of quantitative studies.
TIP Every study focuses on one or more phenomena, concepts, or
variables, but these terms per se are not necessarily used in research reports.
For example, a report might say, “The purpose of this study is to examine the
effect of nurses’ workload on hand hygiene compliance.” Although the
researcher did not explicitly label anything a variable, the variables under
study are workload and hand hygiene compliance. Key concepts or variables
are often indicated in the study title.
Characteristics of Variables
Variables are often inherent human traits, such as age or weight, but sometimes
researchers create a variable. For example, if a researcher tests the effectiveness of
patient-controlled analgesia compared to intramuscular analgesia in relieving pain after
surgery, some patients would be given one type of analgesia, and some would receive
88
the other. In the context of this study, method of pain management is a variable because
different patients are given different analgesic methods.
Some variables take on a wide range of values that can be represented on a
continuum (e.g., a person’s age or weight). Other variables take on only a few values;
sometimes such variables convey quantitative information (e.g., number of children),
but others simply involve placing people into categories (e.g., male, female, other; or
blood type A, B, AB, or O).
Dependent and Independent Variables
As noted in Chapter 1, many studies seek to understand causes of phenomena. Does a
nursing intervention cause improvements in patient outcomes? Does smoking cause
lung cancer? The presumed cause is the independent variable, and the presumed effect
is the dependent or outcome variable. The dependent variable is the outcome that
researchers want to understand, explain, or predict. In terms of the PICO scheme
discussed in Chapter 2, the dependent variable corresponds to the “O” (outcome). The
independent variable corresponds to the “I” (the intervention, influence, or exposure),
plus the “C” (the comparison).
TIP In searching for evidence, a nurse might want to learn about the effects
of an intervention or influence (I), compared to any alternative, on a
designated outcome. In a cause-probing study, however, researchers always
specify what the comparative intervention or influence (the “C”) is.
The terms independent variable and dependent variable also can be used to indicate
direction of influence rather than cause and effect. For example, suppose we compared
levels of depression among men and women diagnosed with pancreatic cancer and
found men to be more depressed. We could not conclude that depression was caused by
gender. Yet the direction of influence clearly runs from gender to depression: It makes
no sense to suggest that patient’s depression influenced their gender. In this situation, it
is appropriate to consider depression as the outcome variable and gender as the
independent variable.
TIP Few research reports explicitly label variables as dependent and
independent. Moreover, variables (especially independent variables) are
sometimes not fully spelled out. Take the following research question: What
is the effect of exercise on heart rate? In this example, heart rate is the
dependent variable. Exercise, however, is not in itself a variable. Rather,
exercise versus something else (e.g., no exercise) is a variable; “something
else” is implied rather than stated in the research question.
Many outcomes have multiple causes or influences. If we were studying factors that
89
influence people’s body mass index, the independent variables might be height, physical
activity, and diet. And, two or more outcome variables may be of interest. For example,
a researcher may compare two alternative dietary interventions in terms of participants’
weight, lipid profile, and self-esteem. It is common to design studies with multiple
independent and dependent variables.
Variables are not inherently dependent or independent. A dependent variable in one
study could be an independent variable in another. For example, a study might examine
the effect of an exercise intervention (the independent variable) on osteoporosis (the
dependent variable) to answer a therapy question. Another study might investigate the
effect of osteoporosis (the independent variable) on bone fracture incidence (the
dependent variable) to address a prognosis question. In short, whether a variable is
independent or dependent is a function of the role that it plays in a particular study.
Example of independent and dependent variables
Research question (Etiology/Harm question): Among heart failure patients, is
reduced gray matter volume (as measured through magnetic resonance imagery)
associated with poorer performance in instrumental activities of daily living? (Alosco
et al., 2016).
Independent variable: Volume of gray matter in the brain
Dependent variable: Performance in instrumental activities of daily living
Conceptual and Operational Definitions
The concepts of interest to researchers are abstractions, and researchers’ worldviews
shape how those concepts are defined. A conceptual definition is the theoretical
meaning of a concept. Researchers need to conceptually define even seemingly
straightforward terms. A classic example is the concept of caring. Morse and colleagues
(1990) examined how researchers and theorists defined caring and identified five
categories of conceptual definitions: as a human trait, a moral imperative, an affect, an
interpersonal relationship, and a therapeutic intervention. Researchers undertaking
studies of caring need to clarify how they conceptualized it.
In qualitative studies, conceptual definitions of key phenomena may be a major end
product, reflecting an intent to have the meaning of concepts defined by those being
studied. In quantitative studies, however, researchers must define concepts at the outset
because they must decide how the variables will be measured. An operational
definition indicates what the researchers specifically must do to measure the concept
and collect needed information.
Readers of research articles may not agree with how researchers conceptualized and
operationalized variables. However, definitional precision is important in
communicating what concepts mean within the context of the study.
Example of conceptual and operational definitions
90
Stoddard and colleagues (2015) studied the relationship between young adolescents’
hopeful future expectations on the one hand and bullying on the other. The
researchers defined bullying conceptually as “intentional aggressive behaviors that
are repetitive and impose a power imbalance between students who bully and
students who are victimized” (p. 422). They operationalized bullying behavior by
asking a set of 12 questions. One question asked how often in the past month did the
study participant “say things about another student to make others laugh?” (p. 426).
Participants were asked to respond on a scale from 0 (never) to 5 (five or more times).
Data
Research data (singular, datum) are the pieces of information gathered in a study. In
quantitative studies, researchers identify and define their variables and then collect
relevant data from subjects. The actual values of the study variables constitute the data.
Quantitative researchers collect primarily quantitative data—information in numeric
form. For example, if we conducted a quantitative study in which a key variable was
depression, we would need to measure how depressed participants were. We might ask,
“Thinking about the past week, how depressed would you say you have been on a scale
from 0 to 10, where 0 means ‘not at all’ and 10 means ‘the most possible’?” Box 3.1
presents quantitative data for three fictitious people. The subjects provided a number
along the 0 to 10 continuum corresponding to their degree of depression—9 for subject
1 (a high level of depression), 0 for subject 2 (no depression), and 4 for subject 3 (little
depression).
Box 3.1 Example of Quantitative Data
Question: Tell me about how you’ve been feeling lately—have you felt sad or
depressed at all, or have you generally been in good spirits?
Data: 9 (Subject 1)
0 (Subject 2)
4 (Subject 3)
In qualitative studies, researchers collect primarily qualitative data, that is,
narrative descriptions. Narrative data can be obtained by conversing with participants,
by making notes about their behavior in naturalistic settings, or by obtaining narrative
records, such as diaries. Suppose we were studying depression qualitatively. Box 3.2
presents qualitative data for three participants responding conversationally to the
question “Tell me about how you’ve been feeling lately—have you felt sad or depressed
at all, or have you generally been in good spirits?” Here, the data consist of rich
narrative descriptions of participants’ emotional state. In reports on qualitative studies,
researchers include excerpts from their narrative data to support their interpretations.
91
Box 3.2 Example of Qualitative Data
Question: Tell me about how you’ve been feeling lately—have you felt sad or
depressed at all, or have you generally been in good spirits?
Data: “Well, actually, I’ve been pretty depressed lately, to tell you the truth. I
wake up each morning and I can’t seem to think of anything to look
forward to. I mope around the house all day, kind of in despair. I just can’t
seem to shake the blues and I’ve begun to think I need to go see a shrink.”
(Participant 1)
“I can’t remember ever feeling better in my life. I just got promoted to a
new job that makes me feel like I can really get ahead in my company.
And I’ve just gotten engaged to a really great guy who is very special.”
(Participant 2)
“I’ve had a few ups and downs the past week but basically things are on a
pretty even keel. I don’t have too many complaints.” (Participant 3)
Relationships
Researchers usually study phenomena in relation to other phenomena—they examine
relationships. A relationship is a connection between phenomena; for example,
researchers repeatedly have found that there is a relationship between frequency of
turning bedridden patients and the incidence of pressure ulcers. Quantitative and
qualitative studies examine relationships in different ways.
In quantitative studies, researchers are interested in the relationship between
independent variables and outcomes. Relationships are often explicitly expressed in
quantitative terms, such as more than or less than. For example, consider a person’s
weight as our outcome variable. What variables are related to (associated with) a
person’s weight? Some possibilities include height, caloric intake, and exercise. For
each independent variable, we can make a prediction about its relationship to the
outcome:
Height: Tall people will weigh more than short people.
Caloric intake: People with high caloric intake will be heavier than those with low
caloric intake.
Exercise: The lower the amount of exercise, the greater will be the person’s weight.
Each statement expresses a predicted relationship between weight (the outcome) and
a measurable independent variable. Most quantitative research is conducted to assess
whether relationships exist among variables and to measure how strong the relationship
is.
TIP Relationships are expressed in two basic forms. First, relationships can
92
be expressed as “if more of Variable X, then more of (or less of) Variable
Y.” For example, there is a relationship between height and weight: With
greater height, there tends to be greater weight, i.e., tall people tend to weigh
more than short people. The second form involves relationships expressed as
group differences. For example, there is a relationship between gender and
height: Men tend to be taller than women.
Variables can be related to one another in different ways, including cause-and-
effect (or causal) relationships. Within the positivist paradigm, natural phenomena are
assumed to have antecedent causes that are discoverable. For example, we might
speculate that there is a causal relationship between caloric intake and weight: All else
being equal, eating more calories causes greater weight. As noted in Chapter 1, many
quantitative studies are cause-probing—they seek to illuminate the causes of
phenomena.
Example of a study of causal relationships
Bench and colleagues (2015) studied whether a critical care discharge information
pack for patients and their families would result in improved psychological well-
being (anxiety and depression) 5 days and 28 days after discharge.
Not all relationships can be interpreted as causal. There is a relationship, for
example, between a person’s pulmonary artery and tympanic temperatures: People with
high readings on one tend to have high readings on the other. We cannot say, however,
that pulmonary artery temperature caused tympanic temperature or vice versa. This type
of relationship is sometimes referred to as an associative (or functional) relationship
rather than a causal one.
Example of a study of associative relationships
Goh and colleagues (2016) studied factors associated with patients’ degree of
satisfaction with nursing care. They found significant differences in satisfaction in
different ethnic subgroups.
Qualitative researchers are not concerned with quantifying relationships or in testing
and confirming causal relationships. Rather, qualitative researchers may seek patterns of
association as a way of illuminating the underlying meaning and dimensionality of
phenomena of interest. Patterns of interconnected concepts are identified as a means of
understanding the whole.
Example of a qualitative study of patterns
Brooten and colleagues (2016) studied rituals of White, Black, and Hispanic parents
after the intensive care unit (ICU) death of an infant or child. They reported that the
93
grieving parents’ experiences differed on two important factors: (1) whether or not
the parents were recent immigrants to the United States with language barriers and
(2) level of family support systems.
MAJOR CLASSES OF QUANTITATIVE AND
QUALITATIVE RESEARCH
Researchers usually work within a paradigm that is consistent with their worldview and
that gives rise to the types of question that excite their curiosity. In this section, we
briefly describe broad categories of quantitative and qualitative research.
Quantitative Research: Experimental and
Nonexperimental Studies
A basic distinction in quantitative studies is between experimental and nonexperimental
research. In experimental research, researchers actively introduce an intervention or
treatment—most often, to address therapy questions. In nonexperimental research, on
the other hand, researchers are bystanders—they collect data without introducing
treatments (most often, to address etiology, prognosis, or diagnosis questions). For
example, if a researcher gave bran flakes to one group of subjects and prune juice to
another to evaluate which method facilitated elimination more effectively, the study
would be experimental because the researcher intervened. If, on the other hand, a
researcher compared elimination patterns of two groups whose regular eating patterns
differed, the study would be nonexperimental because there is no intervention. In
medical and epidemiological research, experimental studies usually are called clinical
trials, and nonexperimental inquiries are called observational studies.
Experimental studies are explicitly designed to test causal relationships—to test
whether an intervention caused changes in the outcome. Sometimes, nonexperimental
studies also explore causal relationships, but causal inferences in nonexperimental
research are tricky and less conclusive, for reasons we explain in a later chapter.
Example of experimental research
In their experimental study, Demirel and Guler (2015) tested the efficacy of uterine
stimulation and nipple stimulation on birth duration and the incidence of synthetic
induction among women giving birth by vaginal delivery. Some study participants
received nipple stimulation, others received uterine stimulation, and some received
neither.
In this example, the researchers intervened by designating that some women would
receive one of two interventions and that others would receive no special intervention.
In other words, the researcher controlled the independent variable, which in this case
94
was the stimulation interventions.
Example of nonexperimental research
Lai and colleagues (2015) compared women who had vaginal births and those who
had cesarean births in terms of postpartum fatigue and maternal–infant attachment.
Women with a cesarean delivery had higher fatigue, which in turn was associated
with weaker maternal–infant attachment.
In this nonexperimental study to address a prognosis question, the researchers did
not intervene in any way. They were interested in a similar population as in the previous
example (women giving birth), but their intent was to explore relationships among
existing conditions rather than to test a potential solution to a problem.
Qualitative Research: Disciplinary Traditions
Many qualitative nursing studies are rooted in research traditions that originated in
anthropology, sociology, and psychology. Three such traditions are briefly described
here. Chapter 11 provides a fuller discussion of these and other traditions and the
methods associated with them.
The grounded theory tradition seeks to describe and understand key social
psychological processes. Grounded theory was developed in the 1960s by two
sociologists, Glaser and Strauss (1967). The focus of most grounded theory studies is on
a developing social experience—the social and psychological phases that characterize a
particular event or episode. A major component of grounded theory is the discovery of a
core variable that is central in explaining what is going on in that social scene.
Grounded theory researchers strive to generate explanations of phenomena that are
grounded in reality.
Example of a grounded theory study
Keogh and colleagues (2015) used grounded theory methods to understand how
mental health service users transitioned home from a hospital stay. The researchers
found that the core variable was the patients’ management of preconceived
expectations.
Phenomenology is concerned with the lived experiences of humans.
Phenomenology is an approach to thinking about what life experiences of people are
like and what they mean. The phenomenological researcher asks the questions: What is
the essence of this phenomenon as experienced by these people? or What is the meaning
of the phenomenon to those who experience it?
Example of a phenomenological study
Tornøe and colleagues (2015) used a phenomenological approach in their study of
95
nurses’ experiences with spiritual and existential care for dying patients in a general
hospital.
Ethnography, the primary research tradition in anthropology, provides a framework
for studying the patterns and lifeways of a defined cultural group in a holistic fashion.
Ethnographers typically engage in extensive fieldwork, often participating to the extent
possible in the life of the culture under study. Ethnographers strive to learn from
members of a cultural group, to understand their worldview, and to describe their
customs and norms.
Example of an ethnographic study
Sandvoll and colleagues (2015) used ethnographic methods to explore how nursing
home staff members managed unpleasant resident behaviors in two public nursing
homes in Norway.
MAJOR STEPS IN A QUANTITATIVE STUDY
In quantitative studies, researchers move from the beginning point of a study (posing a
question) to the end point (obtaining an answer) in a reasonably linear sequence of steps
that is broadly similar across studies (Fig. 3.1). This section describes that flow, and the
next section describes how qualitative studies differ.
96
Phase 1: The Conceptual Phase
The early steps in a quantitative study typically involve activities with a strong
conceptual element. During this phase, researchers call on such skills as creativity,
deductive reasoning, and a grounding in research evidence on the topic of interest.
Step 1: Formulating and Delimiting the Problem
Quantitative researchers begin by identifying an interesting research problem and
formulating research questions. The research questions identify what the study
variables are. In developing questions, nurse researchers must attend to substantive
issues (Is this problem important?), theoretical issues (Is there a conceptual framework
97
for this problem?), clinical issues (Will findings be useful in clinical practice?),
methodologic issues (How can this question be answered to yield high-quality
evidence?), and ethical issues (Can this question be addressed in an ethical manner?).
Step 2: Reviewing the Related Literature
Quantitative research is conducted within the context of previous knowledge.
Quantitative researchers typically strive to understand what is already known about a
topic by undertaking a thorough literature review before any data are collected.
Step 3: Undertaking Clinical Fieldwork
Researchers embarking on a clinical study often benefit from spending time in relevant
clinical settings (in the field), discussing the topic with clinicians and observing current
practices. Such clinical fieldwork can provide perspectives on clinicians’ and clients’
viewpoints.
Step 4: Defining the Framework and Developing Conceptual Definitions
When quantitative research is performed within the context of a theoretical framework,
the findings may have broader significance and utility. Even when the research question
is not embedded in a theory, researchers should have a conceptual rationale and a clear
vision of the concepts under study.
Step 5: Formulating Hypotheses
Hypotheses state researchers’ expectations about relationships between study variables.
Hypotheses are predictions of the relationships researchers expect to observe in the
study data. The research question identifies the concepts of interest and asks how the
concepts might be related; a hypothesis is the predicted answer. Most quantitative
studies are designed to test hypotheses through statistical analysis.
Phase 2: The Design and Planning Phase
In the second major phase of a quantitative study, researchers decide on the methods
they will use to address the research question. Researchers make many methodologic
decisions that have crucial implications for the quality of the study evidence.
Step 6: Selecting a Research Design
The research design is the overall plan for obtaining answers to the research questions.
Quantitative designs tend to be structured and controlled, with the goal of minimizing
bias. Research designs also indicate how often data will be collected and what types of
comparisons will be made. The research design is the architectural backbone of the
study.
98
Step 7: Developing Protocols for the Intervention
In experimental research, researchers create the independent variable, which means that
participants are exposed to different treatments. An intervention protocol for the study
must be developed, specifying exactly what the intervention will entail (e.g., who will
administer it, over how long a period the treatment will last, and so on) and what the
alternative condition will be. In nonexperimental research, this step is not necessary.
Step 8: Identifying the Population
Quantitative researchers need to specify what characteristics study participants should
possess—that is, they must identify the population to be studied. A population is all the
individuals or objects with common, defining characteristics (the “P” component in
PICO questions).
Step 9: Designing the Sampling Plan
Researchers typically collect data from a sample, which is a subset of the population.
The researcher’s sampling plan specifies how the sample will be selected and how
many subjects there will be. The goal is to have a sample that adequately reflects the
population’s traits.
Step 10: Specifying Methods to Measure Variables
Quantitative researchers must find methods to measure the research variables
accurately. A variety of quantitative data collection approaches exist; the primary
methods are self-reports (e.g., interviews and questionnaires), observations (e.g.,
watching and recording people’s behavior), and biophysiologic measurements. The task
of measuring research variables and developing a data collection plan is complex and
challenging.
Step 11: Developing Methods to Safeguard Human/Animal Rights
Most nursing research involves human subjects, although some involve animals. In
either case, procedures need to be developed to ensure that the study adheres to ethical
principles.
Step 12: Reviewing and Finalizing the Research Plan
Before collecting data, researchers often undertake assessments to ensure that
procedures will work smoothly. For example, they may evaluate the readability of
written materials to see if participants with low reading skills can comprehend them.
Researchers usually have their research plan critiqued by reviewers to obtain clinical or
methodologic feedback. Researchers seeking financial support submit a proposal to a
funding source, and reviewers usually suggest improvements.
99
Phase 3: The Empirical Phase
The third phase of quantitative studies involves collecting the research data. This phase
is often the most time-consuming part of the study. Data collection may require months
of work.
Step 13: Collecting the Data
The actual collection of data in a quantitative study often proceeds according to a
preestablished plan. The plan typically spells out procedures for training data collection
staff, for actually collecting data (e.g., where and when the data will be gathered), and
for recording information.
Step 14: Preparing the Data for Analysis
Data collected in a quantitative study must be prepared for analysis. For example, one
preliminary step is coding, which involves translating verbal data into numeric form
(e.g., coding gender information as “1” for females, “2” for males, and “3” for other).
Phase 4: The Analytic Phase
Quantitative data must be subjected to analysis and interpretation, which occur in the
fourth major phase of a project.
Step 15: Analyzing the Data
To answer research questions and test hypotheses, researchers analyze their data in a
systematic fashion. Quantitative data are analyzed through statistical analyses, which
include some simple procedures (e.g., computing an average) as well as more complex,
sophisticated methods.
Step 16: Interpreting the Results
Interpretation involves making sense of study results and examining their implications.
Researchers attempt to explain the findings in light of prior evidence, theory, and
clinical experience and in light of the adequacy of the methods they used in the study.
Interpretation also involves coming to conclusions about the clinical significance of the
new evidence.
Phase 5: The Dissemination Phase
In the analytic phase, researchers come full circle: The questions posed at the outset are
answered. The researchers’ job is not completed, however, until study results are
disseminated.
Step 17: Communicating the Findings
A study cannot contribute evidence to nursing practice if the results are not
100
communicated. Another—and often final—task of a research project is the preparation
of a research report that can be shared with others. We discuss research reports in the
next chapter.
Step 18: Putting the Evidence Into Practice
Ideally, the concluding step of a high-quality study is to plan for its use in practice
settings. Although nurse researchers may not implement a plan for using research
findings, they can contribute to the process by developing recommendations on how the
evidence could be used in practice, by ensuring that adequate information has been
provided for a meta-analysis, and by pursuing opportunities to disseminate the findings
to practicing nurses.
ACTIVITIES IN A QUALITATIVE STUDY
Quantitative research involves a fairly linear progression of tasks—researchers plan
what steps to take and then follow those steps. In qualitative studies, by contrast, the
progression is closer to a circle than to a straight line. Qualitative researchers
continually examine and interpret data and make decisions about how to proceed based
on what has been discovered (Fig. 3.2).
Because qualitative researchers have a flexible approach, we cannot show the flow
of activities precisely—the flow varies from one study to another, and researchers
themselves may not know in advance how the study will unfold. We provide a general
sense of qualitative studies by describing major activities and indicating when they
might be performed.
101
Conceptualizing and Planning a Qualitative Study
Identifying the Research Problem
Qualitative researchers usually begin with a broad topic, often focusing on an aspect
about which little is known. Qualitative researchers often proceed with a fairly broad
initial question that allows the focus to be sharpened and delineated more clearly once
the study is underway.
Doing a Literature Review
Some qualitative researchers avoid consulting the literature before collecting data. They
worry that prior studies might influence the conceptualization of the phenomenon under
study, which they believe should be based on participants’ viewpoints rather than on
prior findings. Others believe that researchers should conduct at least a brief literature
review at the outset. In any case, qualitative researchers typically find a relatively small
body of relevant previous work because of the type of questions they ask.
Selecting and Gaining Entrée Into Research Sites
Before going into the field, qualitative researchers must identify an appropriate site. For
example, if the topic is the health beliefs of the urban poor, an inner-city neighborhood
with a concentration of low-income residents must be identified. In some cases,
researchers may have access to the selected site, but in others, they need to gain entrée
into it. Gaining entrée typically involves negotiations with gatekeepers who have the
authority to permit entry into their world.
TIP The process of gaining entrée is usually associated with doing
fieldwork in qualitative studies, but quantitative researchers often need to
gain entrée into sites for collecting data as well.
Developing an Overall Approach
Quantitative researchers do not collect data before finalizing their research design.
Qualitative researchers, by contrast, use an emergent design that materializes during
data collection. Certain design features are guided by the study’s qualitative tradition,
but qualitative studies rarely have rigid designs that prohibit changes while in the field.
Addressing Ethical Issues
Qualitative researchers must also develop plans for addressing ethical issues—and,
indeed, there are special concerns in qualitative studies because of the more intimate
nature of the relationship that typically develops between researchers and participants.
Conducting a Qualitative Study
102
In qualitative studies, the tasks of sampling, data collection, data analysis, and
interpretation typically take place iteratively. Qualitative researchers begin by talking
with people with firsthand experience with the phenomenon under study. The
discussions and observations are loosely structured, allowing participants to express a
full range of beliefs, feelings, and behaviors. Analysis and interpretation are ongoing
activities that guide choices about “next steps.”
The process of data analysis involves clustering together related narrative
information into a coherent scheme. Through inductive reasoning, researchers identify
themes and categories, which are used to build a rich description or theory of the
phenomenon. Data gathering becomes increasingly purposeful: As conceptualizations
develop, researchers seek participants who can confirm and enrich theoretical
understandings as well as participants who can potentially challenge them.
Quantitative researchers decide in advance how many subjects to include in the
study, but qualitative researchers’ sampling decisions are guided by the data. Many
qualitative researchers use the principle of saturation, which occurs when participants’
accounts about their experiences become redundant, such that no new information can
be gleaned by further data collection.
Quantitative researchers seek to collect high-quality data by measuring their
variables with instruments that have been demonstrated to be accurate and valid.
Qualitative researchers, by contrast, are the main data collection instrument and must
take steps to demonstrate the trustworthiness of the data. The central feature of these
efforts is to confirm that the findings accurately reflect the viewpoints of participants
rather than researchers’ perceptions. One confirmatory activity, for example, involves
going back to participants, sharing preliminary interpretations with them, and asking
them to evaluate whether the researcher’s thematic analysis is consistent with their
experiences.
Qualitative nursing researchers also strive to share their findings at conferences and
in journal articles. Qualitative studies help to shape nurses’ perceptions of a problem,
their conceptualizations of potential solutions, and their understanding of patients’
concerns and experiences.
TIP An emerging trend is for researchers to design mixed methods (MM)
studies that involve the collection, analysis, and integration of quantitative
and qualitative data. Mixed methods research is discussed in Chapter 13.
GENERAL QUESTIONS IN REVIEWING A STUDY
Box 3.3 presents some further suggestions for performing a preliminary overview of a
research report, drawing on concepts explained in this chapter. These guidelines
supplement those presented in Box 1.1 (see Chapter 1).
Box 3.3 Additional Questions for a Preliminary Review of a Study
103
1. What was the study all about? What were the main phenomena, concepts, or
constructs under investigation?
2. If the study was quantitative, what were the independent and dependent variables?
3. Did the researcher examine relationships or patterns of association among
variables or concepts? Did the report imply the possibility of a causal relationship?
4. Were key concepts defined, both conceptually and operationally?
5. What type of study does it appear to be, in terms of types described in this chapter
—experimental or nonexperimental/observational? Grounded theory,
phenomenologic, or ethnographic?
6. Did the report provide information to suggest how long the study took to
complete?
In this section, we illustrate the progression of activities and discuss the time
schedule of a study conducted by the second author of this book. Read the
research summary and then answer the critical thinking questions that follow,
referring to the full research report if necessary. Example 1 is featured in our
interactive Critical Thinking Activity on website. The critical
thinking questions for Examples 2 and 3 are based on the studies that appear
in their entirety in Appendices A and B of this book. Our comments for these
exercises are in the Student Resources section on .
EXAMPLE 1: PROJECT SCHEDULE FOR A
QUANTITATIVE STUDY
Study: Postpartum depressive symptomatology: Results from a two-stage
U.S. national survey (Beck et al., 2011)
Study Purpose: Beck and colleagues (2011) undertook a study to estimate
the prevalence of mothers with elevated postpartum depressive (PPD)
symptom levels in the United States and factors associated with differences in
symptom levels.
Study Methods: This study took a little less than 3 years to complete. Key
activities and methodological decisions included the following:
Phase 1. Conceptual Phase: 1 Month. Beck had been a member of the
Listening to Mothers II National Advisory Council. The data for their
national survey (the Childbirth Connection: Listening to Mothers II U.S.
National Survey) had already been collected when Beck was approached to
analyze the variables in the survey relating to PPD symptoms. The first phase
104
took only 1 month because data collection was already completed, and Beck,
a world expert on PPD, just needed to update a review of the literature.
Phase 2. Design and Planning Phase: 3 Months. The design phase entailed
identifying which of the hundreds of variables on the national survey the
researchers would focus on in their analysis. Also, their research questions
were formalized during this phase. Approval from a human subjects
committee also was obtained during this phase.
Phase 3. Empirical Phase: 0 Months. In this study, the data from nearly
1,000 postpartum women had already been collected.
Phase 4. Analytic Phase: 12 Months. Statistical analyses were performed to
(1) estimate the percentage of new mothers experiencing elevated PPD
symptom levels and (2) identify which demographic, antepartum,
intrapartum, and postpartum variables were significantly related to elevated
symptom levels.
Phase 5. Dissemination Phase: 18 Months. The researchers prepared and
submitted their report to the Journal of Midwifery & Women’s Health for
possible publication. It was accepted within 5 months and was “in press”
(awaiting publication) another 4 months before being published. The article
received the Journal of Midwifery & Women’s Health 2012 Best Research
Article Award.
Critical Thinking Exercises
1. Answer the relevant questions from Box 3.3 regarding this study.
2. Also consider the following targeted questions:
a. Could the data for this study have been collected anonymously?
b. Comment on the appropriateness of the participant stipend in this study.
c. Do you think an appropriate amount of time was allocated to the various
phases and steps in this study?
d. Would it have been appropriate for the researchers to address the
research question using qualitative research methods? Why or why not?
EXAMPLE 2: QUANTITATIVE RESEARCH IN APPENDIX A
• Read the abstract and introduction of Swenson and colleagues’ (2016)
study (“Parents’ use of praise and criticism in a sample of young children
seeking mental health services”) in Appendix A of this book.
Critical Thinking Exercises
1. Answer the relevant questions from Box 3.3 regarding this study.
2. Also consider the following targeted questions:
105
a. Comment on the composition of the research team for this study.
b. Did this report present any actual data from the study participants?
c. Would it have been possible for the researchers to use an experimental
design for this study?
EXAMPLE 3: QUALITATIVE RESEARCH IN APPENDIX B
• Read the abstract and the introduction of Beck and Watson’s (2010) study
(“Subsequent childbirth after a previous traumatic birth”) in Appendix B
of this book.
Critical Thinking Exercises
1. Answer the relevant questions from Box 3.3 regarding this study.
2. Also consider the following targeted questions:
a. Find an example of actual data in this study. (You will need to look at
the “Results” section of this study.)
b. How long did it take Beck and Watson to collect the data for this study?
(You will find this information in the “Procedure” section.)
c. How much time elapsed between when the paper was accepted for
publication and when it was actually published? (You will find relevant
information at the end of the paper.)
WANT TO KNOW MORE?
A wide variety of resources to enhance your learning and understanding of
this chapter are available on .
• Interactive Critical Thinking Activity
• Chapter Supplement on Deductive and Inductive Reasoning
• Answers to the Critical Thinking Exercises for Examples 2 and 3
• Internet Resources with useful websites for Chapter 3
• A Wolters Kluwer journal article in its entirety—the study by Alosco et
al., described on p. 44.
Additional study aids, including eight journal articles and related
questions, are also available in Study Guide for Essentials of Nursing
Research, 9e.
Summary
Points
106
The people who provide information to the researchers in a study are called
subjects or study participants in quantitative research or study participants or
informants in qualitative research; collectively, they comprise the sample.
The site is the location for the research; researchers sometimes engage in multisite
studies.
Researchers investigate concepts and phenomena (or constructs), which are
abstractions inferred from people’s behavior or characteristics.
Concepts are the building blocks of theories, which are systematic explanations of
some aspect of the real world.
In quantitative studies, concepts are called variables. A variable is a characteristic
or quality that takes on different values (i.e., varies from one person or object to
another).
The dependent (or outcome) variable is the behavior, characteristic, or outcome
the researcher is interested in explaining, predicting, or affecting (the “O” in the
PICO scheme). The independent variable is the presumed cause of or influence
on the dependent variable. The independent variable corresponds to the “I” and the
“C” components in the PICO scheme.
A conceptual definition describes the abstract meaning of a concept being
studied. An operational definition specifies how the variable will be measured.
Data—the information collected during the course of a study—may take the form
of numeric values (quantitative data) or narrative information (qualitative data).
A relationship is a connection (or pattern of association) between variables.
Quantitative researchers study the relationship between independent variables and
outcome variables.
When the independent variable causes or affects the outcome, the relationship is a
cause-and-effect (or causal) relationship. In an associative (or functional)
relationship, variables are related in a noncausal way.
A key distinction in quantitative studies is between experimental research, in
which researchers actively intervene to test an intervention or therapy, and
nonexperimental (or observational) research, in which researchers collect data
about existing phenomena without intervening.
Qualitative research often is rooted in research traditions that originate in other
disciplines. Three such traditions are grounded theory, phenomenology, and
ethnography.
Grounded theory seeks to describe and understand key social psychological
processes that occur in a social setting.
Phenomenology focuses on the lived experiences of humans and is an approach to
gaining insight into what the life experiences of people are like and what they
mean.
107
Ethnography provides a framework for studying the meanings, patterns, and
lifeways of a culture in a holistic fashion.
In a quantitative study, researchers usually progress in a linear fashion from asking
research questions to answering them. The main phases in a quantitative study are
the conceptual, planning, empirical, analytic, and dissemination phases.
The conceptual phase involves (1) defining the problem to be studied, (2) doing a
literature review, (3) engaging in clinical fieldwork for clinical studies, (4)
developing a framework and conceptual definitions, and (5) formulating
hypotheses to be tested.
The planning phase entails (6) selecting a research design, (7) developing
intervention protocols if the study is experimental, (8) specifying the population,
(9) developing a plan to select a sample, (10) specifying a data collection plan
and methods to measure variables, (11) developing strategies to safeguard
subjects’ rights, and (12) finalizing the research plan.
The empirical phase involves (13) collecting data and (14) preparing data for
analysis (e.g., coding data).
The analytic phase involves (15) performing statistical analyses and (16)
interpreting the results.
The dissemination phase entails (17) communicating the findings and (18)
promoting the use of the study evidence in nursing practice.
The flow of activities in a qualitative study is more flexible and less linear.
Qualitative studies typically involve an emergent design that evolves during data
collection.
Qualitative researchers begin with a broad question regarding a phenomenon of
interest, often focusing on a little-studied aspect. In the early phase of a qualitative
study, researchers select a site and seek to gain entrée into it, which typically
involves enlisting the cooperation of gatekeepers within the site.
Once in the field, researchers select informants, collect data, and then analyze and
interpret them in an iterative fashion; experiences during data collection help in an
ongoing fashion to shape the design of the study.
Early analysis in qualitative research leads to refinements in sampling and data
collection, until saturation (redundancy of information) is achieved. Analysis
typically involves a search for critical themes or categories.
Both quantitative and qualitative researchers disseminate their findings, most often
by publishing their research reports in professional journals.
REFERENCES FOR CHAPTER 3
**Alosco, M., Brickman, A., Spitznagel, M., Narkhede, A., Griffith, E., Cohen, R., . . . Gunstad, J. (2016).
Reduced gray matter volume is associated with poorer instrumental activities of daily living performance in
heart failure. Journal of Cardiovascular Nursing, 31, 31–41.
108
Beck, C. T., Gable, R. K., Sakala, C., & Declercq, E. R. (2011). Postpartum depressive symptomatology: Results
from a two-stage U.S. national survey. Journal of Midwifery & Women’s Health, 56, 427–435.
*Bench, S., Day, T., Heelas, K., Hopkins, P., White, C., & Griffiths, P. (2015). Evaluating the feasibility and
effectiveness of a critical care discharge information pack for patients and their families: A pilot cluster
randomised controlled trial. BMJ Open, 5(11), e006852.
Brooten, D., Youngblut, J. M., Charles, D., Roche, R., Hidalgo, I., & Malkawi, F. (2016). Death rituals reported by
White, Black, and Hispanic parents following the ICU death of an infant or child. Journal of Pediatric Nursing,
31, 132–140.
Demirel, G., & Guler, H. (2015). The effect of uterine and nipple stimulation on induction with oxytocin and the
labor process. Worldviews on Evidence-Based Nursing, 12, 273–280.
Glaser, B. G., & Strauss, A. L. (1967). The discovery of grounded theory: Strategies for qualitative research.
Piscataway, NJ: Aldine.
Goh, M. L., Ang, E. N., Chan, Y., He, H. G., & Vehviläinen-Julkunen, K. (2016). A descriptive quantitative study
on multi-ethnic patient satisfaction with nursing care as measured by the Revised Humane Caring Scale. Applied
Nursing Research, 31, 126–131.
Keogh, B., Callaghan, P., & Higgins, A. (2015). Managing preconceived expectations: Mental health service users’
experiences of going home from hospital: A grounded theory study. Journal of Psychiatric and Mental Health
Nursing, 22, 715–723.
Lai, Y., Hung, C., Stocker, J., Chan, T., & Liu, Y. (2015). Postpartum fatigue, baby-care activities, and maternal-
infant attachment of vaginal and cesarean births following rooming-in. Applied Nursing Research, 28, 116–120.
Morse, J. M., Solberg, S. M., Neander, W. L., Bottorff, J. L., & Johnson, J. L. (1990). Concepts of caring and
caring as a concept. Advances in Nursing Science, 13, 1–14.
*Sandvoll, A., Grov, E., Kristoffersen, K., & Hauge, S. (2015). When care situations evoke difficult emotions in
nursing staff members: An ethnographic study in two Norwegian nursing homes. BMC Nursing, 14, 40.
Stoddard, S., Varela, J., & Zimmerman, M. (2015). Future expectations, attitude toward violence, and bullying
perpetration during early adolescence: A mediation evaluation. Nursing Research, 64, 422–433.
*Tornøe, K., Danbolt, L., Kvigne, K., & Sørlie, V. (2015). The challenge of consolation: Nurses’ experiences with
spiritual and existential care for the dying—a phenomenological hermeneutical study. BMC Nursing, 14, 62.
*A link to this open-access article is provided in the Internet Resources section on website.
**This journal article is available on for this chapter.
109
4 Reading and Critiquing Research
Articles
Learning Objectives
On completing this chapter, you will be able to:
Identify and describe the major sections in a research journal article
Characterize the style used in quantitative and qualitative research reports
Read a research article and broadly grasp its “story”
Describe aspects of a research critique
Understand the many challenges researchers face and identify some tools for
addressing methodologic challenges
Define new terms in the chapter
Key Terms
Abstract
Bias
Blinding
Confounding variable
Credibility
Critique
Findings
IMRAD format
Inference
Journal article
Level of significance
p
Placebo
Randomness
Reflexivity
Reliability
Research control
Scientific merit
110
Statistical significance
Statistical test
Transferability
Triangulation
Trustworthiness
Validity
Evidence from nursing studies is communicated through research reports that describe
what was studied, how it was studied, and what was found. Research reports are often
daunting to readers without research training. This chapter aims to make research
reports more accessible and also provides some guidance regarding critiques of research
reports.
TYPES OF RESEARCH REPORTS
Nurses are most likely to encounter research evidence in journals or at professional
conferences. Research journal articles are descriptions of studies published in
professional journals. Competition for journal space is keen, so research articles are
brief—generally only 10 to 20 double-spaced pages. This means that researchers must
condense a lot of information about the study into a short report.
Usually, manuscripts are reviewed by two or more peer reviewers (other
researchers) who make recommendations about acceptance of or revisions to the
manuscript. Reviews are usually blind—reviewers are not told researchers’ names, and
authors are not told reviewers’ names. Consumers thus have some assurance that journal
articles have been vetted by other impartial nurse researchers. Nevertheless, publication
does not mean that the findings can be uncritically accepted. Research method courses
help nurses to evaluate the quality of evidence reported in journal articles.
At conferences, research findings are presented as oral presentations or poster
sessions. In an oral presentation, researchers are typically allotted 10 to 20 minutes to
describe key features of their study to an audience. In poster sessions, many researchers
simultaneously present visual displays summarizing their studies, and conference
attendees walk around the room looking at the displays. Conferences offer an
opportunity for dialogue: Attendees can ask questions to help them better understand
what the findings mean; moreover, they can offer the researchers suggestions relating to
clinical implications of the study. Thus, professional conferences are a valuable forum
for clinical audiences.
THE CONTENT OF RESEARCH JOURNAL
ARTICLES
Many research articles follow an organization called the IMRAD format. This format
organizes content into four main sections—Introduction, Method, Results, and
111
Discussion. The paper is preceded by a title and an abstract and concludes with
references.
The Title and Abstract
Research reports have titles that succinctly convey key information. In qualitative
studies, the title normally includes the central phenomenon and group under
investigation. In quantitative studies, the title communicates key variables and the
population (in other words, PICO components).
The abstract is a brief description of the study placed at the beginning of the article.
The abstract answers questions like the following: What were the research questions?
What methods were used to address those questions? What were the findings? and What
are the implications for nursing practice? Readers can review an abstract to judge
whether to read the full report.
The Introduction
The introduction to a research article acquaints readers with the research problem and
its context. This section usually describes the following:
The central phenomena, concepts, or variables under study
The study purpose and research questions or hypotheses
A review of the related literature
The theoretical or conceptual framework
The significance of and need for the study
Thus, the introduction lets readers know the problem the researcher sought to
address.
Example of an introductory material
“Little is known about how the back-to-school transition following cancer treatment
influences adolescents’ developing self-identity and social relationships.” Data from
the adolescent’s perspective are particularly limited . . . The purpose of this study was
to describe how the return to school affects adolescents’ beliefs about themselves,
their self-identity, and their social relationships (Choquette et al., 2015).
In this paragraph, the researchers described the central concept of interest
(experiences of adolescents returning to school after cancer treatment), the need for the
study (the fact that little is known about the experience directly from adolescents), and
the study purpose.
TIP The introduction section of most reports is not specifically labeled
“Introduction.” The report’s introduction immediately follows the abstract.
112
The Method Section
The method section describes the methods used to answer the research questions. In a
quantitative study, the method section usually describes the following, which may be
presented in labeled subsections:
The research design
The sampling plan
Methods of measuring variables and collecting data
Study procedures, including procedures to protect human rights
Data analysis methods
Qualitative researchers discuss many of the same issues but with different emphases.
For example, a qualitative study often provides more information about the research
setting and the context of the study. Reports of qualitative studies also describe the
researchers’ efforts to enhance the integrity of the study.
The Results Section
The results section presents the findings that were obtained by analyzing the study data.
The text presents a narrative summary of key findings, often accompanied by more
detailed tables. Virtually all results sections contain descriptive information, including a
description of the participants (e.g., average age, percent male, female, and other).
In quantitative studies, the results section also reports the following information
relating to statistical tests performed:
The names of statistical tests used. Researchers test their hypotheses and assess the
probability that the results are right using statistical tests. For example, if the
researcher finds that the average birth weight of drug-exposed infants in the sample
is lower than the birth weight of infants not exposed to drugs, how probable is it that
the same would be true for other infants not in the sample? A statistical test helps
answer the question, Is the relationship between prenatal drug exposure and infant
birth weight real, and would it likely be observed with a new sample from the same
population? Statistical tests are based on common principles; you do not have to
know the names of all statistical tests to comprehend the findings.
The value of the calculated statistic. Computers are used to calculate a numeric value
for the particular statistical test used. The value allows researchers to reach
conclusions about their hypotheses. The actual value of the statistic, however, is not
inherently meaningful and need not concern you.
Statistical significance. A critical piece of information is whether the statistical tests
were significant (not to be confused with clinically important). If a researcher reports
that the results are statistically significant, it means the findings are probably true
and replicable with a new sample. Research reports also indicate the level of
significance, which is an index of how probable it is that the findings are reliable.
For example, if a report indicates that a finding was significant at the .05 probability
113
level (symbolized as p), this means that only 5 times out of 100 (5 ÷ 100 = .05)
would the obtained result be spurious. In other words, 95 times out of 100, similar
results would be obtained with a new sample. Readers can thus have a high degree of
confidence—but not total assurance—that the results are accurate.
Example from the results section of a quantitative study
Park and coresearchers (2015) tested the effects of a 16-session Patient-Centered
Environment Program (PCEP) on a variety of outcomes for home-dwelling patients
with dementia. Here is a sentence adapted from the reported results: “Findings
showed that agitation (t = 2.91, p < .02) and pain (t = 4.51, p < .002) improved after
receiving the PCEP” (p. 40).
In this example, the researchers indicated that both agitation and pain were
significantly improved following receipt of the PCEP intervention. The changes in
agitation and pain were not likely to have been haphazard and probably would be
replicated with a new sample. These findings are very reliable. For example, with regard
to pain reduction, it was found that an improvement of the magnitude obtained would
occur just as a “fluke” less than 2 times in 1,000 (p < .002). Note that to comprehend
this finding, you do not need to understand what a t statistic is, nor do you need to
concern yourself with the actual value of the t statistic, 4.51.
TIP Results are more reliable if the p value is smaller. For example, there
is a higher probability that the results are accurate when p = .01 (1 in 100
chance of a spurious result) than when p = .05 (5 in 100 chances of a
spurious result). Researchers sometimes report an exact probability (e.g., p =
.03) or a probability below conventional thresholds (e.g., p < .05—less than 5
in 100).
In qualitative reports, researchers often organize findings according to the major
themes, processes, or categories that were identified in the data. The results section of
qualitative reports sometimes has several subsections, the headings of which correspond
to the researcher’s labels for the themes. Excerpts from the raw data (the actual words
of participants) are presented to support and provide a rich description of the thematic
analysis. The results section of qualitative studies may also present the researcher’s
emerging theory about the phenomenon under study.
Example from the results section of a qualitative study
Larimer and colleagues (2015) studied the experiences, challenges, and coping
behaviors of young adults with pacemakers or implantable cardioverter defibrillators.
Participants described four categories of challenges, one of which was labeled
“Limited support.” Here is an excerpt illustrating that category: “If I go to pediatric
114
doctors, their waiting rooms have blocks and pink elephants. But in cardiopulmonary
rehab, I’m the youngest by 60 years. It feels like I’m in a no man’s land, stuck in the
middle” (p. 3).
The Discussion Section
In the discussion, the researcher presents conclusions about the meaning and
implications of the findings, i.e., what the results mean, why things turned out the way
they did, how the findings fit with other evidence, and how the results can be used in
practice. The discussion in both quantitative and qualitative reports may include the
following elements:
An interpretation of the results
Clinical and research implications
Study limitations and ramifications for the believability of the results
Researchers are in the best position to point out deficiencies in their studies. A
discussion section that presents the researcher’s grasp of study limitations demonstrates
to readers that the authors were aware of the limitations and probably took them into
account in interpreting the findings.
References
Research articles conclude with a list of the books and articles that were referenced. If
you are interested in additional reading on a topic, the reference list of a recent study is
a good place to begin.
THE STYLE OF RESEARCH JOURNAL ARTICLES
Research reports tell a story. However, the style in which many research journal articles
are written—especially for quantitative studies—makes it difficult for some readers to
understand or become interested in the story.
Why Are Research Articles So Hard to Read?
To unaccustomed audiences, research reports may seem bewildering. Four factors
contribute to this impression:
1. Compactness. Journal space is limited, so authors compress a lot of information into
a small space. Interesting, personalized aspects of the investigation cannot be
reported, and, in qualitative studies, only a handful of supporting quotes can be
included.
2. Jargon. The authors of research articles use research terms that may seem esoteric.
3. Objectivity. Quantitative researchers tend to avoid any impression of subjectivity, so
they tell their research stories in a way that makes them sound impersonal. Most
115
quantitative research articles are written in the passive voice, which tends to make
the articles less inviting and lively. Qualitative reports, by contrast, are often written
in a more conversational style.
4. Statistical information. In quantitative reports, numbers and statistical symbols may
intimidate readers who do not have statistical training.
A goal of this textbook is to assist you in understanding the content of research
reports and in overcoming anxieties about jargon and statistical information.
HOW-TO-TELL TIP How can you tell if the voice is active or
passive? In the active voice, the article would say what the researchers did
(e.g., “We used a mercury sphygmomanometer to measure blood
pressure”). In the passive voice, the article indicates what was done,
without indicating who did it, although it is implied that the researchers
were the agents (e.g., “A mercury sphygmomanometer was used to measure
blood pressure”).
Tips on Reading Research Articles
As you progress through this book, you will acquire skills for evaluating research
articles, but the skills involved in critical appraisal take time to develop. The first step is
to comprehend research articles. Here are some hints on digesting research reports.
Grow accustomed to the style of research articles by reading them frequently, even
though you may not yet understand the technical points.
Read journal articles slowly. It may be useful to skim the article first to get the major
points and then read the article more carefully a second time.
On the second reading, train yourself to become an active reader. Reading actively
means that you constantly monitor yourself to verify that you understand what you
are reading. If you have difficulty, you can ask someone for help. In most cases, that
“someone” will be your instructor, but also consider contacting the researchers
themselves.
Keep this textbook with you as a reference when you read articles so that you can look
up unfamiliar terms in the glossary or index.
Try not to get bogged down in (or scared away by) statistical information. Try to grasp
the gist of the story without letting symbols and numbers frustrate you.
CRITIQUING RESEARCH REPORTS
A critical reading of a research article involves a careful appraisal of the researcher’s
major conceptual and methodologic decisions. It will be difficult to criticize these
decisions at this point, but your skills will improve as you progress through this book.
116
What Is a Research Critique?
A research critique is an objective assessment of a study’s strengths and limitations.
Critiques usually conclude with the reviewer’s summary of the study’s merits,
recommendations regarding the value of the evidence, and suggestions about improving
the study or the report.
Research critiques of individual studies are prepared for various reasons, and they
vary in scope. Peer reviewers who are asked to prepare a written critique for a journal
considering publication of a manuscript may evaluate the strengths and weaknesses in
terms of substantive issues (Was the research problem significant to nursing?),
theoretical issues (Were the conceptual underpinnings sound?), methodologic decisions
(Were the methods rigorous, yielding believable evidence?), interpretive (Did the
researcher reach defensible conclusions?), ethics (Were participants’ rights protected?),
and style (Is the report clear, grammatical, and well organized?). In short, peer
reviewers do a comprehensive review to provide feedback to the researchers and to
journal editors about the merit of both the study and the report and typically offer
suggestions for revisions.
Critiques designed to inform evidence-based nursing practice are seldom
comprehensive. For example, it is of little consequence to evidence-based practice
(EBP) that an article is ungrammatical. A critique of the clinical utility of a study
focuses on whether the evidence is accurate, believable, and clinically relevant. These
narrower critiques focus more squarely on appraising the research methods and the
findings themselves.
Students taking a research methods course also may be asked to critique a study.
Such critiques are often intended to cultivate critical thinking and to induce students to
apply newly acquired skills in research methods.
Critiquing Support in This Textbook
We provide several types of support for research critiques. First, detailed critiquing
suggestions relating to chapter content are included at the end of most chapters. Second,
it is always illuminating to have a good model, so we prepared critiques of two studies.
The two studies in their entirety and the critiques are in Appendices C and D.
Third, we offer a set of key critiquing guidelines for quantitative and qualitative
reports in this chapter, in Tables 4.1 and 4.2, respectively. The questions in the
guidelines concern the rigor with which the researchers dealt with critical research
challenges, some of which we outline in the next section.
TIP For those undertaking a comprehensive critique, we offer more
inclusive critiquing guidelines in the Supplement to this chapter on
website.
117
118
The second columns of Tables 4.1 and 4.2 list some key critiquing questions, and
the third column cross-references the more detailed guidelines in the various chapters of
the book. We know that most of the critiquing questions are too difficult for you to
answer at this point, but your methodologic and critiquing skills will develop as you
progress through this book.
The question wording in these guidelines calls for a yes or no answer (although it
may well be that the answer sometimes will be “Yes, but . . . ”). In all cases, the
desirable answer is yes, that is, a no suggests a possible limitation and a yes suggests a
strength. Therefore, the more yeses a study gets, the stronger it is likely to be.
119
Cumulatively, then, these guidelines can suggest a global assessment: A report with 10
yeses is likely to be superior to one with only two. However, these guidelines are not
intended to yield a formal quality “score.”
We acknowledge that our critiquing guidelines have shortcomings. In particular,
they are generic even though critiquing cannot use a one-size-fits-all list of questions.
Important critiquing questions that are relevant to certain studies (e.g., those that have a
Therapy purpose) do not fit into a set of general questions for all quantitative studies.
Thus, you need to use some judgment about whether the guidelines are sufficiently
comprehensive for the type of study you are critiquing. We also note that there are
questions in these guidelines for which there are no totally objective answers. Even
experts sometimes disagree about methodological strategies.
TIP Just as a careful clinician seeks research evidence that certain practices
are or are not effective, you as a reader should demand evidence that the
researchers’ methodological decisions were sound.
Critiquing With Key Research Challenges in Mind
In critiquing a study, it is useful to be aware of the challenges that confront researchers.
For example, they face ethical challenges (e.g., Can the study achieve its goals without
infringing on human rights?), practical challenges (Will I be able to recruit enough
participants?), and methodologic challenges (Will the methods I use yield results that
can be trusted?). Most of this book provides guidance relating to the last question, and
this section highlights key methodologic challenges. This section offers us an
opportunity to introduce important terms and concepts that are relevant in a critique.
The worth of a study’s evidence for nursing practice often relies on how well
researchers deal with these challenges.
Inference
Inference is an integral part of doing and critiquing research. An inference is a
conclusion drawn from the study evidence using logical reasoning and taking into
account the methods used to generate that evidence.
Inference is necessary because researchers use proxies that “stand in” for things that
are fundamentally of interest. A sample of participants is a proxy for an entire
population. A control group that does not receive an intervention is a proxy for what
would happen to the same people if they simultaneously received and did not receive an
intervention.
Researchers face the challenge of using methods that yield good and persuasive
evidence in support of inferences that they wish to make. Readers must draw their own
inferences based on a critique of methodological decisions.
Reliability, Validity, and Trustworthiness
120
Researchers want their inferences to correspond to the truth. Research cannot contribute
evidence to guide clinical practice if the findings are inaccurate, biased, or fail to
represent the experiences of the target group.
Quantitative researchers use several criteria to assess the quality of a study,
sometimes referred to as its scientific merit. Two especially important criteria are
reliability and validity. Reliability refers to the accuracy and consistency of information
obtained in a study. The term is most often associated with the methods used to measure
variables. For example, if a thermometer measured Alan’s temperature as 98.1°F 1
minute and as 102.5°F the next minute, the thermometer would be unreliable.
Validity is a more complex concept that broadly concerns the soundness of the
study’s evidence. Like reliability, validity is an important criterion for evaluating
methods to measure variables. In this context, the validity question is whether the
methods are really measuring the concepts that they purport to measure. Is a paper-and-
pencil measure of depression really measuring depression? Or is it measuring
something else, such as loneliness or stress? Researchers strive for solid conceptual
definitions of research variables and valid methods to operationalize them.
Another aspect of validity concerns the quality of evidence about the relationship
between the independent variable and the dependent variable. Did a nursing intervention
really bring about improvements in patients’ outcomes—or were other factors
responsible for patients’ progress? Researchers make numerous methodologic decisions
that can influence this type of study validity.
Qualitative researchers use different criteria and terminology in evaluating a study’s
integrity. In general, qualitative researchers discuss methods of enhancing the
trustworthiness of the study’s data and findings (Lincoln & Guba, 1985).
Trustworthiness encompasses several different dimensions—credibility, transferability,
confirmability, dependability, and authenticity—which are described in Chapter 17.
Credibility is an especially important aspect of trustworthiness. Credibility is
achieved to the extent that the research methods inspire confidence that the results are
truthful and accurate. Credibility in a qualitative study can be enhanced in several ways,
but one strategy merits early discussion because it has implications for the design of all
studies, including quantitative ones. Triangulation is the use of multiple sources or
referents to draw conclusions about what constitutes the truth. In a quantitative study,
this might mean having two ways to measure an outcome, to assess whether results are
consistent. In a qualitative study, triangulation might involve efforts to understand the
complexity of a phenomenon by using multiple data collection methods to converge on
the truth (e.g., having in-depth discussions with participants as well as watching their
behavior in natural settings). Nurse researchers are also beginning to triangulate across
paradigms—that is, to integrate both quantitative and qualitative data in a single study
to enhance the validity of the conclusions. We discuss such mixed methods research in
Chapter 13.
Example of triangulation
121
Montreuil and colleagues (2015) explored helpful nursing care from the perspective
of children with suicide risk factors and their parents. The researchers triangulated
data from observations of the children, debriefing sessions with the children, and
interviews with their parents.
Nurse researchers need to design their studies in such a way that threats to the
reliability, validity, and trustworthiness of their studies are minimized, and users of
research must evaluate the extent to which they were successful.
TIP In reading and critiquing research articles, it is appropriate to have a
“show me” attitude—that is, to expect researchers to build and present a solid
case for the merit of their inferences. They do this by providing evidence that
the findings are reliable and valid or trustworthy.
Bias
Bias can threaten a study’s validity and trustworthiness. A bias is a distortion or
influence that results in an error in inference. Bias can be caused by various factors,
including study participants’ lack of candor, researchers’ preconceptions, or faulty
methods of collecting data.
Some bias is haphazard and affects only small segments of the data. As an example,
a few study participants might provide inaccurate information because they were tired at
the time of data collection. Systematic bias results when the bias is consistent or
uniform. For example, if a scale consistently measured people’s weight as being 2
pounds heavier than their true weight, there would be systematic bias in the data on
weight. Rigorous research methods aim to eliminate or minimize bias.
Researchers adopt a variety of strategies to address bias. Triangulation is one such
approach, the idea being that multiple sources of information or points of view offer
avenues to identify biases. In quantitative research, methods to combat bias often entail
research control.
Research Control
A central feature of most quantitative studies is that they involve efforts to control
aspects of the research. Research control usually involves holding constant influences
on the outcome variable so that the true relationship between the independent and
outcome variables can be understood. In other words, research control attempts to
eliminate contaminating factors that might cloud the relationship between the variables
that are of central interest.
Contaminating factors, often called confounding (or extraneous) variables, can
best be illustrated with an example. Suppose we were studying whether urinary
incontinence (UI) leads to depression. Prior evidence suggests that this is the case, but
previous studies have not clarified whether it is UI per se or other factors that contribute
122
to risk of depression. The question is whether UI itself (the independent variable)
contributes to higher levels of depression or whether there are other factors that can
account for the relationship between UI and depression. We need to design a study to
control other determinants of the outcome—determinants that are also related to the
independent variable, UI.
One confounding variable here is age. Levels of depression tend to be higher in
older people, and people with UI tend to be older than those without this problem. In
other words, perhaps age is the real cause of higher depression in people with UI. If age
is not controlled, then any observed relationship between UI and depression could be
caused by UI, or by age.
Three possible explanations might be portrayed schematically as follows:
1. UI→depression
2. Age→UI→depression
3.
The arrows symbolize a causal mechanism or influence. In model 1, UI directly
affects depression, independently of other factors. In model 2, UI is a mediating
variable—the effect of age on depression is mediated by UI. According to this
representation, age affects depression through the effect that age has on UI. In model 3,
both age and UI have separate effects on depression, and age also increases the risk of
UI. Some research is specifically designed to test paths of mediation and multiple
causations, but in the present example, age is extraneous to the research question. We
want to design a study that tests the first explanation. Age must be controlled if our goal
is to explore the validity of model 1, which posits that, no matter what a person’s age,
having UI makes a person more vulnerable to depression.
How can we impose such control? There are a number of ways, as we discuss in
Chapter 9, but the general principle underlying each alternative is that the confounding
variable must be held constant. The confounding variable must somehow be handled so
that, in the context of the study, it is not related to the independent variable or the
outcome. As an example, let us say we wanted to compare the average scores on a
depression scale for those with and without UI. We would want to design a study in
such a way that the ages of those in the UI and non-UI groups are comparable, even
though, in general, the groups are not comparable in terms of age.
By exercising control over age, we would be taking a step toward understanding the
relationship between UI and depression. The world is complex, and many variables are
interrelated in complicated ways. The value of evidence in quantitative studies is often
related to how well researchers control confounding influences.
Research rooted in the constructivist paradigm does not impose controls. With their
emphasis on holism and individual human experience, qualitative researchers typically
believe that imposing controls removes some of the meaning of reality.
123
Bias Reduction: Randomness and Blinding
For quantitative researchers, a powerful tool for eliminating bias involves randomness
—having certain features of the study established by chance rather than by researcher
preference. When people are selected at random to participate in a study, for example,
each person in the initial pool has an equal chance of being selected. This in turn means
that there are no systematic biases in the makeup of the sample. Men and women have
an equal chance of being selected, for example. Similarly, if participants are allocated at
random to groups that will be compared (e.g., a special intervention and “usual care”
group), then there is no systematic biases in the groups’ composition. Randomness is a
compelling method of controlling confounding variables and reducing bias.
Another bias-reducing strategy is called blinding (or masking), which is used in
some quantitative studies to prevent biases stemming from people’s awareness.
Blinding involves concealing information from participants, data collectors, or care
providers to enhance objectivity. For example, if study participants are aware of
whether they are getting an experimental drug or a sham drug (a placebo), then their
outcomes could be influenced by their expectations of the new drug’s efficacy. Blinding
involves disguising or withholding information about participants’ status in the study
(e.g., whether they are in a certain group) or about the study hypotheses.
Example of randomness and blinding
Da Silva and colleagues (2015) studied the effect of foot reflexology on tissue
integrity and impairment of the feet among people with type 2 diabetes mellitus.
Their sample of 45 people with diabetes was randomly assigned to one of two groups
—one group received guidelines on foot care plus 12 sessions of foot reflexology,
and the other group received the guidelines only. The person who assessed foot
impairment was blinded to which group the participants were in.
Qualitative researchers do not consider randomness or blinding desirable tools for
understanding phenomena. A researcher’s judgment is viewed as an indispensable
vehicle for uncovering the complexities of the phenomena of interest.
Reflexivity
Qualitative researchers are also interested in discovering the truth about human
experience. Qualitative researchers often rely on reflexivity to guard against personal
bias. Reflexivity is the process of reflecting critically on the self and of analyzing and
noting personal values that could affect data collection and interpretation. Qualitative
researchers are trained to explore these issues, to be reflective about decisions made
during the inquiry, and to record their thoughts in personal diaries and memos.
Example of reflexivity
Sanon and colleagues (2016) examined the role of transnationalism (maintenance of
124
relationships and activities that transcend borders across countries) among Haitian
immigrants in terms of hypertension self-management. By means of reflexivity, the
primary researcher “considered her historical, social, and political context and
position as they influenced her reflections, and the meanings she ascribed to the
participants’ accounts” (p. 150). The researcher also reflected on the inequality in
power relationship between the participants and herself.
TIP Reflexivity can be a useful tool in quantitative as well as qualitative
research—self-awareness and introspection can enhance the quality of any
study.
Generalizability and Transferability
Nurses increasingly rely on evidence from disciplined research as a guide in their
clinical practice. EBP is based on the assumption that study findings are not unique to
the people, places, or circumstances of the original research.
As noted in Chapter 1, generalizability is the criterion used in quantitative studies to
assess the extent to which the findings can be applied to other groups and settings. How
do researchers enhance the generalizability of a study? First and foremost, they must
design studies strong in reliability and validity. There is little point in wondering
whether results are generalizable if they are not accurate or valid. In selecting
participants, researchers must also give thought to the types of people to whom results
might be generalized—and then select subjects accordingly. If a study is intended to
have implications for male and female patients, then men and women should be
included as participants.
Qualitative researchers do not specifically aim for generalizability, but they do want
to generate knowledge that might be useful in other situations. Lincoln and Guba
(1985), in their influential book on naturalistic inquiry, discuss the concept of
transferability, the extent to which qualitative findings can be transferred to other
settings, as another aspect of trustworthiness. An important mechanism for promoting
transferability is the amount of rich descriptive information qualitative researchers
provide about study contexts.
Abstracts for a quantitative and a qualitative nursing study are presented in the
following sections. Read the abstracts for Examples 1 and 2 and then answer
the critical thinking questions that follow. Examples 1 and 2 are featured on
the interactive Critical Thinking Activity on website. The critical
thinking questions for Examples 3 and 4 are based on the studies that appear
in their entirety in Appendices A and B of this book. Our comments for these
125
exercises are in the Student Resources section on .
EXAMPLE 1: QUANTITATIVE RESEARCH
Study: Relationships among daytime napping and fatigue, sleep quality, and
quality of life in cancer patients (Sun & Lin, 2016)
Background: The relationships among napping and sleep quality, fatigue,
and quality of life (QOL) in cancer patients are not clearly understood.
Objective: The aim of the study was to determine whether daytime napping
is associated with nighttime sleep, fatigue, and QOL in cancer patients.
Methods: In total, 187 cancer patients were recruited. Daytime napping,
nighttime self-reported sleep, fatigue, and QOL were assessed using a
questionnaire. Objective sleep parameters were collected using a wrist
actigraph.
Results: According to waking-after-sleep-onset measurements, patients who
napped during the day experienced poorer nighttime sleep than did patients
who did not (t = −2.44, p = .02). Daytime napping duration was significantly
negatively correlated with QOL. Patients who napped after 4 pm had poorer
sleep quality (t = −1.93, p = .05) and a poorer Short-Form Health Survey
mental component score (t = 2.06, p = .04) than did patients who did not.
Fatigue, daytime napping duration, and sleep quality were significant
predictors of the mental component score and physical component score,
accounting for 45.7% and 39.3% of the variance, respectively.
Conclusions: Daytime napping duration was negatively associated with
QOL. Napping should be avoided after 4 pm.
Implications for Practice: Daytime napping affects the QOL of cancer
patients. Future research can determine the role of napping in the sleep
hygiene of cancer patients.
Critical Thinking Exercises
1. Consider the following targeted questions:
a. What were the independent and dependent variables in this study? What
are the PICO components?
b. Is this study experimental or nonexperimental?
c. How, if at all, was randomness used in this study?
d. How, if at all, was blinding used in this study?
e. Did the researchers use any statistical tests? If yes, were any of the
results statistically significant?
2. If the results of this study are valid and generalizable, what might be some
126
of the uses to which the findings could be put in clinical practice?
EXAMPLE 2: QUALITATIVE RESEARCH
Study: Adolescents’ lived experiences while hospitalized after surgery for
ulcerative colitis (Olsen et al., 2016)
Abstract: Adolescents are in a transitional phase of life characterized by
major physical, emotional, and psychological challenges. Living with
ulcerative colitis is experienced as a reduction of their life quality. Initial
treatment of ulcerative colitis is medical, but surgery may be necessary when
medical treatment ceases to have an effect. No research-based studies of
adolescents’ experience of the hospital period after surgery for ulcerative
colitis exist. The objective of the study was to identify and describe
adolescents’ lived experiences while hospitalized after surgery for ulcerative
colitis. This qualitative study was based on interviews with eight adolescents.
Analysis and interpretation were based on a hermeneutic interpretation of
meaning. Three themes were identified: Body: Out of order, Seen and
understood, and Where are all the others? The adolescents experience a
postoperative period characterized by physical and mental impairment. Being
mentally unprepared for such challenges, they shun communication and
interaction. The findings demonstrate the importance of individualized
nursing care on the basis of the adolescent’s age, maturity, and individual
needs. Further study of adolescent patients’ hospital stay, focusing on the
implications of being young and ill at the same time, is needed.
Critical Thinking Exercises
1. Consider the following targeted questions:
a. On which qualitative research tradition, if any, was this study based?
b. Is this study experimental or nonexperimental?
c. How, if at all, was randomness used in this study?
d. Is there any indication in the abstract that triangulation was used?
Reflexivity ?
2. If the results of this study are trustworthy and transferable, what might be
some of the uses to which the findings could be put in clinical practice?
3. Compare the two abstracts in Examples 1 and 2. The first is structured,
with specific headings, whereas the second is a more “traditional” format
consisting of a single paragraph. Which do you prefer? Why?
EXAMPLE 3: QUANTITATIVE RESEARCH IN APPENDIX A
• Read the introduction and methods section of Swenson and colleagues’
(2016) study (“Parents’ use of praise and criticism in a sample of young
127
children seeking mental health services”) in Appendix A of this book.
Critical Thinking Exercises
1. Answer the following targeted questions:
a. Did this article follow a traditional IMRAD format? Where does the
introduction to this article begin and end?
b. How, if at all, was randomness used in this study?
c. How, if at all, was blinding used?
d. Comment on the possible generalizability of the study findings.
EXAMPLE 4: QUALITATIVE RESEARCH IN APPENDIX B
• Read the abstract of Beck and Watson’s (2010) study (“Subsequent
childbirth after a previous traumatic birth”) in Appendix B of this book.
Critical Thinking Exercises
1. Answer the following targeted questions, which may assist you in
assessing aspects of the study’s merit:
a. Where does the introduction to this article begin and end?
b. How, if at all, was randomness used in this study?
c. Is there any indication in the abstract that triangulation was used?
Reflexivity ?
d. Comment on the possible transferability of the study findings.
WANT TO KNOW MORE?
A wide variety of resources to enhance your learning and understanding of
this chapter are available on .
• Interactive Critical Thinking Activity
• Chapter Supplement on Guides to Overall Critiques of Research Reports
• Answers to the Critical Thinking Exercises for Examples 3 and 4
• Internet Resources with useful websites for Chapter 4
• A Wolters Kluwer journal article in its entirety—the study described as
Example 1 on p. 73.
Additional study aids, including eight journal articles and related
questions, are also available in Study Guide for Essentials of Nursing
Research, 9e.
Summary
128
Points
Both quantitative and qualitative researchers disseminate their findings, most often
by publishing reports of their research as journal articles, which concisely
describe what researcher did and what they found.
Journal articles often consist of an abstract (a synopsis of the study) and four
major sections that often follow the IMRAD format: an Introduction (the
research problem and its context); Method section (the strategies used to answer
research questions); Results (study findings); and Discussion (interpretation and
implications of the findings).
Research reports are often difficult to read because they are dense, concise, and
contain jargon. Quantitative research reports may be intimidating at first because,
compared to qualitative reports, they are more impersonal and report on statistical
tests.
Statistical tests are used to test hypotheses and to evaluate the reliability of the
findings. Findings that are statistically significant have a high probability of
being “real.”
A goal of this book is to help students to prepare a research critique, which is a
critical appraisal of the strengths and limitations of a study, often to assess the
worth of the evidence for nursing practice.
Researchers face numerous challenges, the solutions to which must be considered
in critiquing a study because they affect the inferences that can be made.
An inference is a conclusion drawn from the study evidence, taking into account
the methods used to generate that evidence. Researchers strive to have their
inferences correspond to the truth.
Reliability (a key challenge in quantitative research) refers to the accuracy of
information obtained in a study. Validity broadly concerns the soundness of the
study’s evidence—that is, whether the findings are convincing and well grounded.
Trustworthiness in qualitative research encompasses several different
dimensions, including credibility, dependability, confirmability, transferability,
and authenticity.
Credibility is achieved to the extent that the methods engender confidence in the
truth of the data and in the researchers’ interpretations. Triangulation, the use of
multiple sources to draw conclusions about the truth, is one approach to enhancing
credibility.
A bias is an influence that produces a distortion in the study results. In quantitative
studies, research control is an approach to addressing bias. Research control is
used to hold constant outside influences on the dependent variable so that the
relationship between the independent and dependent variables can be better
129
understood.
Researchers seek to control confounding (or extraneous) variables—variables
that are extraneous to the purpose of a specific study.
For quantitative researchers, randomness—having certain features of the study
established by chance—is a powerful tool to eliminate bias.
Blinding (or masking) is sometimes used to avoid biases stemming from
participants’ or research agents’ awareness of study hypotheses or research status.
Reflexivity, the process of reflecting critically on the self and of scrutinizing
personal values that could affect data collection and interpretation, is an important
tool in qualitative research.
Generalizability in a quantitative study concerns the extent to which the findings
can be applied to other groups and settings.
A similar concept in qualitative studies is transferability, the extent to which
qualitative findings can be transferred to other settings. One mechanism for
promoting transferability is a rich and thorough description of the research context
so that others can make inferences about contextual similarities.
REFERENCES FOR CHAPTER 4
Choquette, A., Rennick, J., & Lee, V. (2015). Back to school after cancer treatment: Making sense of the
adolescent experience. Cancer Nursing. Advance online publication.
*da Silva, N., Chaves, É., de Carvalho, E., Carvalho, L., & Iunes, D. (2015). Foot reflexology in feet impairment of
people with type 2 diabetes mellitus: Randomized trial. Revista Latino-Americana de Enfermagem, 23, 603–
610.
Larimer, K., Durmus, J., & Florez, E. (2015). Experiences of young adults with pacemakers and/or implantable
cardioverter defibrillators. Journal of Cardiovascular Nursing. Advance online publication.
Lincoln, Y. S., & Guba, E. G. (1985). Naturalistic inquiry. Newbury Park, CA: Sage.
Montreuil, M., Butler, K., Stachura, M., & Pugnaire-Gros, C. (2015). Exploring helpful nursing care in pediatric
mental health settings: The perceptions of children with suicide risk factors and their parents. Issues in Mental
Health Nursing, 36, 849–859.
Olsen, I., Jensen, S., Larsen, L., & Sørensen, E. (2016). Adolescents’ lived experiences while hospitalized after
surgery for ulcerative colitis. Gastroenterology Nursing, 39, 287–296.
Park, H., Chun, Y., & Gang, M. (2015). Effects of the Patient-Centered Environment Program on behavioral and
emotional problems in home-dwelling patients with dementia. Journal of Gerontological Nursing, 41, 40–48.
Sanon, M. A., Spigner, C., & McCullagh, M. C. (2016). Transnationalism and hypertension self-management
among Haitian immigrants. Journal of Transcultural Nursing, 27, 147–156.
**Sun, J. L., & Lin, C. C. (2016). Relationships among daytime napping and fatigue, sleep quality, and quality of
life in cancer patients. Cancer Nursing. Advance online publication.
*A link to this open-access article is provided in the Internet Resources section on website.
**This journal article is available on for this chapter.
130
5 Ethics in Research
Learning Objectives
On completing this chapter, you will be able to:
Discuss the historical background that led to the creation of various codes of ethics
Understand the potential for ethical dilemmas stemming from conflicts between ethics
and research demands
Identify the three primary ethical principles articulated in the Belmont Report and the
important dimensions encompassed by each
Identify procedures for adhering to ethical principles and protecting study participants
Given sufficient information, evaluate the ethical dimensions of a research report
Define new terms in the chapter
Key Terms
Anonymity
Assent
Belmont Report
Beneficence
Certificate of Confidentiality
Code of ethics
Confidentiality
Consent form
Debriefing
Ethical dilemma
Full disclosure
Informed consent
Institutional Review Board (IRB)
Minimal risk
Risk/benefit assessment
Stipend
Vulnerable group
131
ETHICS AND RESEARCH
In any research with human beings or animals, researchers must address ethical issues.
Ethical concerns are especially prominent in nursing research because the line between
what constitutes the expected practice of nursing and the collection of research data
sometimes gets blurred. This chapter discusses ethical principles that should be kept in
mind when reading a study.
Historical Background
We might like to think that violations of moral principles among researchers occurred
centuries ago rather than recently, but this is not the case. The Nazi medical experiments
of the 1930s and 1940s are the most famous example of recent disregard for ethical
conduct. The Nazi program of research involved using prisoners of war and “racial
enemies” in medical experiments. The studies were unethical not only because they
exposed people to harm but also because subjects could not refuse participation.
There are more recent examples. For instance, between 1932 and 1972, the
Tuskegee Syphilis Study, sponsored by the U.S. Public Health Service, investigated the
effects of syphilis among 400 poor African American men. Medical treatment was
deliberately withheld to study the course of the untreated disease. It was revealed in
1993 that U.S. federal agencies had sponsored radiation experiments since the 1940s on
hundreds of people, many of them prisoners or elderly hospital patients. And, in 2010, it
was revealed that a U.S. doctor who worked on the Tuskegee study inoculated prisoners
in Guatemala with syphilis in the 1940s. Other examples of studies with ethical
transgressions have emerged to give ethical concerns the high visibility they have today.
Codes of Ethics
In response to human rights violations, various codes of ethics have been developed.
The ethical standards known as the Nuremberg Code were developed in 1949 in
response to the Nazi atrocities. Several other international standards have been
developed, including the Declaration of Helsinki, which was adopted in 1964 by the
World Medical Association and was most recently revised in 2013.
Most disciplines, such as medicine and nursing, have established their own code of
ethics. In the United States, the American Nurses Association (ANA) issued Ethical
Guidelines in the Conduct, Dissemination, and Implementation of Nursing Research in
1995 (Silva, 1995). The ANA, which declared 2015 the Year of Ethics, published a
revised Code of Ethics for Nurses with Interpretive Statements, a document that not
only covers ethical issues for practicing nurses primarily but also includes principles
that apply to nurse researchers. In Canada, the Canadian Nurses Association published
the third edition of its Ethical Research Guidelines for Registered Nurses in 2002. And,
the International Council of Nurses (ICN) developed the ICN Code of Ethics for Nurses,
which was updated in 2012.
132
TIP Many useful websites are devoted to ethics and research, links to some
of which are listed in the Internet Resources for this chapter on
website.
Government Regulations for Protecting Study
Participants
Governments throughout the world fund research and establish rules for adhering to
ethical principles. In the United States, an important code of ethics was adopted by the
National Commission for the Protection of Human Subjects of Biomedical and
Behavioral Research. The commission issued a report in 1978, known as the Belmont
Report, which provided a model for many guidelines adopted by disciplinary
organizations in the United States. The Belmont Report also served as the basis for
regulations affecting research sponsored by the U.S. government, including studies
supported by the National Institute of Nursing Research (NINR). The U.S. ethical
regulations have been codified at Title 45 Part 46 of the Code of Federal Regulations
and were revised most recently in 2005.
Ethical Dilemmas in Conducting Research
Research that violates ethical principles typically occurs because a researcher believes
that knowledge is potentially beneficial in the long run. For research problems,
participants’ rights and study quality are put in direct conflict, posing ethical dilemmas
for researchers. Here are examples of research problems in which the desire for rigor
conflicts with ethical considerations:
1. Research question: Does a new medication prolong life in patients with AIDS?
Ethical dilemma: The best way to test the effectiveness of an intervention is to
administer the intervention to some participants but withhold it from others to see if
the groups have different outcomes. However, if the intervention is untested (e.g., a
new drug), the group receiving the intervention may be exposed to potentially
hazardous side effects. On the other hand, the group not receiving the drug may be
denied a beneficial treatment.
2. Research question: Are nurses equally empathic in their treatment of male and
female patients in the intensive care unit (ICU)?
Ethical dilemma: Ethics require that participants be aware of their role in a study.
Yet, if the researcher informs nurse participants that their empathy in treating male
and female ICU patients will be scrutinized, will their behavior be “normal”? If the
nurses’ usual behavior is altered because of the known presence of research
observers, then the findings will be inaccurate.
3. Research question: How do parents cope when their children have a terminal illness?
133
Ethical dilemma: To answer this question, the researcher may need to probe into
parents’ psychological state at a vulnerable time, yet knowledge of the parents’
coping mechanisms might help to design effective ways of addressing parents’ grief
and stress.
4. Research question: What is the process by which adult children adapt to the day-to-
day stress of caring for a parent with Alzheimer’s disease?
Ethical dilemma: Sometimes, especially in qualitative studies, a researcher may get
so close to participants that they become willing to share “secrets” and privileged
information. Interviews can become confessions—sometimes of unseemly or illegal
behavior. In this example, suppose a woman admitted to physically abusing her
mother—how does the researcher respond to that information without undermining a
pledge of confidentiality? And, if the researcher divulges the information to
authorities, how can a pledge of confidentiality be given in good faith to other
participants?
As these examples suggest, researchers are sometimes in a bind. Their goal is to
develop high-quality evidence for practice, but they must also adhere to rules for
protecting human rights. Another type of dilemma may arise if nurse researchers face
conflict-of-interest situations, in which their expected behavior as nurses conflicts with
standard research behavior (e.g., deviating from a research protocol to assist a patient).
It is precisely because of such dilemmas that codes of ethics are needed to guide
researchers’ efforts.
ETHICAL PRINCIPLES FOR PROTECTING STUDY
PARTICIPANTS
The Belmont Report articulated three primary ethical principles on which standards of
ethical research conduct are based: beneficence, respect for human dignity, and justice.
We briefly discuss these principles and then describe methods researchers use to comply
with them.
Beneficence
Beneficence imposes a duty on researchers to minimize harm and maximize benefits.
Human research should be intended to produce benefits for participants or, more
typically, for others. This principle covers multiple aspects.
The Right to Freedom From Harm and Discomfort
Researchers have an obligation to prevent or minimize harm in studies with humans.
Participants must not be subjected to unnecessary risks of harm or discomfort, and their
participation in research must be necessary for achieving societally important aims. In
research with humans, harm and discomfort can be physical (e.g., injury), emotional
134
(e.g., stress), social (e.g., loss of social support), or financial (e.g., loss of wages).
Ethical researchers must use strategies to minimize all types of harms and discomforts,
even ones that are temporary.
Protecting human beings from physical harm is often straightforward, but it may be
more difficult to address psychological issues. For example, participants may be asked
questions about their personal lives. Such queries might lead people to reveal deeply
personal information. The need for sensitivity may be greater in qualitative studies,
which often involve in-depth exploration into highly personal areas. Researchers need to
be aware of the nature of the intrusion on people’s psyches.
The Right to Protection From Exploitation
Involvement in a study should not place participants at a disadvantage. Participants need
to be assured that their participation, or information they provide, will not be used
against them. For example, people describing their economic situation should not risk
loss of public health benefits; people reporting drug abuse should not fear being
reported for a crime.
Study participants enter into a special relationship with researchers, and this
relationship should not be exploited. Because nurse researchers may have a nurse–
patient (in addition to a researcher–participant) relationship, special care may be needed
to avoid exploiting that bond. Patients’ consent to participate in a study may result from
their understanding of the researcher’s role as nurse, not as researcher.
In qualitative research, psychological distance between researchers and participants
often declines as the study progresses. The emergence of a pseudotherapeutic
relationship is not uncommon, which could create additional risks that exploitation
could inadvertently occur. On the other hand, qualitative researchers often are in a better
position than quantitative researchers to do good, rather than just to avoid doing harm,
because of the close relationships they develop with participants.
Example of therapeutic research experiences
Beck et al. (2015) found that some participants in their study on secondary traumatic
stress among certified nurse-midwives told the researchers that writing about the
traumatic births they had attended was therapeutic for them. One participant wrote, “I
think it’s fascinating how little respect our patients and coworkers give to the
traumatic experiences we suffer. It is healing to be able to write out my experiences
in this study and actually have researchers interested in studying this topic.”
Respect for Human Dignity
Respect for human dignity is the second ethical principle in the Belmont Report. This
principle includes the right to self-determination and the right to full disclosure.
135
The Right to Self-Determination
The principle of self-determination means that prospective participants have the right to
decide voluntarily whether to participate in a study, without risking prejudicial
treatment. It also means that people have the right to ask questions, refuse answering
questions, and drop out of the study.
A person’s right to self-determination includes freedom from coercion. Coercion
involves explicit or implicit threats of penalty from failing to participate in a study or
excessive rewards from agreeing to participate. The issue of coercion requires careful
thought when researchers are in a position of authority or influence over potential
participants, as might be the case in a nurse–patient relationship. Coercion can be subtle.
For example, a generous monetary incentive (or stipend) to encourage the participation
of a low-income group (e.g., the homeless) might be considered mildly coercive
because such incentives may be seen as a form of pressure.
The Right to Full Disclosure
Respect for human dignity encompasses people’s right to make informed decisions
about study participation, which requires full disclosure. Full disclosure means that the
researcher has fully described the study, the person’s right to refuse participation, and
potential risks and benefits. The right to self-determination and the right to full
disclosure are the two elements on which informed consent (discussed later in this
chapter) is based.
Full disclosure is not always straightforward because it can create biases and sample
recruitment problems. Suppose we were testing the hypothesis that high school students
with a high absentee rate are more likely to be substance abusers than students with
good attendance. If we approached potential participants and fully explained the study’s
purpose, some students might refuse to participate, and nonparticipation would be
selective; students who are substance abusers—the group of primary interest—might be
least likely to participate. Moreover, by knowing the study purpose, those who
participate might not give candid responses. In such a situation, full disclosure could
undermine the study.
In such situations, researchers sometimes use covert data collection (concealment),
which is collecting data without participants’ knowledge and thus without their consent.
This might happen if a researcher wanted to observe people’s behavior and was worried
that doing so openly would change the behavior of interest. Researchers might choose
to obtain needed information through concealed methods, such as observing while
pretending to be engaged in other activities.
A more controversial technique is the use of deception, which can involve
deliberately withholding information about the study or providing participants with
false information. For example, in studying high school students’ use of drugs, we
might describe the research as a study of students’ health practices, which is a mild form
of misinformation.
136
Deception and concealment are problematic ethically because they interfere with
people’s right to make truly informed decisions about personal costs and benefits of
participation. Some people think that deception is never justified, but others believe that
if the study involves minimal risk yet offers benefits to society, then deception may be
acceptable.
Full disclosure has emerged as a concern in connection with data collected over the
Internet (e.g., analyzing the content of messages posted to blogs or social media sites).
The issue is whether such messages can be used as data without the authors’ consent.
Some researchers believe that anything posted electronically is in the public domain, but
others feel that the same ethical standards must apply in cyberspace research and that
researchers must carefully protect the rights of individuals who are participants in
“virtual” communities.
Justice
The third principle articulated in the Belmont Report concerns justice, which includes
participants’ right to fair treatment and their right to privacy.
The Right to Fair Treatment
One aspect of justice concerns the equitable distribution of benefits and burdens of
research. The selection of participants should be based on research requirements and not
on people’s vulnerabilities. For example, groups with lower social standing (e.g.,
prisoners) have sometimes been selected as study participants, raising ethical concerns.
Potential discrimination is another aspect of distributive justice. During the 1990s, it
was found that women and minorities were being excluded from many clinical studies.
In the United States, this led to regulations requiring that researchers who seek funding
from the National Institutes of Health (including NINR) include women and minorities
as study participants.
The right to fair treatment encompasses other obligations. For example, researchers
must treat people who decline to participate in a study in a nonprejudicial manner, they
must honor all agreements made with participants, they must show respect for the
beliefs of people from different backgrounds, and they must treat participants
courteously and tactfully at all times.
The Right to Privacy
Research with humans involves intrusions into people’s lives. Researchers should
ensure that their research is not more intrusive than it needs to be and that privacy is
maintained. Participants have the right to expect that any data they provide will be kept
in strict confidence.
Privacy issues have become even more salient in the U.S. health care community
since the passage of the Health Insurance Portability and Accountability Act of 1996
(HIPAA), which articulates federal standards to protect patients’ medical records and
137
health information. For health care providers who transmit health information
electronically, compliance with HIPAA regulations (the Privacy Rule) has been
required since 2003.
PROCEDURES FOR PROTECTING STUDY
PARTICIPANTS
Now that you are familiar with ethical principles for conducting research, you need to
understand the procedures researchers use to adhere to them. It is these procedures that
should be evaluated in critiquing the ethical aspects of a study.
TIP Information about ethical considerations is usually presented in the
method section of a research report, often in a subsection labeled
“Procedures.”
Risk/Benefit Assessments
One strategy that researchers use to protect participants is to conduct a risk/benefit
assessment. Such an assessment is designed to evaluate whether the benefits of
participating in a study are in line with the costs—i.e., whether the risk/benefit ratio is
acceptable. Box 5.1 summarizes major costs and benefits of research participation to
study participants. Benefits to society and to nursing should also be taken into account.
The selection of a significant topic that has the potential to improve patient care is the
first step in ensuring that research is ethical.
Box 5.1 Potential Benefits and Risks of Research to Participants
Major Potential Benefits to Participants
Access to a potentially beneficial intervention that might otherwise be unavailable
Reassurance in being able to discuss their situation or problem with a friendly,
objective person
Increased knowledge about themselves or their conditions
Escape from normal routine
Satisfaction that information they provide may help others with similar problems
Direct gains through stipends or other incentives
Major Potential Risks to Participants
Physical harm, including unanticipated side effects
Physical discomfort, fatigue, or boredom
Emotional distress from self-disclosure, discomfort with strangers, embarrassment
relating to questions being asked
138
Social risks, such as the risk of stigma, negative effects on personal relationships
Loss of privacy
Loss of time
Monetary costs (e.g., for transportation, child care, time lost from work)
TIP In evaluating the risk/benefit ratio of a study, you might want to
consider how comfortable you would have felt about being a study
participant.
In some cases, risks may be negligible. Minimal risk is a risk expected to be no
greater than those ordinarily encountered in daily life or during routine procedures.
When the risks are not minimal, researchers must proceed with caution, taking every
step possible to reduce risks and maximize benefits.
Informed Consent
An important procedure for safeguarding participants involves obtaining their informed
consent. Informed consent means that participants have adequate information about the
study, comprehend the information, and have the power of free choice, enabling them to
consent to or decline participation voluntarily.
Researchers usually document informed consent by having participants sign a
consent form. This form includes information about the study purpose, specific
expectations regarding participation (e.g., how much time will be required), the
voluntary nature of participation, and potential costs and benefits.
TIP The chapter supplement on website provides additional
information about the content of informed consent forms as well as an actual
example from a study by one of the book’s authors (Beck).
Example of informed consent
Kelley and coresearchers (2015) studied the evolution of case management services
for U.S. service members injured in Iraq and Afghanistan. A total of 235 nurses were
interviewed on patient care experiences. Written informed consent was obtained from
study participants. The consent form outlined information pertaining to the
divulgence of illegal activities. Prior to each interview, investigators reminded
participants not to divulge information that might be interpreted as sensitive or
classified.
Researchers may not obtain written informed consent when data collection is
through self-administered questionnaires. Researchers often assume implied consent
(i.e., the return of a completed questionnaire implies the person’s consent to participate).
139
In qualitative studies that involve repeated data collection, it may be difficult to
obtain meaningful consent at the outset. Because the design emerges during the study,
researchers may not know what the risks and benefits will be. In such situations, consent
may be an ongoing process, called process consent, in which consent is continuously
renegotiated.
Confidentiality Procedures
Study participants have the right to expect that the data they provide will be kept in
strict confidence. Participants’ right to privacy is protected through confidentiality
procedures.
Anonymity
Anonymity, the most secure means of protecting confidentiality, occurs when the
researcher cannot link participants to their data. For example, if questionnaires were
distributed to a group of nursing home residents and were returned without any
identifying information, responses would be anonymous.
Example of anonymity
Melnyk and colleagues (2016) conducted a study to identify key factors that
influenced healthy lifestyle behaviors in 3,959 faculty and staff at one large
university. Participants completed an anonymous online survey that asked questions
about participants’ healthy lifestyle beliefs and behaviors and perceptions about the
wellness culture.
Confidentiality in the Absence of Anonymity
When anonymity is not possible, other confidentiality procedures need to be
implemented. A promise of confidentiality is a pledge that any information participants
provide will not be publicly reported in a manner that identifies them and will not be
made accessible to others.
Researchers can take a number of steps to ensure that a breach of confidentiality
does not occur. These include maintaining identifying information in locked files,
substituting identification (ID) numbers for participants’ names on records, and
reporting only aggregate data for groups of participants.
Confidentiality is especially salient in qualitative studies because of their in-depth
nature, yet anonymity is rarely possible. Qualitative researchers also face the challenge
of adequately disguising participants in their reports. Because the number of
respondents is small and because rich descriptive information is presented, qualitative
researchers must be especially vigilant in safeguarding participants’ identity.
TIP As a means of enhancing individual and institutional privacy, research
140
articles frequently avoid giving information about the locale of the study. For
example, a report might say that data were collected in a 200-bed, private
nursing home, without mentioning its name or location.
Confidentiality sometimes creates tension between researchers and legal authorities,
especially if participants engage in criminal activity like substance abuse. To avoid the
forced disclosure of information (e.g., through a court order), researchers in the United
States can apply for a Certificate of Confidentiality from the National Institutes of
Health. The certificate allows researchers to refuse to disclose information on study
participants in any legal proceeding.
Example of confidentiality procedures
Hayes (2015) studied the life patterns of incarcerated women. The 18 women who
participated selected pseudonyms for themselves. The interviews were conducted in
private rooms in the prison. The researcher made certain that the rooms did not have
cameras or microphones in them and that no correctional staff were nearby.
Debriefings and Referrals
Researchers should show respect for participants during the interactions they have with
them. For example, researchers should be polite and should make evident their tolerance
of cultural, linguistic, and lifestyle diversity.
Formal strategies for communicating respect for participants’ well-being are also
available. For example, it is sometimes advisable to offer debriefing sessions following
data collection so that participants can ask questions or share concerns. Researchers can
also demonstrate their interest in participants by offering to share study findings with
them after the data have been analyzed. Finally, researchers may need to assist
participants by making referrals to appropriate health, social, or psychological services.
Example of referrals
Holmes and colleagues (2015) studied the experience of seclusion in a forensic
psychiatric setting. The study involved in-depth interviews with 13 psychiatric
inpatients who had experienced a period of seclusion in the 6 months before the
interview. The researchers, aware of the sensitive nature of the research, made
arrangements to refer any distressed patients to the head nurse on the unit.
Treatment of Vulnerable Groups
Adherence to ethical standards is often straightforward. The rights of special vulnerable
groups, however, may need extra protections. Vulnerable populations may be incapable
of giving fully informed consent (e.g., cognitively impaired people) or may be at high
141
risk for unintended side effects (e.g., pregnant women). You should pay particular
attention to the ethical dimensions of a study when people who are vulnerable are
involved. Among the groups that should be considered as being vulnerable are the
following:
Children. Legally and ethically, children do not have the competence to give informed
consent, and so the consent of children’s parents or guardians should be obtained.
However, it is appropriate—especially if the child is at least 7 years of age—to
obtain the child’s assent as well. Assent refers to the child’s affirmative agreement to
participate.
Mentally or emotionally disabled people. Individuals whose disability makes it
impossible for them to make informed decisions (e.g., people in a coma) also cannot
legally provide informed consent. In such cases, researchers should obtain the
consent of a legal guardian.
Severely ill or physically disabled people. For patients who are very ill or undergoing
certain treatments (e.g., mechanical ventilation), it might be necessary to assess their
ability to make reasoned decisions about study participation.
The terminally ill. Terminally ill people can seldom expect to benefit personally from
research, and thus the risk/benefit ratio needs to be carefully assessed.
Institutionalized people. Nurses often conduct studies with hospitalized or
institutionalized people (e.g., prisoners) who might feel that their care would be
jeopardized by failure to cooperate. Researchers studying institutionalized groups
need to emphasize the voluntary nature of participation.
Pregnant women. The U.S. government has issued additional requirements governing
research with pregnant women and fetuses. These requirements reflect a desire to
safeguard both the pregnant woman, who may be at heightened physical or
psychological risk, and the fetus, who cannot give informed consent.
Example of research with a vulnerable group
Knutsson and Bergbom (2016) studied 28 children’s thoughts and feelings related to
visiting critically ill relatives in an adult ICU. The custodians of the children signed
an informed consent form. In addition, prior to the start of the interviews with the
children, the researcher asked the children if they wanted to participate.
External Reviews and the Protection of Human Rights
Researchers may not be objective in developing procedures to protect participants’
rights. Biases may arise from their commitment to an area of knowledge and their desire
to conduct a rigorous study. Because of the risk of a biased evaluation, the ethical
dimensions of a study are usually subjected to external review.
Most hospitals, universities, and other institutions where research is conducted have
established formal committees for reviewing research plans. These committees are
142
sometimes called human subjects committees or (in Canada) Research Ethics Boards. In
the United States, the committee is often called an Institutional Review Board (IRB).
Before undertaking a study, researchers must submit research plans to the IRB and must
also undergo formal IRB training. An IRB can approve the proposed plans, require
modifications, or disapprove them.
Example of IRB approval
Fishering and colleagues (2016) studied the experience of neonatal intensive care
(NICU) nurses who themselves became NICU mothers. The procedures and protocols
for the study were approved by the Washington University Medical School’s IRB.
Ethical Issues in Using Animals in Research
Some nurse researchers who focus on biophysiologic phenomena use animals as their
subjects. Ethical considerations are clearly different for animals and humans; for
example, informed consent is not relevant for animals. In the United States, the Public
Health Service has issued a policy statement on the humane care and use of animals.
The guidelines articulate principles for the proper care and treatment of animals used in
research, covering such issues as the transport of research animals, pain and distress in
animal subjects, the use of appropriate anesthesia, and euthanizing animals under
certain conditions during or after the study.
Example of research with animals
Moes and Holden (2014) studied changes in spontaneous activity and skeletal muscle
mass with rats that had received chronic constriction injury surgery. The University
of Michigan’s Committee for the Use and Care of Animals approved all procedures,
and the study adhered to guidelines of the Association for Assessment and
Accreditation of Laboratory Animal Care.
CRITIQUING THE ETHICAL ASPECTS OF A STUDY
Guidelines for critiquing the ethical aspects of a study are presented in Box 5.2.
Members of an IRB or human subjects committee are provided with sufficient
information to answer all these questions, but research articles do not always include
detailed information about ethics because of space constraints in journals. Thus, it may
be difficult to critique researchers’ adherence to ethical guidelines. Nevertheless, we
offer a few suggestions for considering ethical issues.
Box 5.2 Guidelines for Critiquing the Ethical Aspects of a Study
1. Was the study approved and monitored by an Institutional Review Board,
143
Research Ethics Board, or other similar ethics review committee?
2. Were study participants subjected to any physical harm, discomfort, or
psychological distress? Did the researchers take appropriate steps to remove or
prevent harm?
3. Did the benefits to participants outweigh any potential risks or actual discomfort
they experienced? Did the benefits to society outweigh the costs to participants?
4. Was any type of coercion or undue influence used to recruit participants? Did they
have the right to refuse to participate or to withdraw without penalty?
5. Were participants deceived in any way? Were they fully aware of participating in a
study, and did they understand the purpose and nature of the research?
6. Were appropriate informed consent procedures used with all participants? If not,
were the reasons valid and justifiable?
7. Were adequate steps taken to safeguard participants’ privacy? How was
confidentiality maintained? Was a Certificate of Confidentiality obtained—and, if
not, should one have been obtained?
8. Were vulnerable groups involved in the research? If yes, were special precautions
instituted because of their vulnerable status?
9. Were groups omitted from the inquiry without a justifiable rationale, such as
women (or men) or minorities?
Many research reports do acknowledge that the study procedures were reviewed by
an IRB or human subjects committee. When a report mentions a formal review, it is
usually safe to assume that a panel of concerned people thoroughly reviewed ethical
issues raised by the study.
You can also come to some conclusions based on a description of the study
methods. There may be sufficient information to judge, for example, whether study
participants were subjected to harm or discomfort. Reports do not always state whether
informed consent was secured, but you should be alert to situations in which the data
could not have been gathered as described if participation were purely voluntary (e.g., if
data were gathered unobtrusively).
In thinking about the ethical aspects of a study, you should also consider who the
study participants were. For example, if the study involves vulnerable groups, there
should be more information about protective procedures. You might also need to attend
to who the study participants were not. For example, there has been considerable
concern about the omission of certain groups (e.g., minorities) from clinical research.
Brief summaries of a quantitative and a qualitative nursing study are presented
in the following sections. Read the research summaries and then answer the
critical thinking questions about the ethical aspects of the studies that follow,
144
referring to the full research report if necessary. Examples 1 and 2 are featured
on the interactive Critical Thinking Activity on website. The critical
thinking questions for Examples 3 and 4 are based on the studies that appear
in their entirety in Appendices A and B of this book. Our comments for these
exercises are in the Student Resources section on .
EXAMPLE 1: QUANTITATIVE RESEARCH
Study: Family typology and appraisal of preschoolers’ behavior by female
caregivers (Coke & Moore, 2015)
Study Purpose: The purpose of the study was to explore family factors
associated with appraisal of a child’s behavior by a primary female caregiver,
the extent to which the caregiver’s appraisal is distorted, and the child’s risk
of having a behavior problem.
Research Methods: Data were collected by means of a questionnaire
completed by female family caregivers of 117 preschoolers who attended a
rural Head Start preschool program for low-income families. The
questionnaires, which took about 30 minutes to complete, included questions
about caregiver stress, appraisal and ratings of children’s behaviors, and
social support. No participant needed assistance in completing the
questionnaire due to reading or language problems. The researchers decided
to focus on female caregivers “because recruitment of male caregivers of
young children is problematic” (p. 446). The sample of caretakers included
African American (83%), White (15%), Hispanic (2%), and Native American
(1%) women.
Ethics-Related Procedures: The caretakers were recruited during a parent–
child field day and a parent–teacher orientation at the Head Start program.
The lead researcher met with all volunteering caretakers. Each participant was
assigned a unique ID number to protect her identity, and the listing that linked
the participant to the ID number was kept separate from the questionnaires
under lock and key. After completing the questionnaire, each participant was
given a gift bag with a $5 gift card to a local store and health-related
education materials for the children. The study was approved by the County
Board of Education and the IRB of the researchers’ university prior to
recruitment.
Key Findings: Distortion of the caregiver’s rating of her child’s behavior
was associated with a higher risk of having a child with behavioral problems.
Vulnerable families were significantly more likely to have a child with high
risk of behavior problems than families classified as secure.
Critical Thinking Exercises
145
1. Answer the relevant questions from Box 5.2 regarding this study.
2. Also consider the following targeted questions:
a. Could the data for this study have been collected anonymously?
b. Comment on the appropriateness of the participant stipend in this study.
3. If the results of this study are valid and generalizable, what might be some
of the uses to which the findings could be put in clinical practice?
EXAMPLE 2: QUALITATIVE RESEARCH
Study: Grief interrupted: The experience of loss among incarcerated women
(Harner et al., 2011)
Study Purpose: The purpose of the study was to explore the experiences of
grief among incarcerated women following the loss of a loved one.
Study Methods: The researchers used phenomenological methods in this
study. They recruited 15 incarcerated women who had experienced the loss of
a loved one during their confinement. In-depth interviews about the women’s
experience of loss lasted 1 to 2 hours.
Ethics-Related Procedures: The researchers recruited women by posting
flyers in the prison’s dayroom. The flyers were written at the 4.5 grade level.
Because the first author was a nurse practitioner at the prison, the researchers
used several strategies to “diffuse any perceived coercion” (p. 457), such as
not posting flyers near the health services unit and not offering any monetary
or work-release incentives to participate. Written informed consent was
obtained, but because of high rates of illiteracy, the informed consent
document was read aloud to all potential participants. During the consent
process, and during the interviews, the women were given opportunities to
ask questions. They were informed that participation would have no effect on
sentence length, sentence structure, parole, or access to health services. They
were also told they could end the interview at any time without fear of
reprisals. Furthermore, they were told that the researcher was a mandated
reporter and would report any indication of suicidal or homicidal ideation.
Participants were not required to give their names to the research team.
During the interview, efforts were made to create a welcoming and
nonthreatening environment. The research team received approval for their
study from a university IRB and from the Department of Corrections
Research Division.
Key Findings: The researchers revealed four themes, which they referred to
as existential lifeworlds: Temporality: frozen in time; Spatiality: no place, no
space to grieve; Corporeality: buried emotions; and Relationality: never alone
yet feeling so lonely.
146
Critical Thinking Exercises
1. Answer the relevant questions from Box 5.2 regarding this study.
2. Also consider the following targeted questions:
a. The researchers did not offer any stipend—was this ethically
appropriate?
b. Might the researchers have benefited from obtaining a Certificate of
Confidentiality for this research?
3. If the results of this study are trustworthy and transferable, what might be
some of the uses to which the findings could be put in clinical practice?
EXAMPLE 3: QUANTITATIVE RESEARCH IN APPENDIX A
• Read the methods section of Swenson and colleagues’ (2016) study
(“Parents’ use of praise and criticism in a sample of young children
seeking mental health services”) in Appendix A of this book.
Critical Thinking Exercises
1. Answer the relevant questions from Box 5.2 regarding this study.
2. Also consider the following targeted questions:
a. Where was information about ethical issues located in this report?
b. What additional information regarding the ethical aspects of their study
could the researchers have included in this article?
EXAMPLE 4: QUALITATIVE RESEARCH IN APPENDIX B
• Read the methods section of Beck and Watson’s (2010) study
(“Subsequent childbirth after a previous traumatic birth”) in Appendix B
of this book.
Critical Thinking Exercises
1. Answer the relevant questions from Box 5.2 regarding this study.
2. Also consider the following targeted questions:
a. Where was information about the ethical aspects of this study located in
the report?
b. What additional information regarding the ethical aspects of Beck and
Watson’s study could the researchers have included in this article?
WANT TO KNOW MORE?
A wide variety of resources to enhance your learning and understanding of
this chapter are available on .
• Interactive Critical Thinking Activity
147
• Chapter Supplement on Informed Consent
• Answers to the Critical Thinking Exercises for Examples 3 and 4
• Internet Resources with useful websites for Chapter 5
• A Wolters Kluwer journal article in its entirety—the study described as
Example 1 on pp. 87–88.
Additional study aids, including eight journal articles and related
questions, are also available in Study Guide for Essentials of Nursing
Research, 9e.
Summary
Points
Because research has not always been conducted ethically and because of genuine
ethical dilemmas that researchers face in designing studies that are both ethical
and rigorous, codes of ethics have been developed to guide researchers.
Three major ethical principles from the Belmont Report are incorporated into many
guidelines: beneficence, respect for human dignity, and justice.
Beneficence involves the performance of some good and the protection of
participants from physical and psychological harm and exploitation.
Respect for human dignity involves the participants’ right to self-determination,
which includes participants’ right to participate in a study voluntarily.
Full disclosure means that researchers have fully described to prospective
participants their rights and the costs and benefits of the study. When full
disclosure poses the risk of biased results, researchers sometimes use concealment
(the collection of information without participants’ knowledge) or deception
(withholding information or providing false information).
Justice includes the right to fair treatment and the right to privacy. In the United
States, privacy has become a major issue because of the Privacy Rule regulations
that resulted from the Health Insurance Portability and Accountability Act
(HIPAA).
Procedures have been developed to safeguard study participants’ rights, including
the performance of a risk/benefit assessment, the implementation of informed
consent procedures, and taking steps to safeguard participants’ confidentiality.
In a risk/benefit assessment, the potential benefits of the study to individual
participants and to society are weighed against the costs to individuals.
Informed consent procedures, which provide prospective participants with
148
information needed to make a reasoned decision about participation, normally
involve signing a consent form to document voluntary and informed participation.
Privacy can be maintained through anonymity (wherein not even researchers
know participants’ identities) or through formal confidentiality procedures that
safeguard the participants’ data.
Some U.S. researchers obtain a Certificate of Confidentiality that protects them
against the forced disclosure of confidential information through a court order.
Researchers sometimes offer debriefing sessions after data collection to provide
participants with more information or an opportunity to air complaints.
Vulnerable groups require additional protection. These people may be vulnerable
because they are not able to make an informed decision about study participation
(e.g., children), because of diminished autonomy (e.g., prisoners), or because their
circumstances heighten the risk of harm (e.g., pregnant women, the terminally ill).
External review of the ethical aspects of a study by a human subjects committee or
Institutional Review Board (IRB) is highly desirable and is often required by
universities and organizations from which participants are recruited.
REFERENCES FOR CHAPTER 5
American Nurses Association. (2015). Code of ethics for nurses with interpretive statements (2nd ed.). Silver
Spring, MD: Author.
Beck, C. T., LoGiudice, J., & Gable, R. K. (2015). A mixed-methods study of secondary traumatic stress in
certified nurse-midwives: Shaken belief in the birth process. Journal of Midwifery & Women’s Health, 60, 16–
23.
Canadian Nurses Association. (2002). Ethical research guidelines for registered nurses (3rd ed.). Ottawa, Canada:
Author.
**Coke, S. P., & Moore, L. (2015). Family typology and appraisal of preschoolers’ behavior by female caregivers.
Nursing Research, 64, 444–451.
Fishering, R., Broeder, J., & Donze, A. (2016). A qualitative study: NICU nurses as NICU parents. Advances in
Neonatal Care, 16, 74–86.
*Harner, H., Hentz, P., & Evangelista, M. (2011). Grief interrupted: The experience of loss among incarcerated
women. Qualitative Health Research, 21, 454–464.
Hayes, M. O. (2015). The life pattern of incarcerated women: The complex and interwoven lives of trauma, mental
illness, and substance abuse. Journal of Forensic Nursing, 11, 214–222.
Holmes, D., Murray, S., & Knack, N. (2015). Experiencing seclusion in a forensic psychiatric setting: A
phenomenological study. Journal of Forensic Nursing, 11, 200–213.
Kelley, P. W., Kenny, D., Gordon, D., & Benner, P. (2015). The evolution of case management for service
members injured in Iraq and Afghanistan. Qualitative Health Research, 25, 426–439.
Knutsson, S., & Bergbom, I. (2016). Children’s thoughts and feelings related to visiting critically ill relatives in an
adult ICU: A qualitative study. Intensive and Critical Care Nursing, 32, 33–41.
Melnyk, B. M., Amaya, M., Szalacha, L. A., & Hoying, J. (2016). Relationships among perceived wellness culture,
healthy lifestyle beliefs, and healthy behaviors in university faculty and staff: Implications for practice and
future research. Western Journal of Nursing Research, 38, 308–324.
Moes, J., & Holden, J. (2014). Characterizing activity and muscle atrophy changes in rats with neuropathic pain: A
pilot study. Biological Research for Nursing, 16, 16–22.
Silva, M. C. (1995). Ethical guidelines in the conduct, dissemination, and implementation of nursing research.
Washington, DC: American Nurses Association.
*A link to this open-access article is provided in the Internet Resources section on website.
149
**This journal article is available on for this chapter.
150
Part 2 Preliminary Steps in Quantitative and
Qualitative Research
6 Research Problems, Research
Questions, and Hypotheses
Learning Objectives
On completing this chapter, you will be able to:
Describe the process of developing and refining a research problem
Distinguish the functions and forms of statements of purpose and research questions
for quantitative and qualitative studies
Describe the function and characteristics of research hypotheses
Critique statements of purpose, research questions, and hypotheses in research reports
with respect to their placement, clarity, wording, and significance
Define new terms in the chapter
Key Terms
Directional hypothesis
Hypothesis
Nondirectional hypothesis
Null hypothesis
Problem statement
Research hypothesis
Research problem
Research question
Statement of purpose
OVERVIEW OF RESEARCH PROBLEMS
151
Studies begin in much the same fashion as an evidence-based practice (EBP) effort—as
problems that need to be solved or questions that need to be answered. This chapter
discusses research problems and research questions. We begin by clarifying some terms.
Basic Terminology
Researchers begin with a topic on which to focus. Examples of research topics are
claustrophobia during magnetic resonance imaging (MRI) tests and pain management
for sickle cell disease. Within broad topic areas are many possible research problems. In
this section, we illustrate various terms using the topic side effects of chemotherapy.
A research problem is an enigmatic or troubling condition. The purpose of
research is to “solve” the problem—or to contribute to its solution—by gathering
relevant data. A problem statement articulates the problem and an argument that
explains the need for a study. Table 6.1 presents a simplified problem statement related
to the topic of side effects of chemotherapy.
Many reports provide a statement of purpose (or purpose statement), which is a
summary of an overall goal. Sometimes the words aim or objective are used in lieu of
purpose. Research questions are the specific queries researchers want to answer.
Researchers who make specific predictions about answers to research questions pose
hypotheses that are then tested.
These terms are not always consistently defined in research textbooks. Table 6.1
illustrates the interrelationships among terms as we define them.
Research Problems and Paradigms
Some research problems are better suited to qualitative versus quantitative inquiry.
Quantitative studies usually involve concepts that are well developed and for which
methods of measurement have been (or can be) developed. For example, a quantitative
study might be undertaken to assess whether people with chronic illness are more
depressed than people without a chronic illness. There are relatively good measures of
depression that would yield quantitative data about the level of depression in those with
152
and without a chronic illness.
Qualitative studies are undertaken because a researcher wants to develop a rich,
context-bound understanding of a poorly understood phenomenon. Qualitative methods
would not be well suited to comparing levels of depression among those with and
without chronic illness, but they would be ideal for exploring the meaning of depression
among chronically ill people. In evaluating a research report, one consideration is
whether the research problem is suitable for the chosen paradigm.
Sources of Research Problems
Where do ideas for research problems come from? At the most basic level, research
topics originate with researchers’ interests. Because research is a time-consuming
enterprise, curiosity about and interest in a topic are essential to a project’s success.
Research reports rarely indicate the source of researchers’ inspiration for a study,
but a variety of explicit sources can fuel their curiosity, such as nurses’ clinical
experience and readings in the nursing literature. Also, topics are sometimes suggested
by global social or political issues of relevance to the health care community (e.g.,
health disparities). Theories from nursing and other disciplines sometimes suggest a
research problem. Additionally, researchers who have developed a program of research
may get inspiration for “next steps” from their own findings or from a discussion of
those findings with others.
Example of a problem source for a quantitative study
Beck, one of this book’s authors, has developed a strong research program on
postpartum depression (PPD). Beck was approached by Dr. Carol Lammi-Keefe, a
professor in nutritional sciences and her PhD student, Michelle Judge, who had been
researching the effect of DHA (docosahexaemoic acid, a fat found in cold-water fish)
on fetal brain development. The literature suggested that DHA might play a role in
reducing the severity of PPD, and so these researchers collaborated in a project to test
the effectiveness of dietary supplements of DHA during pregnancy on the incidence
and severity of PPD. The researchers found that women in the DHA experimental
group had fewer symptoms of PPD compared to women who did not receive the
DHA intervention (Judge et al., 2014).
Development and Refinement of Research Problems
Developing a research problem is a creative process. Researchers often begin with
interests in a broad topic area and then develop a more specific researchable problem.
For example, suppose a hospital nurse begins to wonder why some patients complain
about having to wait for pain medication when certain nurses are assigned to them. The
general topic is differences in patients’ complaints about pain medications. The nurse
might ask, What accounts for this discrepancy? This broad question may lead to other
153
questions, such as How do the nurses differ? or What characteristics do patients with
complaints share? The nurse may then observe that the ethnic background of the
patients and nurses could be relevant. This may direct the nurse to look at the literature
on nursing behaviors and ethnicity, or it may lead to a discussion with peers. These
efforts may result in several research questions, such as the following:
What is the nature of patient complaints among patients of different ethnic
backgrounds?
Is the ethnic background of nurses related to the frequency with which they dispense
pain medication?
Does the number of patient complaints increase when patients are of dissimilar ethnic
backgrounds as opposed to when they are of the same ethnic background as nurses?
These questions stem from the same problem, yet each would be studied differently;
for example, some suggest a qualitative approach, and others suggest a quantitative one.
Both ethnicity and nurses’ dispensing behaviors are variables that can be measured
reliably. A qualitative researcher would be more interested in understanding the essence
of patients’ complaints, patients’ experience of frustration, or the process by which the
problem got resolved. These aspects of the problem would be difficult to measure.
Researchers choose a problem to study based on its inherent interest to them and on its
fit with a paradigm of preference.
COMMUNICATING RESEARCH PROBLEMS AND
QUESTIONS
Every study needs a problem statement that articulates what is problematic and what
must be solved. Most research reports also present either a statement of purpose,
research questions, or hypotheses, and often, combinations of these three elements are
included.
Many students do not really understand problem statements and may have trouble
identifying them in a research article. A problem statement is presented early and often
begins with the first sentence after the abstract. Research questions, purpose statements,
or hypotheses appear later in the introduction.
Problem Statements
A good problem statement is a declaration of what it is that is problematic, what it is
that “needs fixing,” or what it is that is poorly understood. Problem statements,
especially for quantitative studies, often have most of the following six components:
1. Problem identification: What is wrong with the current situation?
2. Background: What is the nature of the problem, or the context of the situation, that
readers need to understand?
3. Scope of the problem: How big a problem is it, and how many people are affected?
154
4. Consequences of the problem: What is the cost of not fixing the problem?
5. Knowledge gaps: What information about the problem is lacking?
6. Proposed solution: How will the new study contribute to the solution of the
problem?
Let us suppose that our topic was humor as a complementary therapy for reducing
stress in hospitalized patients with cancer. One research question (discussed later in this
section) might be “What is the effect of nurses’ use of humor on stress and natural killer
cell activity in hospitalized cancer patients?” Box 6.1 presents a rough draft of a
problem statement for such a study. This problem statement is a reasonable draft, but it
could be improved.
Box 6.1 Draft Problem Statement on Humor and Stress
A diagnosis of cancer is associated with high levels of stress. Sizeable numbers of
patients who receive a cancer diagnosis describe feelings of uncertainty, fear, anger,
and loss of control. Interpersonal relationships, psychological functioning, and role
performance have all been found to suffer following cancer diagnosis and treatment.
A variety of alternative/complementary therapies have been developed in an effort
to decrease the harmful effects of cancer-related stress on psychological and
physiological functioning, and resources devoted to these therapies (money and staff)
have increased in recent years. However, many of these therapies have not been
carefully evaluated to assess their efficacy, safety, or cost-effectiveness. For example,
the use of humor has been recommended as a therapeutic device to improve quality of
life, decrease stress, and perhaps improve immune functioning, but the evidence to
justify its advocacy is scant.
Box 6.2 illustrates how the problem statement could be made stronger by adding
information about scope (component 3), long-term consequences (component 4), and
possible solutions (component 6). This second draft builds a more compelling argument
for new research: Millions of people are affected by cancer, and the disease has adverse
consequences not only for patients and their families but also for society. The revised
problem statement also suggests a basis for the new study by describing a possible
solution on which the new study might build.
Box 6.2 Some Possible Improvements to Problem Statement on Humor and
Stress
Each year, more than 1 million people are diagnosed with cancer, which remains one
of the top causes of death among both men and women (reference citations).*
Numerous studies have documented that a diagnosis of cancer is associated with high
levels of stress. Sizeable numbers of patients who receive a cancer diagnosis describe
feelings of uncertainty, fear, anger, and loss of control (citations). Interpersonal
155
relationships, psychological functioning, and role performance have all been found to
suffer following cancer diagnosis and treatment (citations). These stressful outcomes
can, in turn, adversely affect health, long-term prognosis, and medical costs among
cancer survivors (citations).
A variety of alternative/complementary therapies have been developed in an effort
to decrease the harmful effects of cancer-related stress on psychological and
physiological functioning, and resources devoted to these therapies (money and staff)
have increased in recent years (citations). However, many of these therapies have not
been carefully evaluated to assess their efficacy, safety, or cost-effectiveness. For
example, the use of humor has been recommended as a therapeutic device to improve
quality of life, decrease stress, and perhaps improve immune functioning (citations),
but the evidence to justify its advocacy is scant. Preliminary findings from a recent
small-scale endocrinology study with a healthy sample exposed to a humorous
intervention (citation), however, holds promise for further inquiry with immuno-
compromised populations.
*Reference citations would be inserted to support the statements.
HOW-TO-TELL TIP How can you tell a problem statement?
Problem statements are rarely explicitly labeled. The first sentence of a
research report is often the starting point of a problem statement. The
problem statement is usually interwoven with findings from the research
literature. Prior findings provide evidence supporting assertions in the
problem statement and suggest gaps in knowledge. In many articles, it is
difficult to disentangle the problem statement from the literature review,
unless there is a subsection specifically labeled “Literature Review” or
something similar.
Problem statements for a qualitative study similarly express the nature of the
problem, its context, its scope, and information needed to address it. Qualitative studies
embedded in a research tradition often incorporate terms and concepts that foreshadow
the tradition in their problem statements. For example, a problem statement for a
phenomenological study might note the need to know more about people’s experiences
or meanings they attribute to those experiences.
Statements of Purpose
Many researchers articulate their research goals as a statement of purpose. The purpose
statement establishes the general direction of the inquiry and captures the study’s
substance. It is usually easy to identify a purpose statement because the word purpose is
explicitly stated: “The purpose of this study was . . . ”—although sometimes the words
aim, goal, or objective are used instead, as in “The aim of this study was . . . .”
In a quantitative study, a statement of purpose identifies the key study variables and
156
their possible interrelationships as well as the population of interest (i.e., all the PICO
elements).
Example of a statement of purpose from a quantitative study
The purpose of this study was to examine the effects of an education-support
intervention delivered in home settings to people with chronic heart failure, in terms
of their functional status, self-efficacy, quality of life, and self-care ability (Clark et
al., 2015).
This purpose statement identifies the population (P) of interest as patients with heart
failure living at home. The key study variables were the patients’ exposure or
nonexposure to the special intervention (the independent variable encompassing the I
and C components) and the patient’s functional status, self-efficacy, quality of life, and
self-care ability (the dependent variables or Os).
In qualitative studies, the statement of purpose indicates the nature of the inquiry;
the key concept or phenomenon; and the group, community, or setting under study.
Example of a statement of purpose from a qualitative study
The purpose of this study was to explore the influence of religiosity and spirituality
on rural parents’ decision to vaccinate their 9- to 13-year-old children against human
papillomavirus (HPV) (Thomas et al., 2015).
This statement indicates that the group under study is rural parents with children
aged 9 to 13 years and the central phenomenon is the parent’s decision making about
vaccinations within the context of their spirituality and religious beliefs.
Researchers often communicate information about their approach through their
choice of verbs. A study whose purpose is to explore or describe some phenomenon is
likely to be an investigation of a little-researched topic, often involving a qualitative
approach such as phenomenology or ethnography. A statement of purpose for a
qualitative study—especially a grounded theory study—may also use verbs such as
understand, discover, or generate. Statements of purpose in qualitative studies also may
“encode” the tradition of inquiry through certain terms or “buzz words” associated with
those traditions, as follows:
Grounded theory: processes; social structures; social interactions
Phenomenological studies: experience; lived experience; meaning; essence
Ethnographic studies: culture; roles; lifeways; cultural behavior
Quantitative researchers also use verbs to communicate the nature of the inquiry. A
statement indicating that the study purpose is to test or evaluate something (e.g., an
intervention) suggests an experimental design, for example. A study whose purpose is
to examine or explore the relationship between two variables is more likely to involve a
157
nonexperimental design. Sometimes the verb is ambiguous: If a purpose statement
states that the researcher’s intent is to compare two things, the comparison could
involve alternative treatments (using an experimental design) or two preexisting groups
such as smokers and nonsmokers (using a nonexperimental design). In any event, verbs
such as test, evaluate, and compare suggest quantifiable variables and designs with
scientific controls.
The verbs in a purpose statement should connote objectivity. A statement of purpose
indicating that the study goal was to prove, demonstrate, or show something suggests a
bias.
Research Questions
Research questions are, in some cases, direct rewordings of statements of purpose,
phrased interrogatively rather than declaratively, as in the following example:
Purpose: The purpose of this study is to assess the relationship between the functional
dependence level of renal transplant recipients and their rate of recovery.
Question: Is the functional dependence level (I) of renal transplant recipients (P)
related to their rate of recovery (O)?
Some research articles omit a statement of purpose and state only research
questions, but in many cases researchers use research questions to add greater
specificity to a global purpose statement.
Research Questions in Quantitative Studies
In Chapter 2, we discussed clinical foreground questions to guide an EBP inquiry. The
EBP question templates in Table 2.1 could yield questions to guide a research project as
well, but researchers tend to conceptualize their questions in terms of their variables.
Take, for example, the first question in Table 2.1: “In (population), what is the effect of
(intervention) on (outcome)?”A researcher would be more likely to think of the question
in these terms: “In (population), what is the effect of (independent variable) on
(dependent variable)?” Thinking in terms of variables helps to guide researchers’
decisions about how to operationalize them. Thus, in quantitative studies, research
questions identify the population (P) under study, the key study variables (I, C, and O
components), and relationships among the variables.
Most research questions concern relationships among variables, and thus, many
quantitative research questions could be articulated using a general question template:
“In (population), what is the relationship between (independent variable or IV) and
(dependent variable or DV)?” Examples of variations include the following:
Therapy/treatment/intervention: In (population), what is the effect of (IV: intervention
vs. an alternative) on (DV)?
Prognosis: In (population), does (IV: disease or illness vs. its absence) affect or
increase the risk of (DV)?
158
Etiology/harm: In (population), does (IV: exposure vs. nonexposure) cause or increase
risk of (DV)?
Not all research questions are about relationships—some are descriptive. As
examples, here are two descriptive questions that could be answered in a quantitative
study on nurses’ use of humor:
What is the frequency with which nurses use humor as a complementary therapy with
hospitalized cancer patients?
What are the characteristics of nurses who use humor as a complementary therapy with
hospitalized cancer patients?
Answers to such questions might be useful in developing effective strategies for
reducing stress in patients with cancer.
Example of a research question from a quantitative study
Chang and colleagues (2015) undertook a study that addressed the following
question: Among community-dwelling elders aged 65 years and older, does regular
exercise have an association with depressive symptoms?
In this example, the question asks about the relationship between an independent
variable (regular participation in exercise) and a dependent variable (depressive
symptoms) in a population of community-dwelling older adults.
Research Questions in Qualitative Studies
Research questions in qualitative studies stipulate the phenomenon and the population
of interest. Grounded theory researchers are likely to ask process questions,
phenomenologists tend to ask meaning questions, and ethnographers generally ask
descriptive questions about cultures. The terms associated with the various traditions,
discussed previously in connection with purpose statements, are likely to be
incorporated into the research questions.
Example of a research question from a phenomenological study
What is the meaning of the lived experience of encounters with a therapy dog for
persons with Alzheimer’s disease? (Swall et al., 2015).
Not all qualitative studies are rooted in a specific research tradition. Many
researchers use constructivist methods to describe or explore phenomena without
focusing on cultures, meaning, or social processes.
Example of a research question from a descriptive qualitative study
In their descriptive qualitative study, Yeager and coresearchers (2016) asked, “What
159
do low-income African American adults with advanced cancer do on a day-to-day
basis to relieve and manage symptoms?”
In qualitative studies, research questions sometimes evolve during the study.
Researchers begin with a focus that defines the broad boundaries of the inquiry, but the
boundaries are not cast in stone. Constructivists are often sufficiently flexible that the
question can be modified as new information makes it relevant to do so.
TIP Researchers most often state their purpose or research questions at the
end of the introduction or immediately after the review of the literature.
Sometimes, a separate section of a research article is devoted to formal
statements about the research problem formally and might be labeled
“Purpose,” “Statement of Purpose,” “Research Questions,” or, in quantitative
studies, “Hypotheses.”
RESEARCH HYPOTHESES
A hypothesis is a prediction, usually involving a predicted relationship between two or
more variables. Qualitative researchers do not have formal hypotheses because
qualitative researchers want the inquiry to be guided by participants’ viewpoints rather
than by their own hunches. Thus, our discussion focuses on hypotheses in quantitative
research.
Function of Hypotheses in Quantitative Research
Many research questions ask about relationships between variables, and hypotheses are
predicted answers to these questions. For instance, the research question might ask
“Does sexual abuse in childhood affect the development of irritable bowel syndrome in
women?” The researcher might predict the following: Women (P) who were sexually
abused in childhood (I) have a higher incidence of irritable bowel syndrome (O) than
women who were not abused (C).
Hypotheses sometimes emerge from a theory. Scientists reason from theories to
hypotheses and test those hypotheses in the real world (see Chapter 8). Even in the
absence of a theory, hypotheses offer direction and suggest explanations. For example,
suppose we hypothesized that the incidence of desaturation in low-birth-weight infants
undergoing intubation and ventilation would be lower using the closed tracheal suction
system (CTSS) than using partially ventilated endotracheal suction (PVETS). Our
hypothesis might be based on prior studies or clinical observations.
Now let us suppose the hypothesis is not confirmed in a study; that is, we find that
rates of desaturation are similar for both the PVETS and CTSS methods. The failure of
data to support a prediction forces researchers to analyze theory or previous research
critically, to review study limitations, and to explore alternative explanations for the
160
findings. The use of hypotheses tends to promote critical thinking. Now suppose we
conducted the study guided only by the question, Is there a relationship between suction
method and rates of desaturation? Without a hypothesis, the researcher is seemingly
prepared to accept any results. The problem is that it is almost always possible to
explain something superficially after the fact, no matter what the findings are.
Hypotheses reduce the possibility that spurious results will be misconstrued.
TIP Some quantitative research articles explicitly state the hypotheses that
guided the study, but many do not. The absence of a hypothesis may indicate
that researchers have failed to consider critically the existing evidence or
theory or have failed to disclose their hunches.
Characteristics of Testable Hypotheses
Research hypotheses usually state the expected relationship between the independent
variable (the presumed cause or influence) and the dependent variable (the presumed
outcome or effect) within a population.
Example of a research hypothesis
Forbes and colleagues (2015) studied cancer survivors’ engagement in strength
exercise behaviors. They hypothesized that prostate cancer survivors would have a
higher rate of strength exercise participation than breast or colon cancer survivors.
In this example, the population is cancer survivors. The IV is type of cancer, and the
outcome variable is participation in strength exercise. The hypothesis predicts that, in
the population, type of cancer is related to rates of strength exercise participation.
Hypotheses that do not make a relational statement are difficult to test. Take the
following example: Pregnant women who receive prenatal instruction about
postpartum experiences are not likely to experience postpartum depression. This
statement expresses no anticipated relationship and cannot be tested using standard
statistical procedures. In our example, how would we decide whether to accept or reject
the hypothesis?
We could, however, modify the hypothesis as follows: Pregnant women who receive
prenatal instruction are less likely than those who do not to experience postpartum
depression. Here, the outcome variable (O) is postpartum depression, and the IV is
receipt (I) versus nonreceipt (C) of prenatal instruction. The relational aspect of the
prediction is embodied in the phrase less than. If a hypothesis lacks a phrase such as
more than, less than, different from, related to, or something similar, it is not testable.
To test the revised hypothesis, we could ask two groups of women with different
prenatal instruction experiences to respond to questions on depression and then compare
the groups’ responses.
161
TIP Hypotheses are typically fairly easy to identify because researchers
make statements such as “The study tested the hypothesis that . . . ” or “It
was predicted that . . . .”
Wording of Hypotheses
Hypotheses can be stated in various ways, as in the following example:
1. Older patients are more likely to fall than younger patients.
2. There is a relationship between a patient’s age and the likelihood of falling.
3. The risk of falling increases with the age of the patient.
4. Older patients differ from younger ones with respect to their risk of falling.
In each example, the hypothesis states the population (patients), the IV (age), the
outcome variable (falling), and an anticipated relationship between them.
Hypotheses can be either directional or nondirectional. A directional hypothesis
specifies the expected direction of the relationship between variables. In the four
versions of the hypothesis, versions 1 and 3 are directional because they predict that
older patients are more likely to fall than younger ones. A nondirectional hypothesis
does not stipulate the direction of the relationship (versions 2 and 4). These versions
predict that a patient’s age and falling are related but do not specify whether older or
younger patients are predicted to be at greater risk.
TIP Hypotheses can be either simple hypotheses (with a single independent
variable and dependent variable) or complex (multiple independent or
dependent variables). Information about this differentiation is available on
the Supplement to this chapter on website.
Another distinction is between research and null hypotheses. Research hypotheses
are statements of expected relationships between variables. All the hypotheses presented
thus far are research hypotheses that indicate actual expectations.
Statistical inference operates on a logic that may be confusing. This logic requires
that hypotheses be expressed as an expected absence of a relationship. Null hypotheses
state that there is no relationship between the independent and dependent variables. The
null form of the hypothesis in our preceding example would be “Older patients are just
as likely as younger patients to fall.” The null hypothesis can be compared with the
assumption of innocence in many systems of criminal justice: The variables are
assumed to be “innocent” of a relationship until they can be shown “guilty” through
statistical tests.
Research articles typically state research rather than null hypotheses. In statistical
testing, underlying null hypotheses are assumed, without being stated.
162
TIP If a researcher uses statistical tests (which is true in most quantitative
studies), it means that there are underlying hypotheses—regardless of
whether the researcher explicitly stated them—because statistical tests are
designed to test hypotheses.
Hypothesis Testing and Proof
Hypotheses are formally tested through statistical analysis. Researchers use statistics to
test whether their hypotheses have a high probability of being correct (i.e., has a
probability <.05). Statistical analysis does not offer proof, it only supports inferences
that a hypothesis is probably correct (or not). Hypotheses are never proved or
disproved; rather, they are supported or rejected. Hypotheses come to be increasingly
supported with evidence from multiple studies.
To illustrate why this is so, suppose we hypothesized that height and weight are
related. We predict that, on average, tall people weigh more than short people. Suppose
we happened by chance to get a sample of short, heavy people and tall, thin people. Our
results might indicate that there is no relationship between a person’s height and weight.
But we would not be justified in concluding that the study proved or demonstrated that
height and weight are unrelated.
This example illustrates the difficulty of using observations from a sample to
generalize to a population. Other issues, such as the accuracy of the measures and the
effects of uncontrolled variables prevent researchers from concluding that hypotheses
are proved.
CRITIQUING RESEARCH PROBLEMS, RESEARCH
QUESTIONS, AND HYPOTHESES
In a comprehensive critique of a research article, you would evaluate whether
researchers have adequately communicated their research problem. The problem
statement, purpose, research questions, and hypotheses set the stage for describing what
was done and what was learned. You should not have to dig too deeply to figure out the
research problem or discover the questions.
A critique of the research problem involves multiple dimensions. Substantively, you
need to consider whether the problem has significance for nursing. Studies that build on
existing evidence in a meaningful way can make contributions to EBP. Also, research
problems stemming from research priorities (see Chapter 1) have a high likelihood of
yielding important evidence for nurses.
Another dimension in critiquing the research problem concerns methodologic issues
—in particular, whether the research problem is compatible with the chosen research
paradigm and its associated methods. You should also evaluate whether the statement of
purpose or research questions lend themselves to research inquiry.
If a research article describing a quantitative study does not state hypotheses, you
163
should consider whether their absence is justified. If there are hypotheses, you should
evaluate whether the hypotheses are sensible and consistent with existing evidence or
relevant theory. Also, hypotheses are valid guideposts in scientific inquiry only if they
are testable. To be testable, hypotheses must predict a relationship between two or more
measurable variables.
Specific guidelines for critiquing research problems, research questions, and
hypotheses are presented in Box 6.3.
Box 6.3 Guidelines for Critiquing Research Problems, Research Questions, and
Hypotheses
1. What was the research problem? Was the problem statement easy to locate and
was it clearly stated? Did the problem statement build a cogent and persuasive
argument for the new study?
2. Does the problem have significance for nursing?
3. Was there a good fit between the research problem and the paradigm (and
tradition) within which the research was conducted?
4. Did the report formally present a statement of purpose, research question, and/or
hypotheses? Was this information communicated clearly and concisely, and was it
placed in a logical and useful location?
5. Were purpose statements or research questions worded appropriately (e.g., Were
key concepts/variables identified and the population specified?)
6. If there were no formal hypotheses, was their absence justified? Were statistical
tests used in analyzing the data despite the absence of stated hypotheses?
7. Were hypotheses (if any) properly worded—did they state a predicted relationship
between two or more variables? Were they presented as research or as null
hypotheses?
This section describes how the research problem and research questions were
communicated in two nursing studies, one quantitative and one qualitative.
Read the summaries and then answer the critical thinking questions that
follow, referring to the full research report if necessary. Examples 1 and 2 are
featured on the interactive Critical Thinking Activity on website. The
critical thinking questions for Examples 3 and 4 are based on the studies that
appear in their entirety in Appendices A and B of this book. Our comments
for these exercises are in the Student Resources section on .
EXAMPLE 1: QUANTITATIVE RESEARCH
164
Study: Association of maternal and infant salivary testosterone and cortisol
and infant gender with mother–infant interaction in very-low-birthweight
infants (Cho et al., 2015).
Problem Statement (excerpt): “Prematurity-related health and
developmental problems are more common in male VLBW (very-low-
birthweight, less than 1500 g) infants than in females. . . . Furthermore, male
VLBW infants experience less positive mother-infant interactions than
females do. These associations raise important questions about whether the
vulnerability of male VLBW infants to suboptimal mother-infant interactions
is due to factors beyond gender socialization. . . . Based on the association of
elevated testosterone in infants. . . . with negative cognitive and behavioral
outcomes and of high or low cortisol with infant health and development,
both hormones may affect mother-infant interactions” (pp. 357–359)
(citations were omitted to streamline the presentation).
Statement of Purpose: “The purpose of this . . . study was to examine
possible associations between these steroid hormonal levels and mother-
VLBW-infant interactions and their potential importance for gender
differences” (p. 359).
Research Questions: One of the research questions for this study was “Are
elevated levels of salivary testosterone and cortisol negatively associated with
the quality of mother-VLBW infant interactions at three and six months?” (p.
359).
Hypothesis: “We hypothesized that the levels of testosterone and cortisol in
VLBW infants would be negatively associated with mother-infant
interactions, especially among male infants” (p. 359).
Study Methods: The study participants were 62 mother–VLBW infant pairs
recruited from a level IV neonatal intensive care unit. Data were collected
through infant record review, interviews with the mothers, biochemical
measurements of both mothers and infants, and observation of mother–infant
interactions at 40 weeks postmenstrual age and at 3 and 6 months corrected
age.
Key Findings: Higher maternal testosterone and infant cortisol were
associated with more positive and more frequent maternal interactive
behaviors. Mothers interacted with their infants more frequently when the
infants had lower levels of testosterone.
Critical Thinking Exercises
1. Answer the relevant questions from Box 6.3 regarding this study.
2. Also consider the following targeted questions:
165
a. Where in the research report do you think the researchers presented the
hypotheses? Where in the report would the results of the hypothesis
tests be placed?
b. Was the stated hypothesis directional or nondirectional?
c. Was the researchers’ hypothesis supported in the statistical analysis?
2. If the results of this study are valid and generalizable, what are some of the
uses to which the findings might be put in clinical practice?
EXAMPLE 2: QUALITATIVE RESEARCH
Study: Adolescent and young adult survivors of childhood brain tumors: Life
after treatment in their own words (Hobbie et al., 2016)
Problem Statement (excerpt): “Although 5-year survival rates for children
diagnosed with brain tumors have improved to 75%, survivors report late
effects that can be acute or long term, episodic, or progressive. . . . Gaps exist
in evidence regarding the perspectives of AYA (adolescents and young
adults) regarding their HRQOL (health-related quality of life). . . . To date,
there are few studies that examine the perspectives of AYA survivors of
childhood brain tumors in terms of their sense of self and their role in their
families” (p. 135) (citations were omitted to streamline the presentation).
Statement of Purpose: “The aim of this study was to describe how
adolescent and young adult survivors of childhood brain tumors describe their
health-related quality of life, that is, their physical, emotional, and social
functioning” (p. 134).
Research Question: “We specifically asked: How do AYA survivors of
childhood brain tumors describe their HRQOL (physical, emotional, and
social functioning)?” (p. 135).
Method: The researchers recruited a sample of 41 adolescents and young
adult survivors of a childhood brain tumor who were living with their
families. In-depth interviews were conducted in a private setting at the homes
of study participants. Participants were asked several conversational
questions, such as “Tell me about yourself” and “What parts of your life are
most challenging?”
Key Findings: The researchers found that the survivors struggled for
normalcy in the face of changed functioning due to their cancer and the late
effects of their treatment.
Critical Thinking Exercises
1. Answer the relevant questions from Box 6.3 regarding this study.
2. Also consider the following targeted questions:
166
a. Where in the research report do you think the researchers presented the
statement of purpose and research questions?
b. Does it appear that this study was conducted within one of the three
main qualitative traditions? If so, which one?
3. If the results of this study are trustworthy, what are some of the uses to
which the findings might be put in clinical practice?
EXAMPLE 3: QUANTITATIVE RESEARCH IN APPENDIX A
• Read the abstract and introduction of Swenson and colleagues’ (2016)
study (“Parents’ use of praise and criticism in a sample of young children
seeking mental health services”) in Appendix A of this book.
Critical Thinking Exercises
1. Answer the relevant questions from Box 6.3 regarding this study.
2. Also answer the following question: What might a hypothesis for this
study be? State it as a research hypothesis and as a null hypothesis.
EXAMPLE 4: QUALITATIVE RESEARCH IN APPENDIX B
• Read the abstract and introduction of Beck and Watson’s (2010) study
(“Subsequent childbirth after a previous traumatic birth”) in Appendix B
of this book.
Critical Thinking Exercises
1. Answer the relevant questions from Box 6.3 regarding this study.
2. Also consider the following targeted questions:
a. Do you think that Beck and Watson provided a sufficient rationale for
the significance of their research problem?
b. In their argument for their study, did Beck and Watson say anything
about the fourth element of an argument identified in the book—i.e., the
consequences of the problem?
WANT TO KNOW MORE?
A wide variety of resources to enhance your learning and understanding of
this chapter are available on .
• Interactive Critical Thinking Activity
• Chapter Supplement on Simple and Complex Hypotheses
• Answers to the Critical Thinking Exercises for Examples 3 and 4
• Internet Resources with useful websites for Chapter 6
• A Wolters Kluwer journal article in its entirety—the Hobbie et al. study
described as Example 2 on pp. 103–104.
167
Additional study aids, including eight journal articles and related
questions, are also available in Study Guide for Essentials of Nursing
Research, 9e.
Summary
Points
A research problem is a perplexing or troubling situation that a researcher wants
to address through disciplined inquiry.
Researchers usually identify a broad topic, narrow the scope of the problem, and
then identify research questions consistent with a paradigm of choice.
Researchers communicate their aims in research articles as problem statements,
statements of purpose, research questions, or hypotheses.
The problem statement articulates the nature, context, and significance of a
problem to be studied. Problem statements typically include several components:
problem identification; background, scope, and consequences of the problem;
knowledge gaps; and possible solutions to the problem.
A statement of purpose, which summarizes the overall study goal, identifies the
key concepts (variables) and the study group or population. Purpose statements
often communicate, through the choice of verbs and other key terms, aspects of
the study design or the research tradition.
Research questions are the specific queries researchers want to answer in
addressing the research problem.
A hypothesis states predicted relationships between two or more variables—that
is, the anticipated association between independent and dependent variables.
Directional hypotheses predict the direction of a relationship; nondirectional
hypotheses predict the existence of relationships, not their direction.
Research hypotheses predict the existence of relationships; null hypotheses,
which express the absence of a relationship, are the hypotheses subjected to
statistical testing.
Hypotheses are never proved or disproved—they are accepted or rejected,
supported or not supported by the data.
REFERENCES FOR CHAPTER 6
Chang, S., Chien, N., & Chen, M. (2015). Regular exercise and depressive symptoms in community-dwelling
elders in northern Taiwan. Journal of Nursing Research. Advance online publication.
168
Cho, J., Su, X., Phillips, V., & Holditch-Davis, D. (2015). Association of maternal and infant salivary testosterone
and cortisol and infant gender with mother–infant interaction in very-low-birthweight infants. Research in
Nursing & Health, 38, 357–368.
*Clark, A., McDougall, G., Riegel, B., Joiner-Rogers, G., Innerarity, S., Meraviglia, M., . . . Davila, A. (2015).
Health status and self-care outcomes after an education-support intervention for people with chronic heart
failure. Journal of Cardiovascular Nursing, 30, S3–S13.
*Forbes, C., Blanchard, C., Mummery, W., & Courneya, K. (2015). Prevalence and correlates of strength exercise
among breast, prostate, and colorectal cancer survivors. Oncology Nursing Forum, 42, 118–127.
**Hobbie, W., Ogle, S., Reilly, M., Barakat, L., Lucas, M., Ginsberg, J., . . . Deatrick, J. (2016). Adolescent and
young adult survivors of childhood brain tumors: Life after treatment in their own words. Cancer Nursing, 39,
134–143.
Judge, M., Beck, C. T., Durham, H., McKelvey, M., & Lammi-Keefe, C. (2014). Pilot trial evaluating maternal
docosahexaenoic acid consumption during pregnancy: Decreased postpartum depressive symptomatology.
International Journal of Nursing Sciences, 1, 339–345.
*Swall, A., Ebbeskog, B., Lundh Hagelin, C., & Fagerberg, I. (2015). Can therapy dogs evoke awareness of one’s
past and present life in persons with Alzheimer’s disease? International Journal of Older People Nursing, 10,
84–93.
*Thomas, T., Blumling, A., & Delaney, A. (2015). The influence of religiosity and spirituality on rural parents’
health decision-making and human papillomavirus vaccine choices. Advances in Nursing Science, 38, E1–E12.
Yeager, K., Sterk, C., Quest, T., Dilorio, C., Vena, C., & Bauer-Wu, S. (2016). Managing one’s symptoms: A
qualitative study of low-income African Americans with advanced cancer. Cancer Nursing, 39(4), 303–312.
*A link to this open-access article is provided in the Internet Resources section on website.
**This journal article is available on for this chapter.
169
7 Finding and Reviewing Research
Evidence in the Literature
Learning Objectives
On completing this chapter, you will be able to:
Understand the steps involved in doing a literature review
Identify bibliographic aids for retrieving nursing research reports and locate references
for a research topic
Understand the process of screening, abstracting, critiquing, and organizing research
evidence
Evaluate the style, content, and organization of a literature review
Define new terms in the chapter
Key Terms
Bibliographic database
CINAHL
Google Scholar
Keyword
Literature review
MEDLINE
MeSH
Primary source
PubMed
Secondary source
A literature review is a written summary of the state of evidence on a research
problem. It is useful for consumers of nursing research to acquire skills for reading,
critiquing, and preparing written evidence summaries.
BASIC ISSUES RELATING TO LITERATURE
170
REVIEWS
Before discussing the activities involved in undertaking a research-based literature
review, we briefly discuss some general issues. The first concerns the purposes of doing
a literature review.
Purposes of Research Literature Reviews
The primary purpose of literature reviews is to summarize evidence on a topic—to sum
up what is known and what is not known. Literature reviews are sometimes stand-alone
reports intended to communicate the state of evidence to others, but reviews are also
used to lay the foundation for new studies and to help researchers interpret their
findings.
In qualitative research, opinions about literature reviews vary. Grounded theory
researchers typically begin to collect data before examining the literature. As a theory
takes shape, researchers turn to the literature, seeking to relate prior findings to the
theory. Phenomenologists and ethnographers often undertake a literature search at the
outset of a study.
Regardless of when they perform the review, researchers usually include a brief
summary of relevant literature in their introductions. The literature review summarizes
current evidence on a topic and illuminates the significance of the new study. Literature
reviews are often intertwined with the problem statement as part of the argument for the
study.
Types of Information to Seek for a Research Review
Findings from prior studies are the “data” for a research review. If you are preparing a
literature review, you should rely mostly on primary sources, which are descriptions of
studies written by the researchers who conducted them. Secondary source research
documents are descriptions of studies prepared by someone else. Literature reviews are
secondary sources. Recent reviews are a good place to start because they offer
overviews and valuable bibliographies. If you are doing your own literature review,
however, secondary sources should not be considered substitutes for primary sources
because secondary sources are not adequately detailed and may not be completely
objective.
TIP For an evidence-based practice (EBP) project, a recent, high-quality
systematic review may be sufficient to provide the needed information about
the evidence base, although it is usually a good idea to search for studies
published after the review. We provide more explicit guidance on searching
for evidence for an EBP query in the chapter supplement on
website.
171
A literature search may yield nonresearch references, including opinion articles,
case reports, and clinical anecdotes. Such materials may broaden understanding of a
problem or demonstrate a need for research. These writings, however, may have limited
utility in research reviews because they do not address the central question: What is the
current state of evidence on this research problem?
Major Steps and Strategies in Doing a Literature Review
Conducting a literature review is a little bit like doing a study: A reviewer starts with a
question and then must gather, analyze, and interpret the information. Figure 7.1 depicts
the literature review process and shows that there are potential feedback loops, with
opportunities to go back to earlier steps in search of more information.
Reviews should be unbiased, thorough, and up-to-date. Also, high-quality reviews
are systematic. Decision rules for including a study should be explicit because a good
review should be reproducible. This means that another diligent reviewer would be able
to apply the same decision rules and come to similar conclusions about the state of
evidence on the topic.
TIP Locating all relevant information on a research question is like being a
detective. The literature retrieval tools we discuss in this chapter are helpful,
but there inevitably needs to be some digging for, and sifting of, the clues to
evidence on a topic. Be prepared for sleuthing!
Doing a literature review is in some ways similar to undertaking a qualitative study.
It is useful to have a flexible approach to “data collection” and to think creatively about
opportunities for new sources of information.
LOCATING RELEVANT LITERATURE FOR A
RESEARCH REVIEW
An early step in a literature review is devising a strategy to locate relevant studies. The
ability to locate evidence on a topic is an important skill that requires adaptability—
rapid technological changes mean that new methods of searching the literature are
172
introduced continuously. We urge you to consult with librarians or faculty at your
institution for updated suggestions.
Developing a Search Strategy
Having good search skills is important. A particular productive approach is to search for
evidence in bibliographic databases, which we discuss next. Reviewers also use the
ancestry approach (“footnote chasing”), in which citations from relevant studies are
used to track down earlier research on which the studies are based (the “ancestors”). A
third strategy, the descendancy approach, involves finding a pivotal early study and
searching forward to find more recent studies (“descendants”) that cited the key study.
TIP You may be tempted to begin a literature search through an Internet
search engine, such as Yahoo, Google, or Bing. Such a search is likely to
yield a lot of “hits” on your topic but is unlikely to give you full
bibliographic information on research literature on your topic.
Decisions must also be made about limiting the search. For example, reviewers may
constrain their search to reports written in one language. You may also want to limit
your search to studies conducted within a certain time frame (e.g., within the past 10
years).
Searching Bibliographic Databases
Bibliographic databases are accessed by computer. Most databases can be accessed
through user-friendly software with menu-driven systems and on-screen support so that
minimal instruction is needed to retrieve articles. Your university or hospital library
probably has subscriptions to these services.
Getting Started With an Electronic Search
Before searching a bibliographic database electronically, you should become familiar
with the features of the software you are using to access it. The software has options for
restricting or expanding your search, for combining two searches, for saving your
search, and so on. Most programs have tutorials, and most also have Help buttons.
An early task in an electronic search is identifying keywords to launch the search
(although an author search for prominent researchers in a field is also possible). A
keyword is a word or phrase that captures key concepts in your question. For
quantitative studies, the keywords are usually the independent or dependent variables
(i.e., at a minimum, the “I” and “O” of the PICO components) and perhaps the
population. For qualitative studies, the keywords are the central phenomenon and the
population. If you use the question templates for asking clinical questions in Table 2.1,
the words you enter in the blanks are likely to be good keywords.
173
TIP If you want to identify all research reports on a topic, you need to be
flexible and to think broadly about keywords. For example, if you are
interested in anorexia, you might look up anorexia, eating disorders, and
weight loss and perhaps appetite, eating behavior, food habits, bulimia, and
body weight changes.
There are various search approaches for a bibliographic search. All citations in a
database have to be coded so they can be retrieved, and databases and programs use
their own system of categorizing entries. The indexing systems have specific subject
headings (subject codes).
You can undertake a subject search by entering a subject heading into the search
field. You do not have to worry about knowing the subject codes because most software
has mapping capabilities. Mapping is a feature that allows you to search for topics using
your own keywords rather than the exact subject heading used in the database. The
software translates (“maps”) your keywords into the most plausible subject heading and
then retrieves citation records that have been coded with that subject heading.
When you enter a keyword into the search field, the program likely will launch both
a subject search and a textword search. A textword search looks for your keyword in the
text fields of the records, i.e., in the title and the abstract. Thus, if you searched for lung
cancer in the MEDLINE database (which we describe in a subsequent section), the
search would retrieve citations coded for the subject code of lung neoplasms (the
MEDLINE subject heading used to code entries) and also any entries in which the
phrase lung cancer appeared, even if it had not been coded for the lung neoplasm
subject heading.
Some features of an electronic search are similar across databases. One feature is
that you usually can use Boolean operators to expand or delimit a search. Three widely
used Boolean operators are AND, OR, and NOT (in all caps). The operator AND
delimits a search. If we searched for pain AND children, the software would retrieve
only records that have both terms. The operator OR expands the search: pain OR
children could be used in a search to retrieve records with either term. Finally, NOT
narrows a search: pain NOT children would retrieve all records with pain that did not
include the term children.
Wildcard and truncation symbols are other useful tools. A truncation symbol (often
an asterisk, *) expands a search term to include all forms of a root. For example, a
search for child* would instruct the computer to search for any word that begins with
“child” such as children, childhood, or childrearing. In some databases, wildcard
symbols (often ? or *) inserted in the middle of a search term permits a search for
alternative spellings. For example, a search for behavio?r would retrieve records with
either behavior or behaviour. For each database, it is important to learn what these
special symbols are and how they work. Note that the use of special symbols, while
useful, may turn off a software’s mapping feature.
174
One way to force a textword search is to use quotation marks around a phrase,
which yields citations in which the exact phrase appears in text fields. In other words,
lung cancer and “lung cancer” might yield different results. A thorough search strategy
might entail doing a search with and without wildcard characters and with and without
quotation marks.
Two especially useful electronic databases for nurses are CINAHL (Cumulative
Index to Nursing and Allied Health Literature) and MEDLINE (Medical Literature On-
Line), which we discuss in the next sections. We also briefly discuss Google Scholar.
Other useful bibliographic databases for nurses include the Cochrane Database of
Systematic Reviews, Web of Knowledge, Scopus, and EMBASE (the Excerpta Medica
database). The Web of Knowledge database is useful for a descendancy search strategy
because of its strong citation indexes.
TIP If your goal is to conduct a systematic review, you will need to
establish an explicit formal plan about your search strategy and keywords, as
discussed in Chapter 18.
The CINAHL Database
CINAHL is an important electronic database for nurses. It covers references to
hundreds of nursing and allied health journals as well as to books and dissertations.
CINAHL contains about 3 million records.
CINAHL provides information for locating references (i.e., the author, title, journal,
year of publication, volume, and page numbers) and abstracts for most citations. Links
to actual articles are often provided. We illustrate features of CINAHL but note that
some features may be different at your institution and changes are introduced
periodically.
A “basic search” in CINAHL involves entering keywords in the search field (more
options for expanding and limiting the search are available in the “Advanced Search”
mode). You can restrict your search to records with certain features (e.g., only ones with
abstracts), to specific publication dates (e.g., only those after 2010), to those published
in English, or to those coded as being in a certain subset (e.g., nursing). The basic
search screen also allows you to expand the search by clicking the option “Apply related
words.”
To illustrate with a concrete example, suppose we were interested in research on the
effect of music on agitation in people with dementia. We entered the following terms in
the search field and placed only one limit on the search—only records with abstracts:
175
By clicking the Search button, we got dozens of “hits” (citations). Note that we used
two Boolean operators. The use of “AND” ensured that retrieved records had to include
all three keywords, and the use of “OR” allowed either dementia or Alzheimer to be the
third keyword. Also, we used a truncation symbol * in the second keyword. This
instructed the computer to search for any word that begins with “agitat” such as agitated
or agitation.
By clicking the Search button, all of the identified references would be displayed on
the monitor, and we could view and print full information for ones that seemed
promising. An example of an abridged CINAHL record entry for a report identified
through this search is presented in Figure 7.2. The title of the article and author
information is displayed, followed by source information. The source indicates the
following:
Name of the journal (Geriatric Nursing)
Year and month of publication (Jan/Feb 2016)
Volume (37)
Issue (1)
Page numbers (25–29)
176
Figure 7.2 also shows the CINAHL major and minor subject headings that were
coded for this particular study. Any of these headings could have been used in a subject
heading search to retrieve this reference. Note that the subject headings include
substantive headings such as Agitation–Therapy—In Old Age as well as methodologic
(e.g., Paired T-Tests) and sample characteristic headings (e.g., Aged; Inpatients). The
subject terms have hyperlinks so that we could expand the search by clicking on them
(we could also click on the author’s name or on the journal). The abstract for the study
is then presented, with the search terms bolded. Next, the names of any formal
instruments used in the study are printed under “Instrumentation.” Based on the
abstract, we would then decide whether this reference was pertinent to our inquiry. Note
that there is also a sidebar link in each record called Times Cited in this Database,
which would retrieve records for articles that had cited this paper (for a descendancy
search).
The MEDLINE Database
The MEDLINE database, developed by the U.S. National Library of Medicine, is the
premier source for bibliographic coverage of the biomedical literature. MEDLINE
covers about 5,600 medical, nursing, and health journals and has more than 24 million
records. MEDLINE can be accessed for free on the Internet at the PubMed website.
177
PubMed is a lifelong resource regardless of your institution’s access to bibliographic
databases.
MEDLINE uses a controlled vocabulary called MeSH (Medical Subject Headings)
to index articles. MeSH terminology provides a consistent way to retrieve information
that may use different terminology for the same concepts. Once you have begun a
search, a field on the right side of the screen labeled “Search Details” lets you see how
the keywords you entered mapped onto MeSH terms, which might lead you to pursue
other leads.
When we did a PubMed search of MEDLINE analogous to the one we described
earlier for CINAHL, using the same keywords and restrictions, 90 records were
retrieved. The list of records in the two PubMed and CINAHL searches overlapped
considerably, but new references were found in each search. Both searches, however,
retrieved the study by Davison—the CINAHL record for which was shown in Figure
7.2. The PubMed record for the same reference is presented in Figure 7.3. As you can
see, the MeSH terms in Figure 7.3 are different than the CINAHL subject headings in
Figure 7.2.
TIP After you have found a study that is good exemplar of what you are
looking for, you can search for other similar studies in the database. In
178
PubMed, after identifying a key study, you could click on “Similar articles”
on the right of the screen to locate similar studies. In CINAHL, you would
click on “Find Similar Results.”
Google Scholar
Google Scholar (GS) is a popular bibliographic search engine that was launched in
2004. GS includes articles in journals from scholarly publishers in all disciplines and
also includes books, technical reports, and other documents. One advantage of GS is
that it is accessible free of charge over the Internet. Like other bibliographic search
engines, GS allows users to search by topic, by a title, and by author and uses Boolean
operators and other search conventions. Also like PubMed and CINAHL, GS has a
Cited By feature for a descendancy search and a Related Articles feature to locate other
sources with relevant content to an identified article. Because of its expanded coverage
of material, GS can provide greater access to free full-text publications.
In the field of medicine, GS has generated controversy, with some arguing that it is
of similar utility and quality to popular medical databases and others urging caution in
depending primarily on GS. The capabilities and features of GS may improve in the
years ahead, but at the moment, it may be risky to depend on GS exclusively. For a
comprehensive literature review, we think it is best to combine searches using GS with
searches of other databases.
Example of a bibliographic search
Zuckerman (2016) did a literature review on the use of oral chlorhexidine to prevent
ventilator-associated pneumonia. She searched for relevant studies in four
bibliographic databases: CINAHL, PubMed, Scopus, and EMBASE. A total of 47
articles were initially identified; only 16 were duplicates. (This journal article is
available on .)
Screening, Documentation, and Abstracting
After searching for and retrieving references, several important steps remain before a
synthesis can begin.
Screening and Gathering References
References that have been identified in the search need to be screened for relevance.
You can usually surmise relevance by reading the abstract. When you find a relevant
article, try to obtain a full copy rather than relying on information in the abstract only.
TIP The open-access journal movement is gaining momentum in health
care publishing. Open-access journals provide articles free of charge online.
When an article is not available online, you may be able to access it by
179
communicating with the lead author, either directly through an e-mail or
through a resource called Research Gate (www.researchgate.net).
Documentation in Literature Retrieval
Search strategies are often complex, so it is wise to document your search actions and
results. You should make note of databases searched, keywords used, limits instituted,
and any other information that would help you keep track of what you did. Part of your
strategy can be documented by printing your search history from the electronic
databases. Documentation will promote efficiency by preventing unintended duplication
and will also help you to assess what else needs to be tried.
Abstracting and Recording Information
Once you have retrieved useful articles, you need a strategy to organize the information
in the articles. For simple reviews, it may be sufficient to make notes about key features
of the retrieved studies and to base your review on these notes. When a literature review
involves a large number of studies, a formal system of recording information from each
study may be needed. One mechanism that we recommend for complex reviews is to
code the characteristics of each study and then record codes in a set of matrices, a
system that we describe in detail elsewhere (Polit & Beck, 2017).
Another approach is to “copy and paste” each abstract and citation information from
the bibliographic database into a word processing document. Then, the bottom of each
page could have a “mini-protocol” for recording important information that you want to
record consistently across studies. There is no fixed format for such a protocol—you
must decide what elements are important to record systematically to help you organize
and analyze information. We present an example for a half-page protocol in Figure 7.4,
with entries that would be most suitable for Therapy/Intervention questions. Although
many of the terms on this protocol are probably not familiar to you at this point, you
will learn their meaning in subsequent chapters.
180
http://www.researchgate.net
EVALUATING AND ANALYZING THE EVIDENCE
In drawing conclusions about a body of evidence, reviewers must make judgments
about the worth of the studies. Thus, an important part of a doing a literature review is
evaluating the body of completed studies and integrating the evidence across studies.
Evaluating Studies for a Review
In reviewing the literature, you would not undertake a comprehensive critique of each
study, but you would need to assess the quality of each study so that you could draw
conclusions about the overall body of evidence and about gaps in the evidence.
Critiques for a literature review tend to focus on study methods, so the critiquing
guidelines in Tables 4.1 and 4.2 might be useful.
In literature reviews, methodological features of the studies under review need to be
assessed with an eye to answering a broad question: To what extent do the findings
reflect the truth (the true state of affairs) or, conversely, to what extent do flaws
undermine the believability of the evidence? The “truth” is most likely to be discovered
when researchers use powerful designs, good sampling plans, high-quality data
collection procedures, and appropriate analyses.
Analyzing and Synthesizing Evidence
Once relevant studies have been retrieved and critiqued, the information has to be
analyzed and synthesized. We find the analogy between doing a literature review and
doing a qualitative study useful: In both, the focus is on the identification of important
181
themes.
A thematic analysis essentially involves detecting patterns and regularities—as well
as inconsistencies. A number of different types of themes can be identified in a
literature review analysis, three of which are as follows:
Substantive themes: What is the pattern of evidence—what findings predominate?
How much evidence is there? How consistent is the body of evidence? What gaps are
there in the evidence?
Methodologic themes: What methods have been used to address the question? What
are major methodologic deficiencies and strengths?
Generalizability/transferability themes: To what population does the evidence apply?
Do the findings vary for different types of people (e.g., men vs. women) or setting
(e.g., urban vs. rural)?
In preparing a review, you would need to determine which themes are most relevant
for the purpose at hand. Most often, substantive themes are of greatest interest.
PREPARING A WRITTEN LITERATURE REVIEW
Writing literature reviews can be challenging, especially when voluminous information
and thematic analyses must be condensed into a few pages. We offer some suggestions,
but we recognize that skills in writing literature reviews develop over time.
Organizing for a Written Review
Organization is crucial in preparing a written review. When literature on a topic is
extensive, it is useful to summarize the retrieved information in a table. The table could
include columns with headings such as Author, Sample Characteristics, Design, and
Key Findings. Such a table provides a quick overview that allows you to make sense of
a mass of information.
Most writers find an outline helpful. Unless the review is very simple, it is important
to have an organizational plan so that the review has a meaningful and understandable
flow. Although the specifics of the organization differ from topic to topic, the goal is to
structure the review to lead logically to a conclusion about the state of evidence on the
topic. After finalizing an organizing structure, you should review your notes or
protocols to decide where a particular reference fits in the outline. If some references do
not seem to fit anywhere, they may need to be omitted. Remember that the number of
references is less important than their relevance.
Writing a Literature Review
It is beyond the scope of this textbook to offer detailed guidance on writing research
reviews, but we offer a few comments on their content and style. Additional assistance
is provided in books such as those by Fink (2014) and Garrard (2014).
182
Content of the Written Literature Review
A written research review should provide readers with an objective synthesis of current
evidence on a topic. Although key studies may be described in detail, it is not necessary
to provide particulars for every reference. Studies with comparable findings often can
be summarized together, as illustrated in the third paragraph of Example 1 at the end of
this chapter.
Findings should be summarized in your own words. The review should demonstrate
that you have considered the cumulative worth of the body of research. Stringing
together quotes from articles fails to show that previous research has been assimilated
and understood.
The review should be as unbiased as possible. The review should not omit a study
because its findings contradict those of other studies or conflict with your ideas.
Inconsistent results should be analyzed and the supporting evidence evaluated
objectively.
A literature review typically concludes with a summary of current evidence on the
topic. The summary should recap key findings, assess their credibility, and point out
gaps in the evidence. When the literature review is conducted for a new study, the
summary should demonstrate the need for the research and clarify the context for any
hypotheses.
As you read this book, you will become increasingly proficient in critically
evaluating the research literature. We hope you will understand the mechanics of doing
a research review once you have completed this chapter, but we do not expect that you
will be in a position to write a state-of-the-art review until you have acquired more
skills in research methods.
Style of a Research Review
Students preparing research reviews often have trouble writing in an acceptable style.
Remember that hypotheses cannot be proved or disproved by statistical testing, and no
question can be definitely answered in a single study. The problem is partly semantic:
Hypotheses are not proved or verified; they are supported by research findings.
TIP Phrases indicating the tentativeness of research results, such as the
following, are appropriate:
Several studies have found . . .
Findings thus far suggest . . .
The results are consistent with the conclusion that . . .
There appears to be fairly strong evidence that . . .
Also, a literature review should include opinions sparingly and should explicitly
reference the source. Reviewers’ own opinions do not belong in a review, with the
exception of assessments of study quality.
183
CRITIQUING RESEARCH LITERATURE REVIEWS
Some nurses never prepare a written research review, and perhaps you will never be
required to do one. Most nurses, however, do read research reviews (including the
literature review sections of research reports), and they should be prepared to evaluate
such reviews critically.
It is often difficult to critique a research review if you are not familiar with the topic.
You may not be able to judge whether the author has included all relevant literature and
has adequately summarized knowledge on that topic. Some aspects of a research review,
however, are amenable to evaluation by readers who are not experts on the topic. A few
suggestions for critiquing research reviews are presented in Box 7.1. Extra critiquing
questions are relevant for systematic reviews, as we discuss in Chapter 18.
Box 7.1 Guidelines for Critiquing Literature Reviews
1. Does the review seem thorough and up-to-date? Did it include major studies on the
topic? Did it include recent research?
2. Did the review rely mainly on research reports, using primary sources?
3. Did the review critically appraise and compare key studies? Did it identify
important gaps in the literature?
4. Was the review well organized? Is the development of ideas clear?
5. Did the review use appropriate language, suggesting the tentativeness of prior
findings? Is the review objective?
6. If the review was in the introduction for a new study, did the review support the
need for the study?
7. If the review was designed to summarize evidence for clinical practice, did it draw
appropriate conclusions about practice implications?
In assessing a literature review, the overarching question is whether it summarizes
the current state of research evidence. If the review is written as part of an original
research report, an equally important question is whether the review lays a solid
foundation for the new study.
TIP Literature reviews in the introductions of research articles are almost
always very brief and are unlikely to present a thorough critique of existing
studies. Gaps in what has been studied, however, should be identified.
The best way to learn about the style, content, and organization of a research
literature review is to read reviews that appear in the nursing literature. We
184
present an excerpt from a review for a mixed method study—one involving
the collection and analysis of both quantitative and qualitative data. The
excerpt is followed by some questions to guide critical thinking—you can
refer to the entire report if needed. Example 1 is featured on the interactive
Critical Thinking Activity on website. The critical thinking questions
for Examples 2 and 3 are based on the studies that appear in their entirety in
Appendices A and B of this book. Our comments for these exercises are in the
Student Resources section on .
EXAMPLE 1: LITERATURE REVIEW FROM A MIXED
METHOD STUDY
Study: Symptoms in women with peripartum cardiomyopathy: A mixed
method study (Patel et al., 2016)
Statement of Purpose: The purpose of this study was to explore and describe
women’s experiences of symptoms in peripartum cardiomyopathy.
Literature Review (excerpt): “Peripartum Cardiomyopathy (PPCM) is
idiopathic disease, rare in high income countries and a diagnosis of exclusion.
It is associated with, at times, severe heart failure (HF) occurring toward the
end of pregnancy or in the months following birth. The left ventricle may not
be dilated but the left ventricle ejection fraction is nearly always reduced
below 45%. The Heart Failure Association of the European Society of
Cardiology Working Group on PPCM defined it as: An idiopathic
cardiomyopathy presenting with HF secondary to left ventricle systolic
dysfunction towards the end of pregnancy or in the months following delivery,
where no other cause of HF is found. It is a diagnosis of exclusion. The left
ventricle may not be dilated but the ejection fraction is nearly always reduced
below 45% (Sliwa et al., 2010).
The incidence and prognosis of PPCM varies globally (Elkayam, 2011).
The true incidence is unknown, as the clinical presentation varies. Current
estimates range between 1:299 (Haiti), 1:1000 (South Africa), and 1:2500-
4000 births (USA) (Sliwa et al., 2006, 2010; Blauwet and Cooper, 2011;
Elkayam, 2011). No data exists on the prevalence of the disease in Europe
(Haghikia et al., 2013). Assuming an incidence of 1:3500 to 1:1400 births
would yield an expected incidence of up to 300 patients per year in Germany,
with severe, critical cardiac failure in around 30 (Hilfiker-Kleiner et al.,
2008). The incidence in Sweden has been estimated to be 1:9191 births
(Barasa et al., 2012).
The anatomical and physiological changes in the mother associated with
normal pregnancy are profound, and this may result in symptoms and signs
that overlap with those usually associated with disease outside of pregnancy
185
(Germain and Nelson-Piercy, 2011). The main/cardinal symptoms of PPCM
are those of HF and include fatigue, shortness of breath, and fluid retention
and thus diagnosis is often missed or delayed as initial symptoms are similar
to those of hemodynamic changes in normal pregnancy or early postpartum
period (Groesdonk et al., 2009; Sliwa et al., 2010; Germain and Nelson-
Piercy, 2011; Givertz, 2013). An analysis of internet narratives of women
with PPCM showed that symptoms overlap with normal discomforts of
pregnancy, and thus create space for clinicians to overlook the seriousness of
their situation (Morton, et al., 2014). A survey of women with PPCM
participating in an online support group showed their frustration with the
nursing staff (Hess et al., 2012) for being ignored, dismissed and neglected.
Only 4% of the posts on the forum described interactions with health care
professionals as positive.
The causes, risk factors, aetiology, treatment and prognosis of PPCM
have been described elsewhere (Ferriere et al., 1990; Cenac and Djibo, 1998;
Groesdonk et al., 2009; Sliwa et al., 2010; Elkayam, 2011; Germain and
Nelson-Piercy, 2011; Bachelier-Walenta et al., 2013; Givertz, 2013). There
are, however, a lot more questions that remain unanswered and women’s
experiences of symptoms of PPCM are rarely explored. As understanding
specific conditions from the ‘sufferers’ perspective is a foundational starting
point for caring (Watson, 2011), it is important to understand the subjective
experience and meaning of PPCM from the affected person’s perspective.
The lack of research in this area points to the need for knowledge
acquirement from those who are affected, to assist with differential and early
diagnosis of PPCM” (pp. 14–15).
Critical Thinking Exercises
1. Answer the relevant questions from Box 7.1 regarding this literature
review.
2. Also consider the following targeted questions, which may further sharpen
your critical thinking skills and assist you in understanding this study:
a. In performing the literature review, what keywords might the
researchers have used to search for prior studies?
b. Using the keywords, perform a computerized search to see if you can
find a recent relevant study to augment the review.
EXAMPLE 2: QUANTITATIVE RESEARCH IN APPENDIX A
• Read the introduction to Swenson and colleagues’ (2016) study (“Parents’
use of praise and criticism in a sample of young children seeking mental
health services”) in Appendix A of this book.
Critical Thinking Exercises
186
1. Answer the relevant questions from Box 7.1 regarding this study.
2. Also consider the following targeted questions:
a. In performing the literature review, what keywords might have been
used to search for prior studies?
b. Using the keywords, perform a computerized search to see if you can
find a recent relevant study to augment the review.
EXAMPLE 3: QUALITATIVE RESEARCH IN APPENDIX B
• Read the abstract and introduction of Beck and Watson’s (2010) study
(“Subsequent childbirth after a previous traumatic birth”) in Appendix B
of this book.
Critical Thinking Exercises
1. Answer the relevant questions from Box 7.1 regarding this study.
2. Also consider the following targeted questions:
a. What was the central phenomenon in this study? Was that phenomenon
adequately covered in the literature review?
b. In performing their literature review, what keywords might Beck and
Watson have used to search for prior studies?
WANT TO KNOW MORE?
A wide variety of resources to enhance your learning and understanding of
this chapter are available on .
• Interactive Critical Thinking Activity
• Chapter Supplement on Finding Evidence for an EBP Inquiry in PubMed
• Answers to the Critical Thinking Exercises for Examples 2 and 3
• Internet Resources with useful websites for Chapter 7
• A Wolters Kluwer journal article in its entirety—the Zuckermann study
described on p. 114.
Additional study aids, including eight journal articles and related
questions, are also available in Study Guide for Essentials of Nursing
Research, 9e.
Summary
Points
A research literature review is a written summary of the state of evidence on a
187
research problem.
The major steps in preparing a written research review include formulating a
question, devising a search strategy, searching and retrieving relevant sources,
abstracting and encoding information, critiquing studies, analyzing and integrating
the information, and preparing a written synthesis.
Research reviews rely primarily on findings in research reports. Information in
nonresearch references (e.g., opinion articles, case reports) may broaden
understanding of a problem but has limited utility in summarizing evidence.
A primary source is the original description of a study prepared by the researcher
who conducted it; a secondary source is a description of a study by another
person. Literature reviews should rely mostly on primary source material.
Strategies for finding studies on a topic not only include the use of bibliographic
tools but also include the ancestry approach (tracking down earlier studies cited in
a reference list of a report) and the descendancy approach (using a pivotal study to
search forward to subsequent studies that cited it).
Key resources for a research literature search are the bibliographic databases that
can be searched electronically. For nurses, the CINAHL and MEDLINE
databases are especially useful.
In searching a bibliographic database, users can do a keyword search that looks
for terms in text fields of a database record (or that maps keywords onto the
database’s subject codes) or can search according to the subject heading codes
themselves.
Retrieved references must be screened for relevance, and then pertinent
information can be abstracted and encoded for subsequent analysis. Studies must
also be critiqued to assess the strength of evidence in existing research.
The analysis of information from a literature search essentially involves the
identification of important themes—regularities and patterns in the information.
In preparing a written review, it is important to organize materials coherently.
Preparation of an outline is recommended. The reviewers’ role is to point out what
has been studied, how adequate and dependable the studies are, and what gaps
exist in the body of research.
REFERENCES FOR CHAPTER 7
Fink, A. (2014). Conducting research literature reviews: From the Internet to paper (4th ed.). Thousand Oaks, CA:
Sage.
Garrard, J. (2014). Health sciences literature review made easy: The matrix method (4th ed.). Burlington, MA:
Jones & Bartlett Learning.
*Patel, H., Berg, M., Barasa, A., Begley, C., & Schaufelberger, M. (2016). Symptoms in women with peripartum
cardiomyopathy: A mixed method study. Midwifery, 32, 14–20.
Polit, D., & Beck, C. (2017). Nursing research: Generating and assessing evidence for nursing practice (10th ed.).
Philadelphia, PA: Wolters Kluwer.
**Zuckerman, L. M. (2016). Oral chlorhexidine use to prevent ventilator-associated pneumonia in adults: Review
of the current literature. Dimensions of Critical Care Nursing, 35, 25–36.
188
*A link to this open-access article is provided in the Internet Resources section on website.
**This journal article is available on for this chapter.
189
8 Theoretical and Conceptual
Frameworks
Learning Objectives
On completing this chapter, you will be able to:
Identify major characteristics of theories, conceptual models, and frameworks
Identify several conceptual models or theories frequently used by nurse researchers
Describe how theory and research are linked in quantitative and qualitative studies
Critique the appropriateness of a theoretical framework—or its absence—in a study
Define new terms in the chapter
Key Terms
Conceptual framework
Conceptual map
Conceptual model
Descriptive theory
Framework
Middle-range theory
Model
Schematic model
Theoretical framework
Theory
High-quality studies typically achieve a high level of conceptual integration. This
happens when the research questions fit the chosen methods, when the questions are
consistent with existing evidence, and when there is a plausible conceptual rationale for
expected outcomes—including a rationale for any hypotheses or interventions. For
example, suppose a research team hypothesized that a nurse-led smoking cessation
intervention would reduce smoking among patients with cardiovascular disease. Why
would they make this prediction—what is the “theory” about how the intervention
might change people’s behavior? Do the researchers predict that the intervention will
190
change patients’ knowledge? their attitudes? their motivation? The researchers’ view of
how the intervention would “work” should drive the design of the intervention and the
study.
Studies are not developed in a vacuum—there must be an underlying
conceptualization of people’s behaviors and characteristics. In some studies, the
underlying conceptualization is fuzzy or unstated, but in good research, a defensible
conceptualization is made explicit. This chapter discusses theoretical and conceptual
contexts for nursing research problems.
THEORIES, MODELS, AND FRAMEWORKS
Many terms are used in connection with conceptual contexts for research, such as
theories, models, frameworks, schemes, and maps. These terms are interrelated but are
used differently by different writers. We offer guidance in distinguishing these terms as
we define them.
Theories
In nursing education, the term theory is used to refer to content covered in classrooms,
as opposed to actual nursing practice. In both lay and scientific language, theory
connotes an abstraction.
Theory is often defined as an abstract generalization that explains how phenomena
are interrelated. As classically defined, theories consist of two or more concepts and a
set of propositions that form a logically interrelated system, providing a mechanism for
deducing hypotheses. To illustrate, consider reinforcement theory, which posits that
behavior that is reinforced (i.e., rewarded) tends to be repeated and learned. The
proposition lends itself to hypothesis generation. For example, we could deduce from
the theory that hyperactive children who are rewarded when they engage in quiet play
will exhibit fewer acting-out behaviors than unrewarded children. This prediction, as
well as others based on reinforcement theory, could be tested in a study.
The term theory is also used less restrictively to refer to a broad characterization of a
phenomenon. A descriptive theory accounts for and thoroughly describes a
phenomenon. Descriptive theories are inductive, observation-based abstractions that
describe or classify characteristics of individuals, groups, or situations by summarizing
their commonalities. Such theories are important in qualitative studies.
Theories can help to make research findings interpretable. Theories may guide
researchers’ understanding not only of the “what” of natural phenomena but also of the
“why” of their occurrence. Theories can also help to stimulate research by providing
direction and impetus.
Theories vary in their level of generality. Grand theories (or macrotheories) claim
to explain large segments of human experience. In nursing, there are grand theories that
offer explanations of the whole of nursing and that characterize the nature and mission
191
of nursing practice, as distinct from other disciplines. An example of a nursing theory
that has been described as a grand theory is Parse’s Humanbecoming Paradigm
(Parse, 2014). Theories of relevance to researchers are often less abstract than grand
theories. Middle-range theories attempt to explain such phenomena as stress, comfort,
and health promotion. Middle-range theories, compared to grand theories, are more
specific and more amenable to empirical testing.
Models
A conceptual model deals with abstractions (concepts) that are assembled because of
their relevance to a common theme. Conceptual models provide a conceptual
perspective on interrelated phenomena, but they are more loosely structured than
theories and do not link concepts in a logical deductive system. A conceptual model
broadly presents an understanding of a phenomenon and reflects the assumptions of the
model’s designer. Conceptual models can serve as springboards for generating
hypotheses.
Some writers use the term model to designate a method of representing phenomena
with a minimal use of words, which can convey different meanings to different people.
Two types of models used in research contexts are schematic models and statistical
models. Statistical models, not discussed here, are equations that mathematically
express relationships among a set of variables and that are tested statistically.
Schematic models (or conceptual maps) visually represent relationships among
phenomena and are used in both quantitative and qualitative research. Concepts and
linkages between them are depicted graphically through boxes, arrows, or other
symbols. As an example of a schematic model, Figure 8.1 shows Pender’s Health
Promotion Model, which is a model for explaining and predicting the health-promotion
component of lifestyle (Pender et al., 2015). Schematic models are appealing as visual
summaries of complex ideas.
192
Frameworks
A framework is the conceptual underpinning of a study. Not every study is based on a
theory or model, but every study has a framework. In a study based on a theory, the
framework is called the theoretical framework; in a study that has its roots in a
conceptual model, the framework may be called the conceptual framework. However,
the terms conceptual framework, conceptual model, and theoretical framework are often
used interchangeably.
A study’s framework is often implicit (i.e., not formally acknowledged or
described). Worldviews shape how concepts are defined, but researchers often fail to
clarify the conceptual foundations of their concepts. Researchers who clarify conceptual
definitions of key variables provide important information about the study’s framework.
Quantitative researchers are less likely to identify their frameworks than qualitative
193
researchers. In qualitative research within a research tradition, the framework is part of
that tradition. For example, ethnographers generally begin within a theory of culture.
Grounded theory researchers incorporate sociological principles into their framework
and approach. The questions that qualitative researchers ask often inherently reflect
certain theoretical formulations.
In recent years, concept analysis has become an important enterprise among
students and nurse scholars. Several methods have been proposed for undertaking a
concept analysis and clarifying conceptual definitions (e.g., Walker & Avant, 2011).
Efforts to analyze concepts of relevance to nursing should facilitate greater conceptual
clarity among nurse researchers.
Example of developing a conceptual definition
Ramezani and colleagues (2014) used Walker and Avant’s (2011) eight-step concept
analysis methods to conceptually define spiritual care in nursing. They searched and
analyzed national and international databases and found 151 relevant articles and 7
books. They proposed the following definition: “The attributes of spiritual care are
healing presence, therapeutic use of self, intuitive sense, exploration of the spiritual
perspective, patient centredness, meaning-centred therapeutic intervention and
creation of a spiritually nurturing environment” (p. 211).
The Nature of Theories and Conceptual Models
Theories, conceptual frameworks, and models are not discovered; they are created.
Theory building depends not only on observable evidence but also on a theorist’s
ingenuity in pulling evidence together and making sense of it. Because theories are not
just “out there” waiting to be discovered, it follows that theories are tentative. A theory
cannot be proved—a theory represents a theorist’s best efforts to describe and explain
phenomena. Through research, theories evolve and are sometimes discarded. This may
happen if new evidence undermines a previously accepted theory. Or, a new theory
might integrate new observations with an existing theory to yield a more parsimonious
explanation of a phenomenon.
Theory and research have a reciprocal relationship. Theories are built inductively
from observations, and research is an excellent source for those observations. The
theory, in turn, must be tested by subjecting deductions from it (hypotheses) to
systematic inquiry. Thus, research plays a dual and continuing role in theory building
and testing.
CONCEPTUAL MODELS AND THEORIES USED IN
NURSING RESEARCH
Nurse researchers have used both nursing and nonnursing frameworks as conceptual
194
contexts for their studies. This section briefly discusses several frameworks that have
been found useful by nurse researchers.
Conceptual Models of Nursing
Several nurses have formulated conceptual models representing explanations of what
the nursing discipline is and what the nursing process entails. As Fawcett and DeSanto-
Madeya (2013) have noted, four concepts are central to models of nursing: human
beings, environment, health, and nursing. The various conceptual models define these
concepts differently, link them in diverse ways, and emphasize different relationships
among them. Moreover, the models emphasize different processes as being central to
nursing.
The conceptual models were not developed primarily as a base for nursing research.
Indeed, most models have had more impact on nursing education and clinical practice
than on research. Nevertheless, nurse researchers have turned to these conceptual
frameworks for inspiration in formulating research questions and hypotheses.
TIP The Supplement to Chapter 8 on website includes a table of
several prominent conceptual models in nursing. The table describes the
model’s key features and identifies a study that claimed the model as its
framework.
Let us consider one conceptual model of nursing that has received research
attention, Roy’s Adaptation Model. In this model, humans are viewed as
biopsychosocial adaptive systems who cope with environmental change through the
process of adaptation (Roy & Andrews, 2009). Within the human system, there are four
subsystems: physiologic/physical, self-concept/group identity, role function, and
interdependence. These subsystems constitute adaptive modes that provide mechanisms
for coping with environmental stimuli and change. Health is viewed as both a state and
a process of being, and becoming integrated and whole, that reflects the mutuality of
persons and environment. The goal of nursing, according to this model, is to promote
client adaptation. Nursing interventions usually take the form of increasing, decreasing,
modifying, removing, or maintaining internal and external stimuli that affect adaptation.
Roy’s Adaptation Model has been the basis for several middle-range theories and
dozens of studies.
Research example using Roy’s Adaptation Model
Alvarado-García and Salazar Maya (2015) used Roy’s Adaptation Model as a basis
for their in-depth study of how elderly adults adapt to chronic benign pain.
Middle-Range Theories Developed by Nurses
195
In addition to conceptual models that describe and characterize the nursing process,
nurses have developed middle-range theories and models that focus on more specific
phenomena of interest to nurses. Examples of middle-range theories that have been used
in research include Beck’s(2012) Theory of Postpartum Depression; Kolcaba’s (2003)
Comfort Theory, Pender and colleagues’ (2015) Health Promotion Model, and Mishel’s
(1990) Uncertainty in Illness Theory. The latter two are briefly described here.
Nola Pender’s (2011) Health Promotion Model (HPM) focuses on explaining
health-promoting behaviors, using a wellness orientation. According to the model (see
Fig. 8.1), health promotion entails activities directed toward developing resources that
maintain or enhance a person’s well-being. The model embodies a number of
propositions that can be used in developing and testing interventions and understanding
health behaviors. For example, one HPM proposition is that people engage in behaviors
from which they anticipate deriving valued benefits, and another is that perceived
competence (or self-efficacy) relating to a given behavior increases the likelihood of
performing the behavior.
Example using the Health Promotion Model
Cole and Gaspar (2015) used the HPM as their framework for an evidence-based
project designed to examine the disease management behaviors of patients with
epilepsy and to guide the implementation of a self-management protocol for these
patients.
Mishel’s Uncertainty in Illness Theory (Mishel, 1990) focuses on the concept of
uncertainty—the inability of a person to determine the meaning of illness-related events.
According to this theory, people develop subjective appraisals to assist them in
interpreting the experience of illness and treatment. Uncertainty occurs when people are
unable to recognize and categorize stimuli. Uncertainty results in the inability to obtain
a clear conception of the situation, but a situation appraised as uncertain will mobilize
individuals to use their resources to adapt to the situation. Mishel’s conceptualization of
uncertainty and her Uncertainty in Illness Scale have been used in many nursing studies.
Example using Uncertainty in Illness Theory
Cypress (2016) used Mishel’s Uncertainty in Illness Theory as a foundation for
exploring uncertainty among chronically ill patients in the intensive care unit.
Other Models Used by Nurse Researchers
Many concepts in which nurse researchers are interested are not unique to nursing, and
so their studies are sometimes linked to frameworks that are not models from nursing.
Several alternative models have gained prominence in the development of nursing
interventions to promote health-enhancing behaviors and life choices. Four nonnursing
196
theories have frequently been used in nursing studies: Bandura’s (2001) Social
Cognitive Theory, Prochaska et al.’s (2002) Transtheoretical (Stages of Change) Model,
the Health Belief Model (Becker, 1974), and the Theory of Planned Behavior (Ajzen,
2005).
Social Cognitive Theory (Bandura, 2001), which is sometimes called self-efficacy
theory, offers an explanation of human behavior using the concepts of self-efficacy,
outcome expectations, and incentives. Self-efficacy concerns people’s belief in their
own capacity to carry out particular behaviors (e.g., smoking cessation). Self-efficacy
expectations determine the behaviors a person chooses to perform, their degree of
perseverance, and the quality of the performance. For example, C. Lee and colleagues
(2016) examined whether social cognitive theory–based factors, including self-efficacy,
were determinants of physical activity maintenance in breast cancer survivors 6 months
after a physical activity intervention.
TIP Self-efficacy is a key construct in several models discussed in this
chapter. Self-efficacy has repeatedly been found to affect people’s behaviors
and to be amenable to change, and so self-efficacy enhancement is often a
goal in interventions designed to change people’s health-related behavior.
In the Transtheoretical Model (Prochaska et al., 2002), the core construct is stages
of change, which conceptualizes a continuum of motivational readiness to change
problem behavior. The five stages of change are precontemplation, contemplation,
preparation, action, and maintenance. Studies have shown that successful self-changers
use different processes at each particular stage, thus suggesting the desirability of
interventions that are individualized to the person’s stage of readiness for change. For
example, M. K. Lee and colleagues (2014) tested a web-based self-management
intervention for breast cancer survivors. The exercise and diet intervention program
incorporated transtheoretical model–based strategies.
Becker’s (1974) Health Belief Model (HBM) is a framework for explaining people
’s health-related behavior, such as compliance with a medical regimen. According to the
model, health-related behavior is influenced by a person’s perception of a threat posed
by a health problem as well as by the value associated with actions aimed at reducing
the threat (Becker, 1974). A revised HBM (RHBM) has incorporated the concept of
self-efficacy (Rosenstock et al., 1988). Nurse researchers have used the HBM
extensively. For example, Jeihooni and coresearchers (2015) developed and tested an
osteoporosis prevention program based on the HBM.
The Theory of Planned Behavior (TPB; Ajzen, 2005), which is an extension of
another theory called the Theory of Reasoned Action, offers a framework for
understanding people’s behavior and its psychological determinants. According to the
theory, behavior that is volitional is determined by people’s intention to perform that
behavior. Intentions, in turn, are affected by attitudes toward the behavior, subjective
norms (i.e., perceived social pressure to perform or not perform the behavior), and
197
perceived behavioral control (i.e., anticipated ease or difficulty of engaging in the
behavior). Newham and colleagues (2016), for example, used TPB as a framework in
their study of pregnant women’s intentions toward physical activity and resting
behavior.
Although the use of theories and models from other disciplines such as psychology
(borrowed theories) has stirred some controversy, nursing research is likely to continue
on its current path of conducting studies within a multidisciplinary perspective. A
borrowed theory that is tested and found to be empirically adequate in health-relevant
situations of interest to nurses becomes shared theory.
TIP Links to websites devoted to several theories mentioned in this chapter
are provided in the Internet Resources on website.
USING A THEORY OR FRAMEWORK IN RESEARCH
The ways in which theory is used by quantitative and qualitative researchers are
elaborated on in this section. The term theory is used in its broadest sense to include
conceptual models, formal theories, and frameworks.
Theories in Qualitative Research
Theory is almost always present in studies that are embedded in a qualitative research
tradition such as ethnography or phenomenology. However, different traditions involve
theory in different ways.
Sandelowski (1993) distinguished between substantive theory (conceptualizations of
a specific phenomenon under study) and theory reflecting a conceptualization of human
inquiry. Some qualitative researchers insist on an atheoretical stance vis-à-vis the
phenomenon of interest, with the goal of suspending prior conceptualizations
(substantive theories) that might bias their inquiry. For example, phenomenologists are
committed to theoretical naiveté and try to hold preconceived views of the phenomenon
in check. Nevertheless, phenomenologists are guided by a framework that focuses their
inquiry on certain aspects of a person’s lifeworld—i.e., lived experiences.
Ethnographers bring a cultural perspective to their studies, and this perspective
shapes their fieldwork. Cultural theories include ideational theories, which suggest that
cultural conditions stem from mental activity and ideas, and materialistic theories,
which view material conditions (e.g., resources, production) as the source of cultural
developments (Fetterman, 2010).
The theoretical underpinning of grounded theory is a melding of sociological
formulations, the most prominent of which is symbolic interaction (or interactionism).
Three underlying premises include (1) humans act toward things based on the meanings
that the things have for them; (2) the meaning of things is derived from the human
interactions; and (3) meanings are handled in, and modified through, an interpretive
198
process (Blumer, 1986).
Example of a grounded theory study
Babler and Strickland (2015) did a grounded theory study within a symbolic
interaction framework to gain an understanding of the efforts of adolescents with type
1 diabetes mellitus to “normalize.”
Despite this theoretical perspective, grounded theory researchers, like
phenomenologists, try to hold prior substantive theory about the phenomenon in
abeyance until their own substantive theory emerges. The goal of grounded theory is to
develop a conceptually dense understanding of a phenomenon that is grounded in actual
observations. Once the theory starts to take shape, grounded theorists use previous
literature for comparison with the emerging categories of the theory. Grounded theory
researchers, who focus on social or psychological processes, often develop conceptual
maps to illustrate how a process unfolds. Figure 8.2 illustrates such a conceptual map
for a study of the transition from patient to survivor in African American breast cancer
survivors (Mollica & Nemeth, 2015); this study is described at the end of this chapter.
In recent years, some qualitative nurse researchers have used critical theory as a
framework in their research. Critical theory is a paradigm that involves a critique of
society and societal processes and structures, as we discuss in Chapter 11.
199
Qualitative researchers sometimes use conceptual models of nursing or other formal
theories as interpretive frameworks. For example, a number of qualitative nurse
researchers acknowledge that the philosophic roots of their studies lie in conceptual
models of nursing such as those developed by Parse (2014), Roy (Roy & Andrews,
2009), Rogers (1994), or Newman (1997).
TIP Systematic review of qualitative studies on a specific topic can lead to
substantive theory development. In metasyntheses, qualitative studies are
combined to identify their essential elements. The findings from different
sources are then used for theory building, as discussed in Chapter 18.
Theories in Quantitative Research
Quantitative researchers link research to theory or models in various ways. The classic
approach is to test hypotheses deduced from an existing theory. For example, a nurse
might read about Pender’s (2011) HPM (see Fig. 8.1) and might reason as follows: If
the HPM is valid, then I would expect that patients with osteoporosis who perceive the
benefit of a calcium-enriched diet would be more likely to alter their eating patterns
than those who perceive no benefits. This hypothesis could be tested through statistical
analysis of data on patients’ perceptions in relation to their eating habits. Repeated
acceptance of hypotheses derived from a theory lends support to the theory.
TIP When a quantitative study is based on a theory or model, the research
article typically states this fact early—often in the abstract, or even in the
title. Some reports also have a subsection of the introduction called
“Theoretical Framework.” The report usually includes a brief overview of the
theory so that all readers can understand, in a broad way, the conceptual
context of the study.
Some researchers test theory-based interventions. Theories have implications for
influencing people’s attitudes or behavior and hence their health outcomes.
Interventions based on an explicit conceptualization of human behavior have a better
chance of being effective than ones developed in a conceptual vacuum. Interventions
rarely affect outcomes directly—there are mediating factors that play a role in the
pathway between the intervention and desired outcomes. For example, researchers
developing interventions based on Social Cognitive Theory posit that improvements to a
person’s self-efficacy will, in turn, result in positive changes in health behaviors and
health outcomes.
Example of theory testing in an intervention study
Smith and colleagues (2015) tested the effectiveness of a theory-based (Social
200
Cognitive Theory) antenatal lifestyle program for pregnant women whose body mass
index exceeded 30.
Many researchers who cite a theory or model as their framework are not directly
testing the theory but may use the theory to provide an organizing structure. In such an
approach, researchers assume that the model they espouse is valid and then use its
constructs or schemas to provide an interpretive context.
Quantitative researchers also use another approach to creating a conceptual context
and that involves using findings from prior research to develop an original model. In
some cases, the model incorporates elements or constructs from an existing theory.
Example of developing a new model
Hoffman and colleagues (2014) developed a rehabilitation program for lung cancer
patients and then pilot tested it. The intervention was based on their own conceptual
model, which represented a synthesis of two theories, the Theory of Symptom-Care
Management and the Transitional Care Model.
CRITIQUING FRAMEWORKS IN RESEARCH
REPORTS
It is often challenging to critique the theoretical context of a published research report—
or its absence—but we offer a few suggestions.
In a qualitative study in which a grounded theory is developed, you may not be
given enough information to refute the proposed theory because only evidence
supporting the theory is presented. You can, however, assess whether
conceptualizations are insightful and whether the evidence is convincing. In a
phenomenological study, you should look for a discussion of the study’s philosophical
underpinnings, that is, the philosophy of phenomenology.
For quantitative studies, the first task is to see whether the study has an explicit
conceptual framework. If there is no mention of a theory, model, or framework (and
often there is not), you should consider whether this absence diminishes the value of the
study. Research often benefits from an explicit conceptual context, but some studies are
so pragmatic that the lack of a theory has no effect on its utility. If, however, the study
involves the test of a hypothesis or a complex intervention, the absence of a formal
framework suggests conceptual fuzziness.
If the study does have an explicit framework, you can reflect on its appropriateness.
You may not be able to challenge the researcher’s use of a particular theory, but you can
assess whether the link between the problem and the theory is genuine. Did the
researcher present a convincing rationale for the framework used? In quantitative
studies, did the hypotheses flow from the theory? Did the researcher interpret the
findings within the context of the framework? If the answer to such questions is no, you
201
may have grounds for criticizing the study’s framework, even though you may not be
able to suggest ways to improve the conceptual basis of the study. Some suggestions for
evaluating the conceptual basis of a quantitative study are offered in Box 8.1.
Box 8.1 Guidelines for Critiquing Theoretical and Conceptual Frameworks
1. Did the report describe an explicit theoretical or conceptual framework for the
study? If not, does the absence of a framework detract from the study’s conceptual
integration?
2. Did the report adequately describe the major features of the theory or model so
that readers could understand the conceptual basis of the study?
3. Is the theory or model appropriate for the research problem? Does the purported
link between the problem and the framework seem contrived?
4. Was the theory or model used for generating hypotheses, or is it used as an
organizational or interpretive framework? Do the hypotheses (if any) naturally
flow from the framework?
5. Were concepts defined in a way that is consistent with the theory? If there was an
intervention, were intervention components consistent with the theory?
6. Did the framework guide the study methods? For example, was the appropriate
research tradition used if the study was qualitative? If quantitative, do the
operational definitions correspond to the conceptual definitions?
7. Did the researcher tie the study findings back to the framework at the end of the
report? Were the findings interpreted within the context of the framework?
TIP Some studies claim theoretical linkages that are contrived. This is
most likely to occur when researchers first formulate the research problem
and then later find a theoretical context to fit it. An after-the-fact linkage of
theory to a research question is often artificial. If a research problem is truly
linked to a conceptual framework, then the design of the study, the
measurement of key constructs, and the analysis and interpretation of data
will flow from that conceptualization.
This section presents two examples of studies that have strong theoretical
links. Read the summaries and then answer the critical thinking questions,
referring to the full research report if necessary. Examples 1 and 2 are featured
on the interactive Critical Thinking Activity on website. The critical
thinking questions for Examples 3 and 4 are based on the studies that appear
in their entirety in Appendices A and B of this book. Our comments for these
202
exercises are in the Student Resources section on .
EXAMPLE 1: THE HEALTH PROMOTION MODEL IN A
QUANTITATIVE STUDY
Study: The effects of coping skills training among teens with asthma (Srof et
al., 2012)
Statement of Purpose: The purpose of the study was to evaluate the effects
of a school-based intervention, coping skills training (CST), for teenagers
with asthma.
Theoretical Framework: The HPM, shown in Figure 8.1, was the guiding
framework for the intervention. The authors noted that within the HPM,
various behavior-specific cognitions (e.g., perceived barriers to behavior,
perceived self-efficacy) influence health-promoting behavior and are
modifiable through an intervention. In this study, the overall behavior of
interest was asthma self-management. The CST intervention was a five-
session small-group strategy designed to promote problem solving,
cognitive–behavior modification, and conflict resolution using strategies to
improve self-efficacy and reduce perceived barriers. The researchers
hypothesized that participation in CST would result in improved outcomes in
asthma self-efficacy, asthma-related quality of life, social support, and peak
expiratory flow rate (PEFR).
Method: In this pilot study, 39 teenagers with asthma were randomly
assigned to one of two groups—one that participated in the intervention and
the other that did not. The researchers collected data about the outcomes from
all participants at two points in time: before the start of the intervention and 6
weeks later.
Key Findings: Teenagers in the treatment group scored significantly higher
at the end of the study on self-efficacy, activity-related quality of life, and
social support than those in the control group.
Conclusions: The researchers noted that the self-efficacy and social support
effects of the intervention were consistent with the HPM. They recommended
that, although the findings were promising, replication of the study and an
extension to specifically examine asthma self-management behavior would be
useful.
Critical Thinking Exercises
1. Answer the relevant questions from Box 8.1 regarding this study.
2. Also consider the following targeted questions:
a. In the model shown in Figure 8.1, which factors did the researchers
203
predict that the intervention would affect, according to the abbreviated
description in the textbook?
b. Is there another model or theory that was described in this chapter that
could have been used to study the effect of this intervention?
3. If the results of this study are valid and generalizable, what might be some
of the uses to which the findings could be put in clinical practice?
EXAMPLE 2: A GROUNDED THEORY STUDY
Study: Transition from patient to survivor in African American breast cancer
survivors (Mollica & Nemeth, 2015)
Statement of Purpose: The purpose of the study was to examine the
experience of African American women as they transition between breast
cancer patient and breast cancer survivor.
Theoretical Framework: A grounded theory approach was chosen because
the researchers noted as a goal “the discovery of theory from data
systematically obtained and analyzed” (p. 17). The researchers further noted
the use of induction that is inherent in a grounded theory approach: “An open,
exploratory approach was used to identify recurrent meaningful concepts
through systematic, inductive analysis of content” (p. 17).
Method: Data were collected through interviews with 15 community-based
African American women who had completed treatment for primary breast
cancer between 6 and 18 months prior to the interviews. Women were
recruited from community settings in two American cities. The women were
interviewed by telephone. Each interview, which lasted about 45 minutes,
was audiotaped so that the interviews could be transcribed. The interviewer
asked broad questions about the women’s experiences following their
treatment for breast cancer. Recruitment and interviewing continued until no
new information was revealed—i.e., until data saturation occurred.
Key Findings: Based on their analysis of the in-depth interviews, the
researchers identified four main processes: perseverance through struggles
supported by reliance on faith, dealing with persistent physical issues,
needing anticipatory guidance after treatment, and finding emotional needs as
important as physical ones. A schematic model for the substantive theory is
presented in Figure 8.2.
Critical Thinking Exercises
1. Answer the relevant questions from Box 8.1 regarding this study.
2. Also consider the following targeted questions:
a. In what way was the use of theory different in the Mollica and Nemeth
study than in the previous study by Srof and colleagues?
204
b. Comment on the utility of the schematic model shown in Figure 8.2.
3. If the results of this study are trustworthy, what might be some of the uses
to which the findings could be put in clinical practice?
EXAMPLE 3: QUANTITATIVE RESEARCH IN APPENDIX A
• Read the introduction of Swenson and colleagues’ (2016) study (“Parents’
use of praise and criticism in a sample of young children seeking mental
health services”) in Appendix A of this book.
Critical Thinking Exercises
1. Answer the relevant questions from Box 8.1 regarding this study.
2. Also consider the following question: Would any of the theories or models
described in this chapter have provided an appropriate conceptual context
for this study?
EXAMPLE 4: QUALITATIVE RESEARCH IN APPENDIX B
• Read the introduction of Beck and Watson’s (2010) study (“Subsequent
childbirth after a previous traumatic birth”) in Appendix B of this book.
Critical Thinking Exercises
1. Answer the relevant questions from Box 8.1 regarding this study.
2. Also consider the following targeted questions:
a. Do you think that a schematic model would have helped to present the fi
ndings in this report?
b. Did Beck and Watson present convincing evidence to support their use
of the philosophy of phenomenology?
WANT TO KNOW MORE?
A wide variety of resources to enhance your learning and understanding of
this chapter are available on .
• Interactive Critical Thinking Activity
• Chapter Supplement on Prominent Conceptual Models of Nursing Used
by Nurse Researchers
• Answers to the Critical Thinking Exercises for Examples 3 and 4
• Internet Resources with useful websites for Chapter 8
• A Wolters Kluwer journal article in its entirety—the Mollica and Nemeth
study described as Example 2 on p. 133.
Additional study aids, including eight journal articles and related
questions, are also available in Study Guide for Essentials of Nursing
205
Research, 9e.
Summary
Points
High-quality research requires conceptual integration, one aspect of which is
having a defensible theoretical rationale for the study.
As classically defined, a theory is an abstract generalization that systematically
explains relationships among phenomena. Descriptive theory thoroughly
describes a phenomenon.
Grand theories (or macrotheories) attempt to describe large segments of the
human experience. Middle-range theories are specific to certain phenomena.
Concepts are also the basic elements of conceptual models, but concepts are not
linked in a logically ordered, deductive system.
In research, the goals of theories and models are to make findings meaningful, to
integrate knowledge into coherent systems, to stimulate new research, and to
explain phenomena and relationships among them.
Schematic models (or conceptual maps) are graphic representations of
phenomena and their interrelationships using symbols or diagrams and a minimal
use of words.
A framework is the conceptual underpinning of a study, including an overall
rationale and conceptual definitions of key concepts. In qualitative studies, the
framework often springs from distinct research traditions.
Several conceptual models of nursing have been used in nursing research. The
concepts central to models of nursing are human beings, environment, health, and
nursing. An example of a model of nursing used by nurse researchers is Roy’s
Adaptation Model.
Nonnursing models used by nurse researchers (e.g., Bandura’s Social Cognitive
Theory) are referred to as borrowed theories; when the appropriateness of
borrowed theories for nursing inquiry is confirmed, the theories become shared
theories.
In some qualitative research traditions (e.g., phenomenology), the researcher
strives to suspend previously held substantive theories of the specific phenomena
under study, but each tradition has rich theoretical underpinnings.
Some qualitative researchers seek to develop grounded theories, data-driven
explanations to account for phenomena under study through inductive processes.
206
In the classical use of theory, researchers test hypotheses deduced from an existing
theory. An emerging trend is the testing of theory-based interventions.
In both quantitative and qualitative studies, researchers sometimes use a theory or
model as an organizing framework, or as an interpretive tool.
REFERENCES FOR CHAPTER 8
Ajzen, I. (2005). Attitudes, personality, and behavior (2nd ed.). Berkshire, United Kingdom: Open University
Press.
*Alvarado-García, A., & Salazar Maya, Á. (2015). Adaptation to chronic benign pain in elderly adults.
Investigación y Educación en Enfermería, 33, 128–137.
Babler, E., & Strickland, C. (2015). Moving the journey towards independence: Adolescents transitioning to
successful diabetes self-management. Journal of Pediatric Nursing, 30, 648–660.
Bandura, A. (2001). Social cognitive theory: An agentic perspective. Annual Review of Psychology, 52, 1–26.
Beck, C. T. (2012). Exemplar: Teetering on the edge: A second grounded theory modification. In P. L. Munhall
(Ed.), Nursing research: A qualitative perspective (5th ed., pp. 257–284). Sudbury, MA: Jones & Bartlett
Learning.
Becker, M. (1974). The health belief model and personal health behavior. Thorofare, NJ: Slack.
Blumer, H. (1986). Symbolic interactionism: Perspective and method. Berkeley, CA: University of California
Press.
Cole, K. A., & Gaspar, P. (2015). Implementation of an epilepsy self-management protocol. Journal of
Neuroscience Nursing, 47, 3–9.
Cypress, B. S. (2016). Understanding uncertainty among critically ill patients in the intensive care unit using
Mishel’s Theory of Uncertainty of Illness. Dimensions of Critical Care Nursing, 35, 42–49.
Fawcett, J., & DeSanto-Madeya, S. (2013). Contemporary nursing knowledge: Analysis and evaluation of nursing
models and theories (3rd ed.). Philadelphia, PA: F. A. Davis.
Fetterman, D. M. (2010). Ethnography: Step-by-step (3rd ed.). Thousand Oaks, CA: Sage.
*Hoffman, A., Brintnall, R., von Eye, A., Jones, L., Alderink, G., Patzelt, L., & Brown, J. (2014). A rehabilitation
program for lung cancer patients during postthoracotomy chemotherapy. OncoTargets and Therapy, 7, 415–423.
*Jeihooni, A. K., Hidarnia, A., Kaveh, M., Hajizadeh, E., & Askari, A. (2015). Effects of an osteoporosis
prevention program based on Health Belief Model among females. Nursing and Midwifery Studies, 4, e26731.
Kolcaba, K. (2003). Comfort theory and practice: A vision for holistic health care and research. New York, NY:
Springer Publishing.
Lee, C., Szuck, B., & Lau, Y. (2016). Determinants of physical activity maintenance in breast cancer survivors
after a community-based intervention. Oncology Nursing Forum, 43, 93–102.
Lee, M. K., Yun, Y., Park, H., Lee, E., Jung, K., & Noh, D. (2014). A web-based self-management exercise and
diet intervention for breast cancer survivors: Pilot randomized controlled trial. International Journal of Nursing
Studies, 51, 1557–1567.
Mishel, M. H. (1990). Reconceptualization of the uncertainty in illness theory. Image, 22(4), 256–262.
**Mollica, M., & Nemeth, L. (2015). Transition from patient to survivor in African American breast cancer
survivors. Cancer Nursing, 38, 16–22.
Newham, J., Allan, C., Leahy-Warren, P., Carrick-Sen, D., & Alderdice, F. (2016). Intentions toward physical
activity and resting behavior in pregnant women: Using the theory of planned behavior framework in a cross-
sectional study. Birth, 43, 49–57.
Newman, M. (1997). Evolution of the theory of health as expanding consciousness. Nursing Science Quarterly, 10,
22–25.
Parse, R. R. (2014). The humanbecoming paradigm: A transformational worldview. Pittsburgh, PA: Discovery
International.
Pender, N. J., Murdaugh, C., & Parsons, M. A. (2015). Health promotion in nursing practice (7th ed.). Upper
Saddle River, NJ: Prentice Hall.
Prochaska, J. O., Redding, C. A., & Evers, K. E. (2002). The transtheoretical model and stages of changes. In K.
Glanz, B. K. Rimer, & F. M. Lewis (Eds.). Health behavior and health education: Theory, research, and
practice (pp. 99–120). San Francisco, CA: Jossey-Bass.
Ramezani, M., Ahmadi, F., Mohammadi, E., & Kazemnejad, A. (2014). Spiritual care in nursing: A concept
analysis. International Nursing Review, 61, 211–219.
207
Rogers, M. E. (1994). The science of unitary human beings: Current perspectives. Nursing Science Quarterly, 7,
33–35.
Rosenstock, I., Strecher, V., & Becker, M. (1988). Social learning theory and the health belief model. Health
Education Quarterly, 15, 175–183.
Roy, C., & Andrews, H. (2009). The Roy adaptation model (3rd ed.). Upper Saddle River, NJ: Prentice Hall.
Sandelowski, M. (1993). Theory unmasked: The uses and guises of theory in qualitative research. Research in
Nursing & Health, 16, 213–218.
Smith, D. M., Taylor, W., Whitworth, M., Roberts, S., Sibley, C., & Lavender, T. (2015). The feasibility phase of a
community antenatal lifestyle programme [The Lifestyle Course (TLC)] for women with a body mass index
(BMI) ≥ (greater than or equal to sign) 30 kg/m2. Midwifery, 31, 280–287.
Srof, B., Velsor-Friedrich, B., & Penckofer, S. (2012). The effects of coping skills training among teens with
asthma. Western Journal of Nursing Research, 34, 1043–1061.
Walker, L., & Avant, K. (2011). Strategies for theory construction in nursing (5th ed.). Upper Saddle River, NJ:
Prentice Hall.
*A link to this open-access article is provided in the Internet Resources section on website.
**This journal article is available on for this chapter.
208
Part 3 Designs and Methods for Quantitative
and Qualitative Nursing Research
9 Quantitative Research Design
Learning Objectives
On completing this chapter, you will be able to:
Discuss key research design decisions for a quantitative study
Discuss the concepts of causality and identify criteria for causal relationships
Describe and identify experimental, quasi-experimental, and nonexperimental designs
Distinguish between cross-sectional and longitudinal designs
Identify and evaluate alternative methods of controlling confounding variables
Understand various threats to the validity of quantitative studies
Evaluate a quantitative study in terms of its research design and methods of controlling
confounding variables
Define new terms in the chapter
Key Terms
Attrition
Baseline data
Blinding
Case-control design
Cause
Cohort design
Comparison group
Construct validity
Control group
Correlation
Correlational research
209
Crossover design
Cross-sectional design
Descriptive research
Effect
Experiment
Experimental group
External validity
History threat
Homogeneity
Internal validity
Intervention
Longitudinal design
Matching
Maturation threat
Mortality threat
Nonequivalent control group design
Nonexperimental study
Placebo
Posttest data
Pretest–posttest design
Prospective design
Quasi-experiment
Randomization (random assignment)
Randomized controlled trial (RCT)
Research design
Retrospective design
Selection threat (self-selection)
Statistical conclusion validity
Statistical power
Threats to validity
Time-series design
Validity
For quantitative studies, no aspect of a study’s methods has a bigger impact on the
validity of the results than the research design—particularly if the inquiry is cause-
probing. This chapter has information about how you can draw conclusions about key
aspects of evidence quality in a quantitative study.
OVERVIEW OF RESEARCH DESIGN ISSUES
The research design of a study spells out the strategies that researchers adopt to answer
their questions and test their hypotheses. This section describes some basic design
210
issues.
Key Research Design Features
Table 9.1 describes seven key features that are typically addressed in the design of a
quantitative study. Design decisions that researchers must make include the following:
Will there be an intervention? A basic design issue is whether or not researchers will
introduce an intervention and test its effects—the distinction between experimental
and nonexperimental research.
What types of comparisons will be made? Quantitative researchers often make
comparisons to provide an interpretive context. Sometimes, the same people are
compared at different points in time (e.g., preoperatively vs. postoperatively), but
often, different people are compared (e.g., those getting vs. not getting an
intervention).
How will confounding variables be controlled? In quantitative research, efforts are
often made to control factors extraneous to the research question. This chapter
discusses techniques for controlling confounding variables.
Will blinding be used? Researchers must decide if information about the study (e.g.,
who is getting an intervention) will be withheld from data collectors, study
participants, or others to minimize the risk of expectation bias—i.e., the risk that
such knowledge could influence study outcomes.
How often will data be collected? Data sometimes are collected from participants at a
single point in time (cross-sectionally), but other studies involve multiple points of
data collection (longitudinally).
When will “effects” be measured, relative to potential causes? Some studies collect
information about outcomes and then look back retrospectively for potential causes.
Other studies begin with a potential cause and then see what outcomes ensue, in a
prospective fashion.
Where will the study take place? Data for quantitative studies are collected in various
settings, such as in hospitals or people’s homes. Another decision concerns how
many different sites will be involved in the study—a decision that could affect the
generalizability of the results.
211
Many design decisions are independent of the others. For example, both
experimental and nonexperimental studies can compare different people or the same
people at different times. This chapter describes the implications of design decisions on
the study’s rigor.
TIP Information about the research design usually appears early in the
method section of a research article.
Causality
Many research questions are about causes and effects. For example, does turning
patients cause reductions in pressure ulcers? Does exercise cause improvements in heart
function? Causality is a hotly debated issue, but we all understand the general concept
of a cause. For example, we understand that failure to sleep causes fatigue and that high
caloric intake causes weight gain. Most phenomena are multiply determined. Weight
gain, for example, can reflect high caloric intake or other factors. Causes are seldom
deterministic; they only increase the likelihood that an effect will occur. For example,
smoking is a cause of lung cancer, but not everyone who smokes develops lung cancer,
and not everyone with lung cancer smoked.
While it might be easy to grasp what researchers mean when they talk about a cause,
what exactly is an effect? One way to understand an effect is by conceptualizing a
counterfactual (Shadish et al., 2002). A counterfactual is what would happen to people
if they were exposed to a causal influence and were simultaneously not exposed to it.
An effect represents the difference between what actually did happen with the exposure
and what would have happened without it. A counterfactual clearly can never be
realized, but it is a good model to keep in mind in thinking about research design.
Three criteria for establishing causal relationships are attributed to John Stuart Mill.
212
1. Temporal: A cause must precede an effect in time. If we test the hypothesis that
smoking causes lung cancer, we need to show that cancer occurred after smoking
began.
2. Relationship: There must be an association between the presumed cause and the
effect. In our example, we have to demonstrate an association between smoking and
cancer—that is, that a higher percentage of smokers than nonsmokers get lung
cancer.
3. Confounders: The relationship cannot be explained as being caused by a third
variable. Suppose that smokers tended to live predominantly in urban environments.
There would then be a possibility that the relationship between smoking and lung
cancer reflects an underlying causal connection between the environment and lung
cancer.
Other criteria for causality have been proposed. One important criterion in health
research is biologic plausibility—evidence from basic physiologic studies that a causal
pathway is credible. Researchers investigating causal relationships must provide
persuasive evidence regarding these criteria through their research design.
Research Questions and Research Design
Quantitative research is used to address different types of research questions, and
different designs are appropriate for different questions. In this chapter, we focus
primarily on designs for Therapy, Prognosis, Etiology/Harm, and Description questions
(Meaning questions require a qualitative approach and are discussed in Chapter 11).
Except for Description, questions that call for a quantitative approach usually
concern causal relationships:
Does a telephone counseling intervention for patients with prostate cancer cause
improvements in their psychological distress? (Therapy question)
Do birth weights under 1,500 g cause developmental delays in children? (Prognosis
question)
Does salt cause high blood pressure? (Etiology/Harm question)
Some designs are better at revealing cause-and-effect relationships than others. In
particular, experimental designs (randomized controlled trials or RCTs) are the best
possible designs for illuminating causal relationships—but it is not always possible to
use such designs. Table 9.2 summarizes a “hierarchy” of designs for answering different
types of causal questions and augments the evidence hierarchy presented in Figure 2.1
(see Chapter 2).
213
EXPERIMENTAL, QUASI-EXPERIMENTAL, AND
NONEXPERIMENTAL DESIGNS
This section describes designs that differ with regard to whether or not there is an
intervention.
Experimental Design: Randomized Controlled Trials
Early scientists learned that complexities occurring in nature can make it difficult to
understand relationships through pure observation. This problem was addressed by
isolating phenomena and controlling the conditions under which they occurred. These
experimental procedures have been adopted by researchers interested in human
physiology and behavior.
Characteristics of True Experiments
A true experiment or RCT is characterized by the following properties:
Intervention—The experimenter does something to some participants by manipulating
the independent variable.
Control—The experimenter introduces controls into the study, including devising an
approximation of a counterfactual—usually a control group that does not receive the
intervention.
Randomization—The experimenter assigns participants to a control or experimental
condition on a random basis.
By introducing an intervention, experimenters consciously vary the independent
variable and then observe its effect on the outcome. To illustrate, suppose we were
investigating the effect of gentle massage (I), compared to no massage (C), on pain (O)
in nursing home residents (P). One experimental design for this question is a pretest–
posttest design, which involves observing the outcome (pain levels) before and after
the intervention. Participants in the experimental group receive a gentle massage,
whereas those in the control group do not. This design permits us to see if changes in
pain were caused by the massage because only some people received it, providing an
important comparison. In this example, we met the first criterion of a true experiment by
varying massage receipt, the independent variable.
This example also meets the second requirement for experiments, use of a control
214
group. Inferences about causality require a comparison, but not all comparisons yield
equally persuasive evidence. For example, if we were to supplement the diet of
premature babies (P) with special nutrients (I) for 2 weeks, their weight (O) at the end
of 2 weeks would tell us nothing about the intervention’s effectiveness. At a minimum,
we would need to compare posttreatment weight with pretreatment weight to see if
weight had increased. But suppose we find an average weight gain of 1 pound. Does
this finding support an inference of a causal connection between the nutritional
intervention (the independent variable) and weight gain (the outcome)? No, because
infants normally gain weight as they mature. Without a control group—a group that
does not receive the supplements (C)—it is impossible to separate the effects of
maturation from those of the treatment. The term control group refers to a group of
participants whose performance on an outcome is used to evaluate the performance of
the experimental group (the group getting the intervention) on the same outcome.
Experimental designs also involve placing participants in groups at random.
Through randomization (also called random assignment), every participant has an
equal chance of being included in any group. If people are randomly assigned, there is
no systematic bias in the groups with regard to attributes that may affect the dependent
variable. Randomly assigned groups are expected to be comparable, on average, with
respect to an infinite number of biologic, psychological, and social traits at the outset of
the study. Group differences on outcomes observed after randomization can therefore be
inferred as being caused by the intervention.
Random assignment can be accomplished by flipping a coin or pulling names from a
hat. Researchers typically either use computers to perform the randomization.
TIP There is a lot of confusion about random assignment versus random
sampling. Random assignment is a signature of an experimental design
(RCT). If subjects are not randomly assigned to intervention groups, then the
design is not a true experiment. Random sampling, by contrast, refers to a
method of selecting people for a study, as we discuss in Chapter 10. Random
sampling is not a signature of an experimental design. In fact, most RCTs do
not involve random sampling.
Experimental Designs
The most basic experimental design involves randomizing people to different groups
and then measuring outcomes. This design is sometimes called a posttest-only design. A
more widely used design, discussed earlier, is the pretest–posttest design, which
involves collecting pretest data (often called baseline data) on the outcome before the
intervention and posttest (outcome) data after it.
Example of a pretest–posttest design
215
Berry and coresearchers (2015) tested the effectiveness of a postpartum weight
management intervention for low-income women. The women were randomly
assigned to be in the intervention group or in a control group. Data on weight,
adiposity, and health behaviors were gathered at baseline and at the end of the
intervention.
TIP Experimental designs can be depicted graphically using symbols to
represent features of the design. Space does not permit us to present these
diagrams here, but many are shown in the Supplement to this chapter on
.
The people who are randomly assigned to different conditions usually are different
people. For example, if we were testing the effect of music on agitation (O) in patients
with dementia (P), we could give some patients music (I) and others no music (C). A
crossover design, by contrast, involves exposing people to more than one treatment.
Such studies are true experiments only if people are randomly assigned to different
orderings of treatment. For example, if a crossover design were used to compare the
effects of music on patients with dementia, some would be randomly assigned to receive
music first followed by a period of no music, and others would receive no music first. In
such a study, the three conditions for an experiment have been met: There is
intervention, randomization, and control—with subjects serving as their own control
group.
A crossover design has the advantage of ensuring the highest possible equivalence
among the people exposed to different conditions. Such designs are sometimes
inappropriate, however, because of possible carryover effects. When subjects are
exposed to two different treatments, they may be influenced in the second condition by
their experience in the first. However, when carryover effects are implausible, as when
intervention effects are immediate and short-lived, a crossover design is powerful.
Example of a crossover design
DiLibero and colleagues (2015) used a crossover design to test whether withholding
or continuing enteral feedings during repositioning of patients affected the incidence
of aspiration. Patients were randomly assigned to different orderings of enteral
feeding treatment.
Experimental and Control Conditions
To give an intervention a fair test, researchers need to design one that is of sufficient
intensity and duration that effects on the outcome might reasonably be expected.
Researchers describe the intervention in formal protocols that stipulate exactly what the
treatment is.
216
Researchers have choices about what to use as the control condition, and the
decision affects the interpretation of the findings. Among the possibilities for the control
condition are the following:
“Usual care”—standard or normal procedures are used to treat patients
An alternative treatment (e.g., music vs. massage)
A placebo or pseudointervention presumed to have no therapeutic value
An attention control condition (the control group gets attention but not the
intervention’s active ingredients)
Delayed treatment, i.e., control group members are wait-listed and exposed to the
intervention at a later point
Example of a wait-listed control group
Song and Lindquist (2015) tested the effectiveness of a mindfulness-based stress
reduction intervention for reducing stress, anxiety, and depression in Korean nursing
students. Students were randomly assigned to the intervention group or a wait-listed
control group.
Ethically, the delayed treatment design is attractive but is not always feasible.
Testing two alternative interventions is also appealing ethically, but the risk is that the
results will be inconclusive because it may be difficult to detect differential effects of
two good treatments.
Researchers must also consider possibilities for blinding. Many nursing
interventions do not lend themselves easily to blinding. For example, if the intervention
were a smoking cessation program, participants would know whether they were
receiving the intervention, and the intervener would know who was in the program. It is
usually possible and desirable, however, to blind the participants’ group status from the
people collecting outcome data.
Example of an experiment with blinding
Kundu and colleagues (2014) studied the effect of Reiki therapy on postoperative
pain in children undergoing dental procedures. Study participants were blinded—
those in the control group received a sham Reiki treatment. Those who recorded the
children’s pain scores, the nurses caring for the children, and the children’s parents
were also blinded to group assignments.
TIP The term double blind is widely used when more than one group is
blinded (e.g., participants and interventionists). However, this term is falling
into disfavor because of its ambiguity, in favor of clear specifications about
exactly who was blinded and who was not.
217
Advantages and Disadvantages of Experiments
RCTs are the “gold standard” for intervention studies (Therapy questions) because they
yield the most persuasive evidence about the effects of an intervention. Through
randomization to groups, researchers come as close as possible to attaining an “ideal”
counterfactual.
The great strength of experiments lies in the confidence with which causal
relationships can be inferred. Through the controls imposed by intervening, comparing,
and—especially—randomizing, alternative explanations can often be ruled out. For this
reason, meta-analyses of RCTs, which integrate evidence from multiple experimental
studies, are at the pinnacle of evidence hierarchies for questions relating to causes (Fig.
2.1 of Chapter 2).
Despite the advantages of experiments, they have limitations. First, many interesting
variables simply are not amenable to intervention. A large number of human traits, such
as disease or health habits, cannot be randomly conferred. That is why RCTs are not at
the top of the hierarchy for Prognosis questions (Table 9.2), which concern the
consequences of health problems. For example, infants could not be randomly assigned
to having cystic fibrosis to see if this disease causes poor psychosocial adjustment.
Second, many variables could technically—but not ethically—be experimentally
varied. For example, there have been no RCTs to study the effect of cigarette smoking
on lung cancer. Such a study would require people to be assigned randomly to a
smoking group (people forced to smoke) or a nonsmoking group (people prohibited
from smoking). Thus, although RCTs are technically at the top of the evidence
hierarchy for Etiology/Harm questions (Table 9.2), many etiology questions cannot be
answered using an experimental design.
Sometimes, RCTs are not feasible because of practical issues. It may, for instance,
be impossible to secure the administrative approval to randomize people to groups. In
summary, experimental designs have some limitations that restrict their use for some
real-world problems; nevertheless, RCTs have a clear superiority to other designs for
testing causal hypotheses.
HOW-TO-TELL TIP How can you tell if a study is experimental?
Researchers usually indicate in the method section of their reports that they
used an experimental or randomized design (RCT). If such terms are
missing, you can conclude that a study is experimental if the article says
that the study purpose was to test the effects of an intervention AND if
participants were put into groups at random.
Quasi-Experiments
Quasi-experiments (called trials without randomization in the medical literature) also
involve an intervention; however, quasi-experimental designs lack randomization, the
signature of a true experiment. Some quasi-experiments even lack a control group. The
218
signature of a quasi-experimental design is the implementation and testing of an
intervention in the absence of randomization.
Quasi-Experimental Designs
A frequently used quasi-experimental design is the nonequivalent control group
pretest–posttest design, which involves comparing two or more groups of people before
and after implementing an intervention. For example, suppose we wished to study the
effect of a chair yoga intervention (I) for older people (P) on quality of life (O). The
intervention is being offered to everyone at a community senior center, and
randomization is not possible. For comparative purposes, we collect outcome data at a
different senior center that is not instituting the intervention (C). Data on quality of life
(QOL) are collected from both groups at baseline and 10 weeks later.
This quasi-experimental design is identical to a pretest–posttest experimental design
except people were not randomized to groups. The quasi-experimental design is weaker
because, without randomization, it cannot be assumed that the experimental and
comparison groups are equivalent at the outset. The design is, nevertheless, strong
because the baseline data allow us to see whether elders in the two senior centers had
similar QOL scores, on average, before the intervention. If the groups are comparable at
baseline, we could be relatively confident inferring that posttest differences in QOL
were the result of the yoga intervention. If QOL scores are different initially, however,
postintervention differences are hard to interpret. Note that in quasi-experiments, the
term comparison group is sometimes used in lieu of control group to refer to the group
against which outcomes in the treatment group are evaluated.
Now suppose we had been unable to collect baseline data. Such a design
(nonequivalent control group posttest-only) has a flaw that is hard to remedy. We no
longer have information about initial equivalence. If QOL in the experimental group is
higher than that in the control group at the posttest, can we conclude that the
intervention caused improved QOL? There could be other explanations for the
differences. In particular, QOL in the two centers might have differed initially. The
hallmark of strong quasi-experiments is the effort to introduce some controls, such as
baseline measurements.
Example of a nonequivalent control group design
Hsu and colleagues (2015) used a nonequivalent control group pretest–posttest design
to test the effect of an online caring curriculum to enhance nurses’ caring behavior.
Nurses in one hospital received the intervention, whereas those in another hospital
did not. Data on the nurses’ caring behaviors were gathered from both groups before
and after the intervention.
Some quasi-experiments have neither randomization nor a comparison group.
Suppose a hospital implemented rapid response teams (RRTs) in its acute care units and
219
wanted to learn the effects on patient outcomes (e.g., mortality). For the purposes of this
example, assume no other hospital would be a good comparison, and so the only
possible comparison is a before–after contrast. If RRTs were implemented in January,
we could compare the mortality rate, for example, during the 3 months before RRTs
with the mortality rate in the subsequent 3-month period.
This one-group pretest–posttest design seems logical, but it has weaknesses. What if
one of the 3-month periods is atypical, apart from the RRTs? What about the effect of
other changes instituted during the same period? What about the effects of external
factors, such as seasonal morbidity? The design in question offers no way to control
these factors.
However, the design could be modified so that some alternative explanations for
changes in mortality could be ruled out. For example, the time-series design involves
collecting data over an extended time period and introducing the treatment during that
period. The present study could be designed with four observations before the RRTs are
introduced (e.g., four quarters of mortality data for the prior year) and four observations
after it (mortality for the next four quarters). Although a time-series design does not
eliminate all interpretive problems, the extended time perspective strengthens the ability
to attribute improvements to the intervention.
Example of a time-series design
Burston and colleagues (2015) used a time-series design to study the effect of a
“transforming care” initiative on two patient outcomes—inpatient falls and hospital-
acquired pressure ulcers. Patients who were discharged from surgical units of an
acute care hospital over a 29-month period comprised the sample.
Advantages and Disadvantages of Quasi-Experiments
One strength of quasi-experiments is their practicality. Nursing research often occurs in
natural settings, where it is difficult to deliver an innovative treatment randomly to some
people but not to others. Strong quasi-experimental designs introduce some research
control when full experimental rigor is not possible.
Another issue is that people are not always willing to be randomized. Quasi-
experimental designs, because they do not involve random assignment, are likely to be
acceptable to more people. This, in turn, has implications for the generalizability of the
results—but the problem is that the results are less conclusive.
The major disadvantage of quasi-experiments is that causal inferences cannot be
made as readily as with RCTs. Alternative explanations for results abound with quasi-
experiments. For example, suppose we administered a special diet to a group of frail
nursing home residents to assess its impact on weight gain. If we use a nonequivalent
control group and then observe a weight gain, we must ask: Is it plausible that some
other factor caused the gain? Is it plausible that pretreatment differences between the
intervention and comparison groups resulted in differential gain? Is it plausible that
220
there was an average weight gain simply because the most frail died or were transferred
to a hospital? If the answer to any of these rival hypotheses is yes, then inferences about
the causal effect of the intervention are weakened. With quasi-experiments, there is
almost always at least one plausible rival explanation.
HOW-TO-TELL TIP How can you tell if a study is quasi-
experimental? Researchers do not always identify their designs as quasi-
experimental. If a study involves the testing of an intervention and if the
report does not explicitly mention random assignment, it is probably safe to
conclude that the design is quasi-experimental.
Nonexperimental Studies
Many cause-probing research questions cannot be addressed with an RCT or quasi-
experiment. For example, take this Prognosis question: Do birth weights under 1,500 g
cause developmental delays in children? Clearly, we cannot manipulate birth weight,
the independent variable. When researchers do not intervene by controlling the
independent variable, the study is nonexperimental, or, in the medical literature,
observational.
There are various reasons for doing a nonexperimental study, including situations in
which the independent variable inherently cannot be manipulated (Prognosis questions)
or in which it would be unethical to manipulate the independent variable (some Etiology
questions). Experimental designs are also not appropriate for Descriptive questions.
Types of Nonexperimental/Observational Studies
When researchers study the effect of a cause they cannot manipulate, they undertake
correlational research to examine relationships between variables. A correlation is an
association between two variables, that is, a tendency for variation in one variable to be
related to variation in another (e.g., people’s height and weight). Correlations can be
detected through statistical analyses.
It is risky to infer causal relationships in correlational research. In RCTs,
investigators predict that deliberate variation of the independent variable will result in a
change to the outcome variable. In correlational research, investigators do not control
the independent variable, which often has already occurred. A famous research dictum
is relevant: Correlation does not prove causation. The mere existence of a relationship
between variables is not enough to conclude that one variable caused the other, even if
the relationship is strong.
Correlational studies are weaker than RCTs for cause-probing questions, but
different designs offer varying degrees of supportive evidence. The strongest design for
Prognosis questions, and for Etiology questions when randomization is impossible, is a
cohort design (Table 9.2). Observational studies with a cohort design (sometimes
called a prospective design) start with a presumed cause and then go forward to the
221
presumed effect. For example, in prospective lung cancer studies, researchers start with
a cohort of adults (P) that includes smokers (I) and nonsmokers (C) and then compare
subsequent lung cancer incidence (O) in the two groups.
Example of a cohort (prospective) design
Giurgescu and colleagues (2015) studied the relationship between levels of
depressive symptoms in African American women during their pregnancy and
subsequent birth outcomes, such as birth weight and the incidence of preterm birth.
TIP Experimental studies are inherently prospective because the researcher
institutes the intervention and subsequently examines its effect.
In correlational studies with a retrospective design, an effect (outcome) observed in
the present is linked to a potential cause that occurred in the past. For example, in
retrospective lung cancer research, researchers begin with some people who have lung
cancer and others who do not and then look for differences in antecedent behaviors or
conditions, such as smoking habits. Such a study uses a case-control design—that is,
cases with a certain condition such as lung cancer are compared to controls without it.
In designing a case-control study, researchers try to identify controls who are as similar
as possible to cases with regard to confounding variables (e.g., age, gender). The
difficulty, however, is that the two groups are almost never comparable with respect to
all factors influencing the outcome.
Example of a case-control design
Delmore and colleagues (2015) conducted a study of the risk factors associated with
heel pressure ulcers in hospitalized patients. Patients with hospital-acquired pressure
ulcers were compared to those without such ulcers in terms of patient characteristics
(e.g., immobility, presence of vascular disease).
Prospective studies are more costly, but stronger, than retrospective studies. For one
thing, any ambiguity about the temporal sequence of phenomena is resolved in
prospective research (i.e., smoking is known to precede the lung cancer). In addition,
samples are more likely to be representative of smokers and nonsmokers.
A second broad class of nonexperimental studies is descriptive research. The
purpose of descriptive studies is to observe, describe, and document aspects of a
situation. For example, an investigator may wish to discover the percentage of teenagers
who smoke—i.e., the prevalence of certain behaviors. Sometimes, a study design is
descriptive correlational, meaning that researchers seek to describe relationships among
variables, without inferring causal connections. For example, researchers might be
interested in describing the relationship between fatigue and psychological distress in
222
HIV patients. In such situations, a descriptive nonexperimental design is appropriate.
Example of a descriptive correlational study
Cullum and colleagues (2016) conducted a descriptive correlational study of
adolescents with Type 2 diabetes to examine relationships among symptoms of
depression, perceived social support, and months since diagnosis.
TIP For Descriptive questions, the strongest design is a nonexperimental
study that relies on random sampling of participants. Random sampling is
discussed in Chapter 10.
Advantages and Disadvantages of Nonexperimental Research
The major disadvantage of nonexperimental studies is that they do not yield persuasive
evidence for causal inferences. This is not a problem when the aim is description, but
correlational studies are often undertaken to discover causes. Yet correlational studies
are susceptible to faulty interpretation because groups being compared have formed
through self-selection. A researcher doing a correlational study cannot assume that the
groups being compared were similar before the occurrence of the independent variable.
As an example of such interpretive problems, suppose we studied differences in
depression (O) of patients with cancer (P) who do or do not have adequate social
support (I and C). Suppose we found a correlation—i.e., that patients without social
support were more depressed than patients with social support. We could interpret this
to mean that patients’ emotional state is influenced by the adequacy of their social
support, as diagrammed in Figure 9.1A. There are, however, alternative interpretations.
Maybe a third variable influences both social support and depression, such as whether
the patients are married. Having a spouse may affect patients’ depression and the
quality of their social support (Fig. 9.1B). A third possibility is reversed causality (Fig.
9.1C). Depressed cancer patients may find it more difficult to elicit social support than
patients who are cheerful. In this interpretation, the person’s depression causes the
amount of received social support, not the other way around. The point is that
correlational results should be interpreted cautiously.
223
TIP Be prepared to think critically when a researcher claims to be studying
the “effects” of one variable on another in a nonexperimental study. For
example, if a report title were “The Effects of Eating Disorders on
Depression,” the study would be nonexperimental (i.e., participants were not
randomly assigned to an eating disorder). In such a situation, you might ask,
Did the eating disorder have an effect on depression—or did depression have
an effect on eating patterns? or Did a third variable (e.g., childhood abuse)
have an effect on both?
Nevertheless, nonexperimental studies play a big role in nursing because many
interesting problems do not lend themselves to intervention. An example is whether
smoking causes lung cancer. Despite the absence of any RCTs with humans, few people
doubt that this causal connection exists. There is ample evidence of a relationship
between smoking and lung cancer and, through prospective studies, that smoking
precedes lung cancer. In numerous replications, researchers have been able to control
for, and thus rule out, other possible “causes” of lung cancer.
Correlational research can offer an efficient way to collect large amounts of data
about a problem. For example, it would be possible to collect information about
people’s health problems and eating habits. Researchers could then examine which
problems correlate with which eating patterns. By doing this, many relationships could
be discovered in a short time. By contrast, an experimenter looks at only a few variables
at a time. For example, one RCT might manipulate cholesterol, whereas another might
manipulate protein. Nonexperimental work is often necessary before interventions can
be justified.
224
THE TIME DIMENSION IN RESEARCH DESIGN
Research designs incorporate decisions about when and how often data will be
collected, and studies can be categorized in terms of how they deal with time. The major
distinction is between cross-sectional and longitudinal designs.
Cross-Sectional Designs
In cross-sectional designs, data are collected at one point in time. For example, a
researcher might study whether psychological symptoms in menopausal women are
correlated contemporaneously with physiologic symptoms. Retrospective studies are
usually cross-sectional: Data on the independent and outcome variables are collected
concurrently (e.g., participants’ lung cancer status and smoking habits), but the
independent variable usually concerns events or behaviors occurring in the past.
Cross-sectional designs can be used to study time-related phenomena, but they are
less persuasive than longitudinal designs. Suppose we were studying changes in
children’s health-promotion activities between ages 8 and 10 years. One way to
investigate this would be to interview children at age 8 years and then 2 years later at
age 10 years—a longitudinal design. Or, we could question two groups of children, ages
8 and 10 years, at one point in time and then compare responses—a cross-sectional
design. If 10-year-olds engaged in more health-promoting activities than 8-year-olds, it
might be inferred that children made healthier choices as they aged. To make this
inference, we have to assume that the older children would have responded as the
younger ones did had they been questioned 2 years earlier or, conversely, that 8-year-
olds would report more health-promoting activities if they were questioned again 2
years later.
Cross-sectional designs are economical, but they pose problems for inferring
changes over time. The amount of social and technological change that characterizes
our society makes it questionable to assume that differences in the behaviors or
characteristics of different age groups are the result of the passage through time rather
than cohort differences.
Example of a cross-sectional study
Brito and colleagues (2015) studied the relationship between functional disability and
demographic factors—including age—in the elderly. Three age groups were
compared—those aged 60 to 69 years, 70 to 79 years, and 80 years or older.
Longitudinal Designs
Longitudinal designs involve collecting data multiple times over an extended period.
Such designs are useful for studying changes over time and for establishing the
sequencing of phenomena, which is a criterion for inferring causality.
225
In nursing research, longitudinal studies are often follow-up studies of a clinical
population, undertaken to assess the subsequent status of people with a specified
condition or who received an intervention. For example, patients who received a
smoking cessation intervention could be followed up to assess its long-term
effectiveness. As a nonexperimental example, samples of premature infants could be
followed up to assess subsequent motor development.
Example of a follow-up study
Pien and coresearchers (2015) examined quality of life among 96 suicidal
individuals, who were followed up 3 months after a suicide attempt.
In longitudinal studies, researchers must decide the number of data collection points
and the time intervals between them. When change is rapid, numerous data collection
points at relatively short intervals may be required to understand transitions. By
convention, however, the term longitudinal implies multiple data collection points over
an extended period of time.
A challenge in longitudinal studies is the loss of participants (attrition) over time.
Attrition is problematic because those who drop out of the study usually differ in
important ways from those who continue to participate, resulting in potential biases and
problems with generalizability.
TIP Not all longitudinal studies are prospective because sometimes, the
independent variable occurred even before the initial wave of data collection.
And not all prospective studies are longitudinal in the classic sense. For
example, an experimental study that collects data at 1, 2, and 4 hours after an
intervention would be prospective but not longitudinal (i.e., data are not
collected over a long time period).
TECHNIQUES OF RESEARCH CONTROL
A major goal of research design in quantitative studies is to maximize researchers’
control over confounding variables. Two broad categories of confounders need to be
controlled—those that are intrinsic to study participants and those that are situational
factors.
Controlling the Study Context
External factors, such as the research context, can affect outcomes. In well-controlled
quantitative research, steps are taken to achieve constancy of conditions so that
researchers can be confident that outcomes reflect the effect of the independent variable
and not the study context.
226
Researchers cannot totally control study contexts, but many opportunities exist. For
example, blinding is a way to control bias. By keeping data collectors and others
unaware of group allocation, researchers minimize the risk that other people involved in
the study will influence the results.
Most quantitative studies also standardize communications to participants. Formal
scripts are often prepared to inform participants about the study purpose and methods.
In intervention studies, researchers develop formal intervention protocols. Careful
researchers pay attention to intervention fidelity—that is, they monitor whether an
intervention is faithfully delivered in accordance with its plan and that the intended
treatment was actually received.
Example of attention to intervention fidelity
McCarthy and colleagues (2015) described their efforts to monitor intervention
fidelity in implementing an exercise counseling intervention that used motivational
interviewing. For example, the researchers examined if motivational interviewing
was faithfully used by the counselors.
Controlling Participant Factors
Outcomes of interest to nurse researchers are affected by dozens of attributes, and most
are irrelevant to the research question. For example, suppose we were investigating the
effects of a physical fitness program on the physical functioning of nursing home
residents. In this study, variables such as the participants’ age, gender, and smoking
history would be confounding variables; each is likely to be related to the outcome
variable (physical functioning), independent of the program. In other words, the effects
that these variables have on the outcome are extraneous to the study. In this section, we
review strategies researchers can use to control confounding variables.
Randomization
Randomization is the most effective way to control participants’ characteristics. A
critical advantage of randomization, compared with other control strategies, is that it
controls all possible sources of extraneous variation, without any conscious decision
about which variables should be controlled. In our example of a physical fitness
intervention, random assignment of elders to an intervention or control group would
yield groups presumably comparable in terms of age, gender, smoking history, and
dozens of other characteristics that could affect the outcome. Randomization to different
treatment orderings in a crossover design is especially powerful: Participants serve as
their own controls, thereby controlling all confounding characteristics.
Homogeneity
When randomization is not feasible, other methods of controlling extraneous
227
characteristics can be used. One alternative is homogeneity, in which only people who
are similar with respect to confounding variables are included in the study. In the
physical fitness example, if gender were a confounding variable, we could recruit only
men (or women) as participants. If age was considered a confounder, participation could
be limited to a specified age range. Using a homogeneous sample is easy, but one
problem is limited generalizability.
Example of control through homogeneity
Bowles and coresearchers (2014) used a quasi-experimental design to examine the
effect of a discharge planning decision support on time to readmission among older
adult hospital patients. Several variables were controlled through homogeneity,
including age (all 55 years or older), condition (none was on dialysis), and admission
(none was admitted from an institution).
Matching
A third method of controlling confounding variables is matching, which involves
consciously forming comparable groups. For example, suppose we began with a group
of nursing home residents who agreed to participate in the physical fitness program. A
comparison group of nonparticipating residents could be created by matching
participants on the basis of important confounding variables (e.g., age and gender). This
procedure results in groups known to be similar on specific confounding variables.
Matching is often used to form comparable groups in case-control designs.
Matching has some drawbacks, however. To match effectively, researchers must
know what the relevant confounders are. Also, after two or three variables, it becomes
difficult to match. Suppose we wanted to control age, gender, and length of nursing
home stay. In this situation, if a program participant were an 80-year-old woman whose
length of stay was 5 years, we would have to seek another woman with these
characteristics as a comparison group counterpart. With more than three variables,
matching may be impossible. Thus, matching is a control method used primarily when
more powerful procedures are not feasible.
Example of control through matching
Stavrinos and coresearchers (2015) compared teenagers with and without attention
deficit/hyperactivity disorder (ADHD) in terms of distractibility while driving, tested
in a simulator while either conversing on a cell phone or text messaging. The ADHD
and non-ADHD groups were matched for gender, ethnicity, and months of driving
experience.
Statistical Control
Researchers can also control confounding variables statistically. Methods of statistical
228
control are complex, and so a detailed description of powerful statistical control
mechanisms, such as analysis of covariance, will not be attempted. You should
recognize, however, that nurse researchers are increasingly using powerful statistical
techniques to control confounding variables. A brief description of methods of
statistical control is presented in Chapter 14.
Evaluation of Control Methods
Random assignment is the most effective approach to controlling confounding variables
because randomization tends to control individual variation on all possible confounders.
Crossover designs are especially powerful, but they cannot be used in many situations
because of the possibility of carryover effects. The alternatives described here share two
disadvantages. First, researchers must decide in advance which variables to control. To
select homogeneous samples, match, or use statistical control, researchers must identify
which variables to control. Second, these methods control only the specified
characteristics, leaving others uncontrolled.
Although randomization is an excellent tool, it is not always feasible. It is better to
use matching or statistical control than to ignore the problem of confounding variables.
CHARACTERISTICS OF GOOD DESIGN
A critical question in critiquing a quantitative study is whether the research design
yielded valid evidence. Four key questions regarding research design, particularly in
cause-probing studies, are as follows:
1. What is the strength of the evidence that a relationship between variables really
exists?
2. If a relationship exists, what is the strength of the evidence that the independent
variable (e.g., an intervention), rather than other factors, caused the outcome?
3. What is the strength of evidence that observed relationships are generalizable across
people, settings, and time?
4. What are the theoretical constructs underlying the study variables, and are those
constructs adequately captured?
These questions, respectively, correspond to four aspects of a study’s validity: (1)
statistical conclusion validity, (2) internal validity, (3) external validity, and (4)
construct validity (Shadish et al., 2002).
Statistical Conclusion Validity
As noted previously, one criterion for establishing causality is a demonstrated
relationship between the independent and dependent variable. Statistical tests are used
to support inferences about whether such a relationship exists. We note here a few
threats that can affect a study’s statistical conclusion validity.
229
Statistical power, the capacity to detect true relationships, affects statistical
conclusion validity. The most straightforward way to achieve statistical power is to use
a large enough sample. With small samples, the analyses may fail to show that the
independent variable and the outcome are related—even when they are. Power and
sample size are discussed in Chapter 10.
Researchers can also enhance power by enhancing differences on the independent
variables (i.e., making the cause powerful) to maximize differences on the outcome (the
effect). If the groups or treatments are not very different, the statistical analysis might
not be sufficiently sensitive to detect effects that actually exist. Intervention fidelity can
enhance the power of an intervention.
Thus, if you are critiquing a study in which outcomes for the groups being compared
were not significantly different, one possibility is that the study had low statistical
conclusion validity. The report might give clues about this possibility (e.g., too small a
sample or substantial attrition) that should be taken into consideration in interpreting
what the results mean.
Internal Validity
Internal validity is the extent to which it can be inferred that the independent variable
is causing the outcome. RCTs tend to have high internal validity because randomization
enables researchers to rule out competing explanations for group differences. With
quasi-experiments and correlational studies, there are competing explanations for what
is causing the outcome, which are sometimes called threats to internal validity.
Evidence hierarchies rank study designs mainly in terms of internal validity.
Threats to Internal Validity
Temporal Ambiguity
In a causal relationship, the cause precedes the effect. In RCTs, researchers create the
independent variable and then observe the outcome, so establishing a temporal sequence
is never a problem. In correlational studies, however—especially ones using a cross-
sectional design—it may be unclear whether the independent variable preceded the
dependent variable, or vice versa, as illustrated in Figure 9.1.
Selection
The selection threat (self-selection) reflects biases stemming from preexisting
differences between groups. When people are not assigned randomly to groups, the
groups being compared may not be equivalent; group differences in the outcome may be
caused by extraneous factors rather than by the independent variable. Selection bias is
the most challenging threat to the internal validity of studies not using an experimental
design but can be partially addressed using control strategies described in the previous
section.
230
History
The history threat is the occurrence of events concurrent with the independent variable
that can affect the outcome. For example, suppose we were studying the effectiveness of
a senior center program to encourage flu shots among the elderly. Now suppose a story
about a flu epidemic was aired in the national media at about the same time. Our
outcome variable, number of flu shots administered, is now influenced by at least two
forces, and it would be hard to disentangle the two effects. In RCTs, history is not
typically a threat because external events are as likely to affect one randomized group as
another. The designs most likely to be affected by the history threat are one-group
pretest–posttest designs and time-series designs.
Maturation
The maturation threat arises from processes occurring as a result of time (e.g., growth,
fatigue) rather than the independent variable. For example, if we were studying the
effect of an intervention for developmentally delayed children, our design would have
to deal with the fact that progress would occur without an intervention. Maturation does
not refer only to developmental changes but to any change that occurs as a function of
time. Phenomena such as wound healing or postoperative recovery occur with little
intervention, and so maturation may be a rival explanation for favorable posttreatment
outcomes if the design does not include a comparison group. One-group pretest–posttest
designs are especially vulnerable to the maturation threat.
Mortality/Attrition
Mortality is the threat that arises from attrition in groups being compared. If different
kinds of people remain in the study in one group versus another, then these differences,
rather than the independent variable, could account for group differences in outcomes.
The most severely ill patients might drop out of an experimental condition because it is
too demanding, for example. Attrition bias essentially is a selection bias that occurs
after the study unfolds: Groups initially equivalent can lose comparability because of
attrition, and differential group composition, rather than the independent variable, could
be the “cause” of any group differences on outcomes.
TIP If attrition is random (i.e., those dropping out of a study are similar to
those remaining in it), then there would not be bias. However, attrition is
rarely random. In general, the higher the rate of attrition, the greater the risk
of bias. Biases are usually of concern if the rate exceeds 10% to 15%.
Internal Validity and Research Design
Quasi-experimental and correlational studies are especially susceptible to internal
validity threats. These threats compete with the independent variable as a cause of the
outcome. The aim of a good quantitative research design is to rule out these competing
231
explanations. The control mechanisms previously described are strategies for improving
internal validity—and thus for strengthening the quality of evidence that studies yield.
An experimental design often, but not always, eliminates competing explanations.
Experimental mortality is a particularly salient threat. Because researchers do different
things with the groups, members may drop out of the study for different reasons. This is
particularly likely to happen if the intervention is stressful or time-consuming or if the
control condition is boring or disappointing. Participants remaining in a study may
differ from those who left, nullifying the initial equivalence of the groups.
You should carefully consider possible rival explanations for study results,
especially in non-RCT studies. When researchers do not have control over critical
confounding variables, caution in drawing conclusions about the evidence is
appropriate.
External Validity
External validity concerns inferences about whether relationships found for study
participants might hold true for different people and settings. External validity is critical
to evidence-based practice (EBP) because it is important to generalize evidence from
controlled research settings to real-world practice settings.
External validity questions can take several different forms. For example, we may
ask whether relationships observed with a study sample can be generalized to a larger
population—for example, whether results about rates of postpartum depression in
Boston can be generalized to mothers in northeastern United States. Thus, one aspect of
a study’s external validity concerns sampling. If the sample is representative of the
population, generalizing results to the population is safer (Chapter 10).
Other external validity questions are about generalizing to different types of people,
settings, or situations. For example, can findings about a pain reduction treatment in
Norway be generalized to people in the United States? New studies are often needed to
answer questions about generalizability. An important concept here is replication.
Multisite studies are powerful because generalizability of the results can be enhanced if
the results have been replicated in several sites—particularly if the sites differ on
important dimensions (e.g., size). In studies with a diverse sample of participants,
researchers can assess whether results are replicated for various subgroups—for
example, whether an intervention benefits men and women. Systematic reviews
represent a crucial aid to external validity precisely because they explore consistency in
results based on replications across time, space, people, and settings.
The demands for internal and external validity may conflict. If a researcher
exercises tight control to maximize internal validity, the setting may become too
artificial to generalize to more naturalistic environments. Compromises are often
necessary.
Construct Validity
232
Research involves constructs. Researchers conduct a study with specific exemplars of
treatments, outcomes, settings, and people, but these are all stand-ins for broad
constructs. Construct validity involves inferences from the particulars of the study to
the higher order constructs they are intended to represent. If studies contain construct
errors, the evidence could be misleading. One aspect of construct validity concerns the
degree to which an intervention is a good representation of the construct that was
theorized as having the potential to cause beneficial outcomes. Lack of blinding can be
a threat to construct validity: Is it an intervention, or awareness of the intervention, that
resulted in benefits? Another issue is whether the measures of the research variables are
good operationalizations of constructs. This aspect of construct validity is discussed in
Chapter 10.
CRITIQUING QUANTITATIVE RESEARCH DESIGNS
A key evaluative question is whether the research design enabled researchers to get
good answers to the research question. This question has both substantive and
methodologic facets.
Substantively, the issue is whether the design matches the aims of the research. If
the research purpose is descriptive or exploratory, an experimental design is not
appropriate. If the researcher is searching to understand the full nature of a phenomenon
about which little is known, a structured design that allows little flexibility might block
insights (flexible designs are discussed in Chapter 11). We have discussed research
control as a bias-reducing strategy, but too much control can introduce bias—for
example, when a researcher tightly controls how phenomena under study can be
manifested and so obscures their true nature.
Methodologically, the main design issue in quantitative studies is whether the
research design provides the most valid, unbiased, and interpretable evidence possible.
Indeed, there usually is no other aspect of a quantitative study that affects the quality of
evidence as much as research design. Box 9.1 provides questions to assist you in
evaluating research designs.
Box 9.1 Guidelines for Critiquing Research Design in a Quantitative Study
1. Was the design experimental, quasi-experimental, or nonexperimental? What
specific design was used? Was this a cause-probing study? Given the type of
question (Therapy, Prognosis, etc.), was the most rigorous possible design used?
2. What type of comparison was called for in the research design? Was the
comparison strategy effective in illuminating key relationships?
3. If the study involved an intervention, were the intervention and control conditions
adequately described? Was blinding used, and if so, who was blinded? If not, is
there a good rationale for failure to use blinding?
4. If the study was nonexperimental, why did the researcher opt not to intervene? If
233
the study was cause-probing, which criteria for inferring causality were potentially
compromised? Was a retrospective or prospective design used, and was such a
design appropriate?
5. Was the study longitudinal or cross-sectional? Was the number and timing of data
collection points appropriate?
6. What did the researcher do to control confounding participant characteristics, and
were the procedures effective? What are the threats to the study’s internal validity?
Did the design enable the researcher to draw causal inferences about the
relationship between the independent variable and the outcome?
7. What are the major limitations of the design used? Were these limitations
acknowledged by the researcher and taken into account in interpreting results?
What can be said about the study’s external validity?
This section presents examples of studies with different research designs.
Read these summaries and then answer the critical thinking questions,
referring to the full research report if necessary. Examples 1 and 2 are featured
on the interactive Critical Thinking Activity on website. The critical
thinking questions for Example 3 are based on the study that appears in its
entirety in Appendix A of this book. Our comments for these questions are in
the Student Resources section on .
EXAMPLE 1: A RANDOMIZED CONTROLLED
CROSSOVER TRIAL
Study: Hydrocortisone cream to reduce perineal pain after vaginal birth: A
randomized controlled trial (Manfre et al., 2015)
Statement of Purpose: The purpose of the study was to evaluate whether the
use of hydrocortisone cream can decrease perineal pain in the immediate
postpartum period.
Design and Treatment Conditions: The researchers used a randomized
crossover design in which participants received three different methods for
pain management at three sequential pain treatments after birth: two topical
creams (corticosteroid and placebo) and a control treatment (no cream
application). The placebo cream was a similar cetyl alcohol–based cream.
Method: A sample of 29 mothers who gave birth vaginally was randomly
assigned to different orderings of the three conditions. The sample size was
based on an analysis undertaken to ensure adequate statistical power. Mothers
234
were first asked to rate their pain within 2 hours of admission to the
postpartum unit. After the rating, the investigator applied the first randomly
assigned treatment to a witch hazel pad and placed the pad on the perineum.
Participants rated their pain again 30 to 60 minutes later. Following the initial
application, the process was repeated every 6 hours for the second and third
randomly assigned perineal treatment. The dependent variable was the change
in perineal pain levels before and 30 to 60 minutes after application of the
treatment. Both the participants and the investigators were blinded to cream
type—a pharmacist prepared the study treatments and packaged them in
sterile tubes. A total of 29 participants were enrolled in the study, with 27
completing all three treatments over a 12-hour period.
Key Findings: A significant reduction in pain was found after application of
both the topical creams. The application of either hydrocortisone cream or
placebo cream provided significantly better pain relief than no cream
application. The average decline in pain was similar in the two cream groups,
6.7 points for the placebo cream and 4.8 with the hydrocortisone cream.
Critical Thinking Exercises
1. Answer the relevant questions from Box 9.1 regarding this study.
2. Also consider the following targeted questions:
a. Could a three-group design (i.e., three different groups of mothers) have
been used in this study?
b. Why might the two creams have been comparably effective in reducing
pain?
3. If the results of this study are valid, what are some of the uses to which the
findings might be put in clinical practice?
EXAMPLE 2: A QUASI-EXPERIMENTAL DESIGN
Study: A study to promote breast feeding in the Helsinki Metropolitan area in
Finland (Hannula et al., 2014)
Statement of Purpose: The purpose of the study was to test the effect of
providing intensified support for breastfeeding during the perinatal period on
the breastfeeding behavior of women in Finland.
Treatment Groups: The women in the intervention group were offered a
free, noncommercial web-based service that provided intensified support for
parenthood, child care, and breastfeeding from the 20th gestation week until
the child was 1 year old. The mothers were cared for by staff with specialized
training who also provided individualized support. Women in the comparison
group received usual care from midwifery and nursing professionals.
Method: The study was conducted in three public maternity hospitals in
235
Helsinki. Because randomization was not possible, two of the hospitals
implemented the intensified support services, and the third hospital served as
the control. Women who were 18 to 21 weeks gestation were recruited into
the intervention group if they were expecting a singleton birth. Altogether,
705 women participated in the study, 431 in the intervention group and 274 in
the comparison group. Study participants completed questionnaires at
hospital discharge or shortly afterward. The primary outcome in the study
was whether or not the mother breastfed exclusively in the hospital.
Secondary outcomes included the mothers’ breastfeeding confidence,
breastfeeding attitudes, and coping with breastfeeding.
Key Findings: The intervention and comparison group members were similar
demographically in some respects (e.g., education, marital status), but several
preintervention group differences were found. For example, patients in the
intervention group were more likely to be primiparas and more likely to have
participated in parenting education than women in the comparison group. To
address this selection bias problem, these characteristics were controlled
statistically. Women in the intervention group were significantly more likely
to breastfeed exclusively at the time of the follow-up (76%) than those in the
comparison group (66%). The authors concluded that intensive support
helped the mothers to breastfeed exclusively.
Critical Thinking Exercises
1. Answer the relevant questions from Box 9.1 regarding this study.
2. Also consider the following targeted questions:
a. Is this study prospective or retrospective?
b. What other quasi-experimental designs could have been used in this
study?
3. If the results of this study are valid, what are some of the uses to which the
findings might be put in clinical practice?
EXAMPLE 3: NONEXPERIMENTAL STUDY IN APPENDIX
A
• Read the methods section of Swenson and colleagues’ (2016) study
(“Parents’ use of praise and criticism in a sample of young children
seeking mental health services”) in Appendix A of this book.
Critical Thinking Exercises
1. Answer the relevant questions from Box 9.1 regarding this study.
2. Suggest modifications to the design of this study that might improve its
external validity.
236
WANT TO KNOW MORE?
A wide variety of resources to enhance your learning and understanding of
this chapter are available on .
• Interactive Critical Thinking Activity
• Chapter Supplement on Selected Experimental and Quasi-Experimental
Designs: Diagrams, Uses, and Drawbacks
• Answers to the Critical Thinking Exercises for Example 3
• Internet Resources with useful websites for Chapter 9
• A Wolters Kluwer journal article in its entirety—the Manfre et al. study
described as Example 1 on p. 156.
Additional study aids, including eight journal articles and related
questions, are also available in Study Guide for Essentials of Nursing
Research, 9e.
Summary
Points
The research design is the overall plan for answering research questions. In
quantitative studies, the design designates whether there is an intervention, the
nature of any comparisons, methods for controlling confounding variables,
whether there will be blinding, and the timing and location of data collection.
Therapy, Prognosis, and Etiology questions are cause-probing, and there is a
hierarchy of designs for yielding best evidence for these questions.
Key criteria for inferring causality include the following: (1) a cause (independent
variable) must precede an effect (outcome), (2) there must be a detectable
relationship between a cause and an effect, and (3) the relationship between the
two does not reflect the influence of a third (confounding) variable.
A counterfactual is what would have happened to the same people simultaneously
exposed and not exposed to a causal factor. The effect is the difference between
the two. A good research design for cause-probing questions entails finding a good
approximation to the idealized counterfactual.
Experiments (or randomized controlled trials [RCTs]) involve an intervention
(the researcher manipulates the independent variable by introducing an
intervention), control (including the use of a control group that is not given the
intervention), and randomization/random assignment (with participants
allocated to experimental and control groups at random to make the groups
237
comparable at the outset).
RCTs are considered the gold standard because they come closer than any other
design to meeting the criteria for inferring causal relationships.
In pretest–posttest designs, data are collected both before the intervention (at
baseline) and after it.
In crossover designs, people are exposed to more than one experimental condition
in random order and serve as their own controls. Crossover designs are
inappropriate if there is a risk of carryover effects.
The control group can undergo various conditions, including an alternative
treatment, a placebo or pseudointervention, standard treatment (“usual care”), or a
wait-list (delayed treatment) condition.
Quasi-experiments (trials without randomization) involve an intervention but
lack a comparison group or randomization. Strong quasi-experimental designs
introduce controls to compensate for these missing components.
The nonequivalent control-group, pretest–posttest design involves a
comparison group that was not created through randomization and the collection
of pretreatment data from both groups to assess initial group equivalence.
In a time-series design, outcome data are collected over a period of time before
and after the intervention, usually for a single group.
Nonexperimental (observational) studies include descriptive research—studies
that summarize the status of phenomena—and correlational studies that examine
relationships among variables but involve no intervention.
In prospective (cohort) designs, researchers begin with a possible cause and then
subsequently collect data about outcomes.
Retrospective designs (case-control designs) involve collecting data about an
outcome in the present and then looking back in time for possible causes.
Making causal inferences in correlational studies is risky; a basic research dictum
is that correlation does not prove causation.
Cross-sectional designs involve the collection of data at one time period, whereas
longitudinal designs involve data collection at two or more times over an
extended period. In nursing, longitudinal studies often are follow-up studies of
clinical populations.
Longitudinal studies are typically expensive, time-consuming, and subject to the
risk of attrition (loss of participants over time) but yield valuable information
about time-related phenomena.
Quantitative researchers strive to control external factors that could affect study
outcomes and subject characteristics that are extraneous to the research question.
Researchers delineate the intervention in formal protocols that stipulate exactly
238
what the treatment is. Careful researchers attend to intervention fidelity—whether
the intervention was properly implemented and actually received.
Techniques for controlling subject characteristics include homogeneity (restricting
participants to reduce variability on confounding variables), matching
(deliberately making groups comparable on some extraneous variables), statistical
procedures, and randomization—the most effective method because it controls all
possible confounding variables without researchers having to identify them.
Study validity concerns the extent to which appropriate inferences can be made.
Threats to validity are reasons that an inference could be wrong. A key function
of quantitative research design is to rule out validity threats.
Statistical conclusion validity concerns the strength of evidence that a
relationship exists between two variables. A threat to statistical conclusion validity
is low statistical power (the ability to detect true relationships among variables).
Internal validity concerns inferences that the outcomes were caused by the
independent variable, rather than by extraneous factors. Threats to internal validity
include temporal ambiguity (uncertainty about whether the presumed cause
preceded the outcome), selection (preexisting group differences), history (external
events that could affect outcomes), maturation (changes due to the passage of
time), and mortality (effects attributable to attrition).
External validity concerns inferences about generalizability—whether findings
hold true over variations in people, conditions, and settings.
REFERENCES FOR CHAPTER 9
Berry, D., Verbiest, S., Hall, E., Dawson, I., Norton, D., Willis, S., . . . Stuebe, A. (2015). A postpartum
community-based weight management intervention designed for low-income women: Feasibility and initial
efficacy testing. Journal of National Black Nurses’ Association, 26, 29–39.
*Bowles, K., Hanlon, A., Holland, D., Potashnik, S., & Topaz, M. (2014). Impact of discharge planning decision
support on time to readmission among older adult medical patients. Professional Case Management, 19, 29–38.
*Brito, K., de Menezes, T., & de Olinda, R. (2015). Functional disability and socioeconomic and demographic
factors in elderly. Revista Brasileira de Enfermagem, 68, 548–555.
Burston, S., Chaboyer, W., Gillespie, B., & Carroll, R. (2015). The effect of a transforming care initiative on
patient outcomes in acute surgical units: A time series study. Journal of Advanced Nursing, 71, 417–429.
Cullum, K., Howland, L., & Instone, S. (2016). Depressive symptoms and social support in adolescents with type 2
diabetes. Journal of Pediatric Health Care, 30, 57–64.
Delmore, B., Lebovits, S., Suggs, B., Rolnitzky, L., & Ayello, E. (2015). Risk factors associated with heel pressure
ulcers in hospitalized patients. Journal of Wound, Ostomy, and Continence Nursing, 42, 242–248.
DiLibero, J., Lavieri, M., O’Donoghue, S., & DeSanto-Madeya, S. (2015). Withholding or continuing enteral
feedings during repositioning and the incidence of aspiration. American Journal of Critical Care, 24, 258–261.
Giurgescu, C., Engeland, C., & Templin, T. (2015). Symptoms of depression predict negative birth outcomes in
African American women: A pilot study. Journal of Midwifery & Women’s Health, 60, 570–577.
Hannula, L. S., Kaunonen, M., & Puukka, P. (2014). A study to promote breast feeding in the Helsinki
Metropolitan area in Finland. Midwifery, 30, 696–704.
Hsu, T., Chiang-Hanisko, L., Lee-Hsieh, J., Lee, G., Turton, M., & Tseng, Y. (2015). Effectiveness of an online
caring curriculum in enhancing nurses’ caring behavior. Journal of Continuing Education in Nursing, 46, 416–
424.
*Kundu, A., Lin, Y., Oron, A., & Doorenbos, A. (2014). Reiki therapy for postoperative oral pain in pediatric
patients: Pilot data from a double-blind, randomized clinical trial. Complementary Therapies in Clinical
239
Practice, 20, 21–25.
**Manfre, M., Adams, D., Callahan, G., Gould, P., Lang, S., McCubbins, H., . . . Chulay, M. (2015).
Hydrocortisone cream to reduce perineal pain after vaginal birth: A randomized controlled trial. MCN: The
American Journal of Maternal/Child Nursing, 40, 306–312.
McCarthy, M., Dickson, V., Katz, S., Sciacca, K., & Chyun, D. (2015). Process evaluation of an exercise
counseling intervention using motivational interviewing. Applied Nursing Research, 28, 156–162.
Pien, F., Chang, Y., Feng, H., Hung, P., Huang, S., & Tzeng, W. (2015). Changes in quality of life after a suicide
attempt. Western Journal of Nursing Research, 38(6), 721–737.
Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for
generalized causal inference. Boston, MA: Houghton Mifflin.
Song, Y., & Lindquist, R. (2015). Effects of mindfulness-based stress reduction on depression, anxiety, stress and
mindfulness in Korean nursing students. Nurse Education Today, 35, 86–90.
*Stavrinos, D., Garner, A., Franklin, C., Johnson, H., Welburn, S., Griffin, R., . . . Fine, P. (2015). Distracted
driving in teens with and without attention-deficit/hyperactivity disorder. Journal of Pediatric Nursing, 30,
e183–e191.
*A link to this open-access article is provided in the Internet Resources section on website.
**This journal article is available on for this chapter.
240
10 Sampling and Data Collection in
Quantitative Studies
Learning Objectives
On completing this chapter, you will be able to:
Distinguish between nonprobability and probability samples and compare their
advantages and disadvantages
Identify and describe several types of sampling designs in quantitative studies
Evaluate the appropriateness of the sampling method and sample size used in a study
Identify phenomena that lend themselves to self-reports, observation, and physiologic
measurement
Describe various approaches to collecting self-report data (e.g., interviews,
questionnaires, composite scales)
Describe methods of collecting and recording observational data
Describe the major features and advantages of biophysiologic measures
Critique a researcher’s decisions regarding the data collection plan
Describe approaches for assessing the reliability and validity of measures
Define new terms in the chapter
Key Terms
Biophysiologic measure
Category system
Checklist
Closed-ended question
Consecutive sampling
Construct validity
Content validity
Convenience sampling
Criterion validity
Eligibility criteria
Face validity
241
Internal consistency
Interrater reliability
Interview schedule
Likert scale
Measurement
Measurement property
Nonprobability sampling
Observation
Open-ended question
Patient-reported outcome (PRO)
Population
Power analysis
Probability sampling
Psychometric assessment
Purposive sampling
Questionnaire
Quota sampling
Rating scale
Reliability
Response options
Response rate
Response set bias
Sample
Sample size
Sampling bias
Sampling plan
Scale
Self-report
Simple random sampling
Strata
Stratified random sampling
Systematic sampling
Test–retest reliability
Validity
Visual analog scale
This chapter covers two important research topics—how quantitative researchers select
their study participants and how they collect data from them.
SAMPLING IN QUANTITATIVE RESEARCH
Researchers answer research questions using a sample of participants. In testing the
242
effects of an intervention for pregnant women, nurse researchers reach conclusions
without testing it with all pregnant women. Quantitative researchers develop a sampling
plan that specifies in advance how participants will be selected and how many to
include.
Basic Sampling Concepts
Let us begin by considering some terms associated with sampling.
Populations
A population (“P” in PICO questions) is the entire group of interest. For instance, if a
researcher were studying American nurses with doctoral degrees, the population could
be defined as all registered nurses (RNs) in the United States with a doctoral-level
degree. Other populations might be all patients who had cardiac surgery in St. Peter’s
Hospital in 2017 or all Australian children younger than age 10 years with cystic
fibrosis. Populations are not restricted to people. A population might be all patient
records in Memorial Hospital. A population is an entire aggregate of elements.
Researchers specify population characteristics through eligibility criteria. For
example, consider the population of American nursing students. Does the population
include part-time students? Are RNs returning to school for a bachelor’s degree
included? Researchers establish criteria to determine whether a person qualifies as a
member of the population (inclusion criteria) or should be excluded (exclusion criteria),
for example, excluding patients who are severely ill.
Example of inclusion and exclusion criteria
Joseph and colleagues (2016) studied children’s sensitivity to sucrose detection
(sweet taste). To be eligible, children had to be healthy and between the ages of 7 and
14 years. Children were excluded if they had a major medical illness, such as
diabetes, heart disease, or asthma.
Quantitative researchers sample from an accessible population in the hope of
generalizing to a target population. The target population is the entire population of
interest. The accessible population is the portion of the target population that is
accessible to the researcher. For example, a researcher’s target population might be all
diabetic patients in the United States, but, in reality, the population that is accessible
might be diabetic patients in a particular hospital.
Samples and Sampling
Sampling involves selecting a portion of the population to represent the population. A
sample is a subset of population elements. In nursing research, the elements (basic
units) are usually humans. Researchers work with samples rather than populations for
practical reasons.
243
Information from samples can, however, lead to faulty conclusions. In quantitative
studies, a criterion for judging a sample is its representativeness. A representative
sample is one whose characteristics closely approximate those of the population. Some
sampling plans are more likely to yield biased samples than others. Sampling bias is
the systematic overrepresentation or underrepresentation of a population segment in
terms of key characteristics.
Strata
Populations consist of subpopulations, or strata. Strata are mutually exclusive segments
of a population based on a specific characteristic. For instance, a population consisting
of all RNs in the United States could be divided into two strata based on gender. Strata
can be used in sample selection to enhance the sample’s representativeness.
TIP The sampling plan is usually discussed in a report’s method section,
sometimes in a subsection called “Sample” or “Study participants.” Sample
characteristics (e.g., average age) are often described in the results section.
Sampling Designs in Quantitative Studies
The two broad classes of sampling designs in quantitative research are probability
sampling and nonprobability sampling.
Nonprobability Sampling
In nonprobability sampling, researchers select elements by nonrandom methods in
which every element does not have a chance to be included. Nonprobability sampling is
less likely than probability sampling to produce representative samples—and yet, most
research samples in nursing and other disciplines are nonprobability samples.
Convenience sampling entails selecting the most conveniently available people as
participants. A nurse who distributes questionnaires about vitamin use to college
students leaving the library is sampling by convenience, for example. The problem with
convenience sampling is that people who are readily available might be atypical of the
population. The price of convenience is the risk of bias. Convenience sampling is the
weakest form of sampling, but it is also the most commonly used sampling method.
Example of a convenience sample
Huang and coresearchers (2016) studied the effects of risk factors and coping style on
the quality of life and depressive symptoms of adults with type 2 diabetes. A
convenience sample of 241 adults was recruited from a hospital metabolic outpatient
department.
In quota sampling, researchers identify population strata and figure out how many
244
people are needed from each stratum. By using information about the population,
researchers can ensure that diverse segments are represented in the sample. For
example, if the population is known to have 50% males and 50% females, then the
sample should have similar percentages. Procedurally, quota sampling is similar to
convenience sampling: Participants are a convenience sample from each stratum.
Because of this fact, quota sampling shares some weaknesses of convenience sampling.
Nevertheless, quota sampling is a big improvement over convenience sampling and
does not require sophisticated skills or a lot of effort. Surprisingly, few researchers use
this strategy.
Example of a quota sample
Wang and coresearchers (2015) described the protocol for a study of the effects of a
health program being implemented at a university in Singapore. The researchers plan
to use a quota sample, stratifying participants based on the type of work they do
(academic, administrative, support).
Consecutive sampling is a nonprobability sampling method that involves recruiting
all people from an accessible population over a specific time interval or for a specified
sample size. For example, in a study of ventilator-associated pneumonia in intensive
care unit (ICU) patients, a consecutive sample might consist of all eligible patients who
were admitted to an ICU over a 6-month period. Or it might be the first 250 eligible
patients admitted to the ICU, if 250 were the targeted sample size. Consecutive
sampling is often the best possible choice when there is “rolling enrollment” into an
accessible population.
Example of a consecutive sample
Bryant and colleagues (2015) compared radiographic reports of feeding tube
placement with images generated by an electromagnetic feeding tube placement
device. The sample consisted of 200 consecutive patients who had feeding tubes
inserted.
Purposive sampling involves using researchers’ knowledge about the population to
handpick sample members. Researchers might decide purposely to select people judged
to be knowledgeable about the issues under study. This method can lead to bias but can
be a useful approach when researchers want a sample of experts.
Example of purposive sampling
Hewitt and Cappiello (2015) invited a purposively sampled panel of experts
knowledgeable in the provision of reproductive health care to offer their viewpoints
for identifying essential nursing competencies for prevention and care relating to
unintended pregnancy.
245
HOW-TO-TELL TIP How can you tell what type of sampling design
was used in a quantitative study? If the report does not explicitly mention
or describe the sampling design, it is usually safe to assume that a
convenience sample was used.
Probability Sampling
Probability sampling involves random selection of elements from a population. With
random sampling, each element in the population has an equal, independent chance of
being selected. Random selection should not be (although it often is) confused with
random assignment, which is a signature of an RCT (see Chapter 9). Random
assignment to different treatment conditions has no bearing on how participants in the
RCT were selected.
Simple random sampling is the most basic probability sampling. In simple random
sampling, researchers establish a sampling frame—a list of population elements. If
nursing students at the University of Connecticut were the population, a student roster
would be the sampling frame. Elements in a sampling frame are numbered and then a
table of random numbers or an online randomizer is used to draw a random sample of
the desired size. Samples selected randomly are unlikely to be biased. There is no
guarantee of a representative sample, but random selection guarantees that differences
between the sample and the population are purely a function of chance. The probability
of selecting a markedly atypical sample through random sampling is low and decreases
as sample size increases.
Example of a simple random sample
Neta et al. (2015) studied adherence to foot self-care in patients with diabetes mellitus
in Brazil. The population included 8,709 patients with type 2 diabetes. The
researchers randomly sampled 368 of these patients.
In stratified random sampling, the population is first divided into two or more
strata, from which elements are randomly selected. As with quota sampling, the aim of
stratified sampling is to enhance representativeness.
Example of stratified random sampling
Buettner-Schmidt and colleagues (2015) studied the impact of smoking legislation on
smoke pollution levels in bars and restaurants in North Dakota. A total of 135 venues
were randomly sampled from three strata: restaurants, bars in communities with
ordinances stronger than the state law, and bars not in such communities.
TIP Many large national studies use multistage sampling, in which large
246
units are first randomly sampled (e.g., census tracts, hospitals), then smaller
units are selected (e.g., individual people).
Systematic sampling involves the selection of every kth case from a list, such as
every 10th person on a patient list. Systematic sampling can be done so that an
essentially random sample is drawn. First, the size of the population is divided by the
size of the desired sample to obtain the sampling interval (the fixed distance between
selected cases). For instance, if we needed a sample of 50 from a population of 5,000,
our sampling interval would be 100 (5,000/50 = 100). Every 100th case on a sampling
frame would be sampled, with the first case selected randomly. If our random number
were 73, the people corresponding to numbers 73, 173, 273, and so on would be in the
sample. Systematic sampling done in this manner is essentially the same as simple
random sampling and is often convenient.
Example of a systematic sample
Ridout and colleagues (2014) studied the incidence of failure to communicate vital
information as patients progressed through the perioperative process. From a
population of 1,858 patient records in a health care system meeting eligibility criteria,
the researchers selected every sixth case, for a sample of 294 cases.
Evaluation of Nonprobability and Probability Sampling
Probability sampling is the only viable method of obtaining representative samples. If
all elements in a population have an equal chance of being selected, then the resulting
sample is likely to do a good job of representing the population. Probability sampling
also allows researchers to estimate the magnitude of sampling error, which is the
difference between population values (e.g., the average age of the population) and
sample values (e.g., the average age of the sample).
Nonprobability samples are rarely representative of the population—some segment
of the population is likely to be underrepresented. When there is sampling bias, there is
a chance that the results could be misleading. Why, then, are nonprobability samples
used in most studies? Clearly, the advantage lies in their expediency: Probability
sampling is often impractical. Quantitative researchers using nonprobability samples
must be cautious about the inferences drawn from the data, and consumers should be
alert to possible sampling biases.
TIP The quality of the sampling plan is of particular importance when the
focus of the research is to obtain descriptive information about prevalence or
average values for a population. National surveys almost always use
probability samples. For studies whose purpose is primarily description, data
from a probability sample is at the top of the evidence hierarchy for
individual studies.
247
Sample Size in Quantitative Studies
Sample size—the number of study participants—is a major concern in quantitative
research. There is no simple formula to determine how large a sample should be, but
larger is usually better than smaller. When researchers calculate a percentage or an
average using sample data, the purpose is to estimate a population value, and larger
samples have less sampling error.
Researchers can estimate how large their samples should be for testing hypotheses
through power analysis. An example can illustrate basic principles of power analysis.
Suppose we were testing an intervention to help people quit smoking; smokers would be
randomized to an intervention or a control group. How many people should be in the
sample? When using power analysis, researchers must estimate how large the group
difference will be (e.g., group differences in daily number of cigarettes smoked). The
estimate might be based on prior research. When expected differences are sizeable, a
large sample is not needed to reveal group differences statistically, but when small
differences are predicted, large samples are necessary. In our example, if a small-to-
moderate group difference in postintervention smoking were expected, the sample size
needed to test group differences in smoking, with standard statistical criteria, would be
about 250 smokers (125 per group).
The risk of “getting it wrong” (statistical conclusion validity) increases when
samples are too small: Researchers risk gathering data that will not support their
hypotheses even when those hypotheses are correct. Large samples are no assurance of
accuracy, though: With nonprobability sampling, even a large sample can harbor bias.
The famous example illustrating this point is the 1936 U.S. presidential poll conducted
by the magazine Literary Digest, which predicted that Alfred Landon would defeat
Franklin Roosevelt by a landslide. A sample of about 2.5 million people was polled, but
biases arose because the sample was drawn from telephone directories and automobile
registrations during a Depression year when only the well-to-do (who favored Landon)
had a car or telephone.
A large sample cannot correct for a faulty sampling design; nevertheless, a large
nonprobability sample is better than a small one. When critiquing quantitative studies,
you must assess both the sample size and the sample selection method to judge how
good the sample was.
TIP The sampling plan is often one of the weakest aspects of quantitative
studies. Most nursing studies use samples of convenience, and many are
based on samples that are too small to provide an adequate test of the
research hypotheses.
Critiquing Sampling Plans
248
In coming to conclusions about the quality of evidence that a study yields, the sampling
plan merits special scrutiny. If the sample is seriously biased or too small, the findings
may be misleading or just plain wrong.
In critiquing a description of a sampling plan, you should consider whether the
researcher has adequately described the sampling strategy. Ideally, research reports
should describe the following:
The type of sampling approach used (e.g., convenience, consecutive, random)
The population and eligibility criteria for sample selection
The sample size, with a rationale
A description of the sample’s main characteristics (e.g., age, gender, clinical status,
and so on)
A second issue is whether the researcher made good sampling decisions. We have
stressed that a key criterion for assessing a sampling plan in quantitative research is
whether the sample is representative of the population. You will never know for sure, of
course, but if the sampling strategy is weak or if the sample size is small, there is reason
to suspect some bias.
Even with a rigorous sampling plan, the sample may be biased if not all people
invited to participate in a study agree to do so. If certain subgroups in the population
decline to participate, then a biased sample can result, even when probability sampling
is used. Research reports ideally should provide information about response rates (i.e.,
the number of people participating in a study relative to the number of people sampled)
and about possible nonresponse bias—differences between participants and those who
declined to participate (also sometimes referred to as response bias). In a longitudinal
study, attrition bias should be reported.
Your job as reviewer is to come to conclusions about the reasonableness of
generalizing the findings from the researcher’s sample to the accessible population and
a broader target population. If the sampling plan is flawed, it may be risky to generalize
the findings at all without replicating the study with another sample.
Box 10.1 presents some guiding questions for critiquing the sampling plan of a
quantitative research report.
Box 10.1 Guidelines for Critiquing Quantitative Sampling Plans
1. Was the population identified? Were eligibility criteria specified?
2. What type of sampling design was used? Was the sampling plan one that could be
expected to yield a representative sample?
3. How many participants were in the sample? Was the sample size affected by high
rates of refusals or attrition? Was the sample size large enough to support
statistical conclusion validity? Was the sample size justified on the basis of a
power analysis or other rationale?
4. Were key characteristics of the sample described (e.g., mean age, percentage of
249
female)?
5. To whom can the study results reasonably be generalized?
DATA COLLECTION IN QUANTITATIVE
RESEARCH
Phenomena in which researchers are interested must be translated into data that can be
analyzed. This section discusses the challenging task of collecting quantitative research
data.
Overview of Data Collection and Data Sources
Data collection methods vary along several dimensions. One issue is whether the
researcher collects original data or uses existing data. Existing records, for example, are
an important data source for nurse researchers. A wealth of clinical data gathered for
nonresearch purposes can be fruitfully analyzed to answer research questions.
Example of a study using records
Draughton Moret and colleagues (2016) explored factors associated with patients’
acceptance of nonoccupational postexposure HIV prophylaxis following a sexual
assault. Data were obtained from forensic nursing charts.
Researchers most often collect new data. In developing a data collection plan,
researchers must decide the type of data to gather. Three types have been frequently
used by nurse researchers: self-reports, observations, and biophysiologic measures. Self-
report data—also called patient-reported outcome (PRO) data—are participants’
responses to researchers’ questions, such as in an interview. In nursing studies, self-
reports are the most common data collection approach. Direct observation of people’s
behaviors and characteristics can be used for certain questions. Nurses also use
biophysiologic measures to assess important clinical variables.
Regardless of type of data collected in a study, data collection methods vary along
several dimensions, including structure, quantifiability, and objectivity. Data for
quantitative studies tend to be quantifiable and structured, with the same information
gathered from all participants in a comparable, prespecified way. Quantitative
researchers generally strive for methods that are as objective as possible.
Self-Reports/Patient-Reported Outcomes
Structured self-report methods are used when researchers know in advance exactly what
they need to know and can frame appropriate questions to obtain the needed
information. Structured self-report data are collected with a formal, written document—
an instrument. The instrument is known as an interview schedule when the questions
250
are asked orally face-to-face or by telephone or as a questionnaire when respondents
complete the instrument themselves.
Question Form and Wording
In a totally structured instrument, respondents are asked to respond to the same
questions in the same order. Closed-ended (or fixed-alternative) questions are ones in
which the response options are prespecified. The options may range from a simple yes
or no to complex expressions of opinion. Such questions ensure comparability of
responses and facilitate analysis. Some examples of closed-ended questions are
presented in Table 10.1.
Some structured instruments, however, also include open-ended questions, which
allow participants to respond to questions in their own words (e.g., Why did you stop
smoking?). When open-ended questions are included in questionnaires, respondents
must write out their responses. In interviews, the interviewer records responses
verbatim.
Good closed-ended questions are more difficult to construct than open-ended ones
but easier to analyze. Also, people may be unwilling to compose lengthy written
responses to open-ended questions in questionnaires. A major drawback of closed-
ended questions is that researchers might omit potentially important responses. If
respondents are verbally expressive and cooperative, open-ended questions allow for
richer information than closed-ended questions. Finally, some respondents object to
choosing from alternatives that do not reflect their opinions precisely.
In drafting questions for a structured instrument, researchers must carefully monitor
the wording of each question for clarity, absence of bias, and (in questionnaires) reading
level. Questions must be sequenced in a psychologically meaningful order that
encourages cooperation and candor. Developing, pretesting, and refining a self-report
251
instrument can take many months.
Interviews Versus Questionnaires
Researchers using structured self-reports must decide whether to use interviews or self-
administered questionnaires. Questionnaires have the following advantages:
Questionnaires are less costly and are advantageous for geographically dispersed
samples. Internet questionnaires are especially economical and are an increasingly
important means of gathering self-report data—although response rates to Internet
questionnaires tend to be low.
Questionnaires offer the possibility of anonymity, which may be crucial in obtaining
information about certain opinions or traits.
Example of Internet questionnaires
Ratanasiripong (2015) sent a web-based questionnaire to a convenience sample of
3,300 male college students attending a public university. The purpose of the study
was to document the rate of human papillomavirus vaccination in college men and to
examine factors associated with being vaccinated. Responses were received from 410
students.
The strengths of interviews outweigh those of questionnaires. Among the
advantages are the following:
Response rates tend to be high in face-to-face interviews. Respondents are less likely
to refuse to talk to an interviewer than to ignore a questionnaire. Low response rates
can lead to bias because respondents are rarely a random subset of the original
sample. In the Internet questionnaire study of college men (Ratanasiripong, 2015),
the response rate was under 15%.
Some people cannot fill out a questionnaire (e.g., young children). Interviews are
feasible with most people.
Some advantages of face-to-face interviews also apply to telephone interviews.
Long or complex instruments are not well suited to telephone administration, but for
relatively brief instruments, telephone interviews combine relatively low costs with high
response rates.
Example of telephone interviews
Oliver and colleagues (2016) conducted telephone interviews with a sample of 1,024
participants. The interviews included questions about cancer risk knowledge, with
particular emphasis on colorectal cancer risk knowledge.
Scales
252
Social psychological scales are often incorporated into questionnaires or interview
schedules. A scale is a device that assigns a numeric score to people along a continuum,
like a scale for measuring weight. Social psychological scales differentiate people with
different attitudes, perceptions, and psychological traits.
One technique is the Likert scale, which consists of several declarative statements
(items) that express a viewpoint on a topic. Respondents are asked to indicate how much
they agree or disagree with the statement. Table 10.2 presents a six-item Likert scale for
measuring attitudes toward condom use. In this example, agreement with positively
worded statements is assigned a higher score. The first statement is positively worded;
agreement indicates a favorable attitude toward condom use. Because there are five
response alternatives, a score of 5 would be given for strongly agree, 4 for agree, and so
on. Responses of two hypothetical participants are shown by a check or an X, and their
item scores are shown in the right-hand columns. Person 1, who agreed with the first
statement, has a score of 4, whereas person 2, who strongly disagreed, got a score of 1.
The second statement is negatively worded, and so scoring is reversed—a 1 is assigned
for strongly agree and so forth. Item reversals ensure that a high score consistently
reflects positive attitudes toward condom use.
A person’s total score is the sum of item scores—hence, these scales are sometimes
called summated rating scales or composite scales. In our example, person 1 has a more
positive attitude toward condoms (total score = 26) than person 2 (total score = 11).
Summing item scores makes it possible to finely discriminate among people with
different opinions. Composite scales are often composed of two or more subscales that
measure different aspects of a construct. Developing high-quality scales requires a lot of
skill and effort.
253
Example of a Likert scale
Ranse and colleagues (2015) studied factors influencing the provision of end-of-life
care in critical care settings and created a 58-item with 8 subscales Likert-type scale.
Examples of statements include the following: “Patients at the end-of-life require
little nursing care” and “I feel a sense of personal failure when a patient dies.”
Responses were on a 5-point scale: strongly disagree, disagree, neutral, agree, and
strongly agree.
Another type of scale is the visual analog scale (VAS), which can be used to
measure subjective experiences such as pain or fatigue. The VAS is a straight line, and
the end anchors are labeled as the extreme limits of the sensation being measured (Fig.
10.1). People mark a point on the line corresponding to the amount of sensation
experienced. Traditionally, a VAS line is 100 mm in length, which makes it easy to
derive a score from 0 to 100 by measuring the distance from one end of the scale to the
mark on the line.
Example of a visual analog scale
Hu and coresearchers (2015) tested the effects of earplugs, eye masks, and relaxing
music on sleep quality in ICU patients. Sleep quality was measured using a 0 to 100
VAS.
Scales permit researchers to efficiently quantify subtle gradations in the intensity of
individual characteristics. Scales can be administered either verbally or in writing and
so can be used with most people. Scales are susceptible to several common problems;
however, many of which are referred to as response set biases. The most important
biases include the following:
Social desirability response set bias—a tendency to misrepresent attitudes or traits by
giving answers that are consistent with prevailing social views
Extreme response set bias—a tendency to consistently express extreme attitudes (e.g.,
strongly agree), leading to distortions because extreme responses may be unrelated to
the trait being measured
Acquiescence response set bias—a tendency to agree with statements regardless of
their content by some people (yea-sayers). The opposite tendency for other people
(naysayers) to disagree with statements independently of the question content is less
common.
254
Researchers can reduce these biases by developing sensitively worded questions,
creating a permissive, nonjudgmental atmosphere, and guaranteeing the confidentiality
of responses.
TIP Other self-report approaches include vignettes and Q-sorts. Vignettes
are brief descriptions of situations to which respondents are asked to react.
Q-sorts present participants with a set of cards on which statements are
written. Participants are asked to sort the cards along a specified dimension,
such as most helpful/least helpful. Vignettes and Q-sorts are described in the
chapter Supplement on website.
Evaluation of Self-Report Methods
If researchers want to know how people feel or what they believe, the most direct
approach is to ask them. Self-reports frequently yield information that would be difficult
or impossible to gather by other means. Behaviors can be observed but only if people
are willing to engage in them publicly and engage in them at the time of data collection.
Nevertheless, self-reports have some weaknesses. The most serious issue concerns
the validity and accuracy of self-reports: How can we be sure that respondents feel or
act the way they say they do? Investigators usually have no choice but to assume that
most respondents have been frank. Yet, we all have a tendency to present ourselves in
the best light, and this may conflict with the truth. When reading research reports, you
should be alert to potential biases in self-reported data.
Observational Methods
For some research questions, direct observation of people’s behavior is an alternative to
self-reports, especially in clinical settings. Observational methods can be used to gather
such information as patients’ conditions (e.g., their sleep–wake state), verbal
communication (e.g., exchange of information at discharge), nonverbal communication
(e.g., body language), activities (e.g., geriatric patients’ self-grooming activities), and
environmental conditions (e.g., noise levels).
In studies that use observation, researchers have flexibility with regard to several
important dimensions. For example, the focus of the observation can be on broadly
defined events (e.g., patient mood swings) or on small, specific behaviors (e.g., facial
expressions). Observations can be made through the human senses and then recorded
manually, but they can also be done with equipment such as video recorders.
Researchers do not always tell people they are being observed because awareness of
being observed may cause people to behave atypically. Behavioral distortions due to the
known presence of an observer is called reactivity.
Structured observation involves the use of formal instruments and protocols that
dictate what to observe, how long to observe it, and how to record the data. Structured
observation is not intended to capture a broad slice of life but rather to document
255
specific behaviors, actions, and events. Structured observation requires the formulation
of a system for accurately categorizing, recording, and encoding the observations.
TIP Researchers often use structured observations when participants
cannot be asked questions or cannot be expected to provide reliable answers.
Many observational instruments are designed to capture the behaviors of
infants, children, or people whose communication skills are impaired.
Methods of Structured Observation
The most common approach to making structured observations is to use a category
system for classifying observed phenomena. A category system represents a method of
recording in a systematic fashion the behaviors and events of interest that transpire
within a setting.
Some category systems require that all observed behaviors in a specified domain
(e.g., body positions) be classified. A contrasting technique is a system in which only
particular types of behavior (which may or may not occur) are categorized. For
example, if we were studying children’s aggressive behavior, we might develop such
categories as “strikes another child” or “throws objects.” In this category system, many
behaviors—all that are nonaggressive—would not be classified; some children may
exhibit no aggressive actions.
Example of nonexhaustive categories
Nilsen and colleagues (2014) conducted a study of nursing care quality that involved
observations of communication between nurses and mechanically ventilated patients
in an ICU. Among many different types of observations made, observers recorded
instances of positive and negative nurse behaviors, according to carefully defined
criteria. Nurse behaviors that were neutral were not categorized.
Category systems must have careful, explicit operational definitions of the behaviors
and characteristics to be observed. Each category must be explained, giving observers
clear-cut criteria for assessing the occurrence of the phenomenon.
Category systems are the basis for constructing a checklist—the instrument
observers use to record observations. The checklist is usually formatted with a list of
behaviors from the category system on the left and space for tallying the frequency or
duration on the right. The task of the observer using an exhaustive category system is to
place all observed behaviors in one category for each “unit” of behavior (e.g., a time
interval). With nonexhaustive category systems, categories of behaviors that may or
may not be manifested by participants are listed. The observer watches for instances of
these behaviors and records their occurrence.
Another approach to structured observations is to use a rating scale, an instrument
that requires observers to rate phenomena along a descriptive continuum. The observer
256
may be required to make ratings at intervals throughout the observation or to summarize
an entire event after observation is completed. Rating scales can be used as an extension
of checklists, in which the observer records not only the occurrence of some behavior
but also some qualitative aspect of it, such as its intensity. Although this approach
yields a lot of information, it places an immense burden on observers.
Example of observational ratings
Burk and colleagues (2014) sought to identify factors that would predict agitation in
critically ill adults. Patients’ degree of agitation was observed and measured using the
Richmond Agitation-Sedation Scale, which requires ratings on a 10-point scale, from
+4 (combative) to −5 (unarousable).
Observational Sampling
Researchers must decide when to apply their observational systems. Observational
sampling methods are a means of obtaining representative examples of the behaviors
being observed. One system is time sampling, which involves selecting time periods
during which observations will occur. Time frames may be selected systematically (e.g.,
every 30 seconds at 2-minute intervals) or at random.
With event sampling, researchers select integral events to observe. Event sampling
requires researchers to either know when events will occur (e.g., nursing shift changes)
or wait for their occurrence. Event sampling is a good choice when events of interest are
infrequent and may be missed if time sampling is used. When behaviors and events are
relatively frequent, however, time sampling enhances the representativeness of the
observed behaviors.
Example of event and time sampling
In the previously mentioned observational study of nurse–patient communication in
the ICU (Nilsen et al., 2014), events were first sampled (occasions of nurse–patient
interaction), and then 3-minute segments of interaction on four separate occasions
over a 2-day period were videotaped and then coded for a range of outcomes (e.g.,
making eye contact).
Evaluation of Observational Methods
Certain research questions are better suited to observation than to self-reports, such as
when people cannot describe their own behaviors. This may be the case when people
are unaware of their behavior (e.g., stress-induced behavior), when behaviors are
emotionally laden (e.g., grieving), or when people are not capable of reporting their
actions (e.g., young children). Observational methods have an intrinsic appeal for
directly capturing behaviors. Nurses are often in a position to watch people’s behaviors
and may, by training, be especially sensitive observers.
257
Shortcomings of observational methods include possible reactivity when the
observer is conspicuous and the vulnerability of observations to bias. For example, the
observer’s values and prejudices may lead to faulty inference. Observational biases
probably cannot be eliminated, but they can be minimized through careful observer
training and assessment.
Biophysiologic Measures
Clinical nursing studies involve biophysiologic instruments both for creating
independent variables (e.g., a biofeedback intervention) and for measuring dependent
variables. Our discussion focuses on the use of biophysiologic measures as dependent
(outcome) variables.
Nurse researchers have used biophysiologic measures for a wide variety of
purposes. Examples include studies of basic biophysiologic processes, explorations of
the ways in which nursing actions and interventions affect physiologic outcomes,
product assessments, studies to evaluate the accuracy of biophysiologic information
gathered by nurses, and studies of the correlates of physiologic functioning in patients
with health problems.
Both in vivo and in vitro measurements are used in research. In vivo measurements
are those performed directly within or on living organisms, such as blood pressure and
body temperature measurement. Technological advances continue to improve the ability
to measure biophysiologic phenomena accurately and conveniently. With in vitro
measures, data are gathered from participants by extracting biophysiologic material
from them and subjecting it to analysis by laboratory technicians. In vitro measures
include chemical measures (e.g., the measurement of hormone levels), microbiologic
measures (e.g., bacterial counts and identification), and cytologic or histologic measures
(e.g., tissue biopsies). Nurse researchers also use anthropomorphic measures, such as
the body mass index and waist circumference.
Example of a study with in vivo and in vitro measures
Okoli et al. (2016) examined the physiological responses of nonsmokers to nicotine
patch administration. The researchers measured heart rate, blood pressure, and serum
nicotine levels at 0.5 hour, 1 hour, and 2 hours after applying a nicotine patch.
Biophysiologic measures offer a number of advantages to nurse researchers. They
are relatively accurate and precise, especially compared to psychological measures, such
as self-report measures of anxiety or pain. Also, biophysiologic measures are objective.
Two nurses reading from the same spirometer output are likely to record identical tidal
volume measurements, and two spirometers are likely to produce the same readouts.
Patients cannot easily distort measurements of biophysiologic functioning. Finally,
biophysiologic instruments provide valid measures of targeted variables: Thermometers
can be relied on to measure temperature and not blood volume, and so forth. For
258
nonbiophysiologic measures, there are typically concerns about whether an instrument
is really measuring the target concept.
Data Quality in Quantitative Research
In developing a data collection plan, researchers must strive for the highest possible
quality data. One aspect of data quality concerns the procedures used to collect the data.
For example, the people who collect and record the data must be properly trained and
monitored to ensure that procedures are diligently followed. Another issue concerns the
circumstances under which data were gathered. For example, it is important for
researchers to ensure privacy and to create an atmosphere that encourages participants
to be candid or behave naturally.
A crucial issue for data quality concerns the adequacy of the instruments or scales
used to measure constructs. Researchers seek to enhance the quality of their data by
selecting excellent measures. Measurement involves assigning numbers to represent
the amount of an attribute present in a person or object. When a new measure of a
construct (e.g., anxiety) is developed, rules for assigning numerical values (scores) need
to be established. Then, the rules must be evaluated to see if they are good rules—they
must yield numbers that truly and accurately correspond to different amounts of the
targeted trait.
Measures that are not perfectly accurate yield measurement that contains some error.
Many factors contribute to measurement error, including personal states (e.g., mood,
fatigue), response set biases, and situational factors (e.g., temperature, lighting). In self-
report measures, measurement errors can result from how questions are worded.
Careful researchers select measures that are known to be psychometrically sound.
Psychometrics is the branch of psychology concerned with the theory and methods of
psychological measurements. When a new measure is developed, the developers
undertake a psychometric assessment, which involves an evaluation of the measure’s
measurement properties.
Psychometricians (and most nurse researchers) have traditionally focused on two
measurement properties when assessing the quality of a measure: reliability and
validity. In recent years, measurement experts in medicine have advocated attending to
additional measurement properties that concern the measurement of change (Polit &
Yang, 2016). Here, we describe the two properties that you are most likely to encounter
in reading articles in the nursing literature. Methods used to assess these properties are
described in the chapter on statistical analysis (see Chapter 14).
Reliability
Reliability, broadly speaking, is the extent to which scores are free from measurement
error. Reliability can also be defined as the extent to which scores for people who have
not changed are the same for repeated measurements. In other words, reliability
concerns consistency—the absence of variation—in measuring a stable attribute for an
259
individual. In all types of assessments, reliability involves a replication to evaluate the
extent to which scores for a stable trait are the same.
In test–retest reliability, replication takes the form of administering a measure to
the same people on two occasions (e.g., 1 week apart). The assumption is that for traits
that have not changed, any differences in people’s scores on the two testings are the
result of measurement error. When score differences across waves are small, reliability
is high. This type of reliability is sometimes called stability or reproducibility—the
extent to which scores can be reproduced on a repeated administration. Except for
highly volatile constructs (e.g., mood), test–retest reliability can be assessed for most
measures, including biophysiologic ones.
When measurements involve using people who make scoring judgments, a key
source of measurement error stems from the person making the measurements. This is
the situation for observational measures (e.g., ratings to measure agitation) and is also
true for some biophysiologic measurements (e.g., skinfold measurement). In such
situations, it is important to evaluate how reliably the measurements reflect attributes of
the person being rated rather than attributes of the raters. The most typical approach is
to undertake an interrater (or inter-observer) reliability assessment, which involves
having two or more observers independently applying the measure with the same people
to see if the scores are consistent across raters.
Another aspect of reliability is internal consistency. In responding to a self-report
item, people are influenced not only by the underlying construct but also by
idiosyncratic reactions to the words. By combining multiple items with various
wordings, item irrelevancies are expected to cancel each other out. An instrument is said
to be internally consistent to the extent that its items measure the same trait. For internal
consistency, replication involves people’s responses to multiple items during a single
administration. Whereas other reliability estimates assess a measure’s degree of
consistency across time or raters, internal consistency captures consistency across items.
As we explain in Chapter 14, assessments of reliability yield coefficients that
summarize how reliable a measure is. The reliability coefficients normally range in
value from 0.0 to 1.0, with higher values being especially desirable. Coefficients of .80
or higher are considered desirable. Researchers should select instruments with
demonstrated reliability and should document this in their reports. In undertaking a
study, researchers do not usually do a full psychometric assessment of an existing
measure, but they often do compute internal consistency reliability coefficients with
their data.
Example of internal consistency reliability
Kennedy and colleagues (2015) developed and assessed a scale to measure nursing
students’ self-efficacy for practice competence. The 22-item scale had high internal
consistency: the reliability coefficient was .92.
260
Validity
Validity in a measurement context is the degree to which an instrument is measuring
the construct it purports to measure. When researchers develop a scale to measure
resilience, they need to be sure that the resulting scores validly reflect this construct and
not something else, such as self-efficacy or perseverance. Assessing the validity of
abstract constructs requires a careful conceptualization of the construct—as well as a
conceptualization of what the construct is not. Like reliability, validity has different
aspects and assessment approaches. Four aspects of measurement validity are face
validity, content validity, criterion validity, and construct validity.
Face validity refers to whether the instrument looks like it is measuring the target
construct. Although face validity is not considered good evidence of validity, it is
helpful for a measure to have face validity if other types of validity have also been
demonstrated. If patients’ resistance to being measured reflects the view that the scale is
not relevant to their problems or situations, then face validity is an issue.
Content validity may be defined as the extent to which an instrument’s content
adequately captures the construct—that is, whether a composite instrument (e.g., a
multi-item scale) has an appropriate sample of items for the construct being measured.
If the content of an instrument is a good reflection of a construct, then the instrument
has a greater likelihood of achieving its measurement objectives. Content validity is
usually assessed by having a panel of experts rate the scale items for relevance to the
construct and comment on the need for additional items.
Criterion validity is the extent to which the scores on a measure are a good
reflection of a “gold standard”—i.e., a criterion considered an ideal measure of the
construct. Not all measures can be validated using a criterion approach because there is
not always a “gold standard” criterion. Two types of criterion validity exist. Concurrent
validity is the type of criterion validity that is assessed when the measurements of the
criterion and the focal instrument occur at the same time. In such a situation, the
implicit hypothesis is that the focal measure is an adequate substitute for a
contemporaneous criterion. For example, scores on a scale to measure stress could be
compared to wake-up salivary free cortisol levels (the criterion). In predictive validity,
the focal measure is tested against a criterion that is measured in the future. Screening
scales are often tested against some future criterion—namely, the occurrence of the
phenomenon for which a screening tool is sought (e.g., a patient fall).
For many abstract, unobservable human attributes (constructs), no gold standard
criterion exists, and so other validation avenues must be pursued. Construct validity is
the degree to which evidence about a measure’s scores in relation to other variables
supports the inference that the construct has been well represented. Construct validity
typically involves hypothesis testing, which follow a similar path: Hypotheses are
developed about a relationship between scores on the focal measure and values on other
constructs, data are collected to test the hypotheses, and then validity conclusions are
reached based on the results of the hypothesis tests.
One widely used hypothesis testing approach to construct validity is sometimes
261
called known-groups validity, which tests hypotheses about a measure’s ability to
discriminate between two or more groups known (or expected) to differ with regard to
the construct of interest. For instance, in validating a measure of anxiety about the labor
experience, the scores of primiparas and multiparas could be contrasted. On average,
women who had never given birth would likely experience more anxiety than women
who had already had children; one might question the validity of the instrument if such
differences did not emerge.
Example of known-groups validity
Peters and colleagues (2014) evaluated the validity of an existing scale, the Trust in
Provider Scale, for a new population, namely, pregnant African American women.
Consistent with hypotheses, women who had experienced racism in health care had
significantly lower scores on the trust scale than women who had not.
TIP Another aspect of construct validity is called cross-cultural validity,
which is relevant for measures that have been translated or adapted for use
with a different cultural group than that for the original instrument. Cross-
cultural validity is the degree to which the components (e.g., items) of a
translated or culturally adapted measure perform adequately and equivalently
relative to their performance on the original instrument.
An instrument does not possess or lack validity; it is a question of degree. An
instrument’s validity is not proved, established, demonstrated, or verified but rather is
supported to a greater or lesser extent by evidence. Researchers undertaking a study
should select measures for which good validity information is available.
Critiquing Data Collection Methods
The goal of a data collection plan is to produce data that are of excellent quality. Every
decision researchers make about data collection methods and procedures can affect data
quality and hence the overall quality of the study.
It may, however, be difficult to critique data collection methods in studies reported
in journals because researchers’ descriptions are seldom detailed. However, researchers
do have a responsibility to communicate basic information about their approach so that
readers can assess the quality of evidence that the study yields. One important issue is
the mix of data collection approaches. Triangulation of methods (e.g., self-report and
observation) is often desirable.
Information about data quality (reliability and validity of the measures) should be
provided in every quantitative research report. Ideally—especially for composite scales
—the report should provide internal consistency coefficients based on data from the
study itself, not just from previous research. Interrater or interobserver reliability is
262
especially crucial for assessing data quality in studies that use observation. The values
of the reliability coefficients should be sufficiently high to support confidence in the
findings.
Validity is more difficult to document than reliability. At a minimum, researchers
should defend their choice of existing measures based on validity information from the
developers, and they should cite the relevant publication. Guidelines for critiquing data
collection methods are presented in Box 10.2.
Box 10.2 Guidelines for Critiquing Quantitative Data Collection Plans
1. Did the researchers use the best method of capturing study phenomena (i.e., self-
reports, observation, biophysiologic measures)? Was triangulation of methods
used to advantage?
2. If self-report methods were used, did the researchers make good decisions about
the specific methods used to solicit information (e.g., in-person interviews,
Internet questionnaires, etc.)? Were composite scales used? If not, should they
have been?
3. If observational methods were used, did the report adequately describe what the
observations entailed and how observations were sampled? Were risks of
observational bias addressed? Were biophysiologic measures used in the study,
and was this appropriate?
4. Did the report provide adequate information about data collection procedures?
Were data collectors properly trained?
5. Did the report offer evidence of the reliability of measures? Did the evidence come
from the research sample itself, or is it based on other studies? If reliability was
reported, which estimation method was used? Was the reliability sufficiently high?
6. Did the report offer evidence of the validity of the measures? If validity
information was reported, which validity approach was used?
7. If there was no reliability or validity information, what conclusion can you reach
about the quality of the data in the study?
In this section, we describe the sampling and data collection plan of a
quantitative nursing study. Read the summary and then answer the critical
thinking questions that follow, referring to the full research report if
necessary. Example 1 is featured on the interactive Critical Thinking Activity
on website. The critical thinking questions for Example 2 are based
on the study that appears in its entirety in Appendix A of this book. Our
comments for these exercises are in the Student Resources section on
263
.
EXAMPLE 1: SAMPLING AND DATA COLLECTION IN A
QUANTITATIVE STUDY
Study: Insomnia symptoms are associated with abnormal endothelial
function (Routledge et al., 2015) (Some information about the study was
provided in Rask et al., 2011.)
Purpose: The purpose of this study was to test the hypothesis that insomnia
symptoms are associated with reduced endothelial function in working adults.
Design: The researchers used cross-sectional baseline data from a
longitudinal study that involved the collection of extensive data from people
enrolled in an Emory-Georgia Tech Predictive Health Institute study. The
design for the study reported by Routledge and colleagues was descriptive
correlational.
Sampling: The initial cohort of the study was a sample of full-time
employees of a large university. The population of eligible employees was
stratified by type of employees (faculty, exempt, and nonexempt employees).
From the stratified sampling frame, every 10th employee was invited to
participate in the research. About 30% of the solicited employees agreed to be
contacted, and about 10% were ultimately enrolled. Additionally, about 10%
of the sample was a convenience sample of workers from self-referral or
health care provider referral. Specific criteria for enrollment included
employees aged 18 years or older, with no hospitalization in the prior year
except for accidents. Exclusion criteria included a history in the previous year
of a severe psychosocial disorder, substance/drug abuse or alcoholism,
current active malignant neoplasm, and any acute illness in the 2 weeks
before baseline data collection. For the purpose of the Routledge et al. study,
participants were excluded if they had a sleep apnea diagnosis or reported
symptoms of sleep apnea. The sample for this study was 496 adults aged 19
to 82 years.
Data Collection: The overall study involved two baseline assessments, a 6-
month assessment, and four annual assessments. Baseline measures used in
Routledge et al. study included both self-report and biophysiological
measures. In terms of self-reports, participants completed an online
questionnaire that asked questions about background characteristics (e.g., age,
gender, smoking status). The questionnaire also included several composite
scales to measure sleep quality (the Pittsburgh Sleep Quality Index),
depression (the Beck Depression Inventory), and sleepiness (the Epworth
Sleepiness Scale). Information from the sleep scales was used to categorize
264
participants as being in an insomnia group or a “better sleepers” group.
Anthropomorphic measurements (height and weight, body mass index) were
obtained, blood pressure was measured, and a blood draw was completed and
analyzed for lipids. Finally, endothelial function was measured using brachial
artery flow-mediated dilation (FMD) measures. FMD measurements were
read by two ultrasound technicians. Information about the reliability and
validity of the various measures was not provided.
Key Findings: In this sample, insomnia symptoms were reported by 40% of
the participants. After controlling statistically for age and other variables, the
researchers found that participants reporting insomnia symptoms had lower
FMD than did participants reporting better sleep.
Critical Thinking Exercises
1. Answer the relevant questions from Box 10.1 regarding this study.
2. Answer the relevant questions from Box 10.2 regarding this study.
3. Are there variables in this study that could have been measured through
observation but were not?
4. If the results of this study are valid and reliable, what might be some of the
uses to which the findings could be put in clinical practice?
EXAMPLE 2: SAMPLING AND DATA COLLECTION IN THE
STUDY IN APPENDIX A
• Read the methods section of Swenson and colleagues’ (2016) study
(“Parents’ use of praise and criticism in a sample of young children
seeking mental health services”) in Appendix A of this book.
Critical Thinking Exercises
1. Answer the relevant questions from Box 10.1 regarding this study.
2. Answer the relevant questions from Box 10.2 regarding this study.
WANT TO KNOW MORE?
A wide variety of resources to enhance your learning and understanding of
this chapter are available on .
• Interactive Critical Thinking Activity
• Chapter Supplement on Vignettes and Q-Sorts
• Answers to the Critical Thinking Exercises for Example 2
• Internet Resources with useful websites for Chapter 10
• A Wolters Kluwer journal article in its entirety—the Routledge et al.
study described as Example 1 on pp. 178–179.
265
Additional study aids, including eight journal articles and related questions,
are also available in Study Guide for Essentials of Nursing Research,
9e.
Summary
Points
Sampling is the process of selecting elements from a population, which is an
entire aggregate of cases. An element is the basic unit of a population—usually
humans in nursing research.
Eligibility criteria (including both inclusion criteria and exclusion criteria) are
used to define population characteristics.
A key criterion in assessing a sample in a quantitative study is its
representativeness—the extent to which the sample is similar to the population
and avoids bias. Sampling bias is the systematic overrepresentation or
underrepresentation of some segment of the population.
Nonprobability sampling (in which elements are selected by nonrandom
methods) includes convenience, quota, consecutive, and purposive sampling.
Nonprobability sampling is convenient and economical; a major disadvantage is
its potential for bias.
Convenience sampling uses the most readily available or convenient people.
Quota sampling divides the population into homogeneous strata (subpopulations)
to ensure representation of the subgroups in the sample; within each stratum,
people are sampled by convenience.
Consecutive sampling involves taking all of the people from an accessible
population who meet the eligibility criteria over a specific time interval or for a
specified sample size.
In purposive sampling, participants are handpicked to be included in the sample
based on the researcher’s knowledge about the population.
Probability sampling designs, which involve the random selection of elements
from the population, yield more representative samples than nonprobability
designs and permit estimates of the magnitude of sampling error.
Simple random sampling involves the random selection of elements from a
sampling frame that enumerates all the elements; stratified random sampling
divides the population into homogeneous subgroups from which elements are
selected at random.
266
Systematic sampling is the selection of every kth case from a list. By dividing the
population size by the desired sample size, the researcher establishes the sampling
interval, which is the standard distance between the selected elements.
In quantitative studies, researchers can use a power analysis to estimate sample
size needs. Large samples are preferable because they enhance statistical
conclusion validity and tend to be more representative, but even large samples do
not guarantee representativeness.
The three principal data collection methods for nurse researchers are self-reports,
observations, and biophysiologic measures.
Self-reports, which are also called patient-reported outcomes or PROs, involve
directly questioning study participants and are the most widely used method of
collecting data for nursing studies.
Structured self-reports for quantitative studies involve a formal instrument—a
questionnaire or interview schedule—that may contain open-ended questions
(which permit respondents to respond in their own words) and closed-ended
questions (which offer respondents response options from which to choose).
Questionnaires are less costly than interviews and offer the possibility of
anonymity, but interviews yield higher response rates and are suitable for a wider
variety of people.
Social psychological scales are self-report instruments for measuring such
characteristics as attitudes and psychological attributes. Likert scales (summated
rating scales) present respondents with a series of items; each item is scored (e.g.,
on a continuum from strongly agree to strongly disagree) and then summed into a
composite score.
A visual analog scale (VAS) is used to measure subjective experiences (e.g., pain,
fatigue) along a 100-mm line designating a bipolar continuum.
Scales are versatile and powerful but are susceptible to response set biases—the
tendency of some people to respond to items in characteristic ways, independently
of item content.
Observational methods are techniques for acquiring data through the direct
observation of phenomena.
Structured observations dictate what the observer should observe; they often
involve checklists—instruments based on category systems for recording the
appearance, frequency, or duration of behaviors or events. Observers may also use
rating scales to rate phenomena along a dimension of interest (e.g.,
lethargic/energetic).
Structured observations often involve a sampling plan (such as time sampling or
event sampling) for selecting the behaviors, events, and conditions to be observed.
Observational techniques are often essential, but observational biases can reduce
267
data quality.
Data may also be derived from biophysiologic measures, which include in vivo
measurements (those performed within or on living organisms) and in vitro
measurements (those performed outside the organism’s body, such as blood tests).
Biophysiologic measures have the advantage of being objective, accurate, and
precise.
In developing a data collection plan, researchers must decide who will collect the
data, how the data collectors will be trained, and what the circumstances for data
collection will be.
In quantitative studies, variables are measured. Measurement involves assigning
numbers to represent the amount of an attribute present in a person, using a set of
rules; researchers strive to use measures that have good rules that minimize
measurement errors.
Measures (and the quality of the data that the measures yield) can be evaluated in a
psychometric assessment in terms of several measurement properties, most
often reliability and validity.
Reliability is the extent to which scores for people who have not changed are the
same for repeated measurements. A reliable measure minimizes measurement
error.
Methods of assessing reliability include test–retest reliability (administering a
measure twice in a short period to see if the measure yields consistent scores),
interrater reliability (assessing whether two raters or observers independently
assign similar scores), and internal consistency (assessing whether there is
consistency across items in a composite scale in measuring a trait).
Reliability is assessed statistically by computing coefficients that range from .00 to
1.00; higher values indicate greater reliability.
Validity is the degree to which an instrument measures what it is supposed to
measure.
Aspects of validity include face validity (the extent to which a measure looks like
it is measuring the target construct), content validity (in composite scales, the
extent to which an instrument’s content adequately captures the construct),
criterion validity (the extent to which scores on a measure are a good reflection
of a “gold standard”), and construct validity (the extent to which an instrument
adequately measures the targeted construct, as assessed mainly by testing
hypotheses).
A measure’s validity is not proved or established but rather is supported to a
greater or lesser extent by evidence.
REFERENCES FOR CHAPTER 10
268
Bryant, V., Phang, J., & Abrams, K. (2015). Verifying placement of small-bore feeding tubes: Electromagnetic
device images versus abdominal radiographs. American Journal of Critical Care, 24, 525–530.
Buettner-Schmidt, K., Lobo, M., Travers, M., & Boursaw, B. (2015). Tobacco smoke exposure and impact of
smoking legislation on rural and non-rural hospitality venues in North Dakota. Research in Nursing & Health,
38, 268–277.
*Burk, R., Grap, M., Munro, C., Schubert, C., & Sessler, C. (2014). Predictors of agitation in the adult critically ill.
American Journal of Critical Care, 23, 414–423.
Draughton Moret, J., Hauda, W., II, Price, B., & Sheridan, D. (2016). Nonoccupational postexposure human
immunodeficiency virus prophylaxis: Acceptance following sexual assault. Nursing Research, 65, 47–54.
Hewitt, C., & Cappiello, J. (2015). Essential competencies in nursing education for prevention and care related to
unintended pregnancy. Journal of Obstetric, Gynecologic, & Neonatal Nursing, 44, 69–76.
*Hu, R., Jiang, X., Hegadoren, K., & Zhang, Y. (2015). Effects of earplugs and eye masks combined with relaxing
music on sleep, melatonin and cortisol levels in ICU patients: A randomized controlled trial. Critical Care, 19,
115.
Huang, C. Y., Lai, H., Lu, Y., Chen, W., Chi, S., Lu, C., & Chen, C. (2016). Risk factors and coping style affect
health outcomes in adults with type 2 diabetes. Biological Research for Nursing, 18, 82–89.
*Joseph, P. V., Reed, D., & Mennella, J. (2016). Individual differences among children in sucrose detection
thresholds. Nursing Research, 65, 3–12.
Kennedy, E., Murphy, G., Misener, R., & Alder, E. (2015). Development and psychometric assessment of the
Nursing Competence Self-Efficacy Scale. Journal of Nursing Education, 54, 550–558.
*Neta, D., DaSilva, A., & DaSilva, G. (2015). Adherence to foot self-care in diabetes mellitus patients. Revista
Brasileira de Enfermagem, 68, 103–108.
*Nilsen, M., Sereika, S., Hoffman, L., Barnato, A., Donovan, H., & Happ, M. (2014). Nurse and patient interaction
behaviors’ effects on nursing care quality for mechanically ventilated older adults in the ICU. Research in
Gerontological Nursing, 7, 113–125.
Okoli, C., Kodet, J., & Robertson, H. (2016). Behavioral and physiological responses to nicotine patch
administration among nonsmokers based on acute and chronic secondhand tobacco smoke exposure. Biological
Research for Nursing, 18, 60–67.
Oliver, J., Ewell, P., Nicholls, K., Chapman, K., & Ford, S. (2016). Differences in colorectal cancer risk knowledge
among Alabamians. Oncology Nursing Forum, 43, 77–85.
Peters, R. M., Benkert, R., Templin, T., & Cassidy-Bushrow, A. (2014). Measuring African American women’s
trust in provider during pregnancy. Research in Nursing & Health, 37, 144–154.
Polit, D. F., & Yang, F. M. (2016). Measurement and the measurement of change: A primer for health
professionals. Philadelphia, PA: Wolters Kluwer.
Ranse, K., Yates, P., & Coyer, F. (2015). Factors influencing the provision of end-of-life care in critical care
settings: Development and testing of a survey instrument. Journal of Advanced Nursing, 71, 697–709.
Rask, K., Brigham, K., & Johns, M. (2011). Integrating comparative effectiveness research programs into
predictive health: A unique role for academic health centers. Academic Medicine, 86, 718–723.
Ratanasiripong, N. T. (2015). Factors related to human papillomavirus (HPV) vaccination in college men. Public
Health Nursing, 32, 645–653.
Ridout, J., Aucoin, J., Browning, A., Piedra, K., & Weeks, S. (2014). Does perioperative documentation transfer
reliably? Computers, Informatics, Nursing, 32, 37–42.
**Routledge, F., Dunbar, S., Higgins, M., Rogers, A., Feeley, C., Ioachimescu, O., . . . Quyyumi, A. (2015).
Insomnia symptoms are associated with abnormal endothelial function. Journal of Cardiovascular Nursing.
Advance online publication.
Wang, W., Zhang, H., Lopez, V., Wu, V., Poo, D., & Kowitlawakul, Y. (2015). Improving awareness, knowledge
and heart-related lifestyle of coronary heart disease among working population through a mHealth programme:
Study protocol. Journal of Advanced Nursing, 71, 2200–2207.
*A link to this open-access article is provided in the Internet Resources section on website.
**This journal article is available on for this chapter.
269
11 Qualitative Designs and Approaches
Learning Objectives
On completing this chapter, you will be able to:
Discuss the rationale for an emergent design in qualitative research and describe
qualitative design features
Identify the major research traditions for qualitative research and describe the domain
of inquiry of each
Describe the main features and methods associated with ethnographic,
phenomenologic, and grounded theory studies
Describe key features of historical research, case studies, narrative analysis, and
descriptive qualitative studies
Discuss the goals and features of research with an ideological perspective
Define new terms in the chapter
Key Terms
Basic social process (BSP)
Bracketing
Case study
Constant comparison
Constructivist grounded theory
Core variable
Critical ethnography
Critical theory
Descriptive phenomenology
Descriptive qualitative study
Emergent design
Ethnonursing research
Feminist research
Grounded theory
Hermeneutics
270
Historical research
Interpretive phenomenology
Narrative analysis
Participant observation
Participatory action research (PAR)
Reflexive journal
THE DESIGN OF QUALITATIVE STUDIES
Quantitative researchers develop a research design before collecting their data and
rarely depart from that design once the study is underway: They design and then they
do. In qualitative research, by contrast, the study design often evolves during the
project: Qualitative researchers design as they do. Qualitative studies use an emergent
design that evolves as researchers make ongoing decisions about their data needs based
on what they have already learned. An emergent design supports the researchers’ desire
to have the inquiry reflect the realities and viewpoints of those under study—realities
and viewpoints that are not known at the outset.
Characteristics of Qualitative Research Design
Qualitative inquiry has been guided by different disciplines with distinct methods and
approaches. Some characteristics of qualitative research design are broadly applicable,
however. In general, qualitative design
Is flexible, capable of adjusting to what is learned during data collection
Often involves triangulating various data collection strategies
Tends to be holistic, striving for an understanding of the whole
Requires researchers to become intensely involved and reflexive and can require a lot
of time
Benefits from ongoing data analysis to guide subsequent strategies
Although design decisions are not finalized beforehand, qualitative researchers
typically do advance planning that supports their flexibility. For example, qualitative
researchers make advance decisions with regard to their research tradition, the study
site, a broad data collection strategy, and the equipment they will need in the field.
Qualitative researchers plan for a variety of circumstances, but decisions about how to
deal with them are resolved when the social context is better understood.
Qualitative Design Features
Some of the design features discussed in Chapter 9 apply to qualitative studies. To
contrast quantitative and qualitative research design, we consider the elements identified
in Table 9.1.
271
Intervention, Control, and Blinding
Qualitative research is almost always nonexperimental—although a qualitative substudy
may be embedded in an experiment (see Chapter 13). Qualitative researchers do not
conceptualize their studies as having independent and dependent variables and rarely
control the people or environment under study. Blinding is rarely used by qualitative
researchers. The goal is to develop a rich understanding of a phenomenon as it exists
and as it is constructed by individuals within their own context.
Comparisons
Qualitative researchers typically do not plan to make group comparisons because the
intent is to thoroughly describe or explain a phenomenon. Yet, patterns emerging in the
data sometimes suggest illuminating comparisons. Indeed, as Morse (2004) noted in an
editorial in Qualitative Health Research, “All description requires comparisons” (p.
1323). In analyzing qualitative data and in determining whether categories are saturated,
there is a need to compare “this” to “that.”
Example of qualitative comparisons
Olsson and coresearchers (2015) studied patients’ decision making about undergoing
transcatheter aortic valve implantation of severe aortic stenosis. They identified three
distinct patterns of decision making in their sample of 24 patients, who were either
ambivalent about the treatment, obedient and willing to let others decide, or
reconciled and accepting of the treatment.
Research Settings
Qualitative researchers usually collect their data in naturalistic settings. And, whereas
quantitative researchers usually strive to collect data in one type of setting to maintain
constancy of conditions (e.g., conducting all interviews in participants’ homes),
qualitative researchers may deliberately study phenomena in a various natural contexts,
especially in ethnographic research.
Time Frames
Qualitative research, like quantitative research, can be either cross-sectional, with one
data collection point, or longitudinal, with multiple data collection points designed to
observe the evolution of a phenomenon.
Example of a longitudinal qualitative study
Hansen and colleagues (2015) studied the illness experiences of patients with
hepatocellular carcinoma near the end of life. Data were collected through in-depth
interviews once a month for up to 6 months from 14 patients.
272
Causality and Qualitative Research
In evidence hierarchies that rank evidence in terms of support of causal inferences (e.g.,
the one in Fig. 2.1), qualitative research is often near the base, which has led some to
criticize evidence-based initiatives. The issue of causality, which has been controversial
throughout the history of science, is especially contentious in qualitative research.
Some believe that causality is an inappropriate construct within the naturalistic
paradigm. For example, Lincoln and Guba (1985) devoted an entire chapter of their
book to a critique of causality and argued that it should be replaced with a concept that
they called mutual shaping. According to their view, “Everything influences everything
else, in the here and now” (p. 151).
Others, however, believe that qualitative methods are particularly well suited to
understanding causal relationships. For example, Huberman and Miles (1994) argued
that qualitative studies “can look directly and longitudinally at the local processes
underlying a temporal series of events and states, showing how these led to specific
outcomes, and ruling out rival hypotheses” (p. 434).
In attempting to not only describe but also explain phenomena, qualitative
researchers who undertake in-depth studies will inevitably reveal patterns and processes
suggesting causal interpretations. These interpretations can be (and often are) subjected
to more systematic testing using more controlled methods of inquiry.
QUALITATIVE RESEARCH TRADITIONS
There is a wide variety of qualitative approaches. One classification system involves
categorizing qualitative research according to disciplinary traditions. These traditions
vary in their conceptualization of what types of questions are important to ask and in the
methods considered appropriate for answering them. Table 11.1 provides an overview
of several such traditions, some of which we introduced previously. This section
describes traditions that have been prominent in nursing research.
Ethnography
Ethnography involves the description and interpretation of a culture and cultural
273
behavior. Culture refers to the way a group of people live—the patterns of human
activity and the values and norms that give activity significance. Ethnographies
typically involve extensive fieldwork, which is the process by which the ethnographer
comes to understand a culture. Because culture is, in itself, not visible or tangible, it
must be inferred from the words, actions, and products of members of a group.
Ethnographic research sometimes concerns broadly defined cultures (e.g., the Maori
culture of New Zealand) in what is sometimes called a macroethnography. However,
ethnographers sometimes focus on more narrowly defined cultures in a focused
ethnography. Focused ethnographies are studies of small units in a group or culture
(e.g., the culture of an intensive care unit). An underlying assumption of the
ethnographer is that every human group eventually evolves a culture that guides the
members’ view of the world and the way they structure their experiences.
Example of a focused ethnography
Taylor and colleagues (2015) used a focused ethnographic approach to study nurses’
experiences of caring for older adults in the emergency department.
Ethnographers seek to learn from (rather than to study) members of a cultural group
—to understand their worldview. Ethnographers distinguish “emic” and “etic”
perspectives. An emic perspective refers to the way the members of the culture regard
their world—the insiders’ view. The emic is the local concepts or means of expression
used by members of the group under study to characterize their experiences. The etic
perspective, by contrast, is the outsiders’ interpretation of the culture’s experiences—the
words and concepts they use to refer to the same phenomena. Ethnographers strive to
acquire an emic perspective of a culture and to reveal tacit knowledge—information
about the culture that is so deeply embedded in cultural experiences that members do
not talk about it or may not even be consciously aware of it.
Three broad types of information are usually sought by ethnographers: cultural
behavior (what members of the culture do), cultural artifacts (what members make and
use), and cultural speech (what they say). Ethnographers rely on a wide variety of data
sources, including observations, in-depth interviews, records, and other types of
physical evidence (e.g., photographs, diaries). Ethnographers typically use a strategy
called participant observation in which they make observations of the culture under
study while participating in its activities. Ethnographers also enlist the help of key
informants to help them understand and interpret the events and activities being
observed.
Ethnographic research is time-consuming—months and even years of fieldwork
may be required to learn about a culture. Ethnography requires a certain level of
intimacy with members of the cultural group, and such intimacy can be developed only
over time and by working with those members as active participants.
The products of ethnographies are rich, holistic descriptions and interpretations of
274
the culture under study. Among health care researchers, ethnography provides access to
the health beliefs and health practices of a culture. Ethnographic inquiry can thus help to
foster understanding of behaviors affecting health and illness. Leininger (1985) coined
the phrase ethnonursing research, which she defined as “the study and analysis of the
local or indigenous people’s viewpoints, beliefs, and practices about nursing care
behavior and processes of designated cultures” (p. 38).
Example of an ethnonursing study
López Entrambasaguas and colleagues (2015) conducted an ethnonursing study to
describe and understand cultural patterns related to HIV risk in Ayoreo women
(indigenous Bolivians) who work in sex trades.
Ethnographers are often, but not always, “outsiders” to the culture under study. A
type of ethnography that involves self-scrutiny (including scrutiny of groups or cultures
to which researchers themselves belong) is called autoethnography or insider research.
Autoethnography has several advantages, including ease of recruitment and the ability
to get candid data based on preestablished trust. The drawback is that an “insider” may
have biases about certain issues or may be so entrenched in the culture that valuable
data get overlooked.
Phenomenology
Phenomenology is an approach to understanding people’s everyday life experiences.
Phenomenologic researchers ask: What is the essence of this phenomenon as
experienced by these people, and what does it mean? Phenomenologists assume there is
an essence—an essential structure—that can be understood, much as ethnographers
assume that cultures exist. Essence is what makes a phenomenon what it is, and without
which, it would not be what it is. Phenomenologists investigate subjective phenomena
in the belief that critical truths about reality are grounded in people’s lived experiences.
The topics appropriate to phenomenology are ones that are fundamental to the life
experiences of humans, such as the meaning of suffering or the quality of life with
chronic pain.
In phenomenologic studies, the main data source is in-depth conversations. Through
these conversations, researchers strive to gain entrance into the informants’ world and to
have access to their experiences as lived. Phenomenologic studies usually involve a
small number of participants—often, 10 or fewer. For some phenomenologic
researchers, the inquiry includes gathering not only information from informants but
also efforts to experience the phenomenon, through participation, observation, and
reflection. Phenomenologists share their insights in rich, vivid reports that describe key
themes. The results section in a phenomenological report should help readers “see”
something in a different way that enriches their understanding of experiences.
Phenomenology has several variants and interpretations. The two main schools of
275
thought are descriptive phenomenology and interpretive phenomenology
(hermeneutics).
Descriptive Phenomenology
Descriptive phenomenology was developed first by Husserl, who was primarily
interested in the question, What do we know as persons? Descriptive phenomenologists
insist on the careful portrayal of ordinary conscious experience of everyday life—a
depiction of “things” as people experience them. These “things” include hearing, seeing,
believing, feeling, remembering, deciding, and evaluating.
Descriptive phenomenologic studies often involve the following four steps:
bracketing, intuiting, analyzing, and describing. Bracketing refers to the process of
identifying and holding in abeyance preconceived beliefs and opinions about the
phenomenon under study. Researchers strive to bracket out presuppositions in an effort
to confront the data in pure form. Phenomenological researchers (as well as other
qualitative researchers) often maintain a reflexive journal in their efforts to bracket.
Intuiting, the second step in descriptive phenomenology, occurs when researchers
remain open to the meanings attributed to the phenomenon by those who have
experienced it. Phenomenologic researchers then proceed to an analysis (i.e., extracting
significant statements, categorizing, and making sense of essential meanings). Finally,
the descriptive phase occurs when researchers come to understand and define the
phenomenon.
Example of a descriptive phenomenological study
Meyer and coresearchers (2016) used a descriptive phenomenological approach in
their study of spouses’ experiences of living with a partner affected with dementia.
Interpretive Phenomenology
Heidegger, a student of Husserl, is the founder of interpretive phenomenology or
hermeneutics. Heidegger stressed interpreting and understanding—not just describing—
human experience. He believed that lived experience is inherently an interpretive
process and argued that hermeneutics (“understanding”) is a basic characteristic of
human existence. (The term hermeneutics refers to the art and philosophy of
interpreting the meaning of an object, such as a text or work of art.) The goals of
interpretive phenomenological research are to enter another’s world and to discover the
understandings found there.
Gadamer, another interpretive phenomenologist, described the interpretive process
as a circular relationship—the hermeneutic circle—where one understands the whole of
a text (e.g., an interview transcript) in terms of its parts and the parts in terms of the
whole. Researchers continually question the meanings of the text.
Heidegger believed it is impossible to bracket one’s being-in-the-world, so
bracketing does not occur in interpretive phenomenology. Hermeneutics presupposes
276
prior understanding on the part of the researcher. Interpretive phenomenologists ideally
approach each interview text with openness—they must be open to hearing what it is the
text is saying.
Interpretive phenomenologists, like descriptive phenomenologists, rely primarily on
in-depth interviews with individuals who have experienced the phenomenon of interest,
but they may go beyond a traditional approach to gathering and analyzing data. For
example, interpretive phenomenologists sometimes augment their understandings of the
phenomenon through an analysis of supplementary texts, such as novels, poetry, or
other artistic expressions—or they use such materials in their conversations with study
participants.
Example of an interpretive phenomenological study
LaDonna and colleagues (2016) used an interpretive phenomenological approach in
their exploration of the experience of caring for individuals with dysphagia and
myotonic dystrophy.
HOW-TO-TELL TIP How can you tell if a phenomenological study
is descriptive or interpretive? Phenomenologists often use terms that can
help you make this determination. In a descriptive phenomenological study,
such terms may be bracketing, description, essence, and Husserl. The
names Colaizzi, Van Kaam, or Giorgi may be mentioned in the methods
section. In an interpretive phenomenological study, key terms can include
being-in-the-world, hermeneutics, understanding, and Heidegger. The
names van Manen or Benner may appear in the method section, as we
discuss in Chapter 16 on qualitative data analysis.
Grounded Theory
Grounded theory has contributed to the development of many middle-range theories of
phenomena relevant to nurses. Grounded theory was developed in the 1960s by two
sociologists, Glaser and Strauss (1967), whose theoretical roots were in symbolic
interaction, which focuses on the manner in which people make sense of social
interactions.
Grounded theory tries to account for people’s actions from the perspective of those
involved. Grounded theory researchers seek to identify a main concern or problem and
then to understand the behavior designed to resolve it—the core variable. One type of
core variable is a basic social process (BSP). Grounded theory researchers generate
conceptual categories and integrate them into a substantive theory, grounded in the data.
Grounded Theory Methods
Grounded theory methods constitute an entire approach to the conduct of field research.
277
A study that truly follows Glaser and Strauss’s (1967) precepts does not begin with a
focused research problem. The problem and the process used to resolve it emerge from
the data and are discovered during the study. In grounded theory research, data
collection, data analysis, and sampling of participants occur simultaneously. The
grounded theory process is recursive: Researchers collect data, categorize them,
describe the emerging central phenomenon, and then recycle earlier steps.
A procedure called constant comparison is used to develop and refine theoretically
relevant concepts and categories. Categories elicited from the data are constantly
compared with data obtained earlier so that commonalities and variations can be
detected. As data collection proceeds, the inquiry becomes increasingly focused on the
emerging theory.
In-depth interviews and participant observation are common data sources in
grounded theory studies, but existing documents and other data may also be used.
Typically, a grounded theory study involves interviews with a sample of about 20 to 30
people.
Alternate Views of Grounded Theory
In 1990, Strauss and Corbin published a controversial book, Basics of Qualitative
Research: Techniques and Procedures for Developing Grounded Theory. The book’s
stated purpose was to provide beginning grounded theory researchers with basic
procedures for building a grounded theory. That book is currently in its fourth edition
(Corbin & Strauss, 2015).
Glaser, however, disagreed with some procedures advocated by Strauss (his original
coauthor) and Corbin (a nurse researcher). Glaser (1992) believed that Strauss and
Corbin (1990) developed a method that is not grounded theory but rather what he called
“full conceptual description.” According to Glaser, the purpose of grounded theory is to
generate concepts and theories that explain and account for variation in behavior in the
substantive area under study. Conceptual description, by contrast, is aimed at describing
the full range of behavior of what is occurring in the substantive area.
Nurse researchers have conducted grounded theory studies using both the original
Glaser and Strauss (1967) and the Corbin and Strauss (2015) approaches. They also use
an approach called constructivist grounded theory (Charmaz, 2014). Charmaz (2014)
regards Glaser and Strauss’ grounded theory as having positivist roots. In Charmaz’s
approach, the developed grounded theory is seen as an interpretation. The data collected
and analyzed are acknowledged to be constructed from shared experiences and
relationships between the researcher and the participants. Data and analyses are viewed
as social constructions.
Example of a grounded theory study
Johansson and coresearchers (2015) used constructivist grounded theory methods to
explore self-reorientation in the early recovery phase following colorectal cancer
278
treatment. Data were gathered through in-depth interviews with 17 patients 3 to 9
months after surgery.
Historical Research
Historical research is the systematic collection and critical evaluation of data relating
to past occurrences. Historical research relies primarily on qualitative (narrative) data
but can sometimes involve statistical analysis of quantitative data. Nurses have used
historical research methods to examine a wide range of phenomena in both the recent
and more distant past.
Data for historical research are usually in the form of written records: diaries, letters,
newspapers, medical documents, and so forth. Nonwritten materials, such photographs
and films, can be forms of historical data. In some cases, it is possible to conduct
interviews with people who participated in historical events (e.g., nurses who served in
recent wars).
Historical research is usually interpretive. Historical researchers try to describe what
happened and also how and why it happened. Relationships between events and ideas,
between people and organizations, are explored and interpreted within their historical
context and within the context of new viewpoints about what is historically significant.
Example of historical research
Irwin (2016) conducted a historical study of the role of nurses in post-World War I
relations between the United States and Europe. The analysis was based on letters,
diaries, official reports, and published articles on the role of American Red Cross
nurses.
OTHER TYPES OF QUALITATIVE RESEARCH
Qualitative studies often can be characterized and described in terms of the disciplinary
research traditions discussed in the previous section. However, several other important
types of qualitative research not associated with a particular discipline also deserve
mention.
Case Studies
Case studies are in-depth investigations of a single entity or small number of entities.
The entity may be an individual, family, institution, or other social unit. Case study
researchers attempt to understand issues that are important to the circumstances of the
focal entity.
In most studies, whether quantitative or qualitative, certain phenomena or variables
are the core of the inquiry. In a case study, the case itself is at “center stage.” The focus
of case studies is typically on understanding why an individual thinks, behaves, or
279
develops in a particular manner rather than on what his or her status or actions are.
Probing research of this type may require study over a considerable period. Data are
often collected not only about the person’s present state but also about past experiences
relevant to the problem being examined.
The greatest strength of case studies is the depth that is possible when a small
number of entities is being investigated. Case study researchers can gain an intimate
knowledge of a person’s feelings, actions, and intentions. Yet, this same strength is a
potential weakness: Researchers’ familiarity with the case may make objectivity more
difficult. Another limitation of case studies concerns generalizability: If researchers
discover important relationships, it is difficult to know whether the same relationships
would occur with others. However, case studies can play a role in challenging
generalizations from other types of research.
Example of a case study
Graneheim and colleagues (2015) conducted an in-depth case study that focused on
the interactions between professional caregivers and one woman with schizophrenia
and dementia. The data were obtained through observations and interviews at the
woman’s residential home.
Narrative Analyses
Narrative analysis focuses on story as the object of inquiry to understand how
individuals make sense of events in their lives. The underlying premise of narrative
research is that people most effectively make sense of their world—and communicate
these meanings—by narrating stories. Individuals construct stories when they wish to
understand specific events and situations that require linking an inner world of needs to
an external world of observable actions. Analyzing stories opens up forms of telling
about experience and is more than just content. Narrative analysts ask, Why did the story
get told that way? A number of structural approaches can be used to analyze stories,
including ones based in literary analysis and linguistics.
Example of a narrative analysis
Tobin and colleagues (including Beck, an author of this textbook) (2014) conducted a
narrative analysis of asylum-seeking women’s experience of childbirth in Ireland.
Twenty-two mothers told their stories during in-depth interviews lasting from 40
minutes to 1.5 hours. Highlighted in their narratives was the lack of communication,
connection, and culturally competent care.
Descriptive Qualitative Studies
Many qualitative studies claim no particular disciplinary or methodologic roots. The
280
researchers may simply indicate that they have conducted a qualitative study, a
naturalistic inquiry, or a content analysis of qualitative data (i.e., an analysis of themes
and patterns that emerge in the narrative content). Thus, some qualitative studies do not
have a formal name or do not fit into the typology we have presented in this chapter. We
refer to these as descriptive qualitative studies.
Descriptive qualitative studies tend to be eclectic in their designs and methods and
are based on the general premises of constructivist inquiry. These studies are
infrequently discussed in research methods textbooks. The chapter supplement on
website presents information on descriptive qualitative studies and studies that
nurse researcher Sally Thorne (2013) called interpretive description.
Example of a descriptive qualitative study
Cal and Bahar (2015) did a descriptive qualitative study to explore women’s barriers
to the prevention of lymphedema after surgery for breast cancer. The researchers
conducted in-depth interviews with 14 women with lymphedema.
Research With Ideological Perspectives
Some qualitative researchers conduct inquiries within an ideological framework,
typically to draw attention to certain social problems or the needs of certain groups and
to bring about change. These approaches represent important investigative avenues.
Critical Theory
Critical theory originated with a group of Marxist-oriented German scholars in the
1920s. Essentially, a critical researcher is concerned with a critique of society and with
envisioning new possibilities. Critical social science is action oriented. Its aim is to
make people aware of contradictions and disparities in social practices and become
inspired to change them. Critical theory calls for inquiries that foster enlightened self-
knowledge and sociopolitical action.
Critical researchers often triangulate methods and emphasize multiple perspectives
(e.g., alternative racial or social class perspectives) on problems. Critical researchers
typically interact with participants in ways that emphasize participants’ expertise.
Critical theory has been applied in several disciplines but has played an especially
important role in ethnography. Critical ethnography focuses on raising consciousness
in the hope of effecting social change. Critical ethnographers attempt to increase the
political dimensions of cultural research and undermine oppressive systems.
Example of a critical ethnography
Speechley and colleagues (2015) developed an “ethnodrama” to catalyze dialogue in
home-based dementia care. The script was derived from a critical ethnographic study
that followed people living with dementia, and their caregivers, over an 18-month
281
period. Their script was designed to disseminate their research findings “in a way that
catalyzes and fosters critical (actionable) dialogue” (p. 1551).
Feminist Research
Feminist research is similar to critical theory research, but the focus is on gender
domination and discrimination within patriarchal societies. Similar to critical
researchers, feminist researchers seek to establish collaborative and nonexploitative
relationships with their informants and to conduct research that is transformative.
Feminist investigators seek to understand how gender and a gendered social order have
shaped women’s lives. The aim is to facilitate change in ways relevant to ending
women’s unequal social position.
Feminist research methods typically include in-depth, interactive, and collaborative
individual or group interviews that offer the possibility of reciprocally educational
encounters. Feminists usually seek to negotiate the meanings of the results with those
participating in the study and to be self-reflective about what they themselves are
learning.
Example of feminist research
Sutherland and colleagues (2016) used a critical feminist lens in their study of the
processes that shape gender inequities in hospice and palliative home care for seniors
with cancer.
Participatory Action Research
Participatory action research (PAR) is based on the view that the production of
knowledge can be used to exert power. PAR researchers typically work with groups or
communities that are vulnerable to the control or oppression of a dominant group.
The PAR tradition has as its starting point a concern for the powerlessness of the
group under study. In PAR, researchers and participants collaborate in defining the
problem, selecting research methods, analyzing the data, and deciding how the findings
will be used. The aim of PAR is to produce not only knowledge but also action,
empowerment, and consciousness raising.
In PAR, the research methods are designed to facilitate processes of collaboration
that can motivate and generate community solidarity. Thus, “data-gathering” strategies
are not only the traditional methods of interview and observation but also storytelling,
sociodrama, photography, and other activities designed to encourage people to find
creative ways to explore their lives, tell their stories, and recognize their own strengths.
Example of participatory action research
Baird and colleagues (2015) used community-based action research to explore the
partnership between researchers, students, and South Sudanese refugee women to
282
address health challenges stemming from the women’s resettlement to the United
States.
CRITIQUING QUALITATIVE DESIGNS
Evaluating a qualitative design is often difficult. Qualitative researchers do not always
document design decisions or describe the process by which such decisions were made.
Researchers often do, however, indicate whether the study was conducted within a
specific qualitative tradition. This information can be used to come to some conclusions
about the study design. For example, if a report indicated that the researcher conducted
2 months of fieldwork for an ethnographic study, you might suspect that insufficient
time had been spent in the field to obtain an emic perspective of the culture under study.
Ethnographic studies may also be critiqued if their only source of information was from
interviews rather than from a broader range of data sources, particularly observations.
In a grounded theory study, look for evidence about when the data were collected
and analyzed. If the researcher collected all the data before analyzing any of it, you
might question whether the constant comparative method was used correctly.
In critiquing a phenomenological study, you should first determine if the study is
descriptive or interpretive. This will help you to assess how closely the researcher kept
to the basic tenets of that qualitative research tradition. For example, in a descriptive
phenomenological study, did the researcher bracket? When critiquing a
phenomenological study, in addition to critiquing the methodology, you should also
look at its power in capturing the meaning of the phenomena being studied.
No matter what qualitative design is identified in a study, look to see if the
researchers stayed true to a single qualitative tradition throughout the study or if they
mixed qualitative traditions. For example, did the researcher state that grounded theory
was used but then present results that described themes instead of a substantive theory?
The guidelines in Box 11.1 are designed to assist you in critiquing the designs of
qualitative studies.
Box 11.1 Guidelines for Critiquing Qualitative Designs
1. Was the research tradition for the qualitative study identified? If none was
identified, can one be inferred?
2. Is the research question congruent with a qualitative approach and with the
specific research tradition? Are the data sources and research methods congruent
with the research tradition?
3. How well was the research design described? Are design decisions explained and
justified? Does it appear that the design emerged during data collection, allowing
researchers to capitalize on early information?
4. Did the design lend itself to a thorough, in-depth examination of the focal
phenomenon? Was there evidence of reflexivity? What design elements might
283
have strengthened the study (e.g., a longitudinal perspective rather than a cross-
sectional one)?
5. Was the study undertaken with an ideological perspective? If so, is there evidence
that ideological goals were achieved? (e.g., Was there full collaboration between
researchers and participants? Did the research have the power to be
transformative?)
This section presents examples of qualitative studies. Read these summaries
and then answer the critical thinking questions, referring to the full research
report if necessary. Example 1 is featured on the interactive Critical Thinking
Activity on website. The critical thinking questions for Example 2 are
based on the study that appears in its entirety in Appendix B of this book. Our
comments for this exercise are in the Student Resources section on .
EXAMPLE 1: A GROUNDED THEORY STUDY
Study: The psychological process of breast cancer patients receiving initial
chemotherapy (Chen et al., 2015)
Statement of Purpose: The purpose of the study was to generate a theory to
describe the psychological stages of Taiwanese breast cancer patients going
through initial chemotherapy.
Method: The researchers used Glaser’s (1992) approach to grounded theory
approach to understand women’s psychological processes. Women were
recruited from a teaching hospital in southern Taiwan. Twenty breast cancer
patients having finished their first round of chemotherapy were invited to
participate in the study, and none refused. Participants were selected toward
the end of the study on the basis of categories that emerged from the analysis
of early data. In-depth interviews, lasting 30 to 60 minutes, were conducted in
a quiet private room in the hospital by a nurse who had extensive knowledge
about breast cancer chemotherapy. The interviewer asked broad questions,
such as “What was on your mind before receiving chemotherapy?” and “How
did the chemotherapy affect your life?” The interviewer was guided by the
women’s answers to ask more probing questions that could be linked to
emergent concepts to reach theoretical saturation. The interviews were
audiorecorded and subsequently transcribed for analysis. Constant
comparison was used in the analysis. The lead researcher maintained a
reflective journal “to help with self-awareness.”
284
Key Findings: The main concern in this study was the psychological aspects
of going through chemotherapy. The analysis revealed a core category that
the researchers called “rising from the ashes.” The four stages of the
psychological process were (1) the fear stage, (2) the hardship stage, (3) the
adjustment stage, and (4) the relaxation stage, when patients accepted the
disease-related changes in their lives.
Critical Thinking Exercises
1. Answer the relevant questions from Box 11.1 regarding this study.
2. Also consider the following targeted questions:
a. Was this study cross-sectional or longitudinal?
b. Could this study have been undertaken as an ethnography? A
phenomenological inquiry?
3. If the results of this study are trustworthy, what are some of the uses to
which the findings might be put in clinical practice?
EXAMPLE 2: PHENOMENOLOGICAL STUDY IN
APPENDIX B
• Read the methods section of Beck and Watson’s (2010) study
(“Subsequent childbirth after a previous traumatic birth”) in Appendix B
of this book.
Critical Thinking Exercises
1. Answer the relevant questions from Box 11.1 regarding this study.
2. Also consider the following targeted questions:
a. Was this study a descriptive or interpretive phenomenology?
b. Could this study have been conducted as a grounded theory study? As
an ethnographic study? Why or why not?
c. Could this study have been conducted as a feminist inquiry? If yes, what
might Beck have done differently?
WANT TO KNOW MORE?
A wide variety of resources to enhance your learning and understanding of
this chapter are available on .
• Interactive Critical Thinking Activity
• Chapter Supplement on Qualitative Descriptive Studies
• Answer to the Critical Thinking Exercises for Example 2
• Internet Resources with useful websites for Chapter 11
• A Wolters Kluwer journal article in its entirety—the Chen et al. study
described as Example 1 on p. 194.
285
Additional study aids, including eight journal articles and related
questions, are also available in Study Guide for Essentials of Nursing
Research, 9e.
Summary
Points
Qualitative research involves an emergent design that develops in the field as the
study unfolds. Qualitative studies can be either cross-sectional or longitudinal.
Ethnography focuses on the culture of a group of people and relies on extensive
fieldwork that usually includes participant observation and in-depth interviews
with key informants. Ethnographers strive to acquire an emic (insider’s)
perspective of a culture rather than an etic (outsider’s) perspective.
Nurses sometimes refer to their ethnographic studies as ethnonursing research.
Phenomenologists seek to discover the essence and meaning of a phenomenon as it
is experienced by people, mainly through in-depth interviews with people who
have had the relevant experience.
In descriptive phenomenology, which seeks to describe lived experiences,
researchers strive to bracket out preconceived views and to intuit the essence of
the phenomenon by remaining open to meanings attributed to it by those who have
experienced it.
Interpretive phenomenology (hermeneutics) focuses on interpreting the
meaning of experiences rather than just describing them.
Grounded theory researchers try to account for people’s actions by focusing on
the main concern that their behavior is designed to resolve. The manner in which
people resolve this main concern is the core variable. A prominent type of core
variable is called a basic social process(BSP) that explains the processes of
resolving the problem.
Grounded theory uses constant comparison: Categories elicited from the data are
constantly compared with data obtained earlier.
A controversy in grounded theory concerns whether to follow the original Glaser
and Strauss (1967) procedures or to use procedures adapted by Strauss and Corbin
(2015). Glaser argued that the latter approach does not result in grounded theories
but rather in conceptual descriptions. More recently, Charmaz’s constructivist
grounded theory has emerged, emphasizing interpretive aspects in which the
grounded theory is constructed from relationships between the researcher and
286
participants.
Case studies are intensive investigations of a single entity or a small number of
entities, such as individuals, groups, families, or communities.
Narrative analysis focuses on story in studies in which the purpose is to
determine how individuals make sense of events in their lives.
Descriptive qualitative studies are not embedded in a disciplinary tradition. Such
studies may be referred to as qualitative studies, naturalistic inquiries, or as
qualitative content analyses.
Research is sometimes conducted within an ideological perspective. Critical
theory is concerned with a critique of existing social structures; critical
researchers conduct studies in collaboration with participants in an effort to foster
self-knowledge and transformation. Critical ethnography uses the principles of
critical theory in the study of cultures.
Feminist research, like critical research, aims at being transformative, but the
focus is on how gender domination and discrimination shape women’s lives.
Participatory action research(PAR) produces knowledge through close
collaboration with groups that are vulnerable to control or oppression by a
dominant culture; in PAR, a goal is to develop processes that can motivate people
and generate community solidarity.
REFERENCES FOR CHAPTER 11
Baird, M., Domian, E., Mulcahy, E., Mabior, R., Jemutai-Tanui, G., & Filippi, M. (2015). Creating a bridge of
understanding between two worlds: Community-based collaborative-action research with Sudanese refugee
women. Public Health Nursing, 32, 388–396.
Cal, A., & Bahar, Z. (2015). Women’s barriers to prevention of lymphedema after breast surgery and home care
needs: A qualitative study. Cancer Nursing. Advance online publication.
Charmaz, K. (2014). Constructing grounded theory (2nd ed.). Thousand Oaks, CA: Sage.
**Chen, Y. C., Huang, H., Kao, C., Sun, C., Chiang, C., & Sun, F. (2015). The psychological process of breast
cancer patients receiving initial chemotherapy. Cancer Nursing. Advance online publication.
Corbin, J., & Strauss, A. (2015). Basics of qualitative research: Techniques and procedures for developing
grounded theory (4th ed.). Thousand Oaks, CA: Sage.
Glaser, B. G. (1992). Basics of grounded theory analysis: Emergence vs. forcing. Mill Valley, CA: Sociology
Press.
Glaser, B. G., & Strauss, A. L. (1967). The discovery of grounded theory: Strategies for qualitative research.
Chicago, IL: Aldine.
Graneheim, U., Jansson, L., & Lindgren, B. (2015). Hovering between heaven and hell: An observational study
focusing on the interactions between one woman with schizophrenia, dementia, and challenging behaviour and
her care providers. Issues in Mental Health Nursing, 36, 543–550.
Hansen, L., Rosenkranz, S., Vaccaro, G., & Chang, M. (2015). Patients with hepatocellular carcinoma near the end
of life: A longitudinal qualitative study of their illness experiences. Cancer Nursing, 38, E19–E27.
Huberman, A. M., & Miles, M. (1994). Data management and analysis methods. In N. K. Denzin & Y. S. Lincoln
(Eds.), Handbook of qualitative research (pp. 428–444). Thousand Oaks, CA: Sage.
Irwin, J. F. (2016). Beyond Versailles: Recovering the voices of-nurses in post-World War I U.S.-European
relations. Nursing History Review, 24, 12–40.
*Johansson, A., Axelsson, M., Berndtsson, I., & Brink, E. (2015). Self-orientation following colorectal cancer
treatment—a grounded theory study. The Open Nursing Journal, 9, 25–31.
LaDonna, K., Koopman, W., Ray, S., & Venance, S. (2016). Hard to swallow: A phenomenological exploration of
287
the experience of caring for individuals with myotomic dystrophy and dysphagia. Journal of Neuroscience
Nursing, 48, 42–51.
Leininger, M. M. (Ed.). (1985). Qualitative research methods in nursing. New York, NY: Grune & Stratton.
Lincoln, Y. S., & Guba, E. G. (1985). Naturalistic inquiry. Newbury Park, CA: Sage.
López Entrambasaguas, O., Granero-Molina, J., Hernández-Padilla, J., & Fernández-Sola, C. (2015).
Understanding sociocultural factors contributing to HIV risk among Ayoreo Bolivian sex workers. Journal of
the Association of Nurses in AIDS Care, 26, 781–793.
Meyer, J., McCullough, J., & Berggren, I. (2016). A phenomenological study of living with a partner affected with
dementia. British Journal of Community Nursing, 21, 24–30.
Morse, J. M. (2004). Qualitative comparison: Appropriateness, equivalence, and fit. Qualitative Health Research,
14(10), 1323–1325.
Olsson, K., Näslund, U., Nillson, J., & Hörnsten, Å. (2015). Patients’ decision making about undergoing
transcatheter aortic valve implantation for severe aortic stenosis. Journal of Cardiovascular Nursing. Advance
online publication.
Speechley, M., DeForge, R., Ward-Griffin, C., Marlatt, N., & Gutmanis, I. (2015). Creating an ethnodrama to
catalyze dialogue in home-based dementia care. Qualitative Health Research, 25, 1551–1559.
Strauss, A., & Corbin, J. (1990). Basics of qualitative research: Grounded theory procedures and techniques.
Thousand Oaks, CA: Sage.
Sutherland, N., Ward-Griffin, C., McWilliam, C., & Stajduhar, K. (2016). Gendered processes in hospice and
palliative home care for seniors with cancer and their family caregivers. Qualitative Health Research, 26(7),
907–920.
Taylor, B., Rush, K., & Robinson, C. (2015). Nurses’ experiences of caring for the older adult in the emergency
department: A focused ethnography. International Emergency Nursing, 23, 185–189.
Thorne, S. (2013). Interpretive description. In C. T. Beck (Ed.), Routledge international handbook of qualitative
nursing research (pp. 295–306). New York, NY: Routledge.
Tobin, C., Murphy-Lawless, J., & Beck, C. T. (2014). Childbirth in exile: Asylum seeking women’s experience of
childbirth in Ireland. Midwifery, 30, 831–838.
*A link to this open-access article is provided in the Internet Resources section on website.
**This journal article is available on for this chapter.
288
12 Sampling and Data Collection in
Qualitative Studies
Learning Objectives
On completing this chapter, you will be able to:
Describe the logic of sampling for qualitative studies
Identify and describe several types of sampling in qualitative studies
Evaluate the appropriateness of the sampling method and sample size used in a
qualitative study
Identify and describe methods of collecting unstructured self-report data
Identify and describe methods of collecting and recording unstructured observational
data
Critique a qualitative researcher’s decisions regarding the data collection plan
Define new terms in the chapter
Key Terms
Data saturation
Diary
Field notes
Focus group interview
Key informant
Log
Maximum variation sampling
Participant observation
Photo elicitation
Photovoice
Purposive (purposeful) sampling
Semistructured interview
Snowball sampling
Theoretical sampling
Topic guide
289
Unstructured interview
This chapter covers two important aspects of qualitative studies—sampling (selecting
informative study participants) and data collection (gathering the right types and amount
of information to address the research question).
SAMPLING IN QUALITATIVE RESEARCH
Qualitative studies typically rely on small nonprobability samples. Qualitative
researchers are as concerned as quantitative researchers with the quality of their
samples, but they use different considerations in selecting study participants.
The Logic of Qualitative Sampling
Quantitative researchers measure attributes and identify relationships in a population;
they desire a representative sample so that findings can be generalized. The aim of most
qualitative studies is to discover meaning and to uncover multiple realities, not to
generalize to a population.
Qualitative researchers ask such sampling questions as, Who would be an
information-rich data source for my study? Whom should I talk to, or what should I
observe, to maximize my understanding of the phenomenon? A first step in qualitative
sampling is selecting settings with potential for information richness.
As the study progresses, new sampling questions emerge, such as, Whom can I talk
to or observe who would confirm, challenge, or enrich my understandings? As with the
overall design, sampling design in qualitative studies is an emergent one that capitalizes
on early information to guide subsequent action.
TIP Like quantitative researchers, qualitative researchers often identify
eligibility criteria for their studies. Although they do not specify an explicit
population to whom results could be generalized, they do establish the kinds
of people who are eligible to participate in their research.
Types of Qualitative Sampling
Qualitative researchers avoid random samples because they are not the best method of
selecting people who are knowledgeable, articulate, reflective, and willing to talk at
length with researchers. Qualitative researchers use various nonprobability sampling
designs.
Convenience and Snowball Sampling
Qualitative researchers often begin with a volunteer (convenience) sample. Volunteer
samples are often used when researchers want participants to come forward and identify
290
themselves. For example, if we wanted to study the experiences of people with frequent
nightmares, we might recruit them by placing a notice on a bulletin board or on the
Internet. We would be less interested in obtaining a representative sample of people
with nightmares than in recruiting a group with diverse nightmare experiences.
Sampling by convenience is efficient but is not a preferred approach. The aim in
qualitative studies is to extract the greatest possible information from a small number of
people, and a convenience sample may not provide the most information-rich sources.
However, convenience sample may be an economical way to begin the sampling
process.
Example of a convenience sample
Wise (2015) explored pregnant adolescents’ beliefs about healthy eating and food
choices. The convenience sample of 14 adolescents was recruited from teen parenting
programs.
Qualitative researchers also use snowball sampling (or network sampling), asking
early informants to make referrals. A weakness of this approach is that the eventual
sample might be restricted to a small network of acquaintances. Also, the quality of the
referrals may be affected by whether the referring sample member trusted the researcher
and truly wanted to cooperate.
Example of a snowball sample
In a focused ethnography, Martin and colleagues (2016) studied family health
concerns from the perspective of adult tribal members residing on an American
Indian reservation. A snowball process was used to recruit tribal members.
Purposive Sampling
Qualitative sampling may begin with volunteer informants and may be supplemented
with new participants through snowballing. Many qualitative studies, however, evolve
to a purposive (or purposeful) sampling strategy in which researchers deliberately
choose the cases or types of cases that will best contribute to the study.
Dozens of purposive sampling strategies have been identified (Patton, 2002), only
some of which are mentioned here. Researchers do not necessarily refer to their
sampling plans with Patton’s labels; his classification shows the diverse strategies
qualitative researchers have adopted to meet the conceptual needs of their research:
Maximum variation sampling involves deliberately selecting cases with a wide range
of variation on dimensions of interest.
Extreme (deviant) case sampling provides opportunities for learning from the most
unusual and extreme informants (e.g., outstanding successes and notable failures).
Typical case sampling involves the selection of participants who illustrate or highlight
291
what is typical or average.
Criterion sampling involves studying cases who meet a predetermined criterion of
importance.
Maximum variation sampling is often the sampling mode of choice in qualitative
research because it is useful in illuminating the scope of a phenomenon and in
identifying important patterns that cut across variations. Other strategies can also be
used advantageously, however, depending on the nature of the research question.
Example of maximum variation sampling
Tobiano and colleagues (2016) studied patients’ perceptions of participating in
nursing care on medical wards. Maximum variation sampling was used to recruit
patients who varied in terms of age, gender, and mobility status.
Sampling confirming and disconfirming cases is another purposive strategy used
toward the end of data collection. As researchers analyze their data, emerging
conceptualizations sometimes need to be checked. Confirming cases are additional
cases that fit researchers’ conceptualizations and strengthen credibility. Disconfirming
cases are new cases that do not fit and serve to challenge researchers’ interpretations.
These “negative” cases may offer insights about how the original conceptualization
needs to be revised.
TIP Some qualitative researchers call their sample purposive simply
because they “purposely” selected people who experienced the phenomenon
of interest. Exposure to the phenomenon is, however, an eligibility criterion.
If the researcher then recruits any person with the desired experience, the
sample is selected by convenience, not purposively. Purposive sampling
implies an intent to choose particular exemplars or types of people who can
best enhance the researcher’s understanding of the phenomenon.
Theoretical Sampling
Theoretical sampling is a method used in grounded theory studies. Theoretical
sampling involves decisions about where to find data to develop an emerging theory
optimally. The basic question in theoretical sampling is What types of people should the
researcher turn to next to further the development of the emerging conceptualization?
Participants are selected as they are needed for their theoretical relevance in developing
emerging categories.
Example of a theoretical sampling
Slatyer and colleagues (2015) used theoretical sampling in their grounded theory
study of hospital nurses’ perspective on caring for patients in severe pain. Early
292
interviews and observations in a renal/hepatology unit provided data on caring for
patients who had problems tolerating analgesic medications. The emerging category,
labeled “medication ineffectiveness,” guided the researchers to observe in an
orthopedic ward where nurses cared for older patients who continued to experience
severe pain for months after hip surgery. This theoretical sampling led the researchers
to notice differences in nurses’ responses to patients with acute and chronic pain
conditions. In turn, this prompted the researchers to sample in the eye/ear/plastic
surgery ward where patients were treated for long-term pain.
Sample Size in Qualitative Research
Sample size in qualitative research is usually based on informational needs. Data
saturation involves sampling until no new information is obtained and redundancy is
achieved. The number of participants needed to reach saturation depends on various
factors. For example, the broader the scope of the research question, the more
participants will likely be needed. Data quality can affect sample size: If participants are
insightful and can communicate effectively, saturation can be achieved with a relatively
small sample. Also, a larger sample is likely to be needed with maximum variation
sampling than with typical case sampling.
Example of saturation
Van Rompaey and colleagues (2016) studied the patients’ perception of a delirium in
a Belgian intensive care unit (ICU). Adult patients in the ICU were interviewed at
least 48 hours after the last positive score for delirium. Data collection continued until
“data saturation was achieved after interviewing 30 patients” (p. 68).
TIP Sample size adequacy in a qualitative study is difficult to evaluate
because the main criterion is information redundancy, which consumers
cannot judge. Some reports explicitly mention that saturation was achieved.
Sampling in the Three Main Qualitative Traditions
There are similarities among the main qualitative traditions with regard to sampling:
Samples are small, nonrandom methods are used, and final sampling decisions take
place during data collection. However, there are differences as well.
Sampling in Ethnography
Ethnographers often begin with a “big net” approach—they mingle and converse with
many members of the culture. However, they usually rely heavily on a smaller number
of key informants, who are knowledgeable about the culture and serve as the
researcher’s main link to the “inside.” Ethnographers may use an initial framework to
293
develop a pool of potential key informants. For example, an ethnographer might decide
to recruit different types of key informants based on their roles (e.g., nurses, advocates).
Once potential key informants are identified, key considerations for final selection are
their level of knowledge about the culture and willingness to collaborate with the
ethnographer in revealing and interpreting the culture.
Sampling in ethnography typically involves sampling things as well as people. For
example, ethnographers make decisions about observing events and activities, about
examining records and artifacts, and about exploring places that provide clues about the
culture. Key informants often help ethnographers decide what to sample.
Example of an ethnographic sample
In their ethnographic study, Michel and colleagues (2015) studied the meanings
assigned to health care by nurses and long-lived elders in a health care setting in
Brazil. The data collection, which involved observations and interviews, relied on the
assistance of 20 key informants: 10 nursing professionals and 10 elders.
Sampling in Phenomenological Studies
Phenomenologists tend to rely on very small samples of participants—often 10 or
fewer. Two principles guide the selection of a sample for a phenomenological study: (1)
All participants must have experienced the phenomenon and (2) they must be able to
articulate what it is like to have lived that experience. Phenomenological researchers
often want to explore diversity of individual experiences, and so, they may specifically
look for people with demographic or other differences who have shared a common
experience.
Example of a sample in a phenomenological study
Pedersen and colleagues (2016) studied the meaning of weight changes among
women treated for breast cancer. A purposive sample of 12 women being treated for
breast cancer at a Danish university hospital were recruited. “Variations were sought
regarding age, initial cancer treatment, type of surgery and change in weight and
waist” (p. 18).
Interpretive phenomenologists may, in addition to sampling people, sample artistic
or literary sources. Experiential descriptions of a phenomenon may be selected from
literature, such as poetry, novels, or autobiographies. These sources can help increase
phenomenologists’ insights into the phenomena under study.
Sampling in Grounded Theory Studies
Grounded theory research is typically done with samples of about 20 to 30 people, using
theoretical sampling. The goal in a grounded theory study is to select informants who
294
can best contribute to the evolving theory. Sampling, data collection, data analysis, and
theory construction occur concurrently, and so, study participants are selected serially
and contingently (i.e., contingent on the emerging conceptualization). Sampling might
evolve as follows:
1. The researcher begins with a general notion of where and with whom to start. The
first few cases may be solicited by convenience.
2. Maximum variation sampling might be used next to gain insights into the range and
complexity of the phenomenon.
3. The sample is continually adjusted: Emerging conceptualizations inform the
theoretical sampling process.
4. Sampling continues until saturation is achieved.
5. Final sampling may include a search for confirming and disconfirming cases to test,
refine, and strengthen the theory.
Critiquing Qualitative Sampling Plans
Qualitative sampling plans can be evaluated in terms of their adequacy and
appropriateness (Morse, 1991). Adequacy refers to the sufficiency and quality of the
data the sample yielded. An adequate sample provides data without “thin” spots. When
researchers have truly obtained saturation, informational adequacy has been achieved,
and the resulting description or theory is richly textured and complete.
Appropriateness concerns the methods used to select a sample. An appropriate
sample results from the selection of participants who can best supply information that
meets the study’s conceptual requirements. The sampling strategy must yield a full
understanding of the phenomenon of interest. A sampling approach that excludes
negative cases or that fails to include people with unusual experiences may not fully
address the study’s information needs.
Another important issue concerns the potential for transferability of the findings.
The transferability of study findings is a function of the similarity between the study
sample and other people to whom the findings might be applied. Thus, in critiquing a
report, you should assess whether the researcher provided an adequately thick
description of the sample and the study context so that someone interested in
transferring the findings could make an informed decision. Further guidance in
critiquing qualitative sampling decisions is presented in Box 12.1.
Box 12.1 Guidelines for Critiquing Qualitative Sampling Plans
1. Was the setting appropriate for addressing the research question, and was it
adequately described?
2. What type of sampling strategy was used?
3. Were the eligibility criteria for the study specified? How were participants
recruited into the study?
295
4. Given the information needs of the study—and, if applicable, its qualitative
tradition—was the sampling approach effective?
5. Was the sample size adequate and appropriate? Did the researcher indicate that
saturation had been achieved? Do the findings suggest a richly textured and
comprehensive set of data without any apparent “holes” or thin areas?
6. Were key characteristics of the sample described (e.g., age, gender)? Was a rich
description of participants and context provided, allowing for an assessment of the
transferability of the findings?
TIP The issue of transferability within the context of broader models of
generalizability is discussed in the Supplement to this chapter on the book’s
website.
DATA COLLECTION IN QUALITATIVE STUDIES
In-depth interviews are the most common method of collecting qualitative data.
Observation is used in some qualitative studies as well. Physiologic data are rarely
collected in a constructivist inquiry. Table 12.1 compares the types of data and aspects
of data collection used by researchers in the three main qualitative traditions.
Ethnographers typically collect a wide array of data, with observation and interviews
being the primary methods. Ethnographers also gather or examine products of the
culture under study, such as documents, records, artifacts, photographs, and so on.
Phenomenologists and grounded theory researchers rely primarily on in-depth
interviews, although observation also plays a role in grounded theory studies.
Qualitative Self-Report Techniques
296
Qualitative researchers do not have a set of questions that must be asked in a specific
order and worded in a given way. Instead, they start with general questions and allow
respondents to tell their narratives in a naturalistic fashion. Qualitative interviews tend
to be conversational. Interviewers encourage respondents to define the important
dimensions of a phenomenon and to elaborate on what is relevant to them.
Types of Qualitative Self-Reports
Researchers use completely unstructured interviews when they have no preconceived
view of the information to be gathered. Researchers begin by asking a grand tour
question such as “What happened when you first learned that you had AIDS?”
Subsequent questions are guided by initial responses. Ethnographic and
phenomenologic studies often rely on unstructured interviews.
Semistructured (or focused) interviews are used when researchers have a list of
topics or broad questions that must be covered in an interview. Interviewers use a
written topic guide to ensure that all question areas are addressed. The interviewer’s
function is to encourage participants to talk freely about all the topics on the guide.
Example of a semistructured interview
Duck and colleagues (2015) studied the perceptions and experiences of patients with
idiopathic pulmonary fibrosis. Semistructured interviews lasting about an hour were
conducted with 17 patients. The topic guide covered topics suggested in the literature
and from a patient/carer support group. The researcher posed broad, open-ended
questions that “gave participants the opportunity to tell their story” (p. 1057).
Focus group interviews involve groups of about 5 to 10 people whose opinions and
experiences are solicited simultaneously. The interviewer (or moderator) guides the
discussion using a topic guide. A group format is efficient and can generate a lot of
dialogue, but not everyone is comfortable sharing their views or experiences in front of
a group.
Example of focus group interviews
Neville and colleagues (2015) explored perceptions of staff working in residential
care homes toward older lesbian, gay, and bisexual people. A total of 47 care workers
from seven residential care facilities participated in seven focus groups. The topic
guide included two vignettes highlighting the stories of two hypothetical gay/lesbian
older people.
Personal diaries are a standard data source in historical research. It is also possible
to generate new data for a study by asking participants to maintain a diary over a
specified period. Diaries can be useful in providing an intimate description of a person’s
everyday life. The diaries may be completely unstructured; for example, individuals
297
who had an organ transplantation could be asked to spend 15 minutes a day jotting
down their thoughts. Frequently, however, people are asked to make diary entries
regarding some specific aspect of their lives.
Example of diaries
Curtis and colleagues (2014) explored responses to stress among Irish women with
breast cancer. Thirty women with newly diagnosed breast cancer maintained diaries
during their participation in a clinical trial. They were asked to write regularly about
their experiences and feelings. A facilitator reminded them weekly about the diaries
over a 5-week period but gave no further instructions.
Photo elicitation involves an interview guided by photographic images. This
procedure, most often used in ethnographies and participatory action research, can help
to promote a collaborative discussion. The photographs sometimes are ones that
researchers have made of the participants’ world, but photo elicitation can also be used
with photos in participants’ homes. Researchers have also used the technique of asking
participants to take photographs themselves and then interpret them, a method
sometimes called photovoice.
Example of a photovoice study
Evans-Agnew (2016) used photovoice to explore disparities in asthma management
with African American youth. Adolescents participated in a three-session photovoice
project; their phototexts were analyzed and compared to youth-related asthma
policies in the state of Washington.
Gathering Qualitative Self-Report Data
Researchers gather narrative self-report data to develop a construction of a phenomenon
that is consistent with that of participants. This goal requires researchers to overcome
communication barriers and to enhance the flow of information. Although qualitative
interviews are conversational, the conversations are purposeful ones that require
preparation. For example, the wording of questions should reflect the participants’
worldview and language. In addition to being good questioners, researchers must be
good listeners. Only by attending carefully to what respondents are saying can in-depth
interviewers develop useful follow-up questions.
Unstructured interviews are typically long, sometimes lasting an hour or more, and
so an important issue is how to record such abundant information. Some researchers
take notes during the interview, but this is risky in terms of data accuracy. Most
researchers record the interviews for later transcription. Although some respondents are
self-conscious when their conversation is recorded, they typically forget about the
presence of recording equipment after a few minutes.
298
TIP Although qualitative self-report data are often gathered in face-to-face
interviews, they can also be collected in writing. Internet “interviews” are
increasingly common.
Evaluation of Qualitative Self-Report Methods
In-depth interviews are a flexible approach to gathering data and, in many research
contexts, offer distinct advantages. In clinical situations, for example, it is often
appropriate to let people talk freely about their problems and concerns, allowing them to
take the initiative in directing the flow of conversation. Unstructured self-reports may
allow investigators to ascertain what the basic issues or problems are, how sensitive or
controversial the topic is, how individuals conceptualize and talk about the problems,
and what range of opinions or behaviors exist relevant to the topic. In-depth interviews
may also help elucidate the underlying meaning of a relationship repeatedly observed in
more structured research. On the other hand, qualitative methods are very time-
consuming and demanding of researchers’ skills in gathering, analyzing, and
interpreting the resulting data.
Qualitative Observational Methods
Qualitative researchers sometimes collect loosely structured observational data, often as
a supplement to self-report data. The aim of qualitative observation is to understand the
behaviors and experiences of people as they occur in naturalistic settings. Skillful
observation permits researchers to see the world as participants see it, to develop a rich
understanding of the focal phenomena, and to grasp subtleties of cultural variation.
Unstructured observational data are often gathered through participant
observation. Participant observers take part in the functioning of the group under study
and strive to observe, ask questions, and record information within the contexts and
structures that are relevant to group members. Participant observation is characterized
by prolonged periods of social interaction between researchers and participants. By
assuming a participating role, observers often have insights that would have eluded
more passive or concealed observers.
TIP Not all qualitative observational research is participant observation
(i.e., with observations occurring from within the group). Some unstructured
observations involve watching and recording behaviors without the
observers’ active participation in activities. Be on the alert for the misuse of
the term “participant observation.” Some researchers use the term
inappropriately to refer to all unstructured observations conducted in the
field.
The Observer-Participant Role in Participant Observation
299
In participant observation, the role that observers play in the group is important because
their social position determines what they are likely to see. The extent of the observers’
actual participation in a group is best thought of as a continuum. At one extreme is
complete immersion in the setting, with researchers assuming full participant status; at
the other extreme is complete separation, with researchers as onlookers. Researchers
may in some cases assume a fixed position on this continuum throughout the study, but
often, researchers’ role evolves toward increasing participation over the course of the
fieldwork.
Observers must overcome two major hurdles in assuming a satisfactory role vis-à-
vis participants. The first is to gain entrée into the social group under study; the second
is to establish rapport and trust within that group. Without gaining entrée, the study
cannot proceed; but without the trust of the group, the researcher will be restricted to
“front stage” knowledge—information distorted by the group’s protective facades. The
goal of participant observers is to “get backstage”—to learn about the true realities of
the group’s experiences. On the other hand, being a fully participating member does not
necessarily offer the best perspective for studying a phenomenon—just as being an
actor in a play does not offer the most advantageous view of the performance.
Example of participant-observer roles
Michaelsen (2012) studied nurses’ relationships with patients they regarded as being
difficult. Data were collected by means of participant observation and in-depth
interviews over an 18-month period. Michaelson conducted 18 observation sessions,
lasting between 3 and 4 hours, of the nurses interacting with patients during home
visits. She kept “a balance between being an ‘insider’ and an ‘outsider,’ between
participation and observation” (p. 92).
Gathering Participant Observation Data
Participant observers typically place few restrictions on the nature of the data collected,
but they often have a broad plan for types of information desired. Among the aspects of
an observed activity likely to be considered relevant are the following:
1. The physical setting—“Where” questions. What are the main features of the setting?
2. The participants—“Who” questions. Who is present, and what are their
characteristics?
3. Activities—“What” questions. What is going on? What are participants doing?
4. Frequency and duration—“When” questions. When did the activity begin and end?
Is the activity a recurring one?
5. Process—“How” questions. How is the activity organized? How does it unfold?
6. Outcomes—“Why” questions. Why is the activity happening? What did not happen
(especially if it ought to have happened) and why?
Participant observers must decide how to sample events and select observational
300
locations. They often use a combination of positioning approaches—staying in a single
location to observe activities in that location (single positioning), moving around to
observe behaviors from different locations (multiple positioning), or following a person
around (mobile positioning).
Direct observation is usually supplemented with information from interviews. For
example, key informants may be asked to describe what went on in a meeting the
observer was unable to attend or to describe an event that occurred before the study
began. In such cases, the informant functions as the observer’s observer.
Recording Observations
The most common forms of record keeping for participant observation are logs and field
notes, but photographs and videorecordings may also be used. A log (or field diary) is a
daily record of events and conversations. Field notes are broader and more interpretive.
Field notes represent the observer’s efforts to record information and to synthesize and
understand the data.
Field notes serve multiple purposes. Descriptive notes are objective descriptions of
events and conversations that were observed. Reflective notes document researchers’
personal experiences, reflections, and progress in the field. For example, some notes
document the observers’ interpretive efforts; others are reminders about how subsequent
observations should be made. Observers often record personal notes, which are
comments about their own feelings during the research process.
The success of participant observation depends on the quality of the logs and field
notes. It is essential to record observations as quickly as possible, but participant
observers cannot usually record information by openly carrying a clipboard or a
recording device because this would undermine their role as ordinary participants.
Observers must develop skills in making detailed mental notes that can later be written
or recorded.
Evaluation of Unstructured Observational Methods
Qualitative observational methods, and especially participant observation, can provide a
deeper understanding of human behaviors and social situations than is possible with
structured methods. Participant observation is valuable for its ability to “get inside” a
situation and illuminate its complexities. Participant observation is a good method for
answering questions about phenomena that are difficult for insiders themselves to
explain because these phenomena are taken for granted.
Like all research methods, however, participant observation faces potential
problems. Observers may lose objectivity in sampling, viewing, and recording
observations. Once they begin to participate in a group’s activities, the possibility of
emotional involvement becomes a concern. Researchers in their member role may fail
to attend to key aspects of the situation or may develop a myopic view on issues of
importance to the group. Finally, the success of participant observation depends on the
301
observer’s observational and interpersonal skills—skills that may be difficult to
cultivate.
Critiquing Unstructured Data Collection
It is often difficult to critique the decisions that researchers made in collecting
qualitative data because details about those decisions are seldom spelled out. In
particular, there is often scant information about participant observation. It is not
uncommon for a report to simply say that the researcher undertook participant
observation, without descriptions of how much time was spent in the field, what exactly
was observed, how observations were recorded, and what level of participation was
involved. Thus, one aspect of a critique is likely to involve an appraisal of how much
information the article provided about the data collection methods. Even though space
constraints in journals make it impossible for researchers to fully elaborate their
methods, researchers have a responsibility to communicate basic information about their
approach so that readers can assess the quality of evidence that the study yields.
Researchers should provide examples of questions asked and types of observations
made.
Triangulation of methods provides important opportunities for qualitative
researchers to enhance the integrity of their data. Thus, an important issue to consider in
evaluating unstructured data is whether the types and amount of data collected are
sufficiently rich to support an in-depth, holistic understanding of the phenomena under
study. Box 12.2 provides guidelines for critiquing the collection of unstructured data.
Box 12.2 Guidelines for Critiquing Data Collection Methods in Qualitative
Studies
1. Given the research question and the characteristics of study participants, did the
researcher use the best method of capturing study phenomena (i.e., self-reports,
observation)? Should supplementary methods have been used to enrich the data
available for analysis?
2. If self-report methods were used, did the researcher make good decisions about
the specific method used to solicit information (e.g., unstructured interviews,
focus group interviews, etc.)?
3. If a topic guide was used, did the report present examples of specific questions?
Were the questions appropriate and comprehensive? Did the wording encourage
rich responses?
4. Were interviews recorded and transcribed? If interviews were not recorded, what
steps were taken to ensure data accuracy?
5. If observational methods were used, did the report adequately describe what the
observations entailed? What did the researcher actually observe, in what types of
setting did the observations occur, and how often and over how long a period
were observations made?
302
6. What role did the researcher assume in terms of being an observer and a
participant? Was this role appropriate?
7. How were observational data recorded? Did the recording method maximize data
quality?
In this section, we describe the sampling plans and data collection strategies
used in a qualitative nursing study. Read the summary and then answer the
critical thinking questions that follow, referring to the full research report if
necessary. Example 1 is featured on the interactive Critical Thinking Activity
on website. The critical thinking questions for Example 2 are based
on the study that appears in its entirety in Appendix B of this book. Our
comments for these exercises are in the Student Resources section on
.
EXAMPLE 1: SAMPLING AND DATA COLLECTION IN A
QUALITATIVE STUDY
Study: Canadian adolescents’ perspectives of cancer risk: A qualitative study
(Woodgate et al., 2015)
Statement of Purpose: The purpose of this study was to understand
Canadian adolescents’ perspectives of cancer and cancer prevention,
including how they conceptualize and understand cancer risk.
Design: The researchers described their approach as ethnographic:
“Exploring the shared understanding and perceptions of adolescents toward
cancer and cancer risk lent itself to an ethnographic design using multiple
data collection methods” (p. 686). Data were collected over a 3-year period.
Sampling Strategy: A purposive sample of 75 adolescents was recruited
from four schools in a western Canadian province, with efforts to “maximize
variation in demographic (e.g., age, gender, SES, urban/rural residency) and
cancer experiences” (p. 686). Recruitment and analysis occurred
concurrently, and recruitment ended when saturation was achieved. The study
took place over a 3-year period. The sample included both males (27%) and
females (73%), ranging in age from 11 to 19 years; 56% lived in an urban
area, and about 30% had a family member with a history of cancer. The
majority were described as “middle income” (72%) and of European descent
(63%).
Data Collection: Data collection took place in the schools the youth
303
attended. Two face-to-face interviews were planned for each adolescent, with
the second one scheduled 4 to 5 weeks after the first. The second interview
was intended to ensure “thick description” and to provide an opportunity for
follow-up questions that helped to clarify issues identified in the initial
interview. Each interview, lasting between 60 and 90 minutes, was digitally
recorded and transcribed. For the first interview, the topic guide included
general questions about cancer risk and prevention (e.g., “How do people get
cancer?”). Photovoice methods were also introduced. The participants were
given cameras and were asked to take pictures over a 4-week period of what
they felt depicted cancer, cancer risks, and cancer prevention. Then, in the
second interview, the adolescents were asked to describe what the photos
meant to them. They were guided by such questions as “How does this
[picture] relate to cancer?” (p. 687). Finally, four focus group interviews were
conducted with adolescents who were previously interviewed “to complement
existing findings and gather new group-based knowledge on cancer risks” (p.
687). Field notes were maintained to describe verbal and nonverbal behaviors
of participants after both individual and focus group interviews.
Key Findings: The adolescents conceptualized cancer risk in terms of
specific risk factors; lifestyle factors (e.g., smoking) were prominent. They
rationalized risky health behaviors using a variety of cognitive strategies that
helped to make cancer risks more acceptable to them. However, they did
believe that it was possible for individuals to delay getting cancer by making
the right choices.
Critical Thinking Exercises
1. Answer the relevant questions from Boxes 12.1 and 12.2 regarding this
study.
2. Also consider the following targeted questions:
a. Comment on the variation the researcher achieved in type of study
participants.
b. Comment on the researchers’ overall data collection plan in terms of the
amount of information gathered.
3. If the results of this study are valid and trustworthy, what might be some
of the uses to which the findings could be put in clinical practice?
EXAMPLE 2: SAMPLING AND DATA COLLECTION IN THE
STUDY IN APPENDIX B
• Read the method section of Beck and Watson’s (2010) study (“Subsequent
childbirth after a previous traumatic birth”) in Appendix B of this book.
Critical Thinking Exercises
304
1. Answer the relevant questions from Boxes 12.1 and 12.2 regarding this
study.
2. Also consider the following targeted questions, which may further sharpen
your critical thinking skills and assist you in assessing aspects of the
study’s merit:
a. Comment on the characteristics of the participants, given the purpose of
the study.
b. Do you think that Beck and Watson should have limited their sample to
women from one country only? Provide a rationale for your answer.
c. Could any of the concepts in this study have been captured by
observation? Should they have been?
d. Did Beck and Watson’s study involve a “grand tour” question?
WANT TO KNOW MORE?
A wide variety of resources to enhance your learning and understanding of
this chapter are available on .
• Interactive Critical Thinking Activity
• Chapter Supplement on Transferability and Generalizability
• Answer to the Critical Thinking Exercises for Example 2
• Internet Resources with useful websites for Chapter 12
• A Wolters Kluwer journal article in its entirety—the Curtis et al. study
described on p. 205.
Additional study aids, including eight journal articles and related
questions, are also available in Study Guide for Essentials of Nursing
Research, 9e.
Summary
Points
Qualitative researchers typically select articulate and reflective informants with
certain types of experience in an emergent way, capitalizing on early learning to
guide subsequent sampling decisions.
Qualitative researchers may start with convenience or snowball sampling but
usually rely eventually on purposive sampling to guide them in selecting data
sources that maximize information richness.
One purposive strategy is maximum variation sampling, which entails purposely
305
selecting diverse cases on key traits. Another important strategy is sampling
confirming and disconfirming cases—i.e., selecting cases that enrich and
challenge the researchers’ conceptualizations.
Samples in qualitative studies are typically small and based on information needs.
A guiding principle is data saturation, which involves sampling to the point at
which no new information is obtained and redundancy is achieved.
Ethnographers make numerous sampling decisions, including not only whom to
sample but what to sample (e.g., activities, events, documents, artifacts); decision
making is often aided by their key informants who serve as guides and
interpreters of the culture.
Phenomenologists typically work with a small sample of people (often 10 or
fewer) who meet the criterion of having lived the experience under study.
Grounded theory researchers typically use theoretical sampling in which
sampling decisions are guided in an ongoing fashion by the emerging theory.
Samples of about 20 to 30 people are typical.
In-depth interviews are the most widely used method of collecting data for
qualitative studies. Self-reports in qualitative studies include completely
unstructured interviews, which are conversational discussions on the topic of
interest; semistructured (or focused) interviews, using a broad topic guide; focus
group interviews, which involve discussions with small groups; diaries, in which
respondents are asked to maintain daily records about some aspects of their lives;
and photo elicitation interviews, which are guided and stimulated by
photographic images, sometimes using photos that participants themselves take
(photovoice).
In qualitative research, self-reports are often supplemented by direct observation in
naturalistic settings. One type of unstructured observation is participant
observation, in which the researcher gains entrée into a social group and
participates to varying degrees in its functioning while making in-depth
observations of activities and events. Maintaining logs of daily events and field
notes of the experiences and interpretations are the major data collection methods.
REFERENCES FOR CHAPTER 12
**Curtis, R., Groarke, A., McSharry, J., & Kerin, M. (2014). Experience of breast cancer: Burden, benefit, or both?
Cancer Nursing, 37, E21–E30.
*Duck, A., Spencer, L., Bailey, S., Leonard, C., Ormes, J., & Caress, A. (2015). Perceptions, experiences and
needs of patients with idiopathic pulmonary fibrosis. Journal of Advanced Nursing, 71, 1055–1065.
Evans-Agnew, R. (2016). Asthma management disparities: A photovoice investigation with African American
youth. The Journal of School Nursing, 32, 99–111.
Martin, D., Yurkovich, E., & Anderson, K. (2016). American Indians’ family health concern on a Northern Plains
reservation: “Diabetes runs rampant here.” Public Health Nursing, 33, 73–81.
Michaelsen, J. J. (2012). Emotional distance to so-called difficult patients. Scandinavian Journal of Caring
Sciences, 26, 90–97.
*Michel, T., Lenardt, M., Willig, M., & Alvarez, A. (2015). From real to ideal—the health (un)care of long-lived
306
elders. Revista Brasileira de Enfermagem, 68, 343–349.
Morse, J. M. (1991). Strategies for sampling. In J. M. Morse (Ed.), Qualitative nursing research: A contemporary
dialogue (Rev. ed., pp. 127–145). Newbury Park, CA: Sage.
*Neville, S., Adams, J., Bellamy, G., Boyd, M., & George, N. (2015). Perceptions towards lesbian, gay and
bisexual people in residential care facilities: A qualitative study. International Journal of Older People Nursing,
10, 73–81.
Patton, M. Q. (2002). Qualitative research & evaluation methods (3rd ed.). Thousand Oaks, CA: Sage.
Pedersen, B., Groenkjaer, M., Falkmer, U., Mark, E., & Delmar, C. (2016). “The ambiguous transforming body”—
a phenomenological study of the meaning of weight changes among women treated for breast cancer.
International Journal of Nursing Studies, 55, 15–25.
Slatyer, S., Williams, A. M., & Michael, R. (2015). Seeking empowerment to comfort patients in severe pain: A
grounded theory study of the nurse’s perspective. International Journal of Nursing Studies, 52, 229–239.
Tobiano, G., Bucknall, T., Marshall, A., Guinane, J., & Chaboyer, W. (2016). Patients’ perceptions of participation
in nursing care on medical wards. Scandinavian Journal of Caring Sciences, 30(2), 260–270.
Van Rompaey, B., Van Hoof, A., van Bogaert, P., Timmermans, O., & Dilles, T. (2016). The patient’s perception
of a delirium: A qualitative research in a Belgian intensive care unit. Intensive and Critical Care Nursing, 32,
66–74.
Wise, N. J. (2015). Pregnant adolescents, beliefs about healthy eating, factors that influence food choices, and
nutrition education preferences. Journal of Midwifery & Women’s Health, 60, 410–418.
*Woodgate, R. L., Safipour, J., & Tailor, K. (2015). Canadian adolescents’ perspectives of cancer risk: A
qualitative study. Health Promotion International, 30, 684–694.
*A link to this open-access article is provided in the Internet Resources section on website.
**This journal article is available on for this chapter.
307
13 Mixed Methods and Other Special
Types of Research
Learning Objectives
On completing this chapter, you will be able to:
Identify advantages of mixed methods research and describe specific applications
Describe strategies and designs for conducting mixed methods research
Identify the purposes and some of the distinguishing features of specific types of
research (e.g., clinical trials, evaluations, outcomes research, surveys)
Define new terms in the chapter
Key Terms
Clinical trial
Concurrent design
Convergent design
Delphi survey
Economic (cost) analysis
Evaluation research
Explanatory design
Exploratory design
Health services research
Intervention research
Intervention theory
Methodologic study
Mixed methods research
Nursing sensitive outcome
Outcomes research
Pragmatism
Process analysis
Quality improvement (QI)
Secondary analysis
308
Sequential design
Surveys
In this final chapter on research designs, we explain several special types of research.
We begin by discussing mixed methods research that combines quantitative and
qualitative approaches.
MIXED METHODS RESEARCH
A growing trend in nursing research is the planned collection and integration of
quantitative and qualitative data within a single study or coordinated clusters of studies.
This section discusses the rationale for such mixed methods research and presents a
few applications.
Rationale for Mixed Method Research
The dichotomy between quantitative and qualitative data represents a key methodologic
distinction. Some argue that the paradigms that underpin quantitative and qualitative
research are incompatible. Most people, however, now believe that many areas of
inquiry can be enriched by triangulating quantitative and qualitative data. The
advantages of a mixed methods (MM) design include the following:
Complementarity. Quantitative and qualitative approaches are complementary. By
using mixed methods, researchers can possibly avoid the limitations of a single
approach.
Practicality. Given the complexity of phenomena, it is practical to use whatever
methodological tools are best suited to addressing pressing research questions.
Enhanced validity. When a hypothesis or model is supported by multiple and
complementary types of data, researchers can be more confident about their
inferences.
Perhaps the strongest argument for MM research, however, is that some questions
require MM. Pragmatism, a paradigm often associated with MM research, provides a
basis for a position that has been stated as the “dictatorship of the research question”
(Tashakkori & Teddlie, 2003, p. 21). Pragmatist researchers consider that it is the
research question that should drive the design of the inquiry. They reject a forced choice
between the traditional postpositivist and constructivist approaches to research.
Purposes and Applications of Mixed Methods Research
In MM research, there is typically an overarching goal, but there are inevitably at least
two research questions, each of which requires a different type of approach. For
example, MM researchers may simultaneously ask exploratory (qualitative) questions
and confirmatory (quantitative) questions. In an MM study, researchers can examine
309
causal effects in a quantitative component but can shed light on causal mechanisms in a
qualitative component.
Creswell and Plano Clark (2011) identified six types of research situations that are
especially well suited to MM research:
1. The concepts are new and poorly understood, and there is a need for qualitative
exploration before more formal, structured methods can be used.
2. Neither a qualitative nor a quantitative approach, by itself, is adequate in addressing
the complexity of the research problem.
3. The findings from one approach can be greatly enhanced with a second source of
data.
4. The quantitative results are puzzling and difficult to interpret, and qualitative data
can help to explain the results.
5. A particular theoretical perspective might require both quantitative and qualitative
data.
6. A multiphase project is needed to attain key objectives, such as the development and
assessment of an intervention.
As this list suggests, MM research can be used in various situations. Some of the
major applications include the following:
Instrument development. Nurse researchers sometimes gather qualitative data as the
basis for developing formal instruments—that is, for generating and wording the
questions on quantitative scales that are subsequently subjected to rigorous testing.
Intervention development. Qualitative research is also playing an important role in the
development of promising nursing interventions that are then rigorously tested for
efficacy.
Hypothesis generation. In-depth qualitative studies are often fertile with insights about
constructs or relationships among them. These insights then can be tested and
confirmed with larger samples in quantitative studies.
Theory building and testing. A theory gains acceptance as it escapes disconfirmation,
and the use of multiple methods provides opportunity for potential disconfirmation of
a theory. If the theory can survive these assaults, it can provide a stronger context for
the organization of clinical and intellectual work.
Explication. Qualitative data are sometimes used to explicate the meaning of
quantitative descriptions or relationships. Quantitative methods can demonstrate that
variables are systematically related but may fail to explain why they are related.
Example of explicating with qualitative data
Edinburgh and coresearchers (2015) undertook an MM study of the abuse
experiences of 62 sexually exploited runaway adolescents seen at a Child Advocacy
Center. Quantitative data came from physical exams and responses to psychological
scales. Qualitative data from forensic interviews were analyzed to explore the
310
experience of sexual exploitation. On a scale to measure posttraumatic stress disorder
(PTSD), nearly 80% of the youth had symptoms severe enough to meet Diagnostic
and Statistical Manual of Mental Disorders (4th ed.; DSM-IV) criteria for PTSD. The
in-depth interviews revealed how exploited youth were recruited and abused.
Mixed Method Designs and Strategies
In designing MM studies, researchers make many important decisions. We briefly
describe a few.
Design Decisions and Notation
Two decisions in MM design concern sequencing and prioritization. There are three
options for sequencing components of an MM study: Qualitative data are collected first,
quantitative data are collected first, or both types are collected simultaneously. When
the data are collected at the same time, the approach is concurrent. The design is
sequential when the two types of data are collected in phases. In well-conceived
sequential designs, the analysis and interpretation in one phase informs the collection of
data in the second.
In terms of prioritization, researchers usually decide which approach—quantitative
or qualitative—to emphasize. One option is to give the two components (strands) equal,
or roughly equal, weight. Usually, however, one approach is given priority. The
distinction is sometimes referred to as equal status versus dominant status.
Janice Morse (1991), a prominent nurse researcher, made a major contribution to
MM research by proposing a widely used notation system for sequencing and
prioritization. In this system, priority is designated by uppercase and lowercase letters:
QUAL/quan designates an MM study in which the dominant approach is qualitative,
whereas QUAN/qual designates the reverse. If neither approach is dominant (i.e., both
are equal), the notation is QUAL/QUAN. Sequencing is indicated by the symbols + or
→. The arrow designates a sequential approach. For example, QUAN → qual is the
notation for a primarily quantitative MM study in which qualitative data are collected in
phase 2. When both approaches occur concurrently, a plus sign is used (e.g., QUAL +
quan).
Specific Mixed Methods Designs
Numerous design typologies have been proposed by different MM methodologists. We
illustrate a few basic designs described by Creswell (2015).
The purpose of the convergent design (sometimes called a triangulation design) is
to obtain different, but complementary, data about the central phenomenon under study
—i.e., to triangulate data sources. The goal of this design is to converge on “the truth”
about a problem or phenomenon by allowing the limitations of one approach be offset
by the strengths of the other. In this design, quantitative and qualitative data are
collected simultaneously, with equal priority (QUAL + QUAN).
311
Example of a convergent design
Wittenberg-Lyles and colleagues (2015) used a QUAL + QUAN design in their MM
study that assessed the potential benefits of a secret Facebook group for bereaved
hospice caregivers. Data were collected concurrently by means of posts and
comments in the secret Facebook group and through standardized scales of anxiety
and depression.
Explanatory designs are sequential designs with quantitative data collected in the
first phase, followed by qualitative data collected in the second phase. Either the
quantitative or the qualitative strand can be given a stronger priority: The design can be
either QUAN → qual or quan → QUAL. In explanatory designs, qualitative data from
the second phase are used to build on or explain the quantitative data from the initial
phase. This design is especially suitable when results are complex and tricky to
interpret.
Example of an explanatory design
Polivka and colleagues (2015) studied environmental health and safety hazards
experienced by home health care providers. A sample of 68 nurses, aides, and other
home health care workers completed a structured questionnaire that asked about
health care tasks performed and injuries or adverse outcomes experienced. Then,
sample members participated in in-depth focus group interviews. The focus group
data allowed the researchers to do a room-by-room analysis of hazards.
Exploratory designs are sequential MM designs, with qualitative data being
collected first. The design has as its central premise the need for initial in-depth
exploration of a concept. Usually, the first phase focuses on exploration of a poorly
understood phenomenon, and the second phase is focused on measuring it or classifying
it. In an exploratory design, either the qualitative phase can be dominant (QUAL →
quan) or the quantitative phase can be dominant (qual → QUAN).
Example of an exploratory design
Yang and colleagues (2016) developed a checklist for assessing thirst in patients with
advanced dementia. The items on the checklist were developed through in-depth
interviews with nurses caring for patients with advanced dementia. The checklist was
then tested quantitatively (e.g., for reliability) with caregivers from eight facilities.
TIP Creswell and Plano Clark (2011) described a design called the
embedded design—a term that is sometimes used in nursing studies.
However, Creswell (2015) subsequently stopped referencing this design. An
embedded design is one in which a second type of data is totally subservient
312
to the other type of data. Creswell now views embedding as an analytic
strategy rather than as a design type.
Sampling and Data Collection in Mixed Methods Research
Sampling and data collection in MM studies are often a blend of approaches described
in earlier chapters. A few special issues for an MM study merit brief discussion.
MM researchers can combine sampling designs in various ways. The quantitative
component is likely to rely on a sampling strategy that enhances the researcher’s ability
to generalize from the sample to a population. For the qualitative component, MM
researchers usually adopt purposive sampling methods to select information-rich cases
who are good informants about the phenomenon of interest. Sample sizes are also likely
to be different in the quantitative and qualitative strands in ways one might expect—i.e.,
larger samples for the quantitative component. A unique sampling issue in MM studies
concerns whether the same people will be in both the quantitative and qualitative
strands. The best strategy depends on the study purpose and the research design, but
using overlapping samples can be advantageous. Indeed, a particularly popular strategy
is a nested approach in which a subset of participants from the quantitative strand is
used in the qualitative strand.
Example of nested sampling
Nguyen and coresearchers (2016) studied the medical, service-related, and emotional
reasons for emergency room visits of older cancer patients. They undertook a
statistical analysis of administrative databases for 792 cancer patients aged 70 years
or older. They conducted semistructured interviews with a subsample of 11 patients
to better understand the experiences from the patients’ perspective.
In terms of data collection, all of the data collection methods discussed previously
can be creatively combined and triangulated in an MM study. Thus, possible sources of
data include group and individual interviews, psychosocial scales, observations,
biophysiological measures, records, diaries, and so on. MM studies can involve
intramethod mixing (e.g., structured and unstructured self-reports) and intermethod
mixing (e.g., biophysiologic measures and unstructured observation). A fundamental
issue concerns the methods’ complementarity—that is, having the limitations of one
method be balanced and offset by the strengths of the other.
TIP One challenge in doing MM research concerns how best to analyze the
quantitative and qualitative data. The benefits of MM research require an
effort to merge results from the two strands and to develop interpretations
and recommendations based on integrated understandings.
313
OTHER SPECIAL TYPES OF RESEARCH
The remainder of this chapter briefly describes types of research that vary by study
purpose rather than by research design or tradition.
Intervention Research
In Chapter 9, we discussed randomized controlled trials (RCTs) and other experimental
and quasi-experimental designs for testing the effects of interventions. In actuality,
intervention research is often more complex than a simple experimental–control group
comparison of outcomes—indeed, intervention research often relies on MM to develop,
refine, test, and understand the intervention.
Different disciplines have developed their own approaches and terminology in
connection with intervention efforts. Clinical trials are associated with medical
research, evaluation research is linked to the fields of education and public policy, and
nurses are developing their own tradition of intervention research. We briefly describe
these three approaches.
Clinical Trials
Clinical trials test clinical interventions. Clinical trials undertaken to evaluate an
innovative therapy or drug are often designed in a series of phases:
Phase I of the trial is designed to establish safety, tolerance, and dose with a simple
design (e.g., one-group pretest–posttest). The focus is on developing the best
treatment.
Phase II is a pilot test of treatment effectiveness. Researchers see if the intervention is
feasible and acceptable and holds promise. This phase is designed as a small-scale
experiment or a quasi-experiment.
Phase III is a full experimental test of the intervention—an RCT with random
assignment to treatment conditions. The objective is to develop evidence about the
treatment’s efficacy—i.e., whether the intervention is more efficacious than usual
care or another alternative. When the term clinical trial is used, it often is referring to
a phase III trial.
Phase IV of clinical trials involves studies of the effectiveness of an intervention in the
general population. The emphasis in effectiveness studies is on the external validity
of an intervention that has demonstrated efficacy under controlled (but artificial)
conditions.
Evaluation Research
Evaluation research focuses on developing useful information about a program or
policy—information that decision makers need on whether to adopt, modify, or abandon
the program.
Evaluations are undertaken to answer various questions. Questions about program
314
effectiveness rely on experimental or quasi-experimental designs, but other questions do
not. Many evaluations are MM studies with distinct components.
For example, a process analysis is often undertaken to obtain descriptive
information about the process by which a program gets implemented and how it actually
functions. A process analysis addresses such questions as the following: What exactly is
the treatment, and how does it differ from traditional practices? What are the barriers to
successful program implementation? How do staff and clients feel about the
intervention? Qualitative data play a big role in process analyses.
Evaluations may also include an economic (or cost) analysis to assess whether
program benefits outweigh its monetary costs. Administrators make decisions about
resource allocation for health services not only on the basis of whether something
“works” but also based on economic viability. Cost analyses are often done when
researchers are also evaluating program efficacy.
Example of an economic analysis
Sahlen and colleagues (2016) assessed the cost-effectiveness of person-centered
integrated heart failure and palliative home care based on data gathered in an RCT of
intervention efficacy. The analysis showed significant cost reductions compared to
usual care.
Nursing Intervention Research
Both clinical trials and evaluations involve interventions. However, the term
intervention research is increasingly being used by nurse researchers to describe an
approach distinguished by a distinctive process of planning, developing, and testing
interventions—especially complex interventions. Proponents of the process are critical
of the simplistic, atheoretical approach that is often used to design and evaluate
interventions. The recommended process involves an in-depth understanding of the
problem and the target population; careful, collaborative planning with a diverse team;
and the development or adoption of a theory to guide the inquiry.
Similar to clinical trials, nursing intervention research that involves the development
of a complex intervention involves several phases: (1) basic developmental research, (2)
pilot research, (3) efficacy research, and (4) effectiveness research.
Conceptualization, a major focus of the development phase, is supported through
collaborative discussions, consultations with experts, critical literature reviews, and in-
depth qualitative research to understand the problem. The construct validity of the
intervention is enhanced through efforts to develop an intervention theory that clearly
articulates what must be done to achieve desired outcomes. The intervention design,
which emerges from the intervention theory, specifies what the clinical inputs should
be. During the developmental phase, key stakeholders—people who have a stake in the
intervention—are often identified and “brought on board.” Stakeholders include
potential beneficiaries of the intervention and their families, advocates and community
315
leaders, and health care staff.
The second phase of nursing intervention research is a pilot test of the intervention.
The central activities during the pilot test are to secure preliminary evidence of the
intervention’s benefits, to assess the feasibility of a rigorous test, and to refine the
intervention theory and intervention protocols. The feasibility assessment should
involve an analysis of factors that affected implementation during the pilot (e.g.,
recruitment, retention, and adherence problems). Qualitative research may be used to
gain insight into how the intervention should be refined.
As in a classic clinical trial, the third phase involves a full experimental test of the
intervention; the final phase focuses on effectiveness and utility in real-world clinical
settings. This full model of intervention research is, at this point, more of an ideal than
an actuality. For example, effectiveness studies in nursing research are rare. A few
research teams have begun to implement portions of the model, and efforts are likely to
expand.
Example of nursing intervention research
Rossen and colleagues (2016) developed and pilot tested a complex nurse-led
psychoeducational intervention to address the physical and psychological needs of
women receiving radiotherapy for gynecologic cancer. The researchers developed the
intervention based on relevant theory and consumer and expert consultations. Two
theoretical perspectives informed intervention development: self-determination
theory and a peer support theory. The intervention was pilot tested with six patients.
The peer volunteers and nurse delivering the intervention maintained reflective
diaries regarding feasibility and acceptability. The intervention is being formally
tested in an RCT.
Health Services and Outcomes Research
Health services research is the broad interdisciplinary field that studies how
organizational structures and processes, health technologies, social factors, and personal
behaviors affect access to health care, the cost and quality of health care, and,
ultimately, people’s health and well-being. Outcomes research, a subset of health
services research, comprises efforts to understand the end results of particular health
care practices and to assess the effectiveness of health care services. Outcomes research
represents a response to the increasing demand from policy makers and the public to
justify care practices in terms of improved patient outcomes and costs.
Many nursing studies evaluate patient outcomes, but efforts to appraise the quality
and impact of nursing care—as distinct from care provided by the overall health care
system—are less common. A major obstacle is attribution—that is, linking patient
outcomes to specific nursing actions, distinct from those of other members of the health
care team. It is also often difficult to ascertain a causal connection between outcomes
and health care interventions because factors outside the health care system (e.g., patient
316
characteristics) affect outcomes in complex ways.
Donabedian (1987), whose pioneering efforts created a framework for outcomes
research, emphasized three factors in appraising quality in health care services:
structure, process, and outcomes. The structure of care refers to broad organizational
and administrative features. Nursing skill mix, for example, is a structural variable that
has been found to be related to patient outcomes. Processes involve aspects of clinical
management and decision making. Outcomes refer to specific clinical end results of
patient care. Much progress has been made in identifying nursing sensitive outcomes
—patient outcomes that improve if there is greater quantity or quality of nurses’ care.
Several modifications to Donabedian’s (1987) framework for appraising health care
quality have been proposed, the most noteworthy of which is the Quality Health
Outcomes Model developed by the American Academy of Nursing (Mitchell et al.,
1998). This model is less linear and more dynamic than Donabedian’s original
framework and takes client characteristics (e.g., illness severity) and system
characteristics into account.
Outcomes research usually concentrates on studying linkages within such models
rather than on testing the overall model. Some studies have examined the effect of
health care structures on health care processes or outcomes, for example. Outcomes
research in nursing often has focused on the process–patient–outcomes nexus. Examples
of nursing process variables include nursing actions, nurses’ problem-solving and
decision-making skills, clinical competence and leadership, and specific activities or
interventions (e.g., communication, touch).
Example of outcomes research
Pitkäaho and colleagues (2015) studied the relationship between nurse staffing
(proportion of registered nurses and skill mix) on the one hand and patient outcomes
(e.g., length of hospital stay) on the other in 35,306 patient episodes in acute care
units of a Finnish hospital.
Survey Research
A survey obtains quantitative information about the prevalence, distribution, and
interrelations of variables within a population. Political opinion polls are examples of
surveys. Survey data are used primarily in correlational studies and are often used to
gather information from nonclinical populations (e.g., college students, nurses).
Surveys obtain information about people’s actions, knowledge, intentions, and
opinions by self-report. Surveys, which yield quantitative data primarily, may be cross-
sectional or longitudinal. Any information that can reliably be obtained by direct
questioning can be gathered in a survey, although surveys include mostly closed-ended
questions.
Survey data can be collected in a number of ways, but the most respected method is
through personal interviews in which interviewers meet in person with respondents to
317
ask them questions. Personal interviews are expensive because they involve a lot of
personnel time, but they yield high-quality data, and the refusal rate tends to be low.
Telephone interviews are less costly, but when the interviewer is unknown, respondents
may be uncooperative on the phone. Self-administered questionnaires (especially those
delivered over the Internet) are an economical approach to doing a survey but are not
appropriate for surveying certain populations (e.g., the elderly, children) and tend to
yield low response rates.
The greatest advantage of surveys is their flexibility and broadness of scope.
Surveys can be used with many populations, can focus on a wide range of topics, and
can be used for many purposes. The information obtained in most surveys, however,
tends to be relatively superficial: Surveys rarely probe deeply into complexities of
human behavior and feelings. Survey research is better suited to extensive rather than
intensive analysis.
Example of a survey
Kleinpell and coresearchers (2016) conducted a national benchmarking survey of
nurses working in intensive care telemedicine facilities in the United States. The
survey focused on the perceived benefits of telemedicine and barriers to its use. More
than 1,200 nurses completed an online survey.
Quality Improvement Studies
One further type of research-like endeavor is quality improvement (QI) projects. As
discussed in Chapter 2, the purpose of QI is to improve practices and processes within a
specific organization—not to generate knowledge that can be generalized beyond the
specific context of the study. Nevertheless, there are similarities between QI, health care
research, and evidence-based practice (EBP) projects. All three have a lot in common
(e.g., the use of systematic methods of collecting and analyzing data to address a
problem), but there are also differences.
A comparison chart describing the similarities and differences of the three types of
efforts on over 20 dimensions has been prepared by Shirey and colleagues (2011). One
dimension is “expectations for knowledge dissemination.” In QI, the major expectation
is that results would be disseminated internally—publication in a professional journal is
not usually considered necessary. In EBP projects, knowledge dissemination is
“increasingly becoming an expectation within facility in which EBP project undertaken
and beyond that setting” (Shirley et al., 2011, p. 63). For research, widespread
dissemination in accessible publications is the norm and often considered an obligation.
A decade ago, publication in a professional journal was considered by many a criterion
for classifying something as “research” rather than QI or EBP, but this is no longer the
case. Many QI projects are described in professional journals.
The issue of how “generalizable” the knowledge gained from the project is another
issue. Shirey and colleagues’ (2011) chart states that knowledge from QI is not
318
generalizable—it is specific to the organization in which the QI is undertaken. However,
some QI projects test improvements that could be effectively implemented in other
institutions. Research, which is supposed to be generalizable, often is not as amenable to
generalization as one might wish. Many nursing and health studies are done in local
settings using convenience samples that provide little basis for generalization without
replications. Thus, one cannot necessarily distinguish QI and research based on whether
the patients are from a specific clinical microsystem.
The field of QI has developed some distinctive methodologies and models for
conducting inquiries. A frequently mentioned model is Plan-Do-Study-Act (PDSA),
which is sometimes referred to as Plan-Do-Check-Act (PDCA). The PDSA cycle, part
of the Institute for Healthcare Improvement’s Model for Improvement, was designed as
a tool for accelerating QI. The steps in the cycle are the following:
1. Plan: Plan a change and develop a test or observation, including a plan for data
collection.
2. Do: Try out the change on a small scale.
3. Study: Review and analyze the data, study the results, and identify what has been
learned.
4. Act: Refine the change and take action based on the lessons learned from the test.
Example of a quality improvement study
Zimnicki (2015) used the PDCA model in a QI project involving the development of
a flow chart for caring for patients undergoing planned ostomy surgery and an
educational intervention to help staff nurses to perform preoperative stoma site
marking and patient teaching.
A Few Other Types of Research
The majority of quantitative studies that nurse researchers have conducted are the types
described thus far in this and earlier chapters. However, nurse researchers have pursued
a few other specific types of research, as briefly described here. The supplement for this
chapter on website provides more details about each type.
Secondary analysis. Secondary analyses involve the use of existing data from a
previous or ongoing study to test new hypotheses or answer questions that were not
initially envisioned. Secondary analyses are often based on quantitative data from a
large data set (e.g., from national surveys), but secondary analyses of data from
qualitative studies have also been undertaken. The study in Appendix A of this book
is a secondary analysis.
Delphi surveys. Delphi surveys were developed as a tool for short-term forecasting.
The technique involves a panel of experts who are asked to complete several rounds
of questionnaires focusing on their judgments about a topic of interest. Multiple
iterations are used to achieve consensus.
319
Methodologic studies. Nurse researchers have undertaken many methodologic
studies, which focus on the development, validation, and assessment of methodologic
tools or strategies (e.g., the psychometric testing of a new scale).
CRITIQUING STUDIES DESCRIBED IN THIS
CHAPTER
It is difficult to provide guidance on critiquing the types of studies described in this
chapter because they are so varied and because many fundamental methodologic issues
require a critique of the overall design. Guidelines for critiquing design-related issues
were presented in previous chapters.
You should, however, consider whether researchers took appropriate advantage of
the possibilities of an MM design. Collecting both quantitative and qualitative data is
not always necessary or practical, but in critiquing studies, you can consider whether the
study would have been strengthened by triangulating different types of data. In studies
in which MM were used, you should carefully consider whether the inclusion of both
types of data was justified and whether the researcher really made use of both types of
data to enhance knowledge on the research topic. Box 13.1 offers a few specific
questions for critiquing the types of studies included in this chapter.
Box 13.1 Guidelines for Critiquing Studies Described in Chapter 13
1. Was the study exclusively quantitative or exclusively qualitative? If so, could the
study have been strengthened by incorporating both approaches?
2. If the study used an MM design, did the inclusion of both approaches contribute
to enhanced validity? In what other ways (if any) did the inclusion of both types
of data strengthen the study and further the aims of the research?
3. If the study used an MM approach, what was the design—how were the
components sequenced, and which had priority? Was this approach appropriate?
4. If the study was a clinical trial or intervention study, was adequate attention paid
to developing an appropriate intervention? Was there a well-conceived
intervention theory that guided the endeavor? Was the intervention adequately
pilot tested?
5. If the study was a clinical trial, evaluation, or intervention study, was there an
effort to understand how the intervention was implemented (i.e., a process-type
analysis)? Were the financial costs and benefits assessed? If not, should they have
been?
6. If the study was outcomes research, which segments of the structure–process–
outcomes model were examined? Would it have been desirable (and feasible) to
expand the study to include other aspects? Do the findings suggest possible
improvements to structures or processes that would be beneficial to patient
outcomes?
320
7. If the study was a survey, was the most appropriate method used to collect the
data (i.e., in-person interviews, telephone interviews, mail or Internet
questionnaires)?
The nursing literature abounds with studies of the types described in this
chapter. Here, we describe an important example. Read the summary and then
answer the critical thinking questions that follow, referring to the full research
report if necessary. Example 1 is featured on the interactive Critical Thinking
Activity on website. The critical thinking questions for Example 2 are
based on the study that appears in its entirety in Appendix D of this book. Our
comments for this exercise are in the Student Resources section on .
EXAMPLE 1: MIXED METHODS STUDY WITH A SURVEY
Study: A mixed-methods study of secondary traumatic stress in certified
nurse-midwives: Shaken belief in the birth process (Beck et al., 2015)
Statement of Purpose: The purpose of this study was to examine secondary
traumatic stress (STS) among certified nurse-midwives (CNMs) exposed to
traumatized patients during childbirth. The research questions were (1) What
are the prevalence and severity of STS in CNMs exposed to traumatic birth?
(2) Are CNMs’ demographic characteristics related to STS? (3) What are the
experiences of CNMs who attend at traumatic births? and (4) How do the
quantitative and qualitative sets of results develop a more complete picture of
STS in CNMs?
Methods: A convergent design (QUAL + QUAN) was used, i.e., independent
strands of data were collected in a single phase. CNMs who had attended at
least one traumatic birth were invited to participate in a survey. A total of 473
CNMs completed the quantitative portion—a questionnaire that included
background questions and the 17-item STS Scale. Data for the qualitative
strand, obtained from a nested sample of 246 survey participants, came from
responses to the following: “Please describe in as much detail as you can
remember your experience of attending one or more traumatic births. Please
describe all of your thoughts, feelings, and perceptions until you have no
more to write. If attending traumatic births has impacted your midwifery
practice, please describe this impact” (p. 17).
Data Analysis and Integration: Statistical methods were used to answer
research questions 1 and 2. Question 3 was addressed by means of a content
321
analysis of the qualitative data on the CNMs’ actual experiences. Themes
were cross-tabulated with information about CNMs’ characteristics and
reported symptoms. The merged results were then integrated into an overall
interpretation.
Key Findings: In this sample, 29% of the CNMs reported high to severe
STS; 36% screened positive for PTSD due to attending traumatic births. Six
themes were identified in the analysis of qualitative data (e.g., protecting my
patients: agonizing sense of powerlessness and helplessness; shaken belief in
the birth process: impacting midwifery practice). More than half the
participants said that their practice had been impacted. Having both
quantitative and qualitative data provided a richer, more complete picture of
STS in CNMs. The quantitative results revealed the previously unknown high
percentage of CNMs experiencing STS. The qualitative results, however,
provided an insider’s glimpse to what it is like for the CNMs to struggle with
STS. For example, one highly rated item on the STS Scale was “I had trouble
sleeping.” Here is an excerpt from the qualitative data that brought this scale
item to life: “The baby must have been dead for 5 days or so as the skin was
peeling badly and blistered. Between the slime of the meconium and the skin
issues it was hard to grip the head to help deliver the rest of the body. I felt
like I was pulling off skin and worried I would pull off the head. For weeks I
could not get pictures of that dead baby girl out of my mind. I had difficulty
sleeping due to the nightmares” (p. 21).
Critical Thinking Exercises
1. Answer the relevant questions from Box 13.1 regarding this study.
2. Also consider the following targeted questions:
a. Comment on the sampling design in this study.
b. What might be an advantage of using a sequential rather than a
concurrent design in this study?
3. If the results of this study are valid, what are some of the uses to which the
findings might be put in clinical practice?
EXAMPLE 2: MIXED METHODS STUDY IN APPENDIX D
• Read the report of the MM study (“Differences in perceptions of diagnosis
and treatment of obstructive sleep apnea and continuous positive airway
pressure therapy among adherers and nonadherers”) by Sawyer and
colleagues (2010) in Appendix D and then address the following suggested
activities.
Critical Thinking Exercises
1. Answer questions 1 to 3 in Box 13.1 regarding this study.
322
2. Suppose that Sawyer and colleagues had only collected qualitative data.
Comment on how this might have affected the results and the overall
quality of the evidence. Then suppose they had collected all of their data in
a structured, quantitative manner. How might this have changed the results
and affected the quality of the evidence?
3. If the results of this study are valid, what are some of the uses to which the
findings might be put in clinical practice?
WANT TO KNOW MORE?
A wide variety of resources to enhance your learning and understanding of
this chapter are available on .
• Interactive Critical Thinking Activity
• Chapter Supplement on Other Specific Types of Research
• Answer to the Critical Thinking Exercise for Example 2
• Internet Resources with useful websites for Chapter 13
• A Wolters Kluwer journal article in its entirety—the Rossen et al. study
described as an example on p. 219.
Additional study aids, including eight journal articles and related
questions, are also available in Study Guide for Essentials of Nursing
Research, 9e.
Summary
Points
For many research purposes, mixed method studies are advantageous. Mixed
methods research involves the collection, analysis, and integration of both
quantitative and qualitative data within a study or series of studies, often with an
overarching goal of achieving both discovery and verification.
Mixed methods (MM) research has numerous advantages, including the
complementarity of quantitative and qualitative data and the practicality of using
methods that best address a question. MM research has many applications,
including the development and testing of instruments, theories, and interventions.
The paradigm most often associated with MM research is pragmatism, which has
as a major tenet, “the dictatorship of the research question.”
Key decisions in designing an MM study involve how to sequence the components
and which strand (if either) will be given priority. In terms of sequencing, MM
323
designs are either concurrent (both strands occurring in one simultaneous phase)
or sequential (one strand occurring prior to and informing the second strand).
Notation for MM research often designates priority—all capital letters for the
dominant strand and all lowercase letters for the nondominant strand—and
sequence. An arrow is used for sequential designs, and a “+” is used for concurrent
designs. QUAL → quan, for example is a sequential, qualitative-dominant design.
Specific MM designs include the convergent design (QUAL + QUAN),
explanatory design (e.g., QUAN → qual), and exploratory design (e.g., QUAL
→ quan).
Sampling in MM studies can involve the same or different people in the different
components. Nesting is a common sampling approach in which a subsample of the
participants in one strand also participates in the other.
Different disciplines have developed different approaches to (and terms for) efforts
to evaluate interventions. Clinical trials, which are studies designed to assess the
effectiveness of clinical interventions, often involve a series of phases. Phase I is
designed to finalize features of the intervention. Phase II involves seeking
preliminary evidence of efficacy and opportunities for refinements. Phase III is a
full experimental test of treatment efficacy. In Phase IV, the researcher focuses
primarily on generalized effectiveness and evidence about costs and benefits.
Evaluation research assesses the effectiveness of a program, policy, or procedure
to assist decision makers in choosing a course of action. Evaluations can answer a
variety of questions. Process analyses describe the process by which a program
gets implemented and how it functions in practice. Economic (cost) analyses seek
to determine whether the monetary costs of a program are outweighed by benefits.
Intervention research is a term sometimes used to refer to a distinctive process of
planning, developing, testing, and disseminating interventions. The construct
validity of an emerging intervention is enhanced through efforts to develop an
intervention theory that articulates what must be done to achieve desired
outcomes.
Outcomes research (a subset of health services research) is undertaken to
document the quality and effectiveness of health care and nursing services. A
model of health care quality encompasses several broad concepts, including
structure (e.g., nursing skill mix), process (nursing interventions and actions), and
outcomes (the specific end results of patient care in terms of patient functioning).
Efforts have been made to identify nursing sensitive outcomes.
Survey research examines people’s characteristics, behaviors, intentions, and
opinions by asking them to answer questions. Surveys can be administered
through personal (face-to-face) interviews, telephone interviews, or self-
administered questionnaires.
Quality improvement (QI) projects are designed to improve practices in a
324
specific organization; they often use a model called Plan-Do-Study-Act (PDSA) or
Plan-Do-Check-Act (PDCA).
REFERENCES FOR CHAPTER 13
Beck, C. T., LoGiudice, J., & Gable, R. (2015). A mixed-methods study of secondary traumatic stress in certified
nurse-midwives: Shaken belief in the birth process. Journal of Midwifery & Women’s Health, 60, 16–23.
Creswell, J. W. (2015). A concise introduction to mixed methods research. Thousand Oaks, CA: Sage.
Creswell, J. W., & Plano Clark, V. L. (2011). Designing and conducting mixed methods research (2nd ed.).
Thousand Oaks, CA: Sage.
Donabedian, A. (1987). Some basic issues in evaluating the quality of health care. In L. T. Rinke (Ed.), Outcome
measures in home care (Vol. 1, pp. 3–28). New York, NY: National League for Nursing.
*Edinburgh, L., Pape-Blabolil, J., Harpin, S., & Saewyc, E. (2015). Assessing exploitation experiences of girls and
boys seen at a Child Advocacy Center. Child Abuse & Neglect, 46, 47–59.
Kleinpell, R., Barden, C., Rincon, T., McCarthy, M., & Zapatochny Rufo, R. (2016). Assessing the impact of
telemedicine on nursing care in intensive care units. American Journal of Critical Care, 25, e14–e20.
Mitchell, P., Ferketich, S., & Jennings, B. (1998). Quality health outcomes model. Image: The Journal of Nursing
Scholarship, 30, 43–46.
Morse, J. M. (1991). Approaches to qualitative-quantitative methodological triangulation. Nursing Research, 40,
120–123.
Nguyen, B., Tremblay, D., Mathieu, L., & Groleau, D. (2016). Mixed method exploration of the medical, service-
related, and emotional reasons for emergency room visits of older cancer patients. Supportive Care in Cancer,
24, 2549–2556.
Pitkäaho, T., Partanen, P., Miettinen, M., & Vehviläinen-Julkunen, K. (2015). Non-linear relationships between
nurse staffing and patients’ length of stay in acute care units: Bayesian dependence modelling. Journal of
Advanced Nursing, 71, 458–473.
*Polivka, B., Wills, C., Darragh, A., Lavender, S., Sommerich, C., & Stredney, D. (2015). Environmental health
and safety hazards experienced by home health care providers: A room-by-room analysis. Workplace Health &
Safety, 63, 512–522.
**Rossen, S., Hansen-Nord, N., Kayser, L., Borre, M., Borre, M., Larsen, R., . . . Hansen, R. (2016). The impact of
husbands’ prostate cancer diagnosis and participation in a behavioral lifestyle intervention on spouses’ lives and
relationships with their partners. Cancer Nursing, 39, E1–E9.
Sahlen, K., Boman, K., & Brännström, M. (2016). A cost-effectiveness study of person-centered integrated heart
failure and palliative home care: Based on randomized controlled trial. Palliative Medicine, 30, 296–302.
Shirey, M., Hauck, S., Embree, J., Kinner, T., Schaar, G., Phillips, L., . . . McCool, I. (2011). Showcasing
differences between quality improvement, evidence-based practice, and research. The Journal of Continuing
Education in Nursing, 42, 57–68.
Tashakkori, A., & Teddlie, C. (2003). Handbook of mixed methods in social & behavioral research (2nd ed.).
Thousand Oaks, CA: Sage.
Wittenberg-Lyles, E., Washington, K., Oliver, D. P., Shaunfield, S., Gage, L. A., Mooney, M., & Lewis, A. (2015).
“It is the ‘starting over’ part that is so hard”: Using an online group to support hospice bereavement. Palliative
& Supportive Care, 13, 351–357.
Yang, Y. P., Wang, C., & Wang, J. (2016). The initial development of a checklist for assessing thirst in patients
with advanced dementia. The Journal of Nursing Research, 24, 224–230.
Zimnicki, K. M. (2015). Preoperative teaching and stoma marking in an inpatient population: A quality
improvement process using a FOCUS-Plan-Do-Check-Act model. Journal of Wound, Ostomy, and Continence
Nursing, 42, 165–169.
*A link to this open-access article is provided in the Internet Resources section on website.
**This journal article is available on for this chapter.
325
Part 4 Analysis and Interpretation in
Quantitative and Qualitative Research
14 Statistical Analysis of Quantitative
Data
Learning Objectives
On completing this chapter, you will be able to:
Describe the four levels of measurement and identify which level was used for
measuring specific variables
Describe characteristics of frequency distributions and identify and interpret various
descriptive statistics
Describe the logic and purpose of parameter estimation and interpret confidence
intervals
Describe the logic and purpose of hypothesis testing and interpret p values
Specify appropriate applications for t-tests, analysis of variance, chi-squared tests, and
correlation coefficients and interpret the meaning of the calculated statistics
Understand the results of simple statistical procedures described in a research report
Identify several types of multivariate statistics and describe situations in which they
could be used
Identify indexes used in assessments of reliability and validity
Define new terms in the chapter
Key Terms
Absolute risk (AR)
Absolute risk reduction (ARR)
Alpha (α)
Analysis of covariance (ANCOVA)
Analysis of variance (ANOVA)
326
Central tendency
Chi-squared test
Coefficient alpha
Cohen’s kappa
Confidence interval (CI)
Continuous variable
Correlation
Correlation coefficient
Correlation matrix
Crosstabs table
d statistic
Descriptive statistics
Effect size
F ratio
Frequency distribution
Hypothesis testing
Inferential statistics
Interval measurement
Intraclass correlation coefficient (ICC)
Level of measurement
Level of significance
Logistic regression
Mean
Median
Mode
Multiple correlation coefficient
Multiple regression
Multivariate statistics
N
Negative relationship
Nominal measurement
Nonsignificant result (NS)
Normal distribution
Number needed to treat (NNT)
Odds ratio (OR)
Ordinal measurement
p value
Parameter
Parameter estimation
Pearson’s r
Positive relationship
Predictor variable
327
r
R2
Range
Ratio measurement
Repeated measures ANOVA
Sensitivity
Skewed distribution
Spearman’s rho
Specificity
Standard deviation
Statistic
Statistical test
Statistically significant
Symmetric distribution
Test statistic
t-test
Type I error
Type II error
Variability
Statistical analysis is used in quantitative research for three main purposes—to describe
the data (e.g., sample characteristics), to test hypotheses, and to provide evidence
regarding measurement properties of quantified variables (see Chapter 10). This chapter
provides a brief overview of statistical procedures for these purposes. We begin,
however, by explaining levels of measurement.
TIP Although the thought of learning about statistics may be anxiety-
provoking, consider Florence Nightingale’s view of statistics: “To
understand God’s thoughts we must study statistics, for these are the measure
of His purpose.”
LEVELS OF MEASUREMENT
Statistical operations depend on a variable’s level of measurement. There are four
major levels of measurement.
Nominal measurement, the lowest level, involves using numbers simply to
categorize attributes. Gender is an example of a nominally measured variable (e.g.,
females = 1, males = 2). The numbers used in nominal measurement do not have
quantitative meaning and cannot be treated mathematically. It makes no sense to
compute a sample’s average gender.
Ordinal measurement ranks people on an attribute. For example, consider this
328
ordinal scheme to measure ability to perform activities of daily living (ADL): 1 =
completely dependent, 2 = needs another person’s assistance, 3 = needs mechanical
assistance, and 4 = completely independent. The numbers signify incremental ability to
perform ADL independently, but they do not tell us how much greater one level is than
another. As with nominal measures, the mathematic operations with ordinal-level data
are restricted.
Interval measurement occurs when researchers can rank people on an attribute and
specify the distance between them. Most psychological scales and tests yield interval-
level measures. For example, the Stanford-Binet Intelligence (IQ) test is an interval
measure. The difference between a score of 140 and 120 is equivalent to the difference
between 120 and 100. Many statistical procedures require interval data.
Ratio measurement is the highest level. Ratio scales, unlike interval scales, have a
meaningful zero and provide information about the absolute magnitude of the attribute.
Many physical measures, such as a person’s weight, are ratio measures. It is meaningful
to say that someone who weighs 200 pounds is twice as heavy as someone who weighs
100 pounds. Statistical procedures suitable for interval data are also appropriate for
ratio-level data. Variables with interval and ratio measurements often are called
continuous variables.
Example of different measurement levels
Grønning and colleagues (2014) tested the effect of a nurse-led education program for
patients with chronic inflammatory polyarthritis. Gender and diagnosis were
measured as nominal-level variables. Education (10 years, 11 to 12 years, 13+ years)
was an ordinal measurement. Many outcomes (e.g., self-efficacy, coping) were
measured on interval-level scales. Other variables were measured on a ratio level
(e.g., age, number of hospital admissions).
Researchers usually strive to use the highest levels of measurement possible because
higher levels yield more information and are amenable to powerful analyses.
HOW-TO-TELL TIP How can you tell a variable’s measurement
level? A variable is nominal if the values could be interchanged (e.g., 1 =
male, 2 = female OR 1 = female, 2 = male). A variable is usually ordinal if
there is a quantitative ordering of values AND if there are a small number
of values (e.g., excellent, good, fair, poor). A variable is usually considered
interval if it is measured with a composite scale or test. A variable is ratio
level if it makes sense to say that one value is twice as much as another
(e.g., 100 mg is twice as much as 50 mg).
DESCRIPTIVE STATISTICS
329
Statistical analysis enables researchers to make sense of numeric information.
Descriptive statistics are used to synthesize and describe data. When indexes such as
averages and percentages are calculated with population data, they are parameters. A
descriptive index from a sample is a statistic. Most research questions are about
parameters; researchers calculate statistics to estimate parameters and use inferential
statistics to make inferences about the population.
Descriptively, data for a continuous variable can be depicted in terms of three
characteristics: the shape of the distribution of values, central tendency, and variability.
Frequency Distributions
Data that are not organized are overwhelming. Consider the 60 numbers in Table 14.1.
Assume that these numbers are the scores of 60 preoperative patients on an anxiety
scale. Visual inspection of these numbers provides little insight into patients’ anxiety.
Frequency distributions impose order on numeric data. A frequency distribution is
an arrangement of values from lowest to highest and a count or percentage of how many
times each value occurred. A frequency distribution for the 60 anxiety scores (Table
14.2) makes it easy to see the highest and lowest scores, where scores clustered, and
how many patients were in the sample (total sample size is designated as N in research
reports).
330
Frequency data can be displayed graphically in a frequency polygon (Fig. 14.1). In
such graphs, scores typically are on the horizontal line, and counts or percentages are on
the vertical line. Distributions can be described by their shapes. Symmetric
distribution occurs if, when folded over, the two halves of a frequency polygon would
be superimposed (Fig. 14.2). In an asymmetric or skewed distribution, the peak is off
center, and one tail is longer than the other. When the longer tail points to the right, the
distribution has a positive skew, as in Figure 14.3A. Personal income is positively
skewed: Most people have moderate incomes, with only a few people with high
incomes at the distribution’s right end. If the longer tail points to the left, the
distribution has a negative skew (Fig. 14.3B). Age at death is negatively skewed: Most
people are at the far right end of the distribution, with fewer people dying young.
331
Another aspect of a distribution’s shape concerns how many peaks it has. A
332
unimodal distribution has one peak (Fig. 14.2A), whereas a multimodal distribution has
two or more peaks—two or more values of high frequency. A distribution with two
peaks is bimodal (Fig. 14.2B).
A special distribution called the normal distribution (a bell-shaped curve) is
symmetric, unimodal, and not very peaked (Fig. 14.2A). Many human attributes (e.g.,
height, intelligence) approximate a normal distribution.
Central Tendency
Frequency distributions clarify patterns, but an overall summary often is desired.
Researchers ask questions such as “What is the average daily calorie consumption of
nursing home residents?” Such a question seeks a single number to summarize a
distribution. Indexes of central tendency indicate what is “typical.” There are three
indexes of central tendency: the mode, the median, and the mean.
Mode: The mode is the number that occurs most frequently in a distribution. In the
following distribution, the mode is 53:
50 51 51 52 53 53 53 53 54 55 56
The value of 53 occurred four times, more than any other number. The mode of the
patients’ anxiety scores in Table 14.2 was 24. The mode identifies the most “popular”
value.
Median: The median is the point in a distribution that divides scores in half. Consider
the following set of values:
2 2 3 3 4 5 6 7 8 9
The value that divides the cases in half is midway between 4 and 5; thus, 4.5 is the
median. The median anxiety score is 24, the same as the mode. The median does not
take into account individual values and is insensitive to extremes. In the given set of
numbers, if the value of 9 were changed to 99, the median would remain 4.5.
Mean: The mean equals the sum of all values divided by the number of participants—
what we usually call the average. The mean of the patients’ anxiety scores is 23.4
(1,405 ÷ 60). As another example, here are the weights of eight people:
85 109 120 135 158 177 181 195
In this example, the mean is 145. Unlike the median, the mean is affected by the
value of every score. If we exchanged the 195-pound person for one weighing 275
pounds, the mean would increase from 145 to 155 pounds. In research articles, the mean
is often symbolized as M or X (e.g., X = 145).
For continuous variables, the mean is usually reported. Of the three indexes, the
mean is most stable: If repeated samples were drawn from a population, the means
would fluctuate less than the modes or medians. Because of its stability, the mean
usually is the best estimate of a population central tendency. When a distribution is
333
skewed, however, the median is preferred. For example, the median is a better index for
“average” (typical) income than the mean because income is positively skewed.
Variability
Two distributions with identical means could differ with respect to how spread out the
data are—how different people are from one another on the attribute. This section
describes the variability of distributions.
Consider the two distributions in Figure 14.4, which represent hypothetical scores
for students from two schools on an IQ test. Both distributions have a mean of 100, but
school A has a wider range of scores, with some below 70 and some above 130. In
school B, there are few low or high scores. School A is more heterogeneous (i.e., more
varied) than school B, and school B is more homogeneous than school A. Researchers
compute an index of variability to express the extent to which scores in a distribution
differ from one another. Two common indexes are the range and standard deviation.
Range: The range is the highest minus the lowest score in a distribution. In our
anxiety score example, the range is 15 (30 − 15). In the distributions in Figure 14.4,
the range for school A is about 80 (140 − 60), whereas the range for school B is
about 50 (125 − 75). The chief virtue of the range is ease of computation. Because it
is based on only two scores, however, the range is unstable: From sample to sample
drawn from a population, the range can fluctuate greatly.
Standard deviation: The most widely used variability index is the standard deviation.
Like the mean, the standard deviation is calculated based on every value in a
distribution. The standard deviation summarizes the average amount of deviation of
values from the mean.* In the example of patients’ anxiety scores (Table 14.2), the
standard deviation is 3.725. In research reports, the standard deviation is often
abbreviated as SD.
TIP SDs sometimes are shown in relation to the mean without a label. For
example, the anxiety scores might be shown as M = 23.4 (3.7) or M = 23.4 ±
3.7, where 23.4 is the mean and 3.7 is the SD.
334
An SD is more difficult to interpret than the range. For the SD of anxiety scores, you
might ask 3.725 what? What does the number mean? We can answer these questions
from several angles. First, the SD is an index of how variable scores in a distribution
are, and so, if (for example) male and female patients had means of 23.0 on the anxiety
scale, but their SDs were 7.0 and 3.0, respectively, it means that females were more
homogeneous (i.e., their scores were more similar to one another).
The SD represents the average of deviations from the mean. The mean tells us the
best value for summarizing an entire distribution, and an SD tells us how much, on
average, the scores deviate from the mean. An SD can be interpreted as our degree of
error when we use a mean to describe an entire sample.
In normal and near-normal distributions, there are roughly three SDs above and
below the mean, and a fixed percentage of cases fall within certain distances from the
mean. For example, with a mean of 50 and an SD of 10 (Fig. 14.5), 68% of all cases fall
within 1 SD above and below the mean. Sixty-eight percent of all cases fall within 1 SD
above and below the mean. Thus, nearly 7 of 10 scores are between 40 and 60. In a
normal distribution, 95% of the scores fall within 2 SDs of the mean. Only a handful of
cases—about 2% at each extreme—lie more than 2 SDs from the mean. Using this
figure, we can see that a person with a score of 70 achieved a higher score than about
98% of the sample.
TIP Descriptive statistics (percentages, means, SDs) are most often used to
describe sample characteristics and key research variables and to document
methodological features (e.g., response rates). They are seldom used to
335
answer research questions—inferential statistics usually are used for this
purpose.
Example of descriptive statistics
Awoleke and coresearchers (2015) studied factors that predicted delays in seeking
care for a ruptured tubal pregnancy in Nigeria. They presented descriptive statistics
about participants’ characteristics. The mean age of the 92 women in the sample was
30.3 years (SD = 5.6); 76.9% were urban dwellers, 74.7% were married, and 27.5%
had no prior births. The mean duration of amenorrhea before hospital presentation
was 5.5 weeks (SD = 4.0).
Bivariate Descriptive Statistics
So far, our discussion has focused on univariate (one-variable) descriptive statistics.
Bivariate (two-variable) descriptive statistics describe relationships between two
variables.
Crosstabulations
A crosstabs table is a two-dimensional frequency distribution in which the frequencies
of two variables are crosstabulated. Suppose we had data on patients’ gender and
whether they were nonsmokers, light smokers (<1 pack of cigarettes a day), or heavy
smokers (≥1 pack a day). The question is whether men smoke more heavily than
women, or vice versa (i.e., whether there is a relationship between smoking and
gender). Fictitious data for this example are shown in Table 14.3. Six cells are created
by placing one variable (gender) along one dimension and the other variable (smoking
status) along the other dimension. After subjects’ data are allocated to the appropriate
cells, percentages are computed. The crosstab shows that women in this sample were
more likely than men to be nonsmokers (45.4% vs. 27.3%) and less likely to be heavy
smokers (18.2% vs. 36.4%). Crosstabs are used with nominal data or ordinal data with
few values. In this example, gender is nominal, and smoking, as operationalized, is
ordinal.
Correlation
336
Relationships between two variables can be described by correlation methods. The
correlation question is: To what extent are two variables related to each other? For
example, to what degree are anxiety scores and blood pressure values related? This
question can be answered by calculating a correlation coefficient, which describes
intensity and direction of a relationship.
Two variables that are related are height and weight: Tall people tend to weigh more
than short people. The relationship between height and weight would be a perfect
relationship if the tallest person in a population was the heaviest, the second tallest
person was the second heaviest, and so on. A correlation coefficient indicates how
“perfect” a relationship is. Possible values for a correlation coefficient range from −1.00
through .00 to +1.00. If height and weight were perfectly correlated, the correlation
coefficient would be 1.00 (the actual correlation coefficient is in the vicinity of .50 to
.60 for a general population). Height and weight have a positive relationship because
greater height tends to be associated with greater weight.
When two variables are unrelated, the correlation coefficient is zero. One might
anticipate that women’s shoe size is unrelated to their intelligence. Women with large
feet are as likely to perform well on IQ tests as those with small feet. The correlation
coefficient summarizing such a relationship would be in the vicinity of .00.
Correlation coefficients between .00 and −1.00 express a negative (inverse)
relationship. When two variables are inversely related, higher values on one variable
are associated with lower values in the second. For example, there is a negative
correlation between depression and self-esteem. This means that, on average, people
with high self-esteem tend to be low on depression. If the relationship were perfect (i.e.,
if the person with the highest self-esteem score had the lowest depression score and so
on), then the correlation coefficient would be −1.00. In actuality, the relationship
between depression and self-esteem is moderate—usually in the vicinity of −.30 or
−.40. Note that the higher the absolute value of the coefficient (i.e., the value
disregarding the sign), the stronger the relationship. A correlation of −.50, for instance,
is stronger than a correlation of +.30.
The most widely used correlation statistic is Pearson’s r (the product–moment
correlation coefficient), which is computed with continuous measures. For correlations
between variables measured on an ordinal scale, researchers usually use an index called
Spearman’s rho. There are no guidelines on what should be interpreted as strong or
weak correlations because it depends on the variables. If we measured patients’ body
temperature orally and rectally, an r of .70 between the two measurements would be
low. For most psychosocial variables (e.g., stress and depression), however, an r of .70
would be high.
Correlation coefficients are often reported in tables displaying a two-dimensional
correlation matrix, in which every variable is displayed in both a row and a column,
and coefficients are displayed at the intersections. An example of a correlation matrix is
presented at the end of this chapter.
337
Example of correlations
Elder et al. (2016) investigated sleep and activity as they relate to body mass index
and waist circumference (WC). They found a modest positive correlation between
WC and sedentary activity (r = .17) and a modest negative correlation between sleep
duration and WC (r = −.11).
Describing Risk
The evidence-based practice (EBP) movement has made decision making based on
research findings an important issue. Several descriptive indexes can be used to
facilitate such decision making. Many of these indexes involve calculating risk
differences—for example, differences in risk before and after exposure to a beneficial
intervention.
We focus on describing dichotomous outcomes (e.g., had a fall/did not have a fall)
in relation to exposure or nonexposure to a beneficial treatment or protective factor.
This situation results in a 2 × 2 crosstabs table with four cells. The four cells in the
crosstabs table in Table 14.4 are labeled, so various indexes can be explained. Cell a is
the number of cases with an undesirable outcome (e.g., a fall) in an
intervention/protected group, cell b is the number with a desirable outcome (e.g., no
fall) in an intervention/protected group, and cells c and d are the two outcome
possibilities for a nontreated/unprotected group. We can now explain the meaning and
calculation of some indexes of interest to clinicians.
Absolute Risk
Absolute risk can be computed for those exposed to an intervention/protective factor
and for those not exposed. Absolute risk (AR) is simply the proportion of people who
experienced an undesirable outcome in each group. Suppose 200 smokers were
338
randomly assigned to a smoking cessation intervention or to a control group (Table
14.5). The outcome is smoking status 3 months later. Here, the AR of continued
smoking is .50 in the intervention group and .80 in the control group. Without the
intervention, 20% of those in the experimental group would presumably have stopped
smoking anyway, but the intervention boosted the rate to 50%.
Absolute Risk Reduction
The absolute risk reduction (ARR) index, a comparison of the two risks, is computed
by subtracting the AR for the exposed group from the AR for the unexposed group. This
index is the estimated proportion of people who would be spared the undesirable
outcome through exposure to an intervention/protective factor. In our example, the
value of ARR is .30: 30% of the control group subjects would presumably have stopped
smoking if they had received the intervention, over and above the 20% who stopped
without it.
Odds Ratio
The odds ratio is a widely reported risk index. The odds, in this context, is the
proportion of people with the adverse outcome relative to those without it. In our
example, the odds of continued smoking for the intervention group is 1.0: 50 (those who
continued smoking) divided by 50 (those who stopped). The odds for the control group
is 80 divided by 20, or 4.0. The odds ratio (OR) is the ratio of these two odds—here,
.25. The estimated odds of continuing to smoke are one-fourth as high among
intervention group members as for control group members. Turned around, the
estimated odds of continued smoking is 4 times higher among smokers who do not get
the intervention as among those who do.
Example of odds ratios
Draughon Moret and colleagues (2016) examined factors associated with patients’
339
acceptance of nonoccupational postexposure prophylaxis (nPEP) for HIV following
sexual assault; many results were reported as ORs. For example, patients were nearly
13 times more likely to accept the offer of nPEP if they were assaulted by more than
one assailant (OR = 12.66).
Number Needed to Treat
The number needed to treat (NNT) index estimates how many people would need to
receive an intervention to prevent one undesirable outcome. NNT is computed by
dividing 1 by the ARR. In our example, ARR = .30, and so NNT is 3.33. About three
smokers would need to be exposed to the intervention to avoid one person’s continued
smoking. The NNT is valuable because it can be integrated with monetary information
to show if an intervention is likely to be cost-effective.
TIP Another risk index is known as relative risk (RR). The RR is the
estimated proportion of the original risk of an adverse outcome (in our
example, continued smoking) that persists when people are exposed to the
intervention. In our example, RR is .625 (.50 / .80): The risk of continued
smoking is estimated as 62.5% of what it would have been without the
intervention.
INTRODUCTION TO INFERENTIAL STATISTICS
Descriptive statistics are useful for summarizing data, but researchers usually do more
than describe. Inferential statistics, based on the laws of probability, provide a means
for drawing inferences about a population, given data from a sample. Inferential
statistics are used to test research hypotheses.
Sampling Distributions
Inferential statistics are based on the assumption of random sampling of cases from
populations—although this assumption is widely ignored. Even with random sampling,
however, sample characteristics are seldom identical to those of the population. Suppose
we had a population of 100,000 nursing home residents whose mean score on a physical
function (PF) test was 500 with an SD of 100. We do not know these parameters—
assume we must estimate them based on scores from a random sample of 100 residents.
It is unlikely that we would obtain a mean of exactly 500. Our sample mean might be,
say, 505. If we drew a new random sample of 100 residents, the mean PF score might
be 497. Sample statistics fluctuate and are unequal to the parameter because of sampling
error. Researchers need a way to assess whether sample statistics are good estimates of
population parameters.
To understand the logic of inferential statistics, we must perform a mental exercise.
340
Consider drawing 5,000 consecutive samples of 100 residents per sample from the
population of all residents. If we calculated a mean PF score each time, we could plot
the distribution of these sample means, as shown in Figure 14.6. This distribution is a
sampling distribution of the mean. A sampling distribution is theoretical: No one
actually draws consecutive samples from a population and plots their means.
Statisticians have shown that sampling distributions of means are normally distributed,
and their mean equals the population mean. In our example, the mean of the sampling
distribution is 500, the same as the population mean.
For a normally distributed sampling distribution of means, the probability is 95 out
of 100 that a sample mean lies between +2 SD and −2 SD of the population mean. The
SD of the sampling distribution—called the standard error of the mean (or SEM)—can
be estimated using a formula that uses two pieces of information: the SD for the sample
and sample size. In our example, the SEM is 10 (Fig. 14.6), which is the estimate of
how much sampling error there would be from one sample mean to another in an
infinite number of samples of 100 residents.
We can now estimate the probability of drawing a sample with a certain mean. With
a sample size of 100 and a population mean of 500, the chances are 95 out of 100 that a
sample mean would fall between 480 and 520—2 SDs above and below the mean. Only
5 times out of 100 would the mean of a random sample of 100 residents be greater than
520 or less than 480.
The SEM is partly a function of sample size, so an increased sample size improves
the accuracy of the estimate. If we used a sample of 400 residents to estimate the
population mean, the SEM would be only 5. The probability would be 95 in 100 that a
sample mean would be between 490 and 510. The chance of drawing a sample with a
mean very different from that of the population is reduced as sample size increases.
You may wonder why you need to learn about these abstract statistical notions.
341
Consider, though, that we are talking about the accuracy of researchers’ results. As an
intelligent consumer, you need to evaluate critically how believable research evidence is
so that you can decide whether to incorporate it into your nursing practice.
Parameter Estimation
Statistical inference consists of two techniques: parameter estimation and hypothesis
testing. Parameter estimation is used to estimate a population parameter—for
example, a mean, a proportion, or a difference in means between two groups (e.g.,
smokers vs. nonsmokers). Point estimation involves calculating a single statistic to
estimate the parameter. In our example, if the mean PF score for a sample of 100
nursing home residents was 510, this would be the point estimate of the population
mean.
Point estimates convey no information about the estimate’s margin of error. Interval
estimation of a parameter provides a range of values within which the parameter has a
specified probability of lying. With interval estimation, researchers construct a
confidence interval (CI) around the point estimate. The CI around a sample mean
establishes a range of values for the population value and the probability of being right.
By convention, researchers use either a 95% or a 99% CI.
TIP CIs address a key EBP question for appraising evidence, as presented
in Box 2.1: How precise is the estimate of effects?
As noted previously, 95% of the scores in a normal distribution lie within about 2
SDs (more precisely, 1.96 SDs) from the mean. In our example, if the point estimate for
mean scores is 510 with an SD = 100, the SEM for a sample of 100 would be 10. We
can build a 95% CI using this formula: 95% CI = (X ± 1.96 × SEM). The confidence is
95% that the population mean lies between the values equal to 1.96 times the SEM,
above and below the sample mean. In our example, with an SEM of 10, the 95% CI
around the sample mean of 510 is between 490.4 and 529.6.
CIs reflect how much risk of being wrong researchers take. With a 95% CI,
researchers risk being wrong 5 times out of 100. A 99% CI sets the risk at only 1% by
allowing a wider range of possible values. In our example, the 99% CI around 510 is
484.2 to 535.8. With a lower risk of being wrong, precision is reduced. For a 95%
interval, the CI range is about 39 points; for a 99% interval, the range about 52 points.
The acceptable risk of error depends on the nature of the problem, but for most studies,
a 95% CI is sufficient.
Example of confidence intervals around odds ratio
Steindal and colleagues (2015) compared analgesics administered in the last 3 days of
life to young-old patients (aged 65 to 84 years) and oldest old patients (aged 85 years
and older). The young old were more than 3 times more likely than the oldest old to
342
receive paracetamol with codeine (OR = 3.25, 95% CI [1.02, 10.40]).
Hypothesis Testing
With statistical hypothesis testing, researchers use objective criteria to decide whether
hypotheses should be accepted or rejected. Suppose we hypothesized that maternity
patients who received online interactive breastfeeding support would breastfeed longer
than mothers who did not. The mean number of days of breastfeeding is 131.5 for 25
intervention group mothers and 125.1 for 25 control group mothers. Should we
conclude that our hypothesis has been supported? Group differences are in the predicted
direction, but in another sample, the group means might be more similar. Two
explanations for the observed outcome are possible: (1) The intervention was effective
in encouraging breastfeeding or (2) the mean difference in this sample was due to
chance (sampling error).
The first explanation is the research hypothesis, and the second is the null
hypothesis, which is that there is no relationship between the independent variable (the
intervention) and the dependent variable (breastfeeding duration). Statistical hypothesis
testing is a process of disproof. It cannot be demonstrated directly that the research
hypothesis is correct. But it is possible to show that the null hypothesis has a high
probability of being incorrect, and such evidence lends support to the research
hypothesis. Hypothesis testing helps researchers to make objective decisions about
whether results are likely to reflect chance differences or hypothesized effects.
Researchers use statistical tests in the hopes of rejecting the null hypothesis.
Null hypotheses are accepted or rejected based on sample data, but hypotheses are
about population values. The interest in testing hypotheses, as in all statistical inference,
is to use a sample to make inferences about a population.
Type I and Type II Errors
Researchers decide whether to accept or reject the null hypothesis by estimating how
probable it is that observed group differences are due to chance. Without population
data, it cannot be asserted that the null hypothesis is or is not true. Researchers must be
content to say that hypotheses are either probably true or probably false.
Researchers can make two types of error: rejecting a true null hypothesis or
accepting a false null hypothesis. Figure 14.7 summarizes possible outcomes of
researchers’ decisions. Researchers make a Type I error by rejecting a null hypothesis
that is, in fact, true. For instance, if we decided that online support effectively promoted
breastfeeding when, in fact, group differences were merely due to sampling error, we
would be making a Type I error—a false-positive conclusion. If we decided that
differences in breastfeeding were due to sampling fluctuations, when the intervention
actually did have an effect, we would be making a Type II error—a false-negative
conclusion.
343
Level of Significance
Researchers do not know when they have made an error in statistical decision making.
However, they control the risk for a Type I error by selecting a level of significance,
which is the probability of making a Type I error. The two most frequently used levels
of significance (referred to as alpha or α) are .05 and .01. With a .05 significance level,
we accept the risk that out of 100 samples from a population, a true null hypothesis
would be wrongly rejected 5 times. In 95 out of 100 cases, however, a true null
hypothesis would be correctly accepted. With a .01 significance level, the risk of a Type
I error is lower: In only 1 sample out of 100 would we wrongly reject the null. By
convention, the minimal acceptable alpha level is .05.
TIP Levels of significance are analogous to the CI values described earlier
—an alpha of .05 is analogous to the 95% CI, and an alpha of .01 is
analogous to the 99% CI.
Researchers would like to reduce the risk of committing both types of error, but
unfortunately, lowering the risk of a Type I error increases the risk of a Type II error.
Researchers can reduce the risk of a Type II error, however, by increasing the sample
size. The probability of committing a Type II error can be estimated through power
analysis, the procedure we mentioned in Chapter 10 with regard to sample size. Power
is the ability of a statistical test to detect true relationships. Researchers ideally use a
sample size that gives them a minimum power of .80 and thus a risk for a Type II error
of no more than .20 (i.e., a 20% risk).
TIP If a report indicates that a research hypothesis was not supported by
the data, consider whether a Type II error might have occurred as a result of
an inadequate sample size.
Tests of Statistical Significance
In hypothesis testing, researchers use study data to compute a test statistic. For every
344
test statistic, there is a theoretical sampling distribution, similar to the sampling
distribution of means. Hypothesis testing uses theoretical distributions to establish
probable and improbable values for the test statistics, which are used to accept or reject
the null hypothesis.
An example can illustrate this process. In our example of a physical functioning test
for nursing home residents, suppose that there are population norms, which are values
derived from large, representative samples. Let us assume that in the sampling
distribution for the norming data, the mean is 500 with an SEM of 10, as in Fig. 14.6.
Now let us say we recruited 100 nursing home residents to participate in an intervention
to improve physical functioning. The null hypothesis is that those receiving the
intervention have mean posttest scores that are not different from those in the overall
population—i.e., 500—but the research hypothesis is that they will have higher scores.
After the intervention, the mean PF score for the intervention group is 528. Now let us
suppose that Figure 14.6 shows the sampling distribution for this example, for a
population mean of 500 with a SEM of 10. As we can see, a mean score of 528 is more
than 2 SDs above the population mean—it is a value that is improbable if the null
hypothesis is true. Thus, we accept the research hypothesis that the intervention resulted
in higher physical functioning scores than those in the population.†
We would not be justified in saying that we had proved the research hypothesis
because the possibility of a Type I error remains—but the possibility is less than 5 in
100. Researchers reporting the results of hypothesis tests state whether their findings are
statistically significant.
The word significant does not mean important or meaningful. In statistics, the term
significant means that results are not likely to have been due to chance at some specified
level of probability. A nonsignificant result (NS) means that any observed difference
or relationship could have been the result of chance.
Overview of Hypothesis Testing Procedures
In the next section, a few statistical tests are discussed. We emphasize applications and
interpretations of statistical tests, not computations. Each statistical test can be used with
specific kinds of data, but the overall hypothesis testing process is similar for all tests:
1. Select a statistical test. Researchers select a test based on factors such as the
variables’ levels of measurement.
2. Specify the level of significance. An α level of .05 is usually chosen.
3. Compute a test statistic. The value for a test statistic is calculated with study data.
4. Determine degrees of freedom. The term degrees of freedom (df ) refers to the
number of observations free to vary about a parameter. The concept can be
confusing, but computing degrees of freedom is easy.
5. Compare the test statistic to a theoretical value. Theoretical distributions exist for all
test statistics. The computed value of the test statistic is compared to a theoretical
value to establish significance or nonsignificance.
345
When a computer is used for the analysis, as is almost always the case, researchers
follow only the first step. The computer calculates the test statistic, degrees of freedom,
and the actual probability that the relationship being tested is due to chance. For
example, the printout may indicate that the probability (p) of an intervention group
having a higher mean number of days of breastfeeding than a control group on the basis
of chance alone is .025. This means that fewer than 3 times out of 100 (only 25 times
out of 1,000) would a group difference of the size observed occur by chance. The
computed p value is then compared with the desired alpha. In this example, if we had
set the significance level to .05, the results would be significant because .025 is more
stringent than .05. Any computed probability greater than .05 (e.g., .15) indicates a
nonsignificant relationship, (i.e., one that could have occurred on the basis of chance in
more than 5 out of 100 samples).
TIP Most tests discussed in this chapter are parametric tests, which are
ones that focus on population parameters and involve certain assumptions
about variables in the analysis, notably the assumption that they are normally
distributed in the population. Nonparametric tests, by contrast, do not
estimate parameters and involve less restrictive assumptions about the
distribution’s shape.
BIVARIATE STATISTICAL TESTS
Researchers use a variety of statistical tests to make inferences about their hypotheses.
Several frequently used bivariate tests are briefly described and illustrated.
t-Tests
Researchers frequently compare two groups of people on an outcome. A parametric test
for testing differences in two group means is called a t-test.
Suppose we wanted to test the effect of early discharge of maternity patients on
perceived maternal competence. We administer a scale of perceived maternal
competence at discharge to 20 primiparas who had a vaginal delivery: 10 who remained
in the hospital 25 to 48 hours (regular discharge group) and 10 who were discharged 24
hours or less after delivery (early discharge group). Data for this example are presented
in Table 14.6. Mean scores for these two groups are 25.0 and 19.0, respectively. Are
these differences real (i.e., do they exist in the population of early and regular discharge
mothers?), or do group differences reflect chance fluctuations? The 20 scores vary from
one mother to another, ranging from a low of 13 to a high of 30. Some variation reflects
individual differences in maternal competence, some might result from participants’
moods on a particular day, and so forth. The research question is whether a significant
amount of the variation is associated with the independent variable—time of hospital
discharge. The t-test allows us to make inferences about this question objectively.
346
The formula for calculating the t statistic uses group means, variability, and sample
size. The computed value of t for the data in Table 14.6 is 2.86. Degrees of freedom
here is the total sample size minus 2 (df = 20 − 2 = 18). For an α level of .05, the cutoff
value for t with 18 degrees of freedom is 2.10. This value is the upper limit to what is
probable if the null hypothesis is true. Thus, the calculated t of 2.86, which is larger
than the theoretical value of t, is improbable (i.e., statistically significant). The
primiparas discharged early had significantly lower perceived maternal competence
than those who were not discharged early. In fewer than 5 out of 100 samples would a
difference in means this large be found by chance. In fact, the actual p value is .011:
Only in about 1 sample out of 100 would this size difference be found by chance.
The situation we just described requires an independent groups t-test: Mothers in the
two groups were different people, independent of each other. There are situations for
which this type of t-test is not appropriate. For example, if means for a single group of
people measured before and after an intervention were being compared, researchers
would compute a paired t-test (also called a dependent groups t-test), using a different
formula.
Example of t-tests
Najafi Ghezeljeh and colleagues (2016) tested the effects of a music intervention on
pain and anxiety in burn patients. They used independent groups t-tests to compare
the pain and anxiety scores of those in the music intervention versus those in the
control group, and they also used paired t-tests to assess differences before and after
the intervention within each group.
In lieu of t-tests, CIs can be constructed around the difference between two means.
In the example in Table 14.6, we can construct CIs around the mean difference of 6.0 in
maternal competence scores (25.0 − 19.0 = 6.0). For a 95% CI, the confidence limits are
1.6 and 10.4: We can be 95% confident that the difference between population means
for early and regular discharge mothers lies between these values. With CI information,
we can also see that the mean difference is significant at p < .05 because the range does
not include 0. There is a 95% probability that the mean difference is not lower than 1.6,
so this means that there is less than a 5% probability that there is no difference at all—
347
thus, the null hypothesis can be rejected.
Analysis of Variance
Analysis of variance (ANOVA) is used to test mean group differences of three or more
groups. ANOVA sorts out the variability of an outcome variable into two components:
variability due to the independent variable (e.g., experimental group status) and
variability due to all other sources (e.g., individual differences). Variation between
groups is contrasted with variation within groups to yield an F ratio statistic.
Suppose we were comparing the effectiveness of interventions to help people stop
smoking. Group A smokers receive nurse counseling, Group B smokers receive a
nicotine patch, and a control group (Group C) gets no intervention. The outcome is 1-
day cigarette consumption 1 month after the intervention. Thirty smokers are randomly
assigned to one of the three groups. The null hypothesis is that the population means for
posttreatment cigarette smoking are the same for all three groups, and the research
hypothesis is inequality of means. Table 14.7 presents fictitious data for the 30
participants. The mean numbers of posttreatment cigarettes consumed are 16.6, 19.2,
and 34.0 for groups A, B, and C, respectively. These means are different, but are they
significantly different—or do differences reflect random fluctuations?
An ANOVA applied to these data yields an F ratio of 4.98. For α = .05 and df = 2
and 27 (2 df between groups and 27 df within groups), the theoretical F value is 3.35.
Because our obtained F value of 4.98 exceeds 3.35, we reject the null hypothesis that
the population means are equal. The actual probability, as calculated by a computer, is
.014. In only 14 samples out of 1,000 would group differences this great be obtained by
chance alone.
ANOVA results support the hypothesis that different treatments were associated
with different cigarette smoking, but we cannot tell from these results whether treatment
A was significantly more effective than treatment B. Statistical analyses known as post
hoc tests (or multiple comparison procedures) are used to isolate the differences
348
between group means that resulted in the rejection of the overall null hypothesis.
A type of ANOVA known as repeated measures ANOVA (RM-ANOVA) can be
used when the means being compared are means at different points in time (e.g., mean
blood pressure at 2, 4, and 6 hours after surgery). This is analogous to a paired t-test,
extended to three or more points of data collection. When two or more groups are
measured several times, an RM-ANOVA provides information about a main effect for
time (Do the measures change significantly over time, irrespective of group?), a main
effect for groups (Do the group means differ significantly, irrespective of time?), and an
interaction effect (Do the groups differ more at certain times?).
Example of an ANOVA
In a cross-sectional study, Lester and coresearchers (2015) studied distress levels
among women who survived breast cancer, selected to represent four time periods in
the cancer trajectory. One-way ANOVA was used to compare the four groups in
terms of scores on a stress scale. Significant differences were found between groups
(F[3, 96] = 5.3, p = .002). Stress levels were lowest among women who were 6-
month posttreatment, compared to women who had been treated more recently.
Chi-Squared Test
The chi-squared (χ2) test is used to test hypotheses about differences in proportions, as
in a crosstab. For example, suppose we were studying the effect of nursing instruction
on patients’ compliance with self-medication. Nurses implement a new instructional
strategy with 50 patients, whereas 50 control group patients get usual care. The research
hypothesis is that a higher proportion of people in the intervention than in the control
condition will be compliant. Some fictitious data for this example are presented in Table
14.8, which shows that 60% of those in the intervention group were compliant,
compared to 40% in the control group. But is this 20 percentage point difference
statistically significant—i.e., likely to be “real”?
The value of the χ2 statistic for the data in Table 14.8 is 4.00, which we can compare
349
with the value from a theoretical chi-squared distribution. In this example, the
theoretical value that must be exceeded to establish significance at the .05 level is 3.84.
The obtained value of 4.00 is larger than would be expected by chance (the actual p =
.046). We can conclude that a significantly larger proportion of experimental patients
than control patients were compliant.
Example of chi-squared test
Zou and coresearchers (2016) undertook a randomized controlled trial to assess
whether sweet potato alleviates constipation in leukemia patients undergoing
chemotherapy. They used chi-squared tests to study group differences on several
outcomes. For example, a higher percentage of patients in the intervention group
(82.5%) than in the control group (52.4%) first defecated within 24 hours of
chemotherapy initiation (χ2 = 12.2, df = 1, p < .001).
As with means, we can construct CIs around the difference between two
proportions. In our example, the group difference in proportion compliant was .20 (.60
− .40 = .20). The 95% CI around .20 is .06 to .34. We can be 95% confident that the true
population difference in compliance rates between the groups is between 6% and 34%.
This interval does not include 0%, so we can be 95% confident that group differences
are “real” in the population.
Correlation Coefficients
Pearson’s r is both descriptive and inferential. As a descriptive statistic, r summarizes
the magnitude and direction of a relationship between two variables. As an inferential
statistic, r tests hypotheses about population correlations; the null hypothesis is that
there is no relationship between two variables, i.e., that the population r = .00.
Suppose we were studying the relationship between patients’ self-reported level of
stress (higher scores indicate more stress) and the pH level of their saliva. With a
sample of 50 patients, we find that r = −.29. This value indicates a tendency for people
with high stress to have lower pH levels than those with low stress. But is the r of −.29 a
random fluctuation observed only in this sample, or is the relationship significant?
Degrees of freedom for correlation coefficients equal N minus 2—48 in this example.
The theoretical value for r with df = 48 and α = .05 is .28. Because the absolute value of
the calculated r is .29, the null hypothesis is rejected: The relationship between patients’
stress level and the acidity of their saliva is statistically significant.
Example of Pearson’s r
Lewis and Cunningham (2016) studied nurses’ perceptions of nurse leadership in
relation to nurse burnout and engagement in a sample of 120 working nurses. Many
correlations were statistically significant. For example, scores on a burnout scale
350
were negatively correlated with perceptions of transformational leadership (r = −.54,
p < .05).
Effect Size Indexes
Effect size indexes are estimates of the magnitude of effects of an “I” component on an
“O” component in PICO questions—an important issue in EBP (see Box 2.1). Effect
size information can be crucial because, with large samples, even miniscule effects can
be statistically significant. P values tell you whether results are likely to be real, but
effect sizes suggest whether they are important. Effect size plays an important role in
meta-analyses.
It is beyond our scope to explain effect sizes in detail, but we offer an illustration. A
frequently used effect size index is the d statistic, which summarizes the magnitude of
differences in two means, such as the difference between intervention and control group
means on an outcome. Thus, d can be calculated to estimate effect size when t-tests are
used. When d is zero, it means that there is no effect—the means of the two groups
being compared are the same. By convention, a d of .20 or less is considered small, a d
of .50 is considered moderate, and a d of .80 or greater is considered large.
Different effect size indexes and interpretive conventions are associated with
different situations. For example, the r statistic can be interpreted directly as an effect
size index, as can the OR. The key point is that they encapsulate information about how
powerful the effect of an independent variable is on an outcome.
TIP Researchers who conduct a power analysis to estimate how big a
sample size they need to adequately test their hypotheses (i.e., to avoid a
Type II error) must estimate in advance how large the effect size will be—
usually based on prior research or a pilot study.
Example of calculated effect size
Hevezi (2015) conducted a pilot study of a meditation intervention to reduce the
stress associated with compassion fatigue among nurses, using a pretest–posttest
design and paired t-tests. Effect size indexes were also computed. For example,
scores on a burnout scale declined significantly after the intervention (t = 3.58, p =
.003), and the effect size was large: d = .92.
Guide to Bivariate Statistical Tests
The selection of a statistical test depends on several factors, such as number of groups
and the levels of measurement of the research variables. To aid you in evaluating the
appropriateness of statistical tests used by nurse researchers, Table 14.9 summarizes key
features of the bivariate tests mentioned in this chapter.
351
TIP Every time a report presents information about statistical tests such as
those described in this section, it means that the researcher was testing
hypotheses—whether those hypotheses were formally stated in the
introduction or not.
MULTIVARIATE STATISTICAL ANALYSIS
We wish we could avoid discussing complex statistical methods in this introductory-
level book. The fact is, however, that most quantitative nursing studies today rely on
multivariate statistics that involve the analysis of three or more variables
simultaneously. The increased use of sophisticated analytic methods has resulted in
greater rigor in nursing studies, but it can be challenging for those without statistical
training to fully understand research reports.
Given the introductory nature of this book and the fact that many of you are not
proficient with even basic statistical tests, we present only a brief description of three
widely used multivariate statistics. The supplement to this chapter on website
expands on this presentation.
Multiple Regression
Correlations enable researchers to make predictions. For example, if the correlation
between secondary school grades and nursing school grades were .60, nursing school
administrators could make predictions—albeit imperfect ones—about applicants’
performance in nursing school. Researchers can improve their prediction of an outcome
by performing a multiple regression in which several independent variables are
352
included in the analysis. As an example, we might predict infant birth weight (the
outcome) from such variables as mothers’ smoking, amount of prenatal care, and
gestational period. In multiple regression, outcome variables are continuous variables.
Independent variables (often called predictor variables in regression) are either
continuous variables or dichotomous nominal-level variables, such as male/female.
The statistic used in multiple regression is the multiple correlation coefficient,
symbolized as R. Unlike Pearson’s r, R does not have negative values. R varies from .00
to 1.00, showing the strength of the relationship between several predictors and an
outcome but not direction. Researchers can test whether R is statistically significant—
i.e., different from .00. R, when squared, can be interpreted as the proportion of the
variability in the outcome that is explained by the predictors. In predicting birth weight,
if we achieved an R of .50 (R2 = .25), we could say that the predictors accounted for one
fourth of the variation in birth weights. Three fourths of the variation, however, resulted
from factors not in the analysis. Researchers usually report multiple correlation results
in terms of R2 rather than R.
Example of multiple regression analysis
Bhandari and Kim (2015) explored factors that predicted health-promoting behaviors
among Nepalese migrant workers. In their multiple regression analysis, the
researchers found that age, gender, education, and perceived health were not
significant predictors of scores on a health-promoting behavior scale, but perceived
self-efficacy was. The overall R2 was modest (.06), but it was significant (p < .05).
Analysis of Covariance
Analysis of covariance (ANCOVA), which combines features of ANOVA and
multiple regression, is used to control confounding variables statistically—that is, to
“equalize” groups being compared. This approach is valuable in certain situations, like
when a nonequivalent control group design is used. When control through
randomization is lacking, ANCOVA offers the possibility of statistical control.
In ANCOVA, the confounding variables being controlled are called covariates.
ANCOVA tests the significance of differences between group means on an outcome
after removing the effect of covariates. ANCOVA produces F statistics to test the
significance of group differences. ANCOVA is a powerful and useful analytic technique
for controlling confounding influences on outcomes.
Example of ANCOVA
Ham (2015) studied socioeconomic and behavioral characteristics associated with
metabolic syndrome among overweight and obese school-aged children. The
biomarkers included such outcomes as blood pressure, cholesterol measurements, and
waist circumference. In the ANCOVA, behavioral factors such as fast food
353
consumption and engaging in regular exercise were the independent variables, and
age and gender were the covariates.
Logistic Regression
Logistic regression analyzes the relationships between multiple independent variables
and a nominal-level outcome (e.g., compliant vs. noncompliant). It is similar to multiple
regression, although it employs a different statistical estimation procedure. Logistic
regression transforms the probability of an event occurring (e.g., that a woman will
practice breast self-examination or not) into its odds. After further transformations, the
analysis examines the relationship of the predictor variables to the transformed outcome
variable. For each predictor, the logistic regression yields an OR, which is the factor by
which the odds change for a unit change in the predictors after controlling other
predictors. Logistic regression yields ORs for each predictor as well as CIs around the
ORs.
Example of logistic regression
Miller and colleagues (2016) examined the extent to which Braden Scale scores and
other nutrition screening variables (e.g., body mass index, weight loss) predict the
development of pressure ulcers (PU) in hospitalized patients. The initial Braden Scale
score was a significant predictor of hospital-acquired PU in the first week of
hospitalization (OR = .64, p =.009). The results indicated that every 5-point increase
on the Braden Scale was associated with a 36% reduction in the odds of PU
development.
MEASUREMENT STATISTICS
In Chapter 10, we described two measurement properties that represent key aspects of
measurement quality—reliability and validity. When a new measure is developed,
researchers undertake a psychometric assessment to estimate its reliability and validity.
Such psychometric assessments rely on statistical analyses, using indexes that we
briefly describe here. Researchers often report measurement statistics when they
describe the measures they opted to use, to provide evidence that their data can be
trusted.
Reliability Assessment
Reliability, it may be recalled, is the extent to which scores on a measure are consistent
across repeated measurements if the trait itself has not changed. In Chapter 10, we
mentioned three major types of reliability, each of which relies on different statistical
indexes: test–retest reliability, interrater reliability, and internal consistency reliability.
354
Test–retest reliability, which concerns the stability of a measure, is assessed by
making two separate measurements of the same people, often 1 to 2 weeks apart, and
then testing the extent to which the two sets of scores are consistent. Some
researchers use Pearson’s r to correlate the scores at Time 1 with those at Time 2, but
the preferred index for test–retest reliability is the intraclass correlation coefficient
(ICC), which can range in value from .00 to 1.00.
Interrater reliability is used to assess the extent to which two independent raters or
observers assign the same score in measuring an attribute. When the ratings are
dichotomous classifications (e.g., presence vs. absence of infusion phlebitis), the
preferred index is Cohen’s kappa, whose values also range from .00 to 1.00. If the
ratings are continuous scores, the ICC is usually used.
Internal consistency reliability concerns the extent to which the various components of
a multicomponent measure (e.g., items on a psychosocial scale) are consistently
measuring the same attribute. Internal consistency, a widely reported aspect of
reliability, is estimated by an index called coefficient alpha (or Cronbach’s alpha).
If a psychosocial scale includes several subscales, coefficient alpha is usually
computed for each subscale separately.
For all of these reliability indexes, the closer the value is to 1.00, the stronger is the
evidence of good reliability. Although opinions about minimally acceptable values vary,
values of .80 or higher are usually considered good. Researchers try to select measures
with previously demonstrated high levels of reliability, but if they are using a multi-item
scale, they usually compute coefficient alpha with their own data as well.
Validity Assessment
Validity is the measurement property that concerns the degree to which an instrument is
measuring what it purports to measure. Like reliability, validity has several aspects.
Unlike reliability, however, it is challenging to establish a measure’s validity.
Validation is a process of evidence building, and typically, multiple forms of evidence
are sought.
Content Validity
Content validity is relevant for composite measures, such as multi-item scales. The issue
is whether the content of the items adequately reflects the construct of interest. Content
validation usually relies on expert ratings of each item, and the ratings are used to
compute an index called the content validity index (CVI). A value of .90 or higher has
been suggested as providing evidence of good content validity.
Criterion Validity
Criterion validity concerns the extent to which scores on a measure are consistent with a
“gold standard” criterion. The methods used to assess criterion validity depend on the
level of measurement of the focal measure and the criterion.
355
When both the focal measure and the criterion are continuous, researchers
administer the two measures to a sample and then compute a Pearson’s r between the
two scores. Larger coefficients are desirable, but there is no threshold value that is
considered a minimum. Usually, statistical significance is the standard for concluding
that criterion validity is adequate.
If both the measure and the gold standard are dichotomous variables, researchers
often apply methods of assessing diagnostic accuracy. Sensitivity is the ability of a
measure to correctly identify a “case,” that is, to correctly screen in or diagnose a
condition. A measure’s sensitivity is its rate of yielding true positives. Specificity is the
measure’s ability to correctly identify noncases, that is, to screen out those without the
condition. Specificity is an instrument’s rate of yielding true negatives.
To assess an instrument’s sensitivity and specificity, researchers need a highly
reliable and valid criterion of “caseness” against which scores on the instrument can be
assessed. For example, if we wanted to test the validity of adolescents’ self-reports
about smoking (yes/no in past 24 hours), we could use urinary cotinine level, using a
cutoff value for a positive test of ≥200 ng/mL as the gold standard. Sensitivity would be
calculated as the proportion of teenagers who said they smoked and who had high
concentrations of cotinine, divided by all real smokers as indicated by the urine test.
Specificity would be the proportion of teenagers who accurately reported they did not
smoke, or the true negatives, divided by all real negatives. Both sensitivity and
specificity can range from .00 to 1.00. It is difficult to set standards of acceptability for
sensitivity and specificity, but both should be as high as possible.
When the focal measure is continuous and the gold standard is dichotomous,
researchers often use a statistical tool called a receiver operating characteristic (ROC)
curve. An ROC curve involves plotting each score on the focal measure against its
sensitivity and specificity for correct classification based on a dichotomous criterion. A
discussion of ROC curves is beyond the scope of this book, but interested readers can
consult Polit and Yang (2016).
Construct Validity
Construct validity concerns the extent to which a measure is truly measuring the target
construct and is often assessed using hypothesis testing procedures like those described
in previous sections of this chapter. For example, a researcher might hypothesize that
scores on a new measure (e.g., a scale of caregiver burden) would correlate with scores
on another established measure (e.g., a depression scale). Pearson’s r would be used to
test this hypothesis, and a significant correlation would provide some evidence of
construct validity. For known groups validity, which involves testing hypotheses about
expected group differences on a new measure, an independent groups t-test could be
used. Both bivariate and multivariate statistical tests are appropriate in assessments of a
new measure’s construct validity.
356
READING AND UNDERSTANDING STATISTICAL
INFORMATION
Measurement statistics are most likely to be presented in the methods section of a report
and are usually statistics reported previously by the instrument developer. Statistical
findings, however, are communicated in the results section. Statistical information is
described in the text and in tables (or, less frequently, in figures). This section offers
assistance in reading and interpreting statistical information.
Tips on Reading Text With Statistical Information
Both descriptive and inferential statistics are reported in results sections. Descriptive
statistics typically summarize sample characteristics. Information about the participants’
background helps readers to draw conclusions about the people to whom the findings
can be applied. Researchers may provide statistical information for evaluating biases.
For example, when a quasi-experimental or case-control design has been used,
researchers may test the equivalence of the groups being compared on baseline or
background variables, using tests such as t-tests.
For hypothesis testing, the text of research articles usually provides the following
information about statistical tests: (1) the test used, (2) the value of the calculated
statistic, (3) degrees of freedom, and (4) level of statistical significance. Examples of
how the results of various statistical tests might be reported in the text are shown in the
following text.
1. t-Test: t = 1.68, df = 160, p = .09
2. Chi-squared: χ2 = 16.65, df = 2, p < .001
3. Pearson’s r: r = .36, df = 100, p < .01
4. ANOVA: F = 0.18; df = 1, 69, ns
The preferred approach is to report significance as the computed probability that the
null hypothesis is correct, as in Example 1. In this case, the observed group mean
differences could be found by chance in 9 out of 100 samples. This result is not
statistically significant because the mean difference had an unacceptably high chance of
being spurious. The probability level is sometimes reported simply as falling below or
above the certain thresholds (Examples 2 and 3). These results are significant because
the probability of obtaining such results by chance is less than 1 in 100. You must be
careful to read the symbol following the p value correctly: The symbol < means less
than. The symbol > means greater than—i.e., the results are not significant if the p
value is .05 or greater. When results do not achieve statistical significance at the desired
level, researchers may simply indicate that the results were not significant (ns), as in
Example 4.
Statistical information often is noted parenthetically in a sentence describing the
findings, as in “Patients in the intervention group had a significantly lower rate of
357
infection than those in the control group (χ2 = 5.41, df = 1, p = .02).” In reading research
reports, the actual values of the test statistics (e.g., χ2) are of no inherent interest. What
is important is whether the statistical tests indicate that the research hypotheses were
accepted as probably true (as demonstrated by significant results) or rejected as
probably false (as demonstrated by NS).
Tips on Reading Statistical Tables
Tables allow researchers to condense a lot of statistical information and minimize
redundancy. Consider, for example, putting information about dozens of correlation
coefficients in the text.
Tables are efficient, but they may be daunting for novice readers partly because of
the absence of standardization. There is no universally accepted format for presenting t-
test results, for example. Thus, each table may present a new deciphering challenge.
We have a few suggestions for helping you to comprehend statistical tables. First,
read the text and the tables simultaneously—the text may help you figure out what the
table is communicating. Second, before trying to understand the numbers in a table, try
to glean information from the accompanying words. Table titles and footnotes often
present critical information. Table headings should be carefully scrutinized because they
indicate what the variables in the analysis are (often listed as row labels in the first
column, as in Table 14.10 on page 255) and what statistical information is included
(often specified as column headings). Third, you may find it helpful to consult the
glossary of symbols on the inside back cover of this book to check the meaning of a
statistical symbol. Not all symbols in this glossary were described in this chapter, so it
may be necessary to refer to a statistics textbook, such as that of Polit (2010), for further
information.
358
TIP In tables, probability levels associated with significance tests are
sometimes presented directly in the table, in a column labeled “p” (e.g., p =
.03). However, researchers sometimes indicate significance levels in tables
with asterisks placed next to the value of the test statistic. One asterisk
usually signifies p < .05, two asterisks signify p < .01, and three asterisks
signify p < .001 (there should be a key at the bottom of the table indicating
what the asterisks mean). Thus, a table might show t = 3.00 in one column
and p < .01 in another. Alternatively, the table might show t = 3.00**. The
absence of an asterisk would signify an NS result.
CRITIQUING QUANTITATIVE ANALYSES
It is often difficult to critique statistical analyses. We hope this chapter has helped to
demystify statistics, but we recognize the limited scope of our coverage. It would be
unreasonable to expect you to be adept at evaluating statistical analyses, but you can be
on the lookout for certain things in reviewing research articles. Some specific guidelines
are presented in Box 14.1.
Box 14.1 Guidelines for Critiquing Statistical Analyses
1. Did the descriptive statistics in the report sufficiently describe the major variables
and background characteristics of the sample? Were appropriate descriptive
statistics used—for example, was a mean presented when percentages would have
359
been more informative?
2. Were statistical analyses undertaken to assess threats to the study’s validity (e.g.,
to test for selection bias or attrition bias)?
3. Did the researchers report any inferential statistics? If inferential statistics were
not used, should they have been?
4. Was information provided about both hypothesis testing and parameter estimation
(i.e., confidence intervals)? Were effect sizes reported? Overall, did the reported
statistics provide readers with sufficient information about the study results?
5. Were any multivariate procedures used? If not, should they have been used—for
example, would the internal validity of the study be strengthened by statistically
controlling confounding variables?
6. Were the selected statistical tests appropriate, given the level of measurement of
the variables and the nature of the hypotheses?
7. Were the results of any statistical tests significant? What do the tests tell you
about the plausibility of the research hypotheses? Were effects sizeable?
8. Were the results of any statistical tests nonsignificant? Is it possible that these
reflect Type II errors? What factors might have undermined the study’s statistical
conclusion validity?
9. Was information about the reliability and validity of measures reported? Did the
researchers use measures with good measurement properties?
10. Was there an appropriate amount of statistical information? Were findings clearly
and logically organized? Were tables or figures used judiciously to summarize
large amounts of statistical information? Are the tables clear, with good titles and
row/column labels?
One aspect of the critique should focus on which analyses were reported. You
should assess whether the statistical information adequately describes the sample and
reports the results of statistical tests for all hypotheses. Another presentational issue
concerns the researcher’s judicious use of tables to summarize statistical information.
A thorough critique also addresses whether researchers used the appropriate
statistics. Table 14.9 provides guidelines for some frequently used bivariate statistical
tests. The major issues to consider are the number of independent and dependent
variables, the levels of measurement of the research variables, and the number of groups
(if any) being compared.
If researchers did not use a multivariate technique, you should consider whether the
bivariate analysis adequately tests the relationship between the independent and
dependent variables. For example, if a t-test or ANOVA was used, could the internal
validity of the study have been enhanced through the statistical control of confounding
variables, using ANCOVA? The answer will often be “yes.”
Finally, you can be alert to possible exaggerations or subjectivity in the reported
results. Researchers should never claim that the data proved, verified, confirmed, or
demonstrated that the hypotheses were correct or incorrect. Hypotheses should be
360
described as being supported or not supported, accepted or rejected.
The main task for beginning consumers in reading a results section of a research
report is to understand the meaning of the statistical tests. What do the quantitative
results indicate about the researcher’s hypothesis? How believable are the findings? The
answer to such questions form the basis for interpreting the research results, a topic
discussed in Chapter 15.
In this section, we provide details about the analysis in a nursing study,
followed by some questions to guide critical thinking. Read the summary and
then answer the critical thinking questions that follow, referring to the full
research report if necessary. Example 1 is featured on the interactive Critical
Thinking Activity on website. The critical thinking questions for
Example 2 are based on the study that appears in its entirety in Appendix A of
this book. Our comments for these exercises are in the Student Resources
section on .
EXAMPLE 1: DESCRIPTIVE AND INFERENTIAL
STATISTICS
Study: Psychological characteristics and traits for finding benefit from
prostate cancer: Correlates and predictors (Pascoe & Edvardsson, 2015)
Statement of Purpose: The purpose of this study was to explore the
correlates and predictors of finding benefit from prostate cancer among men
undergoing androgen deprivation therapy (ADT).
Methods: The researchers used a descriptive correlational design. They
collected data from a sample of 209 men undergoing ADT in an acute tertiary
hospital outpatient setting in Australia. Study participants completed self-
report questionnaires that asked questions about demographic and clinical
characteristics. The questionnaire also included several psychological scales,
including scales to measure coping, anxiety, depression, and resilience. The
researchers noted that a theoretical model of the coping process led them to
select independent variables that comprise “psychological factors that may be
influential to fostering or maintaining positive emotional states, which
includes finding benefit” (p. 3). Participants completed the Benefit Finding
Scale, a 17-item scale that asks about potential benefits of having experienced
prostate cancer (e.g., “ . . . has helped me take things as they come”). The
researchers indicated that, in their sample of men, internal consistency for this
scale was strong (α = .96). Good internal consistency was also found for the
361
coping scale (α = .85), the anxiety scale (α = .85), the depression scale (α =
.79), and the resilience scale (α = .90).
Descriptive Statistics: The researchers presented descriptive statistics
(means, SDs, ranges, and percentages) to describe the characteristics of
sample members, in terms of both demographic characteristics and scores on
the psychological scales. Table 14.10 presents descriptive information for
selected variables. The men in the sample ranged in age from 53 to 92 years,
and their mean age was 72.0 years (±7.2). The typical participant was in a
relationship (76.6%) and retired (73.2%). Just over half of the men had
postsecondary education (53.1%). In terms of the participants’ scores on the
psychological scales, there was a good range of values, indicating adequate
variability. Scores on the Benefit Finding Scale ranged from 17 to 85, which
corresponds to the full range of possible scores.
Hypothesis Tests: The researchers used Pearson’s r to test hypotheses that
benefit finding for these men was correlated with various psychological
characteristics. Table 14.11 presents a correlation matrix that shows the
values of r for pairs of selected variables (the researchers’ correlation matrix
was more comprehensive). This table lists, on the left, six variables: Variable
1, scores on the Benefit Finding Scale (the dependent variable); Variable 2,
education level; Variable 3, scores on the coping scale; Variable 4, scores on
the depression scale; Variable 5, scores on the anxiety scale; and Variable 6,
age. The correlation matrix shows, in column 1, the correlation coefficient
between benefit finding scores and all other variables. At the intersection of
row 1–column 1, we find 1.00, which indicates that the scores are perfectly
correlated with themselves. The next entry in column 1 is the r between
benefit finding scores and education level. The value of .09 indicates a very
modest, positive relationship between these two variables—a relationship that
was not statistically significant and so could be zero. The strongest
correlation for the finding benefit scores was with scores on the coping scale,
r = .59, p < .01.
362
Multivariate Analyses: The researchers found that six of their independent
variables were significantly correlated with scores on the benefit finding
scale. These six variables were entered into a multiple regression analysis.
The R2 for these six predictor variables was .38, p < .001. These variables
explained 38% of the variance in finding benefit from prostate cancer. Self-
reported coping made the largest contribution, suggesting that helping
patients identify coping strategies might be valuable.
Critical Thinking Exercises
1. Answer the relevant questions from Box 14.1 regarding this study.
2. Also consider the following targeted questions:
a. Using information from Table 14.11, with which variable was the men’s
educational level significantly correlated? What does the correlation
indicate?
b. What is the strongest correlation in Table 14.11? What is the weakest
correlation in this table? What do the correlations indicate?
3. What might be some of the uses to which the findings could be put in
clinical practice?
EXAMPLE 2: STATISTICAL ANALYSIS IN THE STUDY IN
APPENDIX A
• Read the results section of Swenson and colleagues’ (2016) study
(“Parents’ use of praise and criticism in a sample of young children
seeking mental health services”) in Appendix A of this book.
Critical Thinking Exercises
1. Answer the relevant questions from Box 14.1 regarding this study.
2. Also consider the following targeted questions:
a. Looking at Table 1, what percentage of parents had graduated from
college? What was the mean score (and the SD) of the parents on the
CES-D depression scale?
b. In Table 2, what percentage of parents reported that they “almost never”
praised their child? And what percentage reported that they “almost
never” criticized their child?
c. In Table 4, what was the correlation coefficient between parents’ self-
reported use of criticism and their score on the depressive symptom
scale? Was this correlation statistically significant?
WANT TO KNOW MORE?
A wide variety of resources to enhance your learning and understanding of
363
this chapter are available on .
• Interactive Critical Thinking Activity
• Chapter Supplement on Multivariate Statistics
• Answer to the Critical Thinking Exercise for Example 2
• Internet Resources with useful websites for Chapter 14
• A Wolters Kluwer journal article in its entirety—the Pascoe and
Edvardsson study described as Example 1 on pp. 254–256.
Additional study aids, including eight journal articles and related
questions, are also available in Study Guide for Essentials of Nursing
Research, 9e.
Summary
Points
There are four levels of measurement: (1) nominal measurement—the
classification of attributes into mutually exclusive categories, (2) ordinal
measurement—the ranking of people based on their relative standing on an
attribute, (3) interval measurement—indicating not only people’s rank order but
also the distance between them, and (4) ratio measurement—distinguished from
interval measurement by having a rational zero point. Interval- and ratio-level
measures are often called continuous.
Descriptive statistics are used to summarize and describe quantitative data.
In frequency distributions, numeric values are ordered from lowest to highest,
together with a count of the number (or percentage) of times each value was
obtained.
Data for a continuous variable can be completely described in terms of the shape
of the distribution, central tendency, and variability.
A distribution’s shape can be symmetric or skewed, with one tail longer than the
other; it can also be unimodal with one peak (i.e., one value of high frequency) or
multimodal with more than one peak. A normal distribution (bell-shaped curve)
is symmetric, unimodal, and not too peaked.
Indexes of central tendency represent average or typical value of a set of scores.
The mode is the value that occurs most frequently, the median is the point above
which and below which 50% of the cases fall, and the mean is the arithmetic
average of all scores. The mean is the most stable index of central tendency.
Indexes of variability—how spread out the data are—include the range and
364
standard deviation. The range is the distance between the highest and lowest
scores. The standard deviation (SD) indicates how much, on average, scores
deviate from the mean.
In a normal distribution, 95% of values lie within 2 SDs above and below the
mean.
A crosstabs table is a two-dimensional frequency distribution in which the
frequencies of two nominal- or ordinal-level variables are crosstabulated.
Correlation coefficients describe the direction and magnitude of a relationship
between two variables and range from −1.00 (perfect negative correlation)
through .00 to +1.00 (perfect positive correlation). The most frequently used
correlation coefficient is Pearson’s r, used with continuous variables.
Spearman’s rho is usually the correlation coefficient used when variables are
measured on an ordinal scale.
Statistical indexes that describe the effects of exposure to risk factors or
interventions provide useful information for clinical decisions. A widely reported
risk index is the odds ratio (OR), which is the ratio of the odds for an exposed
versus unexposed group, with the odds reflecting the proportion of people with an
adverse outcome relative to those without it.
Inferential statistics, based on laws of probability, allow researchers to make
inferences about population parameters based on data from a sample.
The sampling distribution of the mean is a theoretical distribution of the means of
an infinite number of same-sized samples drawn from a population. Sampling
distributions are the basis for inferential statistics.
The standard error of the mean (SEM)—the SD of this theoretical distribution—
indicates the degree of average error of a sample mean; the smaller the SEM, the
more accurate are estimates of the population value.
Statistical inference consists of two approaches: hypothesis testing and parameter
estimation (estimating a population value).
Point estimation provides a single value of a population estimate (e.g., a mean).
Interval estimation provides a range of values—a confidence interval (CI) —
between which the population value is expected to fall, at a specified probability.
Most often, the 95% CI is reported, which indicates that there is a 95% probability
that the true population value lies between the upper and lower confidence limits.
Hypothesis testing through statistical tests enables researchers to make objective
decisions about relationships between variables.
The null hypothesis is that no relationship exists between variables; rejection of the
null hypothesis lends support to the research hypothesis. In testing hypotheses,
researchers compute a test statistic and then see if the statistic falls beyond a
critical region on the theoretical distribution. The value of the test statistic
365
indicates whether the null hypothesis is “improbable.”
A Type I error occurs if a null hypothesis is wrongly rejected (false positives). A
Type II error occurs when a null hypothesis is wrongly accepted (false
negatives).
Researchers control the risk of making a Type I error by selecting a level of
significance (or alpha level), which is the probability that such an error will
occur. The .05 level (the conventional standard) means that in only 5 out of 100
samples would the null hypothesis be rejected when it should have been accepted.
The probability of committing a Type II error is related to power, the ability of a
statistical test to detect true relationships. The standard criterion for an acceptable
level of power is .80. Power increases as sample size increases.
Results from hypothesis tests are either significant or nonsignificant; statistically
significant means that the obtained results are not likely to be due to chance
fluctuations at a given probability (p value).
Two common statistical tests are the t-test and analysis of variance (ANOVA),
both of which can be used to test the significance of the difference between group
means; ANOVA is used when there are three or more groups. Repeated measures
ANOVA (RM-ANOVA) is used when data are collected at multiple time points.
The chi-squared test is used to test hypotheses about group differences in
proportions.
Pearson’s r can be used to test whether a correlation is significantly different from
zero.
Effect size indexes (such as the d statistic) summarize the strength of the effect of
an independent variable (e.g., an intervention) on an outcome variable.
Multivariate statistics are used in nursing research to untangle complex
relationships among three or more variables.
Multiple regression analysis is a method for understanding the effect of two or
more predictor (independent) variables on a continuous dependent variable. The
squared multiple correlation coefficient (R2) is an estimate of the proportion of
variability in the outcome variable accounted for by the predictors.
Analysis of covariance (ANCOVA) controls confounding variables (called
covariates) before testing whether group mean differences are statistically
significant.
Logistic regression is used in lieu of multiple regression when the outcome is
dichotomous.
Statistics are also used in psychometric assessments to quantify a measure’s
reliability and validity.
For test–retest reliability, the preferred index is the intraclass correlation
366
coefficient (ICC). Cohen’s kappa is used to estimate interrater reliability when
the ratings of two independent raters are dichotomous. The index used to estimate
internal consistency reliability is coefficient alpha. Reliability coefficients of .80
or higher are desirable.
In terms of content validity, expert ratings of scale items are used to compute a
content validity index (CVI).
Criterion validity is assessed with different statistical methods depending on the
measurement level of the focal measure and the criterion. When both are
dichotomous, sensitivity and specificity are usually calculated. Sensitivity is the
instrument’s ability to identify a case correctly (i.e., its rate of yielding true
positives). Specificity is the instrument’s ability to identify noncases correctly
(i.e., its rate of yielding true negatives).
Construct validity is evaluated using hypothesis testing procedures, so statistical
tests such as those described in this chapter (e.g., Pearson’s r, t-tests) are
appropriate.
REFERENCES FOR CHAPTER 14
*Awoleke, J., Adanikin, A., & Awoleke, A. (2015). Ruptured tubal pregnancy: Predictors of delays in seeking and
obtaining care in a Nigerian population. International Journal of Women’s Health, 27, 141–147.
Bhandari, P., & Kim, M. (2015). Predictors of the health-promoting behaviors of Nepalese migrant workers. The
Journal of Nursing Research, 24, 232–239.
Draughon Moret, J., Hauda, W., II, Price, B., & Sheridan, D. (2016). Nonoccupational postexposure human
immunodeficiency virus prophylaxis: Acceptance following sexual assault. Nursing Research, 65, 47–54.
Elder, B., Ammar, E., & Pile, D. (2016). Sleep duration, activity levels, and measures of obesity in adults. Public
Health Nursing, 33(3), 200–205.
*Grønning, K., Rannestad, T., Skomsvoll, J., Rygg, L., & Steinsbekk, A. (2014). Long-term effects of a nurse-led
group and individual patient education programme for patients with chronic inflammatory polyarthritis—a
randomised controlled trial. Journal of Clinical Nursing, 23, 1005–1017.
Ham, O. K. (2015). Socioeconomic and behavioral characteristics associated with metabolic syndrome among
overweight/obese school-age children. Journal of Cardiovascular Nursing. Advance online publication.
Hevezi, J. A. (2015). Evaluation of a meditation intervention to reduce the effects of stressors associated with
compassion fatigue among nurses. Journal of Holistic Nursing. Advance online publication.
*Lester, J., Crosthwaite, K., Stout, R., Jones, R., Holloman, C., Shapiro, C., & Andersen, B. (2015). Women with
breast cancer: Self-reported distress in early survivorship. Oncology Nursing Forum, 42, E17–E23.
Lewis, H. S., & Cunningham, C. (2016). Linking nurse leadership and work characteristics to nurse burnout and
engagement. Nursing Research, 65, 13–23.
Miller, N., Frankenfield, D., Lehman, E., Maguire, M., & Schirm, V. (2016). Predicting pressure ulcer
development in clinical practice: Evaluation of Braden Scale scores and nutrition parameters. Journal of Wound,
Ostomy, and Continence Nursing, 43, 133–139.
Najafi Ghezeljeh, T., Mohades Ardebilii, F., Rafii, F., & Haghani, H. (2016). The effects of music intervention on
background pain and anxiety in burn patients: Randomized controlled clinical trial. Journal of Burn Care &
Research, 37(4), 226–234.
**Pascoe, E. C., & Edvardsson, D. (2015). Psychological characteristics and traits for finding benefit from prostate
cancer: Correlates and predictors. Cancer Nursing. Advance online publication.
Polit, D. F. (2010). Statistics and data analysis for nursing research (2nd ed.). Upper Saddle River, NJ: Pearson.
Polit, D. F., & Yang, F. (2016). Measurement and the measurement of change. Philadelphia, PA: Wolters Kluwer.
Steindal, S., Bredal, I., Ranhoff, A., Sørbye, L., & Lerdal, A. (2015). The last three days of life: A comparison of
pain management in the young old and the oldest old hospitalised patients using the Resident Assessment
Instrument for Palliative Care. International Journal of Older People Nursing, 10, 263–272.
Zou, J., Xu, Y., Wang, X., Jiang, Q., & Zhu, X. (2016). Improvement of constipation in leukemia patients
367
undergoing chemotherapy using sweet potato. Cancer Nursing, 39, 181–186.
*A link to this open-access article is provided in the Internet Resources section on website.
**This journal article is available on for this chapter.
*Formulas for computing the SD and other statistics discussed in this chapter are not
shown in this textbook. The emphasis here is on helping you to understand statistical
applications. Polit (2010) can be consulted for computation.
†The design for our fictitious example is highly flawed, with several serious threats to
internal validity. We used this contrived example purely as a simple way to illustrate
hypothesis testing.
368
15 Interpretation and Clinical
Significance in Quantitative
Research
Learning Objectives
On completing this chapter, you will be able to:
Describe dimensions for interpreting quantitative research results
Describe the mindset conducive to a critical interpretation of research results
Identify approaches to an assessment of the credibility of quantitative results, and
undertake such an assessment
Distinguish statistical and clinical significance
Identify some methods of drawing conclusions about clinical significance at the group
and individual levels
Critique researchers’ interpretation of their results in a discussion section of a report
Define new terms in the chapter
Key Terms
Benchmark
Change score
Clinical significance
CONSORT guidelines
Minimal important change (MIC)
Results
In this chapter, we consider approaches to interpreting researchers’ statistical results,
which requires consideration of the various theoretical, methodological, and practical
decisions that researchers make in undertaking a study. We also discuss an important
but seldom discussed topic: clinical significance.
369
INTERPRETATION OF QUANTITATIVE RESULTS
Statistical results are summarized in the “Results” section of a research article.
Researchers present their interpretations of the results in the “Discussion” section.
Researchers are seldom totally objective, though, so you should develop your own
interpretations.
Aspects of Interpretation
Interpreting study results involves attending to six different but overlapping
considerations, which intersect with the “Questions for Appraising the Evidence”
presented in Box 2.1:
The credibility and accuracy of the results
The precision of the estimate of effects
The magnitude of effects and importance of the results
The meaning of the results, especially with regard to causality
The generalizability of the results
The implications of the results for nursing practice, theory development, or further
research
Before discussing these considerations, we want to remind you about the role of
inference in research thinking and interpretation.
Inference and Interpretation
An inference involves drawing conclusions based on limited information, using logical
reasoning. Interpreting research findings entails making multiple inferences. In research,
virtually everything is a “stand-in” for something else. A sample is a stand-in for a
population, a scale score is a proxy for the magnitude of an abstract attribute, and so on.
Research findings are meant to reflect “truth in the real world”—the findings are
“stand-ins” for the true state of affairs (Fig. 15.1). Inferences about the real world are
valid to the extent that the researchers have made good decisions in selecting proxies
and have controlled sources of bias. This chapter offers several vantage points for
assessing whether study findings really do reflect “truth in the real world.”
The Interpretive Mindset
Evidence-based practice (EBP) involves integrating research evidence into clinical
decision making. EBP encourages clinicians to think critically about clinical practice
and to challenge the status quo when it conflicts with “best evidence.” Thinking
370
critically and demanding evidence are also part of a research interpreter’s job. Just as
clinicians should ask, “What evidence is there that this intervention will be beneficial?”
so must interpreters ask, “What evidence is there that the results are real and true”?
To be a good interpreter of research results, you can profit by starting with a
skeptical (“show me”) attitude and a null hypothesis. The null hypothesis in
interpretation is that the results are wrong and the evidence is flawed. The “research
hypothesis” is that the evidence reflects the truth. Interpreters decide whether the null
hypothesis has merit by critically examining methodologic evidence. The greater the
evidence that the researcher’s design and methods were sound, the less plausible is the
null hypothesis that the evidence is inaccurate.
CREDIBILITY OF QUANTITATIVE RESULTS
A critical interpretive task is to assess whether the results are right. This corresponds to
the first question in Box 2.1: “What is the quality of the evidence—i.e., how rigorous
and reliable is it?” If the results are not judged to be credible, the remaining interpretive
issues (the meaning, magnitude, precision, generalizability, and implications of results)
are unlikely to be relevant.
A credibility assessment requires a careful analysis of the study’s methodologic and
conceptual limitations and strengths. To come to a conclusion about whether the results
closely approximate “truth in the real world,” each aspect of the study—its design,
sampling plan, data collection, and analyses—must be subjected to critical scrutiny.
There are various ways to approach the issue of credibility, including the use of the
critiquing guidelines we have offered throughout this book and the overall critiquing
protocol presented in Table 4.1. We share some additional perspectives in this section.
Proxies and Interpretation
Researchers begin with constructs and then devise ways to operationalize them. The
constructs are linked to actual research strategies in a series of approximations; the
better the proxies, the more credible the results are likely to be. In this section, we
illustrate successive proxies using sampling concepts to highlight the potential for
inferential challenges.
When researchers formulate research questions, the population of interest is often
abstract. For example, suppose we wanted to test the effectiveness of an intervention to
increase physical activity in low-income women. Figure 15.2 shows the series of steps
between the abstract population construct (low-income women) and actual study
participants. Using data from the actual sample on the far right, the researcher would
like to make inferences about the effectiveness of the intervention for a broader group,
but each proxy along the way represents a potential problem for achieving the desired
inference. In interpreting a study, readers must consider how plausible it is that the
actual sample reflects the recruited sample, the accessible population, the target
371
population, and the population construct.
Table 15.1 presents a description of a hypothetical scenario in which the researchers
moved from the population construct (low-income women) to a sample of 161
participants (recent welfare recipients from two neighborhoods in Los Angeles). The
table identifies questions that could be asked in drawing inferences about the study
results. Answers to these questions would affect the interpretation of whether the
intervention really is effective with low-income women—or only with recent welfare
recipients in Los Angeles who were cooperative with the research team.
Researchers make methodologic decisions that affect inferences, and these decisions
must be scrutinized. However, prospective participants’ behavior also needs to be
considered. In our example, 300 women were recruited for the study, but only 161
provided data. The final sample of 161 almost surely would differ in important ways
from the 139 who declined, and these differences affect the study evidence.
Fortunately, researchers are increasingly documenting participant flow in their
studies—especially in intervention studies. Guidelines called the Consolidated
Standards of Reporting Trials or CONSORT guidelines have been adopted by major
medical and nursing journals to help readers track study participants. CONSORT flow
charts, when available, should be scrutinized in interpreting study results. Figure 15.3
provides an example of such a flowchart for a randomized controlled trial (RCT). The
chart shows that 295 people were assessed for eligibility, but 95 either did not meet
eligibility criteria or refused to be in the study. Of the 200 study participants, half were
372
randomized to the experimental group and the other half to the control group (N = 100
in each group). However, only 83 in the intervention group actually received the full
intervention. At the 3-month follow-up, researchers attempted to obtain data from 96
people in the intervention group (everyone who did not move or die). They did get
follow-up data from 92 in the intervention group (and 89 in the control group), and
these 181 comprised the analysis sample.
Credibility and Validity
Inference and validity are inextricably linked. To be careful interpreters, readers must
search for evidence that the desired inferences are, in fact, valid. Part of this process
involves considering alternative competing hypotheses about the credibility and
meaning of the results.
In Chapter 9, we discussed four types of validity that relate to the credibility of
study results: statistical conclusion validity, internal validity, external validity, and
construct validity. We use our sampling example (Fig. 15.2 and Table 15.1) to
demonstrate the relevance of methodologic decisions to all four types of validity—and
373
hence to inferences about study results.
In our example, the population construct is low-income women, which was
translated into population eligibility criteria stipulating California public assistance
recipients. Yet, there are alternative operationalizations of the population construct (e.g.,
California women living below the official poverty level). Construct validity, it may be
recalled, involves inferences from the particulars of the study to higher order constructs.
So it is fair to ask, Do the eligibility criteria adequately capture the population construct,
low-income women?
Statistical conclusion validity—the extent to which correct inferences can be made
about the existence of “real” group differences—is also affected by sampling decisions.
Ideally, researchers would do a power analysis at the outset to estimate how large a
sample they needed. In our example, let us assume (based on previous research) that the
effect size for the exercise intervention would be small to moderate, with d = .40. For a
power of .80, with risk of a Type I error set at .05, we would need a sample of about
200 participants. The actual sample of 161 yields a nearly 30% risk of a Type II error,
i.e., wrongly concluding that the intervention was not successful.
External validity—the generalizability of the results—is affected by sampling. To
whom would it be safe to generalize the results in this example—to the population
construct of low-income women? to all welfare recipients in California? to all new
welfare recipients in Los Angeles who speak English or Spanish? Inferences about the
extent to which the study results correspond to “truth in the real world” must take
sampling decisions and sampling problems (e.g., recruitment difficulties) into account.
Finally, the study’s internal validity (the extent to which a causal inference can be
made) is also affected by sample composition. In this example, attrition would be a
concern. Were those in the intervention group more likely (or less likely) than those in
the control group to drop out of the study? If so, any observed differences in outcomes
could be caused by individual differences in the groups (e.g., differences in motivation
to stay in the study) rather than by the intervention itself.
Methodological decisions and the careful implementation of those decisions—
whether they be about sampling, intervention design, measurement, research design, or
analysis—inevitably affect the rigor of a study. And all of them can affect the four types
of validity and hence the interpretation of the results.
Credibility and Bias
A researcher’s job is to translate abstract constructs into appropriate proxies. Another
major job concerns efforts to eliminate, reduce, or control biases—or, as a last resort, to
detect and understand them. As a reader of research reports, your job is to be on the
lookout for biases and to consider them into your assessment about the credibility of the
results.
Biases create distortions and undermine researchers’ efforts to reveal “truth in the
real world.” Biases are pervasive and virtually inevitable. It is important to consider
374
what types of bias might be present and how extensive, sizeable, and systematic they
are. We have discussed many types of bias in this book—some reflect design
inadequacies (e.g., selection bias), others reflect recruitment problems (nonresponse
bias), and others relate to measurement (social desirability). Table 15.2 presents biases
and errors mentioned in this book. This table is meant to serve as a reminder of some of
the problems to consider in interpreting study results.
TIP The supplement to this chapter on website includes a longer
list of biases, including some not described in this book; we offer definitions
for all biases listed. Different disciplines, and different writers, use different
names for the same or similar biases. The actual names are unimportant—but
it is important to reflect on how different forces can distort results and affect
inferences.
Credibility and Corroboration
Earlier, we noted that research interpreters should seek evidence to disconfirm the “null
hypothesis” that research results are wrong. Some evidence to discredit this null
hypothesis comes from the quality of the proxies that stand in for abstractions. Ruling
out biases also undermines the null hypothesis. Another strategy is to seek corroboration
for the results.
Corroboration can come from internal and external sources, and the concept of
replication is an important one in both cases. Interpretations are aided by considering
prior research on the topic, for example. Interpreters can examine whether the study
results are congruent with those of other studies. Consistency across studies tends to
discredit the “null hypothesis” of erroneous results.
Researchers may have opportunities for replication themselves. For example, in
multisite studies, if the results are similar across sites, this suggests that something
“real” is occurring. Triangulation can be another form of replication. We are strong
advocates of mixed methods studies (see Chapter 13). When findings from the analysis
of qualitative data are consistent with the results of statistical analyses, internal
corroboration can be especially powerful and persuasive.
375
OTHER ASPECTS OF INTERPRETATION
If an assessment leads you to accept that the results of a study are probably “real,” you
have made important progress in interpreting the study findings. Other interpretive tasks
depend on a conclusion that the results are likely credible.
Precision of the Results
Results from statistical hypothesis tests indicate whether a relationship or group
difference is probably “real.” A p value in hypothesis testing offers information that is
important (whether the null hypothesis is probably false) but incomplete. Confidence
intervals (CIs), by contrast, communicate information about how precise the study
results are. Dr. David Sackett, a founding father of the EBP movement, and his
colleagues (2000) said this about CIs: “P values on their own are . . . not informative. . .
. By contrast, CIs indicate the strength of evidence about quantities of direct interest,
such as treatment benefit. They are thus of particular relevance to practitioners of
evidence-based medicine” (p. 232). It seems likely that nurse researchers will
increasingly report CI information in the years ahead because of its value for
interpreting study results and assessing their utility for nursing practice.
Magnitude of Effects and Importance
In quantitative studies, results that support the researcher’s hypotheses are described as
significant. A careful analysis of study results involves evaluating whether, in addition
to being statistically significant, the effects are large and clinically important.
Attaining statistical significance does not necessarily mean that the results are
meaningful to nurses and clients. Statistical significance indicates that the results are
unlikely to be due to chance—not that they are important. With large samples, even
modest relationships are statistically significant. For instance, with a sample of 500, a
correlation coefficient of .10 is significant at the .05 level, but a relationship this weak
may have little practical relevance. This issue concerns an important EBP question (Box
2.1): “What is the evidence—what is the magnitude of effects?” Estimating the
magnitude and importance of effects is relevant to the issue of clinical significance, a
topic we discuss later in this chapter.
The Meaning of Quantitative Results
In quantitative studies, statistical results are in the form of p values, effect sizes, and
CIs, to which researchers and consumers must attach meaning. Questions about the
meaning of statistical results often reflect a desire to interpret causal connections.
Interpreting what descriptive results mean is not typically a challenge. For example,
suppose we found that, among patients undergoing electroconvulsive therapy (ECT),
the percentage who experience an ECT-induced headache is 59.4% (95% CI [56.3,
376
63.1]). This result is directly interpretable. But if we found that headache prevalence is
significantly lower in a cryotherapy intervention group than among patients given
acetaminophen, we would need to interpret what the results mean. In particular, we need
to interpret whether it is plausible that cryotherapy caused the reduced prevalence of
headaches. In this section, we discuss the interpretation of research outcomes within a
hypothesis testing context, with an emphasis on causal interpretations.
Interpreting Hypothesized Results
Interpreting statistical results is easiest when hypotheses are supported, i.e., when there
are positive results. Researchers have already considered prior findings and theory in
developing hypotheses. Nevertheless, a few caveats should be kept in mind.
It is important to avoid the temptation of going beyond the data to explain what
results mean. For example, suppose we hypothesized that pregnant women’s anxiety
level about childbearing is correlated with the number of children they have. The data
reveal a significant negative relationship between anxiety levels and parity (r = −.40).
We interpret this to mean that increased experience with childbirth results in decreased
anxiety. Is this conclusion supported by the data? The conclusion appears logical, but in
fact, there is nothing in the data that leads to this interpretation. An important, indeed
critical, research precept is correlation does not prove causation. The finding that two
variables are related offers no evidence suggesting which of the two variables—if either
—caused the other. In our example, perhaps causality runs in the opposite direction, i.e.,
a woman’s anxiety level influences how many children she bears. Or maybe a third
variable, such as the woman’s relationship with her husband, influences both anxiety
and number of children. Inferring causality is especially difficult in studies that have not
used an experimental design.
Empirical evidence supporting research hypotheses never constitutes proof of their
veracity. Hypothesis testing is probabilistic. There is always a possibility that observed
relationships resulted from chance—that is, that a Type I error has occurred.
Researchers must be tentative about their results and about interpretations of them.
Thus, even when the results are in line with expectations, researchers should draw
conclusions with restraint.
Example of corroboration of a hypothesis
Houck and colleagues (2011) studied factors associated with self-concept in 145
children with attention deficit hyperactivity disorder (ADHD). They hypothesized
that behavior problems in these children would be associated with less favorable self-
concept, and they found that internalizing behavior problems were significantly
predictive of lower self-concept scores. In their discussion, they stated that “age and
internalizing behaviors were found to negatively influence the child’s self-concept”
(p. 245).
377
This study is a good example of the challenges of interpreting findings in
correlational studies. The researchers’ interpretation was that behavior problems
influenced (“caused”) low self-concept. This conclusion is supported by earlier research,
yet there is nothing in the data that would rule out the possibility that a child’s self-
concept influenced his or her behavior or that some other factor influenced both
behavior and self-concept. The researchers’ interpretation is plausible, but their cross-
sectional design makes it difficult to rule out other explanations. A major threat to the
internal validity of the inference in this study is temporal ambiguity.
Interpreting Nonsignificant Results
Nonsignificant results pose interpretative challenges. Statistical tests are geared toward
disconfirmation of the null hypothesis. Failure to reject a null hypothesis can occur for
many reasons, and the real reason may be hard to figure out.
The null hypothesis could actually be true, accurately reflecting the absence of a
relationship among research variables. On the other hand, the null hypothesis could be
false. Retention of a false null hypothesis (a Type II error) can result from such
methodologic problems as poor internal validity, an anomalous sample, a weak
statistical procedure, or unreliable measures. In particular, failure to reject null
hypotheses is often a consequence of insufficient power, usually reflecting too small a
sample size.
It is important to recognize that a null hypothesis that is not rejected does not
confirm the absence of relationships among variables. Nonsignificant results provide no
evidence of the truth or the falsity of the hypothesis.
Because statistical procedures are designed to test support for rejecting null
hypotheses, they are not well suited for testing actual research hypotheses about the
absence of relationships or about equivalence between groups. Yet sometimes, this is
exactly what researchers want to do, especially in clinical situations in which the goal is
to test whether one practice is as effective as another—but perhaps less painful or
costly. When the actual research hypothesis is null (e.g., a prediction of no group
difference), stringent additional strategies must be used to provide supporting evidence.
It is useful to compute effect sizes or CIs to illustrate that the risk of a Type II error was
small.
Example of support for a hypothesized nonsignificant result
Lavender and colleagues (2013) conducted a trial to test the hypothesis that a baby
wash product formulated for newborn bathing is not inferior to bathing with water
alone, in terms of transepidermal water loss (TEWL) and other secondary outcomes.
In their relatively large sample of 307 healthy infants, none of the group differences
was statistically significant. The difference in TEWL values was only .08 glm2/h, p =
.89, 95% CI [−1.24, 1.07]. The researchers concluded, “We were unable to detect any
differences between newborn wash product and water” (p. 203).
378
Interpreting Unhypothesized Significant Results
Unhypothesized significant results can occur in two situations. The first involves
exploring relationships that were not considered during the design of the study. For
example, in examining correlations among research variables, a researcher might notice
that two variables that were not central to the research questions were nevertheless
significantly correlated—and interesting.
Example of a serendipitous significant finding
Latendresse and Ruiz (2011) studied the relationship between chronic maternal stress
and preterm birth. They reported an unexpected finding that maternal use of selective
serotonin reuptake inhibitors (SSRIs) was associated with a 12-fold increase in
preterm births.
The second situation is more perplexing and happens infrequently: obtaining results
opposite to those hypothesized. For instance, a researcher might hypothesize that
individualized teaching about AIDS risks is more effective than group instruction, but
the results might indicate that the group method was significantly better. Although this
might seem disconcerting, research should not be undertaken to corroborate predictions
but rather to arrive at truth. There is no such thing as a study whose results “came out
wrong” if they reflect the truth. When significant findings are opposite to what was
hypothesized, the interpretation should involve comparisons with other research, a
consideration of alternate theories, and a critical scrutiny of the research methods.
Example of a significant result contrary to hypothesis
Dotson and colleagues (2014), who tested hypotheses about nurse retention with a
sample of 861 registered nurses (RNs), predicted that higher levels of altruism would
be associated with stronger intentions to stay in nursing; however, the opposite was
found. They speculated that this might mean that some nurses “are no longer
experiencing the fulfillment of their altruistic desires in the field of nursing” (p. 115).
In summary, interpreting the meaning of research results is a demanding task, but it
offers the possibility of intellectual rewards. Interpreters must play the role of scientific
detectives, trying to make pieces of the puzzle fit together so that a coherent picture
emerges.
Generalizability of the Results
Researchers typically seek evidence that can be used by others. If a new nursing
intervention is found to be successful, others might want to adopt it. Therefore, another
interpretive question is whether the intervention will “work” or whether the
relationships will “hold” in other settings, with other people. Part of the interpretive
379
process involves asking the question, “To what groups, environments, and conditions
can the results reasonably be applied?”
In interpreting a study’s generalizability, it is useful to consider our earlier
discussion about proxies. For which higher order constructs, which populations, which
settings, or which versions of an intervention were the study operations good “stand-
ins”?
Implications of the Results
Once you have reached conclusions about the credibility, precision, importance,
meaning, and generalizability of the results, you are ready to think about their
implications. You might consider the implications of the findings with respect to future
research: What should other researchers in this area do—what is the right “next step”?
You are most likely to consider the implications for nursing practice: How should the
results be used by nurses in their practice?
All of the interpretive dimensions we have discussed are critical in evidence-based
nursing practice. With regard to generalizability, it may not be enough to ask a broad
question about to whom the results could apply—you need to ask, Are these results
relevant to my particular clinical situation? Of course, if you have concluded that the
results have limited credibility or importance, they may be of little utility to your
practice.
CLINICAL SIGNIFICANCE
It has long been recognized that statistical hypothesis testing provides limited
information for interpretation purposes. In particular, attaining statistical significance
does not address the question of whether a finding is clinically meaningful or relevant.
With a large enough sample, a trivial relationship can be statistically significant.
Broadly speaking, we define clinical significance as the practical importance of
research results in terms of whether they have genuine, palpable effects on the daily
lives of patients or on the health care decisions made on their behalf.
In fields other than nursing, notably in medicine and psychotherapy, recent attention
has been paid to defining clinical significance and developing ways to operationalize it.
There has been no consensus on either front, but a few conceptual and statistical
solutions are being used with some regularity. In this section, we provide a brief
overview of recent advances in defining and operationalizing clinical significance;
further information is available in Polit and Yang (2016).
In statistical hypothesis testing, consensus was reached decades ago—for better or
worse—that a p value of .05 would be the standard criterion for statistical significance.
It is unlikely that a uniform standard will ever be adopted for clinical significance,
however, because of its complexity. For example, in some cases, no change over time
could be clinically significant if it means that a group with a progressive disease has not
380
deteriorated. In other cases, clinical significance is associated with improvements.
Another issue concerns whose perspective on clinical significance is relevant.
Sometimes, clinicians’ perspective is key because of implications for health
management (e.g., regarding cholesterol levels). For other outcomes, the patient’s view
is what matters (e.g., about quality of life). Two other issues concern whether clinical
significance is for group-level findings or about individual patients and whether clinical
significance is attached to point-in-time outcomes or to change scores. Most recent
work is about the clinical significance of change scores for individual patients (e.g., a
change from a baseline measurement to a follow-up measurement). We begin, however,
with a brief discussion of group-level clinical significance.
Clinical Significance at the Group Level
Many studies concern group-level comparisons. For example, one-group pretest–
posttest designs involve comparing a group at two or more points in time, to examine
whether or not a change in outcomes has occurred, on average. In RCTs and case-
control studies, the central comparison is about average differences for different groups
of people. Group-level clinical significance typically involves using statistical
information other than p values to draw conclusions about the usefulness of research
findings. The most widely used statistics for this purpose are effect size (ES) indexes,
CIs, and number needed to treat (NNT).
ES indexes summarize the magnitude of a change or a relationship and thus provide
insights into how a group, on average, might benefit from a treatment. In most cases, a
clinically significant finding at the group level means that the ES is sufficiently large to
have relevance for patients. CIs are espoused by several writers as useful tools for
understanding clinical significance; CIs provide the most plausible range of values, at a
given level of confidence, for the unknown population parameter. NNTs are sometimes
promoted as useful indicators of clinical significance because the information is
relatively easy to understand. For example, if the NNT for an important outcome is
found to be 2.0, only two patients have to receive a particular treatment in order for one
patient to benefit. If the NNT is 10.0, however, 9 patients out of 10 receiving the
treatment would get no benefit.
With any of these group-level indexes, researchers should designate in advance what
would constitute clinical significance—just as they would establish an alpha value for
statistical significance. For example, would an ES of .20 (for the d index described in
Chapter 14) be considered clinically significant? A d of .20 has been described as a
“small” effect, but sometimes, small improvements can have clinical relevance. Claims
about attainment of clinical significance for groups should be based on defensible
criteria.
Example of clinical significance at the group level
Despriee and Langeland (2016) tested the effect of 30% sucrose compared with a
381
placebo (water) on relieving pain during the immunization of 15-month-old children.
The mean group difference of 15 fewer seconds of crying among infants in the
intervention group was statistically significant. The large ES led the researchers to
conclude that the improvement was also clinically significant.
Clinical Significance at the Individual Level
Clinicians usually are not interested in what happens in a group of people—they are
concerned with individual patients. As noted in Chapter 2, a key goal in EBP is to
personalize “best evidence” into decisions for a specific patient’s needs, within a
particular clinical context. Efforts to come to conclusions about clinical significance at
the individual level can be directly linked to EBP goals.
Dozens of approaches to defining and operationalizing clinical significance at the
individual level have been developed, but they share one thing in common: They
involve establishing a benchmark (or threshold) that designates the score value on a
measure (or the value of a change score) that would be considered clinically important.
With an established benchmark for clinical significance, each person in a study can be
classified as having or not having a score or change score that is clinically significant.
Conceptual Definitions of Clinical Significance
Numerous definitions of clinical significance can be found in the health literature, most
of which concern changes in measures of patient outcomes (e.g., a score at Time 1
subtracted from a score at Time 2). One approach to conceptualizing clinical
significance dominates medical fields. In a paper cited hundreds of times in the medical
literature, Jaeschke and colleagues (1989) offered the following definition: “The
minimal clinically important difference (MCID) can be defined as the smallest
difference in score in the domain of interest which patients perceive as beneficial and
which would mandate, in the absence of troublesome side effects and excessive cost, a
change in the patient’s management” (p. 408). Although these researchers referred to
the conceptual threshold for clinical significance as a minimal clinically important
difference (MCID), we follow an influential group of measurement experts in using the
term minimal important change (MIC) because the focus is on individual change
scores, not differences between groups.
Operationalizing Clinical Significance: Establishing the Minimal
Important Change Benchmark
The Jaeschke et al. (1989) definition regarding change score benchmarks has inspired
researchers to go in many different directions to quantify it. Broadly speaking, the MIC
benchmark is usually operationalized as a value for the amount of change in score
points on a measure that an individual patient must achieve to be considered as having a
clinically important change.
382
A traditional approach to setting a benchmark for health outcomes is to obtain input
from a panel of health care experts—sometimes called a consensus panel. For example,
a consensus panel convened in 2005 to establish the clinical significance of changes in
self-reported pain intensity (e.g., on a visual analog scale) established the benchmark as
a 30% reduction in pain.
Another approach is to undertake a study to determine what patients themselves
think is a minimally important change on a focal measure. The developers of many new
multi-item scales now use this approach to estimate the MIC as part of the psychometric
assessment of their instrument. Calculating an MIC using patient ratings of important
change requires a lot of work, however, and a careful research design with a large
sample of people whose change over time is expected to vary.
A third approach to defining the MIC is based on the distributional characteristics of
a measure. Most often, the MIC using this approach is set to a threshold of 0.5 SDs—
i.e., one half a standard deviation (SD) on a distribution of baseline scores. For example,
if the baseline SD for a scale were 6.0, then the MIC using the 0.5 SD criterion would be
3.0. This value, like any MIC, can be used as the benchmark to classify individual
patients as having or not having experienced clinically meaningful change.
Many researchers have used the MIC to interpret group-level findings. The MIC is,
however, an index of individual change, not group differences. Experts have warned
that it is inappropriate to interpret mean differences in relation to the MIC. For example,
if the MIC on an important outcome has been established as 4.0, this value should not
be used to interpret the clinical significance of the mean difference between two groups.
If the mean group difference were found to be 3.0, for instance, it would be wrong to
conclude that the results were not clinically significant. A mean difference of 3.0
suggests that a sizeable percentage of participants did achieve a clinically meaningful
benefit—i.e., an improvement of 4 points or more.
MIC thresholds can be used to calculate rates of clinical significance for individual
study participants. Once the MIC is known, researchers can classify all people in a study
in terms of their having attained or not attained the threshold. Then, researchers can
compare the percentage of people who “responded” at clinically important levels in the
study groups (e.g., those in the intervention and those in the control group). Such a
responder analysis is easy to understand and has strong implications for EBP.
Example of a responder analysis
Lima and colleagues (2015) examined blood pressure responses to walking and
resistance exercise in patients with peripheral artery disease. The researchers used a
previously established MIC of a 4-mm Hg decrease in diastolic or systolic blood
pressure to classify participants. Chi-squared analysis and t-tests were used to
compare the clinical characteristics of responders (those who benefited at clinically
significant levels from exercise) and nonresponders.
383
CRITIQUING INTERPRETATIONS
Researchers offer an interpretation of their findings and discuss what the findings might
imply for nursing in the discussion section of research articles. When critiquing a study,
your own interpretation can be contrasted against those of the researchers.
A good discussion section should point out study limitations. Researchers are in the
best position to detect and assess sampling deficiencies, practical constraints, data
quality problems, and so on, and it is a professional responsibility to alert readers to
these difficulties. Also, when researchers acknowledge methodologic shortcomings,
readers know that these limitations were considered in interpreting the results. Of
course, researchers are unlikely to note all relevant limitations. Your task as reviewer is
to develop your own interpretation and assessment of methodologic problems, to
challenge conclusions that do not appear to be warranted, and to consider how the
study’s evidence could have been enhanced.
You should also carefully scrutinize causal interpretations, especially in
nonexperimental studies. Sometimes, even the titles of reports suggest a potentially
inappropriate causal inference. If the title of a nonexperimental study includes terms
like “the effect of . . . ” or “the impact of . . . ,” this may signal the need for critical
scrutiny of the researcher’s inferences.
In addition to comparing your interpretation with that of the researchers, your
critique should also draw conclusions about the stated implications of the study. Some
researchers make grandiose claims or offer unfounded recommendations on the basis of
modest results.
Clinical significance is a new topic in this edition of our book. The
conceptualization and operationalization of clinical significance have not received much
attention in nursing, and so studies that do not mention clinical significance should not
be faulted for this omission—but studies that do address clinical significance should be
lauded. We hope that nurse researchers will pay more attention to this issue in the years
ahead.
Some guidelines for evaluating researchers’ interpretation are offered in Box 15.1.
Box 15.1 Guidelines for Critiquing Interpretations/Discussions in Quantitative
Research Reports
Interpretation of the Findings
1. Were all the important results discussed?
2. Did the researchers discuss any study limitations and their possible effects on the
credibility of the findings? Did the interpretations take limitations into account?
3. What types of evidence were offered in support of the interpretation, and was that
evidence persuasive? Were results interpreted in light of findings from other
studies?
4. Did the researchers make any unjustifiable causal inferences? Were alternative
384
explanations for the findings considered? Were the rationales for rejecting these
alternatives convincing?
5. Did the interpretation take into account the precision of the results and/or the
magnitude of effects?
6. Did the researchers draw any unwarranted conclusions about the generalizability
of the results?
Implications of the Findings and Recommendations
7. Did the researchers discuss the study’s implications for clinical practice or future
nursing research? Did they make specific recommendations?
8. If yes, are the stated implications appropriate, given the study’s limitations and
the magnitude of the effects—as well as evidence from other studies? Are there
important implications that the report neglected to include?
Clinical Significance
9. Did the researchers mention or assess clinical significance? Did they make a
distinction between statistical and clinical significance?
10. If clinical significance was examined, was it assessed in terms of group-level
information (e.g., effect sizes) or individual-level results? If the latter, how was
clinical significance operationalized?
In this section, we provide details about the interpretive portion of a
quantitative study. Read the summary and then answer the critical thinking
questions that follow, referring to the full research report if necessary.
Example 1 is featured on the interactive Critical Thinking Activity on
website. The critical thinking questions for Examples 2 and 3 are based on the
studies that appear in their entirety in Appendices A and C of this book. Our
comments for Example 2 are in the Student Resources section on .
EXAMPLE 1: INTERPRETATION IN A QUANTITATIVE
STUDY
Study: Neurobehavioral effects of aspartame consumption (Lindseth et al.,
2014)
Statement of Purpose: The purpose of this study was to examine the effects
of consuming diets with higher amounts of aspartame (25 mg/kg body
weight/day) versus lower amounts of aspartame (10 mg/kg body weight/day)
on neurobehavioral outcomes.
385
Method: The researchers used a randomized crossover design to assess the
effects of aspartame amounts. Study participants were 28 healthy adults,
university students, who consumed study-prepared diets. Participants were
randomized to orderings of the aspartame protocol (i.e., some received the
high-aspartame diet first, others received the low amount first). Participants
were blinded to which diet they were receiving, and data collectors were also
blinded. They consumed one of the diets for an 8-day period, followed by a 2-
week washout period. Then, they consumed the alternative diet for another 8
days. At the end of each 8-day session, measurements were made for
neurobehavioral outcomes, including cognition (working memory and spatial
visualization), depression, and mood (irritability).
Analyses: Within-subjects tests (paired t-tests, repeated measures analysis of
variance) were used to test the statistical significance of differences in
outcomes for the two dietary protocols, with alpha set at .05. In terms of
clinical significance, a participant was considered to have a clinically
significant neurobehavioral effect if his or her score was 2+ SDs outside the
mean score for normal functioning based on norms for each measure. Thus,
change scores for participants were not computed. Rather, each score was
assessed for crossing the benchmark value for a normative state—a criterion
that has been frequently used in trials of psychotherapeutic interventions.
Results: Statistically significant differences, favoring the low-aspartame diet,
were observed for three neurobehavioral outcomes: spatial orientation,
depression, and irritability. Despite the fact that the participants were healthy
adult students, a few of them experienced clinically significant outcomes in
the high-aspartame condition. For example, two participants had clinically
significant cognitive impairment (two with working memory deficits and two
others with spatial orientation impairment) after 8 days of consuming the
high-aspartame diet. Three other participants (different from the four with
cognitive impairment) had clinically relevant levels of depression at the end
of the high-aspartame condition. None of the participants’ scores was
clinically significant after 8 days on the low-aspartame diet.
Discussion: The researchers devoted a large portion of their discussion
section to the issue of corroboration, which we mentioned in connection with
effects to interpret the credibility of study results. They pointed out ways in
which their findings were consistent with (or diverged from) other studies on
the effects of aspartame. In keeping with the researchers’ use of a strong
experimental design, they concluded that there was a causal relationship
between high amounts of aspartame consumption and negative
neurobehavioral effects: “A high dose of aspartame caused more irritability
and depression than a low-aspartame dose consumed by the same
participants, supporting earlier study findings by Walton et al. (1993)” (p.
386
191). The researchers also commented on the clinical significance findings:
“Additionally, three participants in our study scored in the clinically
depressed category while consuming the high-aspartame diet, despite no
previous histories of depression” (p. 191). The researchers concluded their
discussion section with remarks about the limitations of their study, which
included problems of generalizability: “Limitations of our study included the
small homogeneous sample, which may make it difficult to apply our
conclusions to other study populations. Also, our sample size of 28
participants resulted in statistical power of .72, which is on the lower end of
the acceptable range. A washout period before the baseline assessments and
using food diaries during the between-treatment washout period to verify that
aspartame was not consumed would have strengthened the design” (p. 191).
Critical Thinking Exercises
1. Answer the relevant questions from Box 15.1 regarding this study. (We
encourage you to read the report in its entirety, especially the discussion,
to answer these questions.)
2. Also consider the following targeted questions:
a. Comment on the statistical conclusion validity of this study.
b. Would this study benefit from the inclusion of a CONSORT-type flow
chart?
3. What might be some of the uses to which the findings could be put in
clinical practice?
EXAMPLE 2: DISCUSSION SECTION IN THE STUDY IN
APPENDIX A
Read the “Discussion” section of Swenson and colleagues’ (2016) study
(“Parents’ use of praise and criticism in a sample of young children seeking
mental health services”) in Appendix A of this book.
Critical Thinking Exercises
1. Answer the relevant questions from Box 15.1 regarding this study.
2. Also consider the following targeted questions:
a. Was a CONSORT-type flow chart used in this study? If not, was
information about participant flow provided in the text?
b. Can you think of any limitations of this study that the researchers did
not mention?
EXAMPLE 3: QUANTITATIVE STUDY IN APPENDIX C
Read Wilson and colleagues’ (2016) study (“A randomized controlled trial of
387
an individualized preoperative education intervention for symptom
management after total knee arthroplasty”) in Appendix C and then address
the following suggested activities or questions.
Critical Thinking Exercises
1. Before reading our critique, which accompanies the full report, write your
own critique or prepare a list of what you think are the study’s major
strengths and weaknesses. Pay particular attention to validity threats and
bias. Then contrast your critique with ours. Remember that you (or your
instructor) do not necessarily have to agree with all of the points made in
our critique and you may identify strengths and weaknesses that we
overlooked. You may find the broad critiquing guidelines in Table 4.1
helpful.
2. Write a short summary of how credible, important, and generalizable you
find the study results to be. Your summary should conclude with your
interpretation of what the results mean and what their implications are for
nursing practice. Contrast your summary with the discussion section in the
report itself.
3. In selecting studies to include with this textbook, we deliberately chose a
study with many strengths. In the following questions, we offer some
“pretend” scenarios in which the researchers for the study in Appendix C
made different methodologic decisions than the ones they in fact did make.
Write a paragraph or two critiquing these “pretend” decisions, pointing out
how these alternatives would have affected the rigor of the study and the
inferences that could be made.
a. Pretend that the researchers had been unable to randomize subjects to
treatments. The design, in other words, would be a nonequivalent
control group quasi-experiment.
b. Pretend that 143 participants were randomized (this is actually what did
happen) but that only 80 participants remained in the study at Time 3.
WANT TO KNOW MORE?
A wide variety of resources to enhance your learning and understanding of
this chapter are available on .
• Interactive Critical Thinking Activity
• Chapter Supplement on Research Biases
• Answer to the Critical Thinking Exercise for Examples 2 and 3
• Internet Resources with useful websites for Chapter 15
• A Wolters Kluwer journal article in its entirety—the Dotson et al. study
described as an example on page 268.
388
Additional study aids, including eight journal articles and related questions,
are also available in Study Guide for Essentials of Nursing Research,
9e.
Summary
Points
The interpretation of quantitative research results (the outcomes of the statistical
analyses) typically involves consideration of (1) the credibility of the results, (2)
precision of estimates of effects, (3) magnitude of effects, (4) underlying meaning,
(5) generalizability, and (6) implications for future research and nursing practice.
The particulars of the study—especially the methodologic decisions made by
researchers—affect the inferences that can be made about the correspondence
between study results and “truth in the real world.”
A cautious outlook is appropriate in drawing conclusions about the credibility and
meaning of study results.
An assessment of a study’s credibility can involve various approaches, one of
which involves an evaluation of the degree of congruence between the abstract
constructs or idealized methods on the one hand and the proxies actually used on
the other.
Credibility assessments also involve an assessment of study rigor through an
analysis of validity threats and biases that could undermine the accuracy of the
results.
Corroboration (replication) of results, through either internal or external sources, is
another approach in a credibility assessment.
Researchers can facilitate interpretations by carefully documenting methodologic
decisions and the outcomes of those decisions (e.g., by using the CONSORT
guidelines to document participant flow).
Broadly speaking, clinical significance refers to the practical importance of
research results—i.e., whether the effects are genuine and palpable in the daily
lives of patients or in the management of their health. Clinical significance has not
received great attention in nursing research.
Clinical significance for group-level results is often inferred on the basis of such
statistics as effect size indexes, confidence intervals, and number needed to treat.
However, clinical significance is most often discussed in terms of effects for
individual patients—especially whether they have achieved a clinically
meaningful change.
389
Definitions and operationalizations of clinical significance for individuals typically
involve a benchmark or threshold to designate a meaningful amount of change.
This benchmark is often called a minimal important change (MIC), which is a
value for the amount of change score points on a measure that an individual
patient must achieve to be classified as having a clinically important change.
MICs cannot legitimately be used to interpret group means or differences in
means. However, the MIC can be used to ascertain whether each person in a
sample has or has not achieved a change greater than the MIC and then a
responder analysis can be undertaken to compare the percentage of people
meeting the threshold in different study groups.
In their discussions of study results, researchers should themselves point out
known study limitations, but readers should draw their own conclusions about the
rigor of the study and about the plausibility of alternative explanations for the
results.
REFERENCES FOR CHAPTER 15
Despriee, Å., & Langeland, E. (2016). The effect of sucrose as pain relief/comfort during immunisation of 15-
month-old children in health care centres: A randomised controlled trial. Journal of Clinical Nursing, 25, 372–
380.
**Dotson, M. J., Dave, D., Cazier, J., & Spaulding, T. (2014). An empirical analysis of nurse retention: What
keeps RNs in nursing? The Journal of Nursing Administration, 44, 111–116.
Houck, G., Kendall, J., Miller, A., Morrell, P., & Wiebe, G. (2011). Self-concept in children and adolescents with
attention deficit hyperactivity disorder. Journal of Pediatric Nursing, 26, 239–247.
Jaeschke, R., Singer, J., & Guyatt, G. H. (1989). Measurement of health status: Ascertaining the minimal clinically
important difference. Controlled Clinical Trials, 10, 407–415.
Latendresse, G., & Ruiz, R. (2011). Maternal corticotropin-releasing hormone and the use of selective serotonin
reuptake inhibitors independently predict the occurrence of preterm birth. Journal of Midwifery & Women’s
Health, 56, 118–126.
*Lavender, T., Bedwell, C., Roberts, S., Hart, A., Turner, M., Carter, L., & Cork, M. (2013). Randomized,
controlled trial evaluating a baby wash product on skin barrier function in healthy, term neonates. Journal of
Obstetric, Gynecologic, & Neonatal Nursing, 42, 203–214.
Lima, A., Miranda, A., Correia, M., Soares, A., Cucato, G., Sobral Filho, D., . . . Ritti-Dias, R. (2015). Individual
blood pressure responses to walking and resistance exercise in peripheral artery disease patients: Are the mean
values describing what is happening?Journal of Vascular Nursing, 33, 150–156.
Lindseth, G. N., Coolahan, S., Petros, T., & Lindseth, P. (2014). Neurobehavioral effects of aspartame
consumption. Research in Nursing & Health, 37, 185–193.
Polit, D. F., & Yang, F. M. (2016). Measurement and the measurement of change. Philadelphia, PA: Wolters
Kluwer.
Sackett, D. L., Straus, S., Richardson, W., Rosenberg, W., & Haynes, R. (2000). Evidence-based medicine: How to
practice and teach EBM (2nd ed.). Edinburgh, United Kingdom: Churchill Livingstone.
*A link to this open-access article is provided in the Internet Resources section on website.
**This journal article is available on for this chapter.
390
16 Analysis of Qualitative Data
Learning Objectives
On completing this chapter, you will be able to:
Describe activities that qualitative researchers perform to manage and organize their
data
Discuss the procedures used to analyze qualitative data, including both general
procedures and those used in ethnographic, phenomenologic, and grounded theory
research
Assess the adequacy of researchers’ descriptions of their analytic procedures and
evaluate the suitability of those procedures
Define new terms in the chapter
Key Terms
Axial coding
Basic social process (BSP)
Central category
Constant comparison
Core category
Domain
Emergent fit
Hermeneutic circle
Metaphor
Open coding
Paradigm case
Qualitative content analysis
Selective coding
Substantive codes
Taxonomy
Theme
Theoretical codes
391
Qualitative data are derived from narrative materials, such as transcripts from
audiotaped interviews or participant observers’ field notes. This chapter describes
methods for analyzing such qualitative data.
INTRODUCTION TO QUALITATIVE ANALYSIS
Qualitative data analysis is challenging, for several reasons. First, there are no universal
rules for analyzing qualitative data. A second challenge is the enormous amount of work
required. Qualitative analysts must organize and make sense of hundreds or even
thousands of pages of narrative materials. Qualitative researchers typically scrutinize
their data carefully, often reading the data over and over in a search for understanding.
Also, doing qualitative analysis well requires creativity and strong inductive skills
(inducing universals from particulars). A qualitative analyst must be proficient in
discerning patterns and weaving them together into an integrated whole.
Another challenge comes in reducing data for reporting purposes. Quantitative
results can often be summarized in a few tables. Qualitative researchers, by contrast,
must balance the need to be concise with the need to maintain the richness of their data.
TIP Qualitative analyses are more difficult to do than quantitative ones, but
qualitative findings are easier to understand than quantitative ones because
the stories are told in everyday language. Qualitative analyses are often hard
to critique, however, because readers cannot know if researchers adequately
captured thematic patterns in the data.
QUALITATIVE DATA MANAGEMENT AND
ORGANIZATION
Qualitative analysis is supported by several tasks that help to organize and manage the
mass of narrative data.
Developing a Coding Scheme
Qualitative researchers begin their analysis by developing a method to classify and
index their data. Researchers must be able to gain access to parts of the data without
having repeatedly to reread the data set in its entirety.
The usual procedure is to create a coding scheme, based on a scrutiny of actual data,
and then code data according to the categories in the coding scheme. Developing a high-
quality coding scheme involves a careful reading of the data, with an eye to identifying
underlying concepts. The nature of the codes may vary in level of detail as well as in
level of abstraction.
Researchers whose aims are primarily descriptive often use codes that are fairly
concrete. The codes may differentiate various types of actions or events, for example. In
392
developing a coding scheme, related concepts are grouped together to facilitate the
coding process.
Example of a descriptive coding scheme
Ersek and Jablonski (2014) studied the adoption of evidence-based pain practices in
nursing homes. Data from focus group interviews with staff were coded into broad
categories of facilitators and barriers within Donabedian’s schema of structure,
process, and outcome. For example, categories of barriers in the process group
included provider mistrust, lack of time, and staff and family knowledge and
attitudes.
Many studies, such as those designed to develop a theory, are more likely to involve
the development of abstract, conceptual coding categories. In creating abstract
categories, researchers break the data into segments, closely examine them, and
compare them to other segments to uncover the meaning of those phenomena. The
researcher asks questions such as the following about discrete statements: What is this?
What is going on? What else is like this? What is this distinct from?
Important concepts that emerge from examining the data are then given a label.
These names are abstractions, but the labels are usually sufficiently graphic that the
nature of the material to which they refer is clear—and often provocative.
Example of an abstract coding scheme
Box 16.1 shows the category scheme developed by Beck and Watson (2010) to code
data from their interviews on childbirth after a previous traumatic birth (the full study
is in Appendix B). The coding scheme includes major thematic categories with
subcodes. For example, an excerpt that described how a mother viewed this
subsequent birth as healing because she felt respected during this subsequent labor
and delivery would be coded 3A, the category for “Treated with respect.”
Box 16.1 Beck and Watson’s (2010) Coding Scheme for the Subsequent
Childbirth After a Previous Traumatic Birth
Theme 1: Riding the Turbulent Wave of Panic During Pregnancy
A. Reactions to learning of pregnancy
B. Denial during the first trimester
C. Heightened state of anxiety
D. Panic attacks as delivery date gets closer
E. Feeling numb toward the baby
Theme 2: Strategizing: Attempts to Reclaim Their Body and Complete the
393
Journey to Motherhood
A. Spending time nurturing self by exercising, going to yoga classes, and swimming
B. Keeping a journal throughout pregnancy
C. Turning to doulas for support during labor
D. Reading avidly to understand the birth process
E. Engaging in birth art exercises
F. Opening up to health care providers about their previous birth trauma
G. Sharing with partners about their fears
H. Learned relaxation techniques
Theme 3: Bringing Reverence to the Birthing Process and Empowering Women
A. Treated with respect
B. Pain relief taken seriously
C. Communicated with labor and delivery staff
D. Reclaimed their body
E. Strong sense of control
F. Birth plan honored by labor and delivery staff
G. Mourned what they missed out with prior birth
H. Healing subsequent birth but it can never change the past
Theme 4: Still Elusive: The Longed for Healing Birth Experience
A. Failed again as a woman
B. Better than first traumatic birth but not healing
C. Hopes of a healing home birth dashed
Coding Qualitative Data
After a coding scheme has been developed, the data are read in their entirety and coded
for correspondence to the categories—a task that is seldom easy. Researchers may have
difficulty deciding the most appropriate code, for example. It sometimes takes several
readings of the material to grasp its nuances.
Also, researchers often discover during coding that the initial coding system was
incomplete. Categories may emerge that were not initially identified. When this
happens, it is risky to assume that the category was absent in previously coded
materials. A concept might not be identified as salient until it has emerged several
times. In such a case, it would be necessary to reread all previously coded material to
check if the new code should be applied.
Narrative materials usually are not linear. For example, paragraphs from transcribed
interviews may contain elements relating to three or four different categories.
Example of a multitopic segment
Figure 16.1 shows an example of a multitopic segment of an interview from Beck and
394
Watson’s (2010) subsequent childbirth after a previous traumatic birth study. The
codes in the margin represent codes from the scheme in Box 16.1.
Methods of Organizing Qualitative Data
Before the advent of software for qualitative data management, analysts used
conceptual files to organize their data. This approach involves creating a physical file
for each category and then cutting out and inserting all the materials relating to that
category into the file. Researchers then retrieve the content on a particular topic by
reviewing the applicable file folder.
Creating conceptual files is a cumbersome, labor-intensive task, particularly when
segments of the narratives have multiple codes. For example, in Figure 16.1, seven
copies of the paragraph would be needed, corresponding to seven codes that were used.
Researchers must also provide enough context that the cut-up material can be
understood, and so it is often necessary to include material preceding or following the
relevant material.
Computer-assisted qualitative data analysis software (CAQDAS) removes the work
of cutting and pasting pages of narrative material. These programs permit an entire data
set to be entered onto the computer and coded; text corresponding to specified codes can
then be retrieved for analysis. The software can also be used to examine relationships
between codes. Computer programs offer many advantages for managing qualitative
data, but some people prefer manual methods because they allow researchers to get
closer to the data. Others object to having a cognitive process turned into a
technological activity. Despite concerns, many researchers have switched to
computerized data management because it frees up their time and permits them to
devote more attention to conceptual issues.
ANALYTIC PROCEDURES
Data management in qualitative research is reductionist in nature: It involves converting
masses of data into smaller, more manageable segments. By contrast, qualitative data
analysis is constructionist: It involves putting segments together into meaningful
395
conceptual patterns. Various approaches to qualitative data analysis exist, but some
elements are common to several of them.
A General Analytic Overview
The analysis of qualitative materials often begins with a search for broad categories or
themes. In their review of how the term theme is used among qualitative researchers,
DeSantis and Ugarriza (2000) offered this definition: “A theme is an abstract entity that
brings meaning and identity to a current experience and its variant manifestations. As
such, a theme captures and unifies the nature or basis of the experience into a
meaningful whole” (p. 362).
Themes emerge from the data. They may not only develop within categories of data
(i.e., within categories of the coding scheme) but may also cut across them. The search
for themes involves not only discovering commonalities across participants but also
seeking variation. Themes are never universal. Researchers must attend not only to what
themes arise but also to how they are patterned. Does the theme apply only to certain
types of people or in certain contexts? At certain periods? In other words, qualitative
analysts must be sensitive to relationships within the data.
TIP Qualitative researchers often use major themes as subheadings in the
“Results” section of their reports. For example, in their analysis of interviews
about the experiences of 14 family caregivers of patients with heart failure,
Gusdal and coresearchers (2016) identified two main themes that were used
to organize their results: “Living in a changed existence” and “Struggling and
sharing with healthcare.” Subthemes in the two categories were also given
headings in the report.
Researchers’ search for themes and patterns in the data can sometimes be facilitated
by devices that enable them to chart the evolution of behaviors and processes. For
example, for qualitative studies that focus on dynamic experiences (e.g., decision
making), flow charts or timelines can be used to highlight time sequences or major
decision points.
Some qualitative researchers use metaphors as an analytic strategy. A metaphor is a
symbolic comparison, using figurative language to evoke a visual analogy. Metaphors
can be expressive tools for qualitative analysts, but they can run the risk of “supplanting
creative insight with hackneyed cliché masquerading as profundity” (Thorne &
Darbyshire, 2005, p. 1111).
Example of a metaphor
Patel and colleagues (2016) studied the symptom experiences of women with
peripartum cardiomyopathy. The researchers captured the nature of the main theme
with the metaphor “Being caught in a spider web.”
396
A further analytic step involves validation. In this phase, the concern is whether the
themes accurately represent the participants’ perspectives. Several validation procedures
are discussed in Chapter 17.
In the final analysis stage, researchers strive to weave the thematic pieces together
into an integrated whole. The various themes are integrated to provide an overall
structure (such as a theory or full description) to the data. Successful integration
demands creativity and intellectual rigor.
TIP Although relatively few qualitative researchers make formal efforts to
quantify features of their data, be alert to quantitative implications when you
read a qualitative report. Qualitative researchers routinely use words like
“some,” “most,” or “many” in characterizing participants’ experiences and
actions, which implies some level of quantification.
Qualitative Content Analysis
In the remainder of this section, we discuss analytic procedures used by ethnographers,
phenomenologists, and grounded theory researchers. Qualitative researchers who
conduct descriptive qualitative studies may, however, simply say that they performed a
content analysis. Qualitative content analysis involves analyzing the content of
narrative data to identify prominent themes and patterns among the themes. Qualitative
content analysis involves breaking down data into smaller units, coding and naming the
units according to the content they represent, and grouping coded material based on
shared concepts. The literature on content analysis often refers to meaning units. A
meaning unit, essentially, is the smallest segment of a text that contains a recognizable
piece of information.
Content analysts often make the distinction between manifest and latent content.
Manifest content is what the text actually says. In purely descriptive studies, qualitative
researchers may focus mainly on summarizing the manifest content communicated in
the text. Often, however, content analysts also analyze what the text talks about, which
involves interpretation of the meaning of its latent content. Interpretations vary in depth
and level of abstraction and are usually the basis for themes.
Example of a content analysis
Herling and colleagues (2016) did a content analysis of semistructured interviews
with 12 women with early-stage endometrial cancer who had had a robotic-assisted
laparoscopic hysterectomy. Four overarching themes emerged: “Surgery was a piece
of cake,” “Recovering physically after surgery,” “Going from being off guard to
being on guard,” and “Preparing oneself by seeking information.”
397
Ethnographic Analysis
Analysis typically begins the moment ethnographers set foot in the field. Ethnographers
are continually looking for patterns in the behavior and thoughts of participants,
comparing one pattern against another, and analyzing many patterns simultaneously. As
they analyze patterns of everyday life, ethnographers acquire a deeper understanding of
the culture being studied. Maps, flow charts, and organizational charts are also useful
tools that help to crystallize and illustrate the data being collected. Matrices (two-
dimensional displays) can also help to highlight a comparison graphically, to cross-
reference categories, and to discover emerging patterns.
Spradley’s (1979) research sequence is sometimes used for ethnographic data
analyses. His 12-step sequence included strategies for both data collection and data
analysis. In Spradley’s method, there are four levels of data analysis: domain analysis,
taxonomic analysis, componential analysis, and theme analysis. Domains are broad
categories that represent units of cultural knowledge. During this first level of analysis,
ethnographers identify relational patterns among terms in the domains that are used by
members of the culture. The ethnographer focuses on the cultural meaning of terms and
symbols (objects and events) used in a culture and their interrelationships.
In taxonomic analysis, the second level in Spradley’s (1979) data analytic method,
ethnographers decide how many domains the analysis will encompass. Will only one or
two domains be analyzed in depth, or will several domains be studied less intensively?
After making this decision, a taxonomy—a system of classifying and organizing terms
—is developed to illustrate the internal organization of a domain.
In componential analysis, multiple relationships among terms in the domains are
examined. The ethnographer analyzes data for similarities and differences among
cultural terms in a domain. Finally, in theme analysis, cultural themes are uncovered.
Domains are connected in cultural themes, which help to provide a holistic view of the
culture being studied. The discovery of cultural meaning is the outcome.
Example using Spradley’s method
Michel and colleagues (2015) studied the meanings assigned to health care by long-
lived elders and nurses in a health care setting. They used Spradley’s method of
ethnographic analysis and identified and analyzed six domains. The overarching
cultural theme that emerged was the real to the ideal—the health (un)care of long-
lived elders.
Other approaches to ethnographic analysis have been developed. For example, in
Leininger’s ethnonursing research method, as described in McFarland and Wehbe-
Alamah (2015), ethnographers follow a four-phase ethnonursing data analysis guide. In
the first phase, ethnographers collect, describe, and record data. The second phase
involves identifying and categorizing descriptors. In phase 3, data are analyzed to
discover repetitive patterns in their context. The fourth and final phase involves
398
abstracting major themes and presenting findings.
Example using Leininger’s method
Raymond and Omeri (2015) studied the culture care for Mauritian immigrant
childbearing families living in Australia. Using Leininger’s four phases of
ethnonursing inquiry, the researchers identified five dominant themes: care as
extended family and friendship support, care as best professional and/or folk
practices, self-care as responsibility, care as enabling and empowerment, and care as
maintenance of a hygienic and supportive environment.
Phenomenological Analysis
Schools of phenomenology have developed different approaches to data analysis. Three
frequently used methods for descriptive phenomenology are the methods of Colaizzi
(1978), Giorgi (1985), and van Kaam (1966), all of whom are from the Duquesne
School of phenomenology, based on Husserl’s philosophy.
The basic outcome of all three methods is the description of the essential nature of
an experience, often through the identification of essential themes. Some important
differences among these three approaches exist. Colaizzi’s (1978) method, for example,
is the only one that calls for a validation of results by querying study participants.
Giorgi’s (1985) view is that it is inappropriate either to return to participants to validate
findings or to use external judges to review the analysis. Van Kaam’s (1966) method
requires that intersubjective agreement be reached with other expert judges.
Figure 16.2 provides an illustration of the steps involved in Colaizzi’s (1978) data
analysis approach, which is the most widely used of the three approaches by nurse
researchers.
399
Example of a study using Colaizzi’s method
Knecht and Fischer (2015) explored undergraduate nursing students’ experience of
service learning. Transcribed interviews with 10 students were analyzed using
Colaizzi’s method. Five themes emerged: “Shattering stereotypes,” “Overwhelmed
with their need,” “Transitioning to community caregiver,” “Advocating,” and
“Reciprocal benefits.”
Phenomenologists from the Utrecht School, such as van Manen (1997), combine
characteristics of descriptive and interpretive phenomenology. Van Manen’s approach
involves six activities: (1) turning to the nature of the lived experience, (2) exploring the
experience as we live it, (3) reflecting on essential themes, (4) describing the
phenomenon through the art of writing and rewriting, (5) maintaining a strong relation
to the phenomenon, and (6) balancing the research context by considering parts and
whole. According to van Manen, thematic aspects of experience can be uncovered from
participants’ descriptions of the experience by three methods: the holistic, selective, or
detailed approach. In the holistic approach, researchers view the text as a whole and try
to capture its meanings. In the selective (or highlighting) approach, researchers pull out
statements that seem essential to the experience under study. In the detailed (or line-by-
line) approach, researchers analyze every sentence. Once themes have been identified,
they become the objects of interpretation through follow-up interviews with
participants. Through this process, essential themes are discovered.
Example of a study using van Manen’s method
Rasmussen and Delmar (2014) provided a detailed description of their use of van
400
Manen’s methods in the study of patient dignity as perceived by surgical patients in a
Danish hospital. Holistic, selective, and detailed analyses were undertaken to reveal
the basic theme: to be an important person.
In addition to identifying themes from participants’ descriptions, van Manen (1997)
also called for gleaning thematic descriptions from artistic sources. Van Manen urged
qualitative researchers to keep in mind that literature, painting, and other art forms can
provide rich experiential data that can increase insights into the essential meaning of the
experience being studied.
A third school of phenomenology is an interpretive approach called Heideggerian
hermeneutics. Central to analyzing data in a hermeneutic study is the notion of the
hermeneutic circle. The circle signifies a methodological process in which, to reach
understanding, there is continual movement between the parts and the whole of the text
being analyzed. Gadamer (1975) stressed that, to interpret a text, researchers cannot
separate themselves from the meanings of the text and must strive to understand
possibilities that the text can reveal.
Benner (1994) offered an analytic approach for hermeneutic analysis that involves
three interrelated processes: the search for paradigm cases, thematic analysis, and
analysis of exemplars. Paradigm cases are “strong instances of concerns or ways of
being in the world” (Benner, 1994, p. 113). Paradigm cases are used early in the
analytic process as a strategy for gaining understanding. Thematic analysis is done to
compare and contrast similarities across cases. Lastly, paradigm cases and thematic
analysis can be enhanced by exemplars that illuminate aspects of a paradigm case or
theme. Paradigm cases and exemplars presented in research reports allow readers to
play a role in consensual validation of the results by deciding whether the cases support
the researchers’ conclusions.
Example using Benner’s hermeneutical analysis
Solomon and Hansen (2015) conducted an interpretive phenomenological study of
the unique lived experience of a dying patient and her family members. The
researchers used Benner’s approach in their analysis, which included paradigm cases,
thematic analysis, and exemplars. Exemplars included “Driving her own course” and
“Not being a burden.”
Grounded Theory Analysis
Grounded theory methods emerged in the 1960s when two sociologists, Glaser and
Strauss, were studying dying in hospitals. The two co-originators eventually split and
developed divergent approaches, which have been called the “Glaserian” and
“Straussian” versions of grounded theory. A third analytic approach by Charmaz
(2014), constructivist grounded theory, has also emerged.
401
Glaser and Strauss’s Grounded Theory Method
Grounded theory in all three analytic systems uses constant comparison, a method that
involves comparing elements present in one data source (e.g., in one interview) with
those in another. The process continues until the content of all sources has been
compared so that commonalities are identified. The concept of fit is an important
element in Glaserian grounded theory analysis. Fit has to do with how closely the
emerging concepts fit with the incidents they are representing—which depends on how
thoroughly constant comparison was done.
Coding in the Glaserian approach is used to conceptualize data into patterns. Coding
helps the researcher to discover the basic problem with which participants must
contend. The substance of the topic under study is conceptualized through substantive
codes, of which there are two types: open and selective. Open coding, used in the first
stage of constant comparison, captures what is going on in the data. Open codes may be
the actual words participants used. Through open coding, data are broken down, and
their similarities and differences are examined.
There are three levels of open coding that vary in degree of abstraction. Level I
codes (or in vivo codes) are derived directly from the language of the substantive area.
They have vivid imagery and “grab.” Table 16.1 presents five level I codes and
illustrative interview excerpts from Beck’s (2002) grounded theory study on mothering
twins.
As researchers constantly compare new level I codes with previously identified
402
ones, they condense them into broader level II codes. For example, in Table 16.1,
Beck’s (2002) five level I codes were collapsed into a single level II code, “Reaping the
Blessings.” Level III codes (or theoretical constructs) are the most abstract. Collapsing
level II codes aids in identifying constructs.
TIP Additional material relating to Beck’s (2002) twin study is presented
in the Supplement to this chapter on website.
Open coding ends when the core category is discovered and then selective coding
begins. The core category (or core variable) is a pattern of behavior that is relevant
and/or problematic for study participants. In selective coding, researchers code only
those data that are related to the core category. One kind of core category is a basic
social process (BSP) that evolves over time in two or more phases. All BSPs are core
categories, but not all core categories have to be BSPs.
Glaser (1978) provided criteria to help researchers decide on a core category. Here
are a few examples: It must be central, meaning that it is related to many categories, it
must reoccur frequently in the data, it relates meaningfully and easily to other
categories, and it has clear and grabbing implications for formal theory.
Theoretical codes provide insights into how substantive codes relate to each other.
Theoretical codes help grounded theorists to weave the broken pieces of data back
together again. Glaser (1978) proposed 18 families of theoretical codes that researchers
can use to conceptualize how substantive codes relate to each other (although he
subsequently expanded possibilities in 2005). Four examples of his families of
theoretical codes include the following:
Process: stages, phases, passages, transitions
Strategy: tactics, techniques, maneuverings
Cutting point: boundaries, critical junctures, turning points
The six Cs: causes, contexts, contingencies, consequences, covariances, and conditions
Throughout coding and analysis, grounded theory analysts document their ideas
about the data and emerging conceptual scheme in memos. Memos encourage
researchers to reflect on and describe patterns in the data, relationships between
categories, and emergent conceptualizations.
The product of a typical Glaserian grounded theory analysis is a theoretical model
that endeavors to explain a pattern of behavior that is relevant for study participants.
Once the basic problem emerges, the grounded theorist goes on to discover the process
these participants experience in coping with or resolving this problem.
Example of a Glaser and Strauss grounded theory analysis
Figure 16.3 presents Beck’s (2002) model from a study in which “Releasing the
Pause Button” was conceptualized as the core category and process through which
403
mothers of twins progressed as they tried to resume their lives after giving birth. The
process involves four phases: Draining Power, Pausing Own Life, Striving to Reset,
and Resuming Own Life. Beck used 10 coding families in her theoretical coding for
the study. The family cutting point offers an illustration. Three months seemed to be
a turning point for mothers, when life started to be more manageable. Here is an
excerpt from an interview that Beck coded as a cutting point: “Three months came
around and the twins sort of slept through the night and it made a huge, huge
difference.”
Glaser and Strauss cautioned against consulting the literature before a framework is
stabilized, but they also saw the benefit of scrutinizing other work. Glaser (1978)
discussed the evolution of grounded theories through the process of emergent fit to
prevent individual substantive theories from being “respected little islands of
knowledge” (p. 148). As he noted, generating grounded theory does not necessarily
require discovering all new categories or ignoring ones previously identified in the
literature. Through constant comparison, researchers can compare concepts emerging
from the data with similar concepts from existing theory or research to evaluate which
parts have emergent fit with the theory being generated.
Strauss and Corbin’s Approach
The Strauss and Corbin approach to grounded theory analysis, most recently described
in Corbin and Strauss, differs from the original Glaser and Strauss method with regard
to method, processes, and outcomes. Table 16.2 summarizes major analytic differences
between these two grounded theory analysis methods.
404
Glaser (1978) stressed that to generate a grounded theory, the basic problem must
emerge from the data—it must be discovered. The theory is, from the very start,
grounded in the data rather than starting with a preconceived problem. Strauss and
Corbin, however, argued that the research itself is only one of four possible sources of a
research problem. Research problems can, for example, come from the literature or a
researcher’s personal and professional experience.
The Corbin and Strauss (2015) method involves two types of coding: open and axial
coding. In open coding, data are broken down into parts and concepts identified for
interpreted meaning of the raw data. In axial coding, the analyst codes for context.
Here, the analyst is “locating and linking action-interaction within a framework of
subconcepts that give it meaning and enable it to explain what interactions are
occurring, and why and what consequences real or anticipated are happening” (Corbin
& Strauss, 2015, p. 156). The paradigm is used as an analytic strategy to help integrate
structure and process. The basic components of the paradigm include conditions,
actions–interactions, and consequences or outcomes. Corbin and Strauss suggested the
conditional/consequential matrix as an analytic strategy for considering the range of
possible conditions and consequences that can enter into the context.
The first step in integrating the findings is to decide on the central category
(sometimes called the core category), which is the main construct in the research. The
outcome of the Strauss and Corbin approach is a full conceptual description. The
original grounded theory method, by contrast, generates a theory that explains how a
basic social problem that emerged from the data is processed in a social setting.
Example of a Strauss and Corbin grounded theory analysis
Lawler and colleagues (2015) sought to understand the process of transitioning to
motherhood for women with a disability. Data from interviews with 22 women were
analyzed using Strauss and Corbin’s method of open and axial coding: “The data
were broken down, examined, compared, conceptualized and categorized so that the
data could be interpreted, concepts and categories selected. Once the categories and
subcategories were sufficiently reinforced the data were reconstructed in different
ways through the linking of categories and subcategories. . . . Categories were then
405
integrated to refine the evolving theory” (p. 1675).
Constructivist Grounded Theory Approach
The constructivist approach to grounded theory is in some ways similar to a Glaserian
approach. According to Charmaz (2014), in constructivist grounded theory, the “coding
generates the bones of your analysis. Theoretical integration will assemble these bones
into a working skeleton” (p. 113). Charmaz offered guidelines for different types of
coding: word-by-word coding, line-by-line coding, and incident-to-incident coding.
Unlike Glaser and Strauss’s grounded theory approach in which theory is discovered
from data separate from the researcher, Charmaz’s position is that researchers construct
grounded theories by means of their past and current involvements and interactions with
individuals and research practices.
Charmaz (2014) distinguished initial coding and focused coding. In initial coding,
the pieces of data (e.g., words, lines, segments, incidents) are studied so the researcher
can learn what the participants view as problematic. In focused coding, the analysis is
directed toward identifying the most significant initial codes, which are then
theoretically coded.
Example of constructivist grounded theory analysis
Giles and coresearchers (2016) used constructivist methods to develop a grounded
theory of family presence during resuscitation, which they called “The Social
Construction of Conditional Permission.” Their article provides an excellent, detailed
description of their methods, tracing the construction of the core category
(“conditional permission”) from initial and focused codes through to the final
substantive grounded theory. This article is available on website.
TIP Grounded theory researchers often present conceptual maps or models
to summarize their results, such as the one in Figure 16.2, especially when
the central phenomenon is a dynamic or evolving process.
CRITIQUING QUALITATIVE ANALYSIS
Evaluating a qualitative analysis is not easy to do. Readers do not have access to the
information they would need to assess whether researchers exercised good judgment
and critical insight in coding the narrative materials, developing a thematic analysis, and
integrating materials into a meaningful whole. Researchers are seldom able to include
more than a handful of examples of actual data in a journal article. Moreover, the
process they used to inductively abstract meaning from the data is difficult to describe
and illustrate.
A major focus of a critique of qualitative analyses is whether the researchers have
406
adequately documented the analytic process. The report should provide information
about the approach used to analyze the data. For example, a report for a grounded
theory study should indicate whether the researchers used the Glaser and Strauss,
Corbin and Strauss, or constructivist method.
Another aspect of a qualitative analysis that can be critiqued is whether the
researchers have documented that they have used one approach consistently and have
been faithful to the integrity of its procedures. Thus, for example, if researchers say they
are using the Glaserian approach to grounded theory analysis, they should not also
include elements from the Strauss and Corbin method. An even more serious problem
occurs when, as sometimes happens, the researchers “muddle” traditions. For example,
researchers who describe their study as a grounded theory study should not present
themes because grounded theory analysis does not yield themes. Researchers who
attempt to blend elements from two traditions may not have a clear grasp of the analytic
precepts of either one. For example, a researcher who claims to have undertaken an
ethnography using a grounded theory approach to analysis may not be well informed
about the underlying goals and philosophies of these two traditions.
Some further guidelines that may be helpful in evaluating qualitative analyses are
presented in Box 16.2.
Box 16.2 Guidelines for Critiquing Qualitative Analyses
1. Was the data analysis approach appropriate for the research design or tradition?
2. Was the category scheme described? If so, does the scheme appear logical and
complete?
3. Did the report adequately describe the process by which the actual analysis was
performed? Did the report indicate whose approach to data analysis was used
(e.g., Glaserian, Straussian, or constructivist in grounded theory studies)?
4. What major themes or processes emerged? Were relevant excerpts from the data
provided, and do the themes or categories appear to capture the meaning of the
narratives—that is, does it appear that the researcher adequately interpreted the
data and conceptualized the themes? Is the analysis parsimonious—could two or
more themes be collapsed into a broader and perhaps more useful
conceptualization?
5. Was a conceptual map, model, or diagram effectively displayed to communicate
important processes?
6. Was the context of the phenomenon adequately described? Did the report give
you a clear picture of the social or emotional world of study participants?
7. Did the analysis yield a meaningful and insightful picture of the phenomenon
under study? Is the resulting theory or description trivial or obvious?
407
This section describes the analytic procedures used in a qualitative study.
Read the summary and then answer the critical thinking questions that follow,
referring to the full research report if necessary. Example 1 is featured on the
interactive Critical Thinking Activity on website. The critical
thinking questions for Example 2 are based on the study that appears in its
entirety in Appendix B of this book. Our comments for this exercise are in the
Student Resources section on .
EXAMPLE 1: A CONSTRUCTIVIST GROUNDED THEORY
ANALYSIS
Study: Care transition experiences of spousal caregivers: From a geriatric
rehabilitation unit to home (Byrne et al., 2011). (This study appears in its
entirety in the accompanying Study Guide.)
Statement of Purpose: The purpose of this study was to develop a theory
about caregivers’ transition processes and experiences during their spouses’
return home from a geriatric rehabilitation unit (GRU).
Method: This grounded theory study involved in-depth interviews with 18
older adult spousal caregivers. Most of the caregivers were interviewed on
three occasions: 48 hours prior to discharge from a 36-bed GRU in a
Canadian long-term care hospital, 2 weeks postdischarge, and 1 month
postdischarge. In addition to the interviews, which lasted between 35 and 120
minutes, the researchers made observations of interactions between spouses
and care recipients.
Analysis: Analysis began with line-by-line coding by the first author. All
authors contributed to focused coding, followed by theoretical coding. They
used constant comparison throughout the coding and analysis process and
provided a good example: “In the early stages of data collection and analysis,
we noticed that caregivers continually used the phrase ‘I don’t know,’ and
thus an open code by this name was created. . . . As data collection and
analysis proceeded, we engaged in focused coding using the term
knowing/not knowing to reflect these instances” (p. 1374). The researchers
illustrated with an interview excerpt how they came to understand that
knowing/not knowing was part of the process of navigating. The researchers
also noted that “Moving from line-be-line coding to focused coding was not a
linear process. As we engaged with the data, we returned to the data collected
to explore new ideas and conceptualization of codes” (p. 1375).
Key Findings: The basic problem the caregivers faced was “fluctuating
needs,” including the physical, emotional, social, and medical needs of the
caregivers and their spouses. The researchers developed a theoretical
408
framework in which reconciling in response to fluctuating needs emerged as
the basic social process. Reconciling encompassed three subprocesses:
navigating, safekeeping, and repositioning. The context that shaped
reconciling was a trajectory of prior care transitions and intertwined life
events.
Critical Thinking Exercises
1. Answer the relevant questions from Box 16.2 regarding this study.
2. Also consider the following targeted questions:
a. Comment on the researcher’s decision to use both interview data and
observations.
b. The authors wrote that “to foster theoretical sensitivity, memos focused
on actions and processes, and gradually incorporated relevant literature
(e.g., theoretical perspectives on transition)” (p. 1375). Comment on
this statement.
3. What might be some of the uses to which the findings could be put in
clinical practice?
EXAMPLE 2: A PHENOMENOLOGICAL ANALYSIS IN
APPENDIX B
• Read the methods and results sections of Beck and Watson’s (2010)
phenomenological study (“Subsequent childbirth after a previous traumatic
birth”) in Appendix B of this book.
Critical Thinking Exercises
1. Answer the relevant questions from Box 16.2 regarding this study.
2. Also consider the following targeted questions:
a. Comment of the amount of data that had to be analyzed in this study.
b. Refer to Table 2 in the article, which presents a list of 10 significant
statements made by participants. In Colaizzi’s approach, the next step is
to construct formulated meanings from the significant statements. Try
to develop your own formulated meanings of one or two of these
significant statements.
WANT TO KNOW MORE?
A wide variety of resources to enhance your learning and understanding of
this chapter are available on .
• Interactive Critical Thinking Activity
• Chapter Supplement on a Glaserian Grounded Theory Study: Illustrative
Materials
409
• Answer to the Critical Thinking Exercise for Example 2
• Internet Resources with useful websites for Chapter 16
• A Wolters Kluwer journal article in its entirety—the Giles et al. study
described as an example on page 289.
Additional study aids, including eight journal articles and related
questions, are also available in Study Guide for Essentials of Nursing
Research, 9e.
Summary
Points
Qualitative analysis is a challenging, labor-intensive activity, with few fixed rules.
A first step in analyzing qualitative data is to organize and index the materials for
easy retrieval, typically by coding the content of the data according to a coding
scheme that involves devising descriptive or abstract categories.
Traditionally, researchers have organized their data by developing conceptual files,
which are physical files in which coded excerpts of data for specific categories are
placed. Now, however, computer software (CAQDAS) is widely used to perform
basic indexing functions and to facilitate data analysis.
The actual analysis of data begins with a search for patterns and themes, which
involves the discovery not only of commonalities across participants but also of
natural variation in the data. Some qualitative analysts use metaphors or figurative
comparisons to evoke a visual and symbolic analogy. In a final step, analysts try to
weave the thematic strands together into an integrated picture of the phenomenon
under investigation.
Researchers whose goal is qualitative description often say they used qualitative
content analysis as their analytic method. Content analysis can vary in terms of an
emphasis on manifest content or latent content.
In ethnographies, analysis begins as the researcher enters the field. One analytic
approach is Spradley’s method, which involves four levels of analysis: domain
analysis (identifying domains or units of cultural knowledge), taxonomic analysis
(selecting key domains and constructing taxonomies), componential analysis
(comparing and contrasting terms in a domain), and a theme analysis (to uncover
cultural themes).
There are numerous approaches to phenomenological analysis, including the
descriptive methods of Colaizzi, Giorgi, and van Kaam, in which the goal is to
find common patterns of experiences shared by particular instances.
410
In van Manen’s approach, which involves efforts to grasp the essential meaning of
the experience being studied, researchers search for themes, using either a holistic
approach (viewing text as a whole), a selective approach (pulling out key
statements and phrases), or a detailed approach (analyzing every sentence).
Central to analyzing data in a hermeneutic study is the notion of the hermeneutic
circle, which signifies a process in which there is continual movement between
the parts and the whole of the text under analysis.
Benner’s approach consists of three processes: searching for paradigm cases,
thematic analysis, and analysis of exemplars.
Grounded theory uses the constant comparative method of data analysis, a
method that involves comparing elements present in one data source (e.g., in one
interview) with those in another. Fit has to do with how closely concepts fit with
incidents they represent, which is related to how thoroughly constant comparison
was done.
One grounded theory approach is the Glaser and Strauss (Glaserian) method, in
which there are two broad types of codes: substantive codes (in which the
empirical substance of the topic is conceptualized) and theoretical codes (in
which the relationships among the substantive codes are conceptualized).
Substantive coding involves open coding to capture what is going on in the data
and then selective coding, in which only variables relating to a core category are
coded. The core category, a behavior pattern that has relevance for participants, is
sometimes a basic social process (BSP) that involves an evolutionary process of
coping or adaptation.
In the Glaserian method, open codes begin with level I (in vivo) codes, which are
collapsed into a higher level of abstraction in level II codes. Level II codes are
then used to formulate level III codes, which are theoretical constructs. Through
constant comparison, the researcher compares concepts emerging from the data
with similar concepts from existing theory or research to see which parts have
emergent fit with the theory being generated.
Strauss and Corbin’s method is an alternative grounded theory method whose
outcome is a full conceptual description. This approach to grounded theory
analysis involves two types of coding: open (in which categories are generated)
and axial coding (where categories are linked with subcategories and integrated).
In Charmaz’s constructivist grounded theory, coding can be word-by-word, line-
by-line, or incident-by-incident. Initial coding leads to focused coding, which is
then followed by theoretical coding.
REFERENCES FOR CHAPTER 16
Beck, C. T. (2002). Releasing the pause button: Mothering twins during the first year of life. Qualitative Health
Research, 12, 593–608.
411
Beck, C. T., & Watson, S. (2010). Subsequent childbirth after a previous traumatic birth. Nursing Research, 59,
241–249.
Benner, P. (1994). The tradition and skill of interpretive phenomenology in studying health, illness, and caring
practices. In P. Benner (Ed.), Interpretive phenomenology: Embodiment, caring, and ethics in health and illness
(pp. 99–128). Thousand Oaks, CA: Sage.
Byrne, K., Orange, J., & Ward-Griffin, C. (2011). Care transition experiences of spousal caregivers: From a
geriatric rehabilitation unit to home. Qualitative Health Research, 21, 1371–1387.
Charmaz, K. (2014). Constructing grounded theory (2nd ed.). Thousand Oaks, CA: Sage.
Colaizzi, P. (1978). Psychological research as the phenomenologist views it. In R. Valle & M. King (Eds.),
Existential-phenomenological alternatives for psychology (pp. 48–71). New York, NY: Oxford University
Press.
Corbin, J., & Strauss, A. (2015). Basics of qualitative research: Techniques and procedures for developing
grounded theory. Thousand Oaks, CA: Sage.
DeSantis, L., & Ugarriza, D. N. (2000). The concept of theme as used in qualitative nursing research. Western
Journal of Nursing Research, 22, 351–372.
*Ersek, M., & Jablonski, A. (2014). A mixed-methods approach to investigating the adoption of evidence-based
pain practices in nursing homes. Journal of Gerontological Nursing, 40, 52–60.
Gadamer, H. G. (1975). Truth and method (G. Borden & J. Cumming, Trans.). London, United Kingdom: Sheed &
Ward. (Original work published 1960)
**Giles, T. M., de Lacey, S., & Muir-Cochrane, E. (2016). Coding, constant comparisons, and core categories: A
worked example for novice constructivist grounded theorists. Advances in Nursing Science, 39, E29–E44.
Giorgi, A. (1985). Phenomenology and psychological research. Pittsburgh, PA: Duquesne University Press.
Glaser, B. G. (1978). Theoretical sensitivity. Mill Valley, CA: Sociology Press.
Glaser, B. G. (2005). The grounded theory perspective III: Theoretical coding. Mill Valley, CA: Sociology Press.
Gusdal, A., Josefsson, K., Adolfsson, E., & Martin, L. (2016). Informal caregivers’ experiences and needs when
caring for a relative with heart failure: An interview study. Journal of Cardiovascular Nursing, 31(4), E1–E8.
Herling, S., Palle, C., Moeller, A., & Thomsen, T. (2016). The experience of robotic-assisted laparoscopic
hysterectomy for women treated for early-stage endometrial cancer: A qualitative study. Cancer Nursing, 39,
125–133.
Knecht, J. G., & Fischer, B. (2015). Undergraduate nursing students’ experience of service-learning: A
phenomenological study. Journal of Nursing Education, 54, 378–384.
Lawler, D., Begley, C., & Lalor, J. (2015). (Re)constructing myself: The process of transition to motherhood for
women with a disability. Journal of Advanced Nursing, 71, 1672–1683.
McFarland, M. R., & Wehbe-Alamah, H. B. (2015). Leininger’s culture care diversity and universality: A
worldwide nursing theory. Burlington, MA: Jones & Bartlett.
*Michel, T., Lenardt, M., Willig, M., & Alvarez, A. (2015). From real to ideal—the health (un)care of long-lived
elders. Revista Brasileira de Enfermagem, 68, 343–349.
*Patel, H., Berg, M., Barasa, A., Begley, C., & Schaufelberger, M. (2016). Symptoms in women with peripartum
cardiomyopathy: A mixed method study. Midwifery, 32, 14–20.
*Rasmussen, T. S., & Delmar, C. (2014). Dignity as an empirical lifeworld construction—in the field of surgery in
Denmark. International Journal of Qualitative Studies on Health and Well-Being, 9, 24849.
Raymond, L. M., & Omeri, A. (2015). Transcultural midwifery: Culture care for Mauritian immigrant childbearing
families living in New South Wales, Australia. In M. R. McFarland & H. B. Wehbe-Alamah (Eds.), Leininger’s
culture care diversity and universality: A worldwide nursing theory (pp. 183–254). Burlington, MA: Jones &
Bartlett.
Solomon, D., & Hansen, L. (2015). Living through the end: The phenomenon of dying at home. Palliative &
Supportive Care, 13, 125–134.
Spradley, J. P. (1979). The ethnographic interview. Belmont, CA: Wadsworth, Cengage Learning.
Thorne, S., & Darbyshire, P. (2005). Land mines in the field: A modest proposal for improving the craft of
qualitative health research. Qualitative Health Research, 15, 1105–1113.
van Kaam, A. (1966). Existential foundations of psychology. Pittsburgh, PA: Duquesne University Press.
van Manen, M. (1997). Researching lived experience: Human science for an action sensitive pedagogy (2nd ed.).
Ontario, Canada: The Althouse Press.
*A link to this open-access article is provided in the Internet Resources section on website.
**This journal article is available on for this chapter.
412
17 Trustworthiness and Integrity in
Qualitative Research
Learning Objectives
On completing this chapter, you will be able to:
Discuss some controversies relating to the issue of quality in qualitative research
Identify the quality criteria proposed in a major framework for evaluating quality and
integrity in qualitative research
Discuss strategies for enhancing quality in qualitative research
Describe different dimensions relating to the interpretation of qualitative results
Define new terms in the chapter
Key Terms
Audit trail
Authenticity
Confirmability
Credibility
Data triangulation
Dependability
Disconfirming cases
Inquiry audit
Investigator triangulation
Member check
Method triangulation
Negative case analysis
Peer debriefing
Persistent observation
Prolonged engagement
Reflexivity
Researcher credibility
Thick description
413
Transferability
Triangulation
Trustworthiness
Integrity in qualitative research is a critical issue for both those doing the research and
those considering the use of qualitative evidence.
PERSPECTIVES ON QUALITY IN QUALITATIVE
RESEARCH
Qualitative researchers agree on the importance of doing high-quality research, yet
defining “high quality” has been controversial. We offer a brief overview of the
arguments of the debate.
Debates About Rigor and Validity
One contentious issue concerns use of the terms rigor and validity—terms some people
shun because they are associated with the positivist paradigm. For these critics, the
concept of rigor is by its nature a term that does not fit into an interpretive paradigm that
values insight and creativity.
Others disagree with those opposing the term validity. Morse (2015), for example,
has argued that qualitative researchers should return to the terminology of the social
sciences—i.e., rigor, reliability, validity, and generalizability.
The complex debate has given rise to a variety of positions. At one extreme are
those who think that validity is an appropriate quality criterion in both quantitative and
qualitative studies, although qualitative researchers use different methods to achieve it.
At the opposite extreme are those who berate the “absurdity” of validity. A widely
adopted stance is what has been called a parallel perspective. This position was
proposed by Lincoln and Guba (1985), who created standards for the trustworthiness
of qualitative research that parallel the standards of reliability and validity in
quantitative research.
Generic Versus Specific Standards
Another controversy concerns whether there should be a generic set of quality standards
or whether specific standards are needed for different qualitative traditions. Some
writers believe that research conducted within different disciplinary traditions must
attend to different concerns and that techniques for enhancing research integrity vary.
Thus, different writers have offered standards for specific forms of qualitative inquiry,
such as grounded theory, phenomenology, ethnography, and critical research. Some
writers believe, however, that some quality criteria are fairly universal within the
constructivist paradigm. For example, Whittemore and colleagues (2001) prepared a
414
synthesis of criteria that they viewed as essential to all qualitative inquiry.
Terminology Proliferation and Confusion
The result of these controversies is that there is no common vocabulary for quality
criteria in qualitative research. Terms such as truth value, goodness, integrity, and
trustworthiness abound, but each proposed term has been refuted by some critics. With
regard to actual criteria for evaluating quality in qualitative research, dozens have been
suggested. Establishing a consensus on what the quality criteria should be, and what
they should be named, remains elusive.
Given the lack of consensus and the heated arguments supporting and contesting
various frameworks, it is difficult to provide guidance about quality standards. We
present information about criteria from the Lincoln and Guba (1985) framework in the
next section. (Criteria from another framework are described in the supplement to this
chapter on website.) We then describe strategies that researchers use to
strengthen integrity in qualitative research. These strategies should provide guidance for
considering whether a qualitative study is sufficiently rigorous, trustworthy, insightful,
or valid.
LINCOLN AND GUBA’S FRAMEWORK OF QUALITY
CRITERIA
Although not without critics, the criteria often viewed as the “gold standard” for
qualitative research are those outlined by Lincoln and Guba (1985). These researchers
suggested four criteria for developing the trustworthiness of a qualitative inquiry:
credibility, dependability, confirmability, and transferability. These criteria represent
parallels to the positivists’ criteria of internal validity, reliability, objectivity, and
external validity, respectively. In later writings, responding to criticisms and to their
own evolving views, a fifth criterion more distinctively aligned with the constructivist
paradigm was added: authenticity (Guba & Lincoln, 1994).
Credibility
Credibility refers to confidence in the truth value of the data and interpretations of
them. Qualitative researchers must strive to establish confidence in the truth of the
findings for the particular participants and contexts in the research. Lincoln and Guba
(1985) pointed out that credibility involves two aspects: first, carrying out the study in a
way that enhances the believability of the findings, and second, taking steps to
demonstrate credibility to external readers. Credibility is a crucial criterion in
qualitative research that has been proposed in several quality frameworks.
Dependability
415
Dependability refers to the stability (reliability) of data over time and over conditions.
The dependability question is Would the study findings be repeated if the inquiry were
replicated with the same (or similar) participants in the same (or similar) context?
Credibility cannot be attained in the absence of dependability, just as validity in
quantitative research cannot be achieved in the absence of reliability.
Confirmability
Confirmability refers to objectivity—the potential for congruence between two or more
independent people about the data’s accuracy, relevance, or meaning. This criterion is
concerned with establishing that the data represent the information participants provided
and that the interpretations of those data are not imagined by the inquirer. For this
criterion to be achieved, the findings must reflect the participants’ voice and the
conditions of the inquiry and not the researcher’s biases.
Transferability
Transferability, analogous to generalizability, is the extent to which qualitative
findings have applicability in other settings or groups. Lincoln and Guba (1985) noted
that the investigator’s responsibility is to provide sufficient descriptive data that readers
can evaluate the applicability of the data to other contexts: “Thus the naturalist cannot
specify the external validity of an inquiry; he or she can provide only the thick
description necessary to enable someone interested in making a transfer to reach a
conclusion about whether transfer can be contemplated as a possibility” (p. 316).
Authenticity
Authenticity refers to the extent to which researchers fairly and faithfully show a range
of different realities. Authenticity emerges in a report when it conveys the feeling tone
of participants’ lives as they are lived. A text has authenticity if it invites readers into a
vicarious experience of the lives being described and enables readers to develop a
heightened sensitivity to the issues being depicted. When a text achieves authenticity,
readers are better able to understand the lives being portrayed “in the round,” with some
sense of the mood, experience, language, and context of those lives.
STRATEGIES TO ENHANCE QUALITY IN
QUALITATIVE INQUIRY
This section describes some of the strategies that qualitative researchers can use to
establish trustworthiness in their studies. We hope this description will prompt you to
carefully assess the steps researchers did or not take to enhance quality.
We have not organized strategies according to the five criteria just described (e.g.,
strategies researchers use to enhance credibility) because many strategies
416
simultaneously address multiple criteria. Instead, we have organized strategies by phase
of the study—data collection, coding and analysis, and report preparation. Table 17.1
indicates how various quality-enhancement strategies map onto Lincoln and Guba’s
(1985) criteria.
Quality-Enhancement Strategies During Data Collection
Some of the strategies that qualitative researchers use are difficult to discern in a report.
For example, intensive listening during an interview, careful probing to obtain rich and
417
comprehensive data, and taking pains to gain participants’ trust are all strategies to
enhance data quality that cannot easily be communicated in a report. In this section, we
focus on some strategies that can be described to readers to increase their confidence in
the integrity of the study results.
Prolonged Engagement and Persistent Observation
An important step in establishing integrity in qualitative studies is prolonged
engagement—the investment of sufficient time collecting data to have an in-depth
understanding of the culture, language, or views of the people or group under study; to
test for misinformation; and to ensure saturation of important categories. Prolonged
engagement is also important for building trust with informants, which in turn makes it
more likely that useful and rich information will be obtained.
Example of prolonged engagement
Zakerihamidi and colleagues (2015) conducted a focused ethnographic study of
pregnant women’s perceptions of vaginal versus cesarean section in Iran. The lead
researcher had “long-term involvement with the participants during data collection”
(p. 43). It was noted that “the researcher fully immersed herself in the culture related
to the selection of the mode of delivery” (p. 42).
High-quality data collection in qualitative studies also involves persistent
observation, which concerns the salience of the data being gathered. Persistent
observation refers to the researchers’ focus on the characteristics or aspects of a
situation that are relevant to the phenomena being studied. As Lincoln and Guba (1985)
noted, “If prolonged engagement provides scope, persistent observation provides depth”
(p. 304).
Example of persistent observation
Nortvedt and colleagues (2016) conducted a qualitative study of 14 immigrant
women on long-term sick leave during their rehabilitation in Norway. In addition to
interviews, the first author conducted participant observation during two
rehabilitation courses at an outpatient clinic. Each course occurred over 10 days for
10 weeks each. The total number of hours of observation of these immigrant women
was 45 hours.
Reflexivity Strategies
Reflexivity involves awareness that the researcher as an individual brings to the inquiry
a unique background, set of values, and a professional identity that can affect the
research process. Reflexivity involves attending continually to the researcher’s effect on
the collection, analysis, and interpretation of data.
418
The most widely used strategy for maintaining reflexivity is to maintain a reflexive
journal or diary. Reflexive writing can be used to record, in an ongoing fashion,
thoughts about how previous experiences and readings about the phenomenon are
affecting the inquiry. Through self-interrogation and reflection, researchers seek to be
well positioned to probe deeply and to grasp the experience, process, or culture under
study through the lens of participants.
TIP Researchers sometimes begin a study by being interviewed themselves
with regard to the phenomenon under study. Of course, this approach is
possible only if the researcher has experienced that phenomenon.
Data and Method Triangulation
Triangulation refers to the use of multiple referents to draw conclusions about what
constitutes truth. The aim of triangulation is to “overcome the intrinsic bias that comes
from single-method, single-observer, and single-theory studies” (Denzin, 1989, p. 313).
Triangulation can also help to capture a more complete, contextualized picture of the
phenomenon under study. Denzin (1989) identified four types of triangulation (data
triangulation, investigator triangulation, method triangulation, and theory triangulation),
and other types have been proposed. Two types are relevant to data collection.
Data triangulation involves the use of multiple data sources for the purpose of
validating conclusions. There are three types of data triangulation: time, space, and
person. Time triangulation involves collecting data on the same phenomenon or about
the same people at different points in time (e.g., at different times of the year). This
concept is similar to test–retest reliability assessment—the point is not to study a
phenomenon longitudinally to assess change but to establish the congruence of the
phenomenon across time. Space triangulation involves collecting data on the same
phenomenon in multiple sites to test for cross-site consistency. Finally, person
triangulation involves collecting data from different types or levels of people (e.g.,
patients, health care staff) with the aim of validating data through multiple perspectives
on the phenomenon.
Example of person and space triangulation
Mill and a multidisciplinary team (2013) undertook participatory action research in
four countries (Jamaica, Kenya, Uganda, and South Africa) to explore stigma in
AIDS nursing care. They gathered data from frontline registered nurses, enrolled
nurses, and midwives, using personal and focus group interviews. They noted that
triangulation was specifically designed to “enhance the rigor of the study” (p. 1068).
Method triangulation involves using multiple methods of data collection. In
qualitative studies, researchers often use a rich blend of unstructured data collection
methods (e.g., interviews, observations, documents) to develop a comprehensive
419
understanding of a phenomenon. Diverse data collection methods provide an
opportunity to evaluate the extent to which a consistent and coherent picture of the
phenomenon emerges.
Example of method triangulation
Nilmanat and coresearchers (2015) studied the end-of-life experiences of Thai
patients with advanced cancer. Patients were interviewed multiple times, and
interviews lasted about an hour each. In-depth observations were made in the
participants’ homes, in hospitals, and at their funeral ceremonies at Buddhist temples.
Comprehensive and Vivid Recording of Information
In addition to taking steps to record interview data accurately (e.g., via careful
transcriptions of recorded interviews), researchers ideally prepare field notes that are
rich with descriptions of what transpired in the field—even if interviews are the primary
source of data.
Some researchers specifically develop an audit trail—a systematic collection of
materials that would allow an independent auditor to draw conclusions about the data.
An audit trail might include the raw data (e.g., interview transcripts), methodologic and
reflexive notes, topic guides, and data reconstruction products (e.g., drafts of the final
report). Similarly, the maintenance of a decision trail that articulates the researcher’s
decision rules for categorizing data and making analytic inferences is a useful way to
enhance the dependability of the study. When researchers share some decision trail
information in their reports, readers can better evaluate the soundness of the decisions.
Example of an audit trail
In their phenomenological study of the illness experiences of patients suffering with
rare diseases and of the health care providers who care for them, Garrino and
colleagues (2015) maintained careful documentation and an audit trail to enhance
credibility.
Member Checking
In a member check, researchers give participants feedback about emerging
interpretations and then seek participants’ reactions. The argument is that participants
should have an opportunity to assess and validate whether the researchers’
interpretations are good representations of their realities. Member checking can be
carried out as data are being collected (e.g., through probing to ensure that interviewers
have properly interpreted participants’ meanings) and more formally after data have
been analyzed in follow-up interviews.
Despite the potential that member checking has for enhancing credibility, it has
potential drawbacks. For example, member checks can lead to erroneous conclusions if
420
participants share a common façade or a desire to “cover up.” Also, some participants
might agree with researchers’ interpretations out of politeness or in the belief that
researchers are “smarter” than they are. Thorne and Darbyshire (2005) cautioned against
what they called adulatory validity, “a mutual stroking ritual that satisfies the agendas
of both researcher and researched” (p. 1110). They noted that member checking tends to
privilege interpretations that place participants in a charitable light.
Few strategies for enhancing data quality are as controversial as member checking.
Nevertheless, it is a strategy that has the potential to enhance credibility if it is done in a
manner that encourages candor and critical appraisal by participants.
Example of member checking
Smith (2015) conducted a qualitative study of the sexual protective strategies and
condom use in middle-aged African American women. A sample of 10 women, ages
45 to 56 years, participated in in-depth interviews. The themes of the individual,
gender/relationship power factors, and the sociocultural elements that influenced
sexual protection or risk-taking behavior were reviewed by three participants, who
stated the themes accurately reflected their experiences.
Strategies Relating to Coding and Analysis
Excellent qualitative inquiry is likely to involve the simultaneous collection and
analysis of data, and so several of the strategies described earlier also contribute to
analytic integrity. Member checking, for example, can occur in an ongoing fashion as
part of the data collection process but typically also involves participants’ review of
preliminary analytic constructions. In this section, we introduce a few additional
quality-enhancement strategies associated with the coding, analysis, and interpretation
of qualitative data.
Investigator Triangulation
Investigator triangulation refers to the use of two or more researchers to make data
collection, coding, and analysis decisions. The underlying premise is that through
collaboration, investigators can reduce the possibility of biased decisions and
idiosyncratic interpretations.
Conceptually, investigator triangulation is analogous to interrater reliability in
quantitative studies and is a strategy that is often used in coding qualitative data. Some
researchers take formal steps to compare two or more independent category schemes or
independent coding decisions.
Example of independent coding
Aujoulat and coresearchers (2014) studied the challenges of self-care for young liver
transplant recipients. In-depth interviews were conducted with 18 patients (ages 16 to
421
30 years) and with several parental caregivers. Initial coding of transcripts was
undertaken by two researchers, who met regularly to discuss emerging categories.
Collaboration can also be used at the analysis stage. If investigators bring to the
analysis task a complementary blend of skills and expertise, the analysis and
interpretation can potentially benefit from divergent perspectives. In Aujoulat et al.’s
(2014) study of young liver transplant recipients, emerging themes were discussed in
four “focus group” meetings with the multidisciplinary team of researchers.
Searching for Disconfirming Evidence and Competing Explanations
A powerful verification procedure involves a systematic search for data that will
challenge a categorization or explanation that has emerged early in the analysis. The
search for disconfirming cases occurs through purposive or theoretical sampling
methods. Clearly, this strategy depends on concurrent data collection and data analysis:
Researchers cannot look for disconfirming data unless they have a sense of what they
need to know.
Example of searching for disconfirming evidence
Andersen and Owen (2014) conducted a grounded theory study to explain the process
of quitting smoking cigarettes. The two investigators worked together to analyze
transcripts from interviews with 16 participants: “We engaged in discussion of
emerging categories, seeking out contradictory evidence” (p. 254).
Lincoln and Guba (1985) discussed the related activity of negative case analysis.
This strategy (sometimes called deviant case analysis) is a process by which researchers
search for cases that appear to disconfirm earlier hypotheses and then revise their
interpretations as necessary. The goal of this procedure is to continuously refine a
hypothesis or theory until it accounts for all cases.
Example of a negative case analysis
Begley and colleagues (2015) studied whether clinical specialists in Ireland were
fulfilling role expectations in terms of involvement with research and evidence-based
practice (EBP) activities. After collecting interview and observational data, the team
came together to develop themes. The team searched for examples of negative cases
“that might disprove, or validate, emerging findings” (p. 104).
Peer Review and Debriefing
Peer debriefing involves external validation, often in face-to-face session with peers of
the researchers to review aspects of the inquiry. Peer debriefing exposes researchers to
the searching questions of others who are experienced in either the methods of
422
constructivist inquiry, the phenomenon being studied, or both.
In a peer review or debriefing session, researchers might present written or oral
summaries of the data that have been gathered, categories and themes that are emerging,
and researchers’ interpretations of the data. In some cases, taped interviews might be
played. Among the questions that peer debriefers might address are the following:
Do the gathered data adequately portray the phenomenon? Have all important themes
or categories been identified?
If there are important omissions, what strategies might remedy this problem?
Are there any apparent errors of fact or possible errors of interpretation?
Is there evidence of researcher bias?
Are the themes and interpretations knit together into a cogent, useful, and creative
conceptualization of the phenomenon?
Example of peer review
Belpame and colleagues (2016) explored the psychosocial experiences of 23
adolescents and young adults with cancer. They established a research committee that
served as a guide throughout the research process, including a review of the findings.
Inquiry Audits
A similar, but more formal, approach is to undertake an inquiry audit, a procedure that
involves a scrutiny of the actual data and relevant supporting documents by an external
reviewer. Such an audit requires careful documentation of all aspects of the inquiry.
Once the audit trail materials are assembled, the inquiry auditor proceeds to audit, in a
fashion analogous to a financial audit, the trustworthiness of the data, and the meanings
attached to them. Such audits are a good tool for persuading others that qualitative data
are worthy of confidence. Relatively few comprehensive inquiry audits have been
reported in the literature, but some studies report partial audits.
Example of an inquiry audit
Rotegård and colleagues (2012) studied cancer patients’ experiences and perceptions
of their personal strengths through their illness and recovery in four focus group
interviews with 26 participants. A partial audit was undertaken by having an external
researcher review a sample of transcripts and interpretations.
Strategies Relating to Presentation
This section describes some aspects of the qualitative report itself that can help to
persuade readers of the high quality of the inquiry.
Thick and Contextualized Description
423
Thick description refers to a rich, thorough, and vivid description of the research
context, the study participants, and events and experiences observed during the inquiry.
Transferability cannot occur unless investigators provide sufficient information for
judging contextual similarity. Lucid and textured descriptions, with the judicious
inclusion of verbatim quotes from study participants, also contribute to the authenticity
of a qualitative study.
TIP Sandelowski (2004) cautioned as follows: “ . . . the phrase thick
description likely ought not to appear in write-ups of qualitative research at
all, as it is among those qualitative research words that should be seen but not
written” (p. 215).
In high-quality qualitative studies, descriptions typically go beyond a faithful
rendering of information. Powerful description is evocative and has the capacity for
emotional impact. Qualitative researchers must be careful, however, not to misrepresent
their findings by sharing only the most poignant stories. Thorne and Darbyshire (2005)
cautioned against what they called lachrymal validity, a criterion for evaluating research
by the extent to which the report can wring tears from its readers. At the same time, they
noted the opposite problem with reports that are “bloodless.” Bloodless findings are
characterized by a tendency of some researchers to “play it safe in writing up the
research, reporting the obvious . . . [and] failing to apply any inductive analytic spin to
the sequence, structure, or form of the findings” (p. 1109).
Researcher Credibility
Another aspect of credibility is researcher credibility. In qualitative studies,
researchers are the data-collecting instruments—as well as creators of the analytic
process—and so, their qualifications, experience, and reflexivity are relevant in
establishing confidence in the data. Patton (2002) has argued that trustworthiness is
enhanced if the report contains information about the researchers, including information
about credentials and any personal connections the researchers had to the people, topic,
or community under study. For example, it is relevant for a reader of a report on the
coping mechanisms of patients with AIDS to know that the researcher is HIV-positive.
Researcher credibility is also enhanced when reports describe the researchers’ efforts to
be reflexive.
Example of researcher credibility
Kindell and colleagues (2014) explored the experience of living with semantic
dementia. Kindell described herself as a speech and language therapist, specializing
in dementia care. She conducted all interviews, and her experience was described as a
“sensitizing experience [that] served as a resource for facilitating their stories” (p.
403). A coauthor was a community mental health nurse with 25 years of experience
424
in dementia care.
INTERPRETATION OF QUALITATIVE FINDINGS
It is difficult to describe the interpretive process in qualitative studies, but there is
considerable agreement that the ability to “make meaning” from qualitative texts
depends on researchers’ immersion in and closeness to the data. Incubation is the
process of living the data, a process in which researchers must try to understand their
meanings, find essential patterns, and draw insightful conclusions. Another ingredient in
interpretation and meaning making is researchers’ self-awareness and the ability to
reflect on their own worldview—that is, reflexivity. Creativity also plays an important
role in uncovering meaning in the data. Researchers need to devote sufficient time to
achieve the aha that comes with making meaning beyond the facts.
For readers of qualitative reports, interpretation is hampered by having limited
access to the data and no opportunity to “live” the data. Researchers are selective in the
amount and types of information to include in their reports. Nevertheless, you should
strive to consider some of the same interpretive dimensions for qualitative studies as for
quantitative ones (see Chapter 15).
The Credibility of Qualitative Results
As with quantitative reports, you should consider whether the results of a qualitative
inquiry are believable. It is reasonable to expect authors of qualitative reports to provide
evidence of the credibility of the findings. Because consumers view only a portion of
the data, they must rely on researchers’ efforts to corroborate findings through such
strategies as peer debriefings, member checks, audits, triangulation, and negative case
analysis. They must also rely on researchers’ frankness in acknowledging known
limitations.
In considering the believability of qualitative results, it makes sense to adopt the
posture of a person who needs to be persuaded about the researcher’s conceptualization
and to expect the researcher to present evidence with which to persuade you. It is also
appropriate to consider whether the researcher’s conceptualization is consistent with
your own clinical insights.
The Meaning of Qualitative Results
The researcher’s interpretation and analysis of qualitative data occur virtually
simultaneously in an iterative process. Unlike quantitative analyses, the meaning of the
data flows directly from qualitative analysis. Efforts to validate the analysis are
necessarily efforts to validate interpretations as well. Nevertheless, prudent qualitative
researchers hold their interpretations up for closer scrutiny—self-scrutiny as well as
review by external reviewers.
425
TIP Interpretation in qualitative studies sometimes yields hypotheses that
can be tested in more controlled quantitative studies. Qualitative studies are
well suited to generating causal hypotheses but not to testing them.
The Importance of Qualitative Results
Qualitative research is especially productive when it is used to describe and explain
poorly understood phenomena. However, the phenomenon must be one that merits
scrutiny.
You should also consider whether the findings themselves are trivial. Perhaps the
topic is worthwhile, but you may feel after reading a report that nothing has been
learned beyond what is everyday knowledge—this can happen when the data are “thin”
or when the conceptualization is shallow. Readers, like researchers, want to have an aha
experience when they read about the lives of clients and their families. Qualitative
researchers often attach catchy labels to their themes, but you should ask yourself
whether the labels have really portrayed an insightful construct.
The Transferability of Qualitative Results
Qualitative researchers do not strive for generalizability, but the possible application of
the results to other settings is important to EBP. Thus, in interpreting qualitative results,
you should consider how transferable the findings are. In what types of settings and
contexts would you expect the phenomena under study to be manifested in a similar
fashion? Of course, to make such an assessment, the researchers must have described
the participants and context in sufficient detail. Because qualitative studies are context-
bound, it is only through a careful analysis of the key features of the study context that
transferability can be assessed.
The Implications of Qualitative Results
If the findings are judged to be believable and important and if you are satisfied with the
interpretation of the results, you can begin to consider what the implications of the
findings might be. First, you can consider implications for further research: Should a
similar study be undertaken in a different setting? Has an important construct been
identified that merits the development of a formal measuring instrument? Do the results
suggest hypotheses that could be tested through controlled quantitative research?
Second, do the findings have implications for nursing practice? For example, could the
health care needs of a subculture (e.g., the homeless) be addressed more effectively as a
result of the study? Finally, do the findings shed light on fundamental processes that
could play a role in nursing theories?
CRITIQUING INTEGRITY AND INTERPRETATIONS
426
IN QUALITATIVE STUDIES
For qualitative research to be judged trustworthy, investigators must earn the trust of
their readers. In a world that is conscious about the quality of research evidence,
qualitative researchers need to be proactive in doing high-quality research and
persuading others that they were successful.
Demonstrating integrity to others involves providing a good description of the
quality-enhancement activities that were undertaken. Yet many qualitative reports do
not provide much information about efforts to ensure that the study is strong with
respect to trustworthiness. Just as clinicians seek evidence for clinical decisions,
research consumers need evidence that findings are valid. Researchers should include
enough information about their quality-enhancement strategies for readers to draw
conclusions about study quality.
Part of the difficulty that qualitative researchers face in demonstrating
trustworthiness is that page constraints in journals impose conflicting demands. It takes
a precious amount of space to present quality-enhancement strategies adequately and
convincingly. Using space for such documentation means that there is less space for the
thick description of context and rich verbatim accounts that support authenticity and
vividness. Qualitative research is often characterized by the need for critical
compromises. It is well to keep such compromises in mind in critiquing qualitative
research reports.
An important point in thinking about quality in qualitative inquiry is that attention
needs to be paid to both “art” and “science” and to interpretation and description.
Creativity and insightfulness need to be attained but not at the expense of soundness.
And the quest for soundness cannot sacrifice inspiration, or else the results are likely to
be “perfectly healthy but dead” (Morse, 2006, p. 6). Good qualitative work is both
descriptively accurate and explicit and interpretively rich and innovative. Some
guidelines that may be helpful in evaluating qualitative methods and analyses are
presented in Box 17.1.
Box 17.1 Guidelines for Evaluating Trustworthiness and Integrity in
Qualitative Studies
1. Did the report discuss efforts to enhance or evaluate the quality of the data and the
overall inquiry? If so, was the description sufficiently detailed and clear? If not,
was there other information that allowed you to draw inferences about the quality
of the data, the analysis, and the interpretations?
2. Which specific techniques (if any) did the researcher use to enhance the
trustworthiness and integrity of the inquiry? What quality-enhancement strategies
were not used? Would additional strategies have strengthened your confidence in
the study and its evidence?
3. Did the researcher adequately represent the multiple realities of those being
427
studied? Do the findings seem authentic?
4. Given the efforts to enhance data quality, what can you conclude about the study’s
validity/integrity/rigor/trustworthiness?
5. Did the report discuss any study limitations and their possible effects on the
credibility of the results or on interpretations of the data? Were results interpreted
in light of findings from other studies?
6. Did the researchers discuss the study’s implications for clinical practice or future
research? Were the implications well grounded in the study evidence and in
evidence from earlier research?
This section describes quality-enhancement efforts in a grounded theory study
—a study that was also described in Chapter 11 and that is available on
website. Read the summary and then answer the critical thinking
questions that follow, referring to the full research report if necessary.
Example 1 is featured on the interactive Critical Thinking Activity on
website. The critical thinking questions for Example 2 are based on the study
that appears in its entirety in Appendix B of this book. Our comments for
these exercises are in the Student Resources section on .
EXAMPLE 1: TRUSTWORTHINESS IN A GROUNDED
THEORY STUDY
Study: The psychological process of breast cancer patients receiving initial
chemotherapy (Chen et al., 2015). It’s available on as the reading for
Chapter 11
Statement of Purpose: The purpose of this study was to explore patients’
suffering and adverse effects during the process of receiving the first course
of chemotherapy for breast cancer.
Method: The researchers used Glaser’s grounded theory methods. Twenty
Taiwanese women, ranging in age from 39 to 62 years, were interviewed
within 6 months of completing the first course of chemotherapy. Purposive
sampling was used initially, and then theoretical sampling was used to select
additional participants until categories were saturated. The interviews
included such broad questions as the following: During chemotherapy, what
was on your mind? How did the chemotherapy affect your life? The audio-
recorded interviews were transcribed for analysis.
Quality Enhancement Strategies: The researchers’ report provided good
detail about efforts to enhance the trustworthiness of their study, as described
428
in a subsection of their “Method” section labeled “Rigor.” The researchers
noted that the lead investigator participated in the care of the women during
their hospitalization and during follow-up visits, thereby contributing to
prolonged engagement—and to the development of a good therapeutic
relationship. The researcher continued to observe the verbal and nonverbal
expressions of these patients during follow-up visits; this strategy was
described as persistent observation but could also be considered data
triangulation if the analysis was informed by both the interview data and the
informal observations. Three experts were invited to review and discuss the
emerging conceptualization (peer debriefing). Two study participants
reviewed the findings in a member check effort. The lead researcher also
maintained a reflexive journal that guided her during data collection. During
the interviews, the questioning was informed by ongoing data analysis so that
questions were linked to emergent categories to achieve saturation. The report
also included explicit statements about the researchers’ credentials and
experience, thus supporting researcher credibility. In terms of thick
description, the researchers provided many vivid excerpts from the
interviews.
Key Findings: The researchers concluded that the core category was “Rising
from the ashes.” Four categories represented four stages of the psychological
process experienced by these patients: the fear stage, the hardship stage, the
adjustment stage, and the relaxation stage. The authors noted that each stage
is likely to occur repeatedly.
Critical Thinking Exercises
1. Answer the relevant questions from Box 17.1 regarding this study.
2. Also consider the following targeted questions:
a. Which quality-enhancement strategy used by Chen et al. (2015) gave
you the most confidence in the integrity and trustworthiness of their
study? Why?
b. Think of an additional type of triangulation that the researchers could
have used in their study and describe how this could have been
implemented.
3. What might be some of the uses to which the findings could be put in
clinical practice?
EXAMPLE 2: TRUSTWORTHINESS IN THE
PHENOMENOLOGIC STUDY IN APPENDIX B
• Read the methods and results sections of Beck and Watson’s (2010)
phenomenological study (“Subsequent childbirth after a previous traumatic
birth”) in Appendix B of this book.
429
Critical Thinking Exercises
1. Answer the relevant questions from Box 17.1 regarding this study.
2. Also consider the following targeted questions:
a. Suggest one or two ways in which triangulation could have been used in
this study.
b. Which quality-enhancement strategy used by Beck and Watson gave
you the most confidence in the integrity and trustworthiness of their
study? Why?
WANT TO KNOW MORE?
A wide variety of resources to enhance your learning and understanding of
this chapter are available on .
• Interactive Critical Thinking Activity
• Chapter Supplement on Whittemore and Colleagues’ Framework of
Quality Criteria in Qualitative Research
• Answer to the Critical Thinking Exercise for Example 2
• Internet Resources with useful websites for Chapter 17
• A Wolters Kluwer journal article in its entirety—the Andersen and Owen
study described as an example on page 301.
Additional study aids, including eight journal articles and related
questions, are also available in Study Guide for Essentials of Nursing
Research, 9e.
Summary
Points
One of several controversies regarding quality in qualitative studies involves
terminology. Some argue that rigor and validity are quantitative terms that are not
suitable as goals in qualitative inquiry, but others believe these terms are
appropriate. Other controversies involve what criteria to use as indicators of
integrity and whether there should be generic or study-specific criteria.
Lincoln and Guba proposed one framework for evaluating trustworthiness in
qualitative inquiries in terms of five criteria: credibility, dependability,
confirmability, transferability, and authenticity.
Credibility, which refers to confidence in the truth value of the findings, has been
viewed as the qualitative equivalent of internal validity. Dependability, the
430
stability of data over time and over conditions, is somewhat analogous to
reliability in quantitative studies. Confirmability refers to the objectivity of the
data. Transferability, the analog of external validity, is the extent to which
findings can be transferred to other settings or groups. Authenticity is the extent
to which researchers faithfully show a range of different realities and convey the
feeling tone of lives as they are lived.
Strategies for enhancing quality during qualitative data collection include
prolonged engagement, which strives for adequate scope of data coverage;
persistent observation, which is aimed at achieving adequate depth;
comprehensive recording of information (including maintenance of an audit
trail); triangulation; and member checks (asking study participants to review and
react to emerging conceptualizations).
Triangulation is the process of using multiple referents to draw conclusions about
what constitutes the truth. This includes data triangulation (using multiple data
sources to validate conclusions) and method triangulation (using multiple
methods to collect data about the same phenomenon).
Strategies for enhancing quality during the coding and analysis of qualitative data
include investigator triangulation (independent coding and analysis of data by
two or more researchers), searching for disconfirming evidence, searching for
rival explanations and undertaking a negative case analysis (revising
interpretations to account for cases that appear to disconfirm early conclusions),
external validation through peer debriefings (exposing the inquiry to the
searching questions of peers), and launching an inquiry audit (a formal scrutiny
of audit trail documents by an independent auditor).
Strategies that can be used to convince report readers of the high quality of
qualitative inquiries include using thick description to vividly portray
contextualized information about study participants and the focal phenomenon and
making efforts to be transparent about researcher credentials and reflexivity so that
researcher credibility can be established.
Interpretation in qualitative research involves “making meaning”—a process that
is difficult to describe or critique. Yet interpretations in qualitative inquiry need to
be reviewed in terms of credibility, importance, transferability, and implications.
REFERENCES FOR CHAPTER 17
**Andersen, J. S., & Owen, D. (2014). Helping relationships for smoking cessation: Grounded theory development
of the process of finding help to quit. Nursing Research, 63, 252–259.
Aujoulat, I., Janssen, M., Libion, F., Charles, A., Struyf, C., Smets, F., . . . Reding, R. (2014). Internalizing
motivation to self-care: A multifaceted challenge for young liver transplant recipients. Qualitative Health
Research, 24, 357–365.
Begley, C., Elliott, N., Lalor, J., & Higgins, A. (2015). Perceived outcomes of research and audit activities of
clinical specialists in Ireland. Clinical Nurse Specialist, 29, 100–111.
Belpame, N., Kars, M., Beeckman, D., Decoene, E., Quaghebeur, M., Van Hecke, A., & Verhaeghe, S. (2016).
“The AYA Director”: A synthesizing concept to understand psychosocial experiences of adolescents and young
431
adults with cancer. Cancer Nursing, 39(4), 292–302.
Chen, Y. C., Huang, H., Kao, C., Sun, C., Chiang, C., & Sun, F. (2015). The psychological process of breast cancer
patients receiving initial chemotherapy. Cancer Nursing. Advance online publication.
Denzin, N. K. (1989). The research act: A theoretical introduction to sociological methods (3rd ed.). Upper Saddle
River, NJ: Prentice Hall.
Garrino, L., Picco, E., Finiguerra, I., Rossi, D., Simone, P., & Roccatello, D. (2015). Living with and treating rare
diseases: Experiences of patients and professional health care providers. Qualitative Health Research, 25, 636–
651.
Guba, E., & Lincoln, Y. (1994). Competing paradigms in qualitative research. In N. Denzin & Y. Lincoln (Eds.),
Handbook of qualitative research (pp. 105–117). Thousand Oaks, CA: Sage.
Kindell, J., Sage, K., Wilkinson, R., & Keady, J. (2014). Living with semantic dementia: A case study of one
family’s experience. Qualitative Health Research, 24, 401–411.
Lincoln, Y. S., & Guba, E. G. (1985). Naturalistic inquiry. Newbury Park, CA: Sage.
Mill, J., Harrowing, J., Rae, T., Richter, S., Minnie, K., Mbalinda, S., & Hepburn-Brown, C. (2013). Stigma in
AIDS nursing care in Sub-Saharan Africa and the Caribbean. Qualitative Health Research, 23, 1066–1078.
Morse, J. M. (2006). Insight, inference, evidence, and verification: Creating a legitimate discipline. International
Journal of Qualitative Methods, 5(1), 93–100. Retrieved from
https://ejournals.library.ualberta.ca/index.php/IJQM/article/view/4412
Morse, J. M. (2015). Critical analysis of strategies for determining rigor in qualitative inquiry. Qualitative Health
Research, 25, 1212–1222.
Nilmanat, K., Promnoi, C., Phungrassami, T., Chailungka, P., Tulathamkit, K., Noo-urai, P., & Phattaranavig, S.
(2015). Moving beyond suffering: The experiences of Thai persons with advanced cancer. Cancer Nursing, 38,
224–231.
Nortvedt, L., Lohne, V., Kumar, B. N., & Hansen, H. P. (2016). A lonely life—a qualitative study of immigrant
women on long-term sick leave in Norway. International Journal of Nursing Studies, 54, 54–64.
Patton, M. Q. (2002). Qualitative research & evaluation methods (3rd ed.). Thousand Oaks, CA: Sage.
Rotegård, A., Fagermoen, M., & Ruland, C. (2012). Cancer patients’ experiences of their personal strengths
through illness and recovery. Cancer Nursing, 35, E8–E17.
Sandelowski, M. (2004). Counting cats in Zanzibar. Research in Nursing & Health, 27, 215–216.
Smith, T. K. (2015). Sexual protective strategies and condom use in middle-aged African American women: A
qualitative study. Journal of the Association of Nurses in AIDS Care, 26, 526–541.
Thorne, S., & Darbyshire, P. (2005). Land mines in the field: A modest proposal for improving the craft of
qualitative health research. Qualitative Health Research, 15, 1105–1113.
Whittemore, R., Chase, S. K., & Mandle, C. L. (2001). Validity in qualitative research. Qualitative Health
Research, 11, 522–537.
*Zakerihamidi, M., Latifnejad Roudsari, R., & Merghati Khoei, E. (2015). Vaginal delivery vs. cesarean section: A
focused ethnographic study of women’s perceptions in the north of Iran. International Journal of Community
Based Nursing and Midwifery, 3, 39–50.
*A link to this open-access article is provided in the Internet Resources section on website.
**This journal article is available on for this chapter.
432
https://ejournals.library.ualberta.ca/index.php/IJQM/article/view/4412
18 Systematic Reviews: Meta-Analysis
and Metasynthesis
Learning Objectives
On completing this chapter, you will be able to:
Discuss alternative approaches to integrating research evidence and advantages to
using systematic methods
Describe key decisions and steps in doing a meta-analysis and metasynthesis
Critique key aspects of a written systematic review
Define new terms in the chapter
Key Terms
Effect size (ES)
Forest plot
Frequency effect size
Intensity effect size
Manifest effect size
Meta-analysis
Meta-ethnography
Meta-summary
Metasynthesis
Primary study
Publication bias
Statistical heterogeneity
Subgroup analysis
Systematic review
In Chapter 7, we described major steps in conducting a literature review. This chapter
also discusses reviews of existing evidence but focuses on systematic reviews,
especially those in the form of meta-analyses and metasyntheses. Systematic reviews, a
cornerstone of evidence-based practice (EBP), are inquiries that follow many of the
433
same rules as those for primary studies, i.e., original research investigations. This
chapter provides guidance in helping you to understand and evaluate systematic
research integration.
RESEARCH INTEGRATION AND SYNTHESIS
A systematic review integrates research evidence about a specific research question
using careful sampling and data collection procedures that are spelled out in advance.
The review process is disciplined and transparent so that readers of a systematic review
can assess the integrity of the conclusions.
Twenty years ago, systematic reviews usually involved narrative integration, using
nonstatistical methods to synthesize research findings. Narrative systematic reviews
continue to be published, but meta-analytic techniques that use statistical integration are
widely used. Most reviews in the Cochrane Collaboration, for example, are meta-
analyses. Statistical integration, however, is sometimes inappropriate, as we shall see.
Qualitative researchers have also developed techniques to integrate findings across
studies. Many terms exist for such endeavors (e.g., meta-study, meta-ethnography), but
the one that has emerged as the top term is metasynthesis.
The field of research integration is expanding steadily. This chapter provides a brief
introduction to this important and complex topic.
META-ANALYSIS
Meta-analyses of randomized controlled trials (RCTs) are at the pinnacle of traditional
evidence hierarchies for Therapy questions (see Fig. 2.1). The essence of a meta-
analysis is that findings from each study are used to compute a common index, an effect
size. Effect size values are averaged across studies, yielding information about the
relationship between variables across multiple studies.
Advantages of Meta-Analyses
Meta-analysis offers a simple advantage as an integration method: objectivity. It is
difficult to draw objective conclusions about a body of evidence using narrative
methods when results are inconsistent, as they often are. Narrative reviewers make
subjective decisions about how much weight to give findings from different studies, and
so different reviewers may reach different conclusions in reviewing the same studies.
Meta-analysts make decisions that are explicit and open to scrutiny. The integration
itself also is objective because it uses statistical formulas. Readers of a meta-analysis
can be confident that another analyst using the same data set and analytic decisions
would come to the same conclusions.
Another advantage of meta-analysis concerns power, i.e., the probability of
detecting a true relationship between variables (see Chapter 14). By combining effects
434
across multiple studies, power is increased. In a meta-analysis, it is possible to conclude
that a relationship is real (e.g., an intervention is effective), even when several small
studies yielded nonsignificant findings. In a narrative review, 10 nonsignificant findings
would almost surely be interpreted as lack of evidence of a true effect, which could be
the wrong conclusion.
Despite these advantages, meta-analysis is not always appropriate. Indiscriminate
use has led critics to warn against potential abuses.
Criteria for Using Meta-Analytic Techniques in a
Systematic Review
Reviewers need to decide whether statistical integration is suitable. A basic criterion is
that the research question should be nearly identical across studies. This means that the
independent and dependent variables, and the study populations, are sufficiently similar
to merit integration. The variables may be operationalized differently to be sure. Nurse-
led interventions to promote healthy diets among diabetics could be a 4-week clinic-
based program in one study and a 6-week home-based intervention in another, for
example. However, a study of the effects of a 1-hour lecture to discourage eating “junk
food” among overweight adolescents would be a poor candidate to include in this meta-
analysis. This is frequently referred to as the “apples and oranges” or “fruit” problem.
Meta-analyses should not be about fruit— i.e., a broad category—but rather about
“apples,” or, even better, “Granny Smith apples.”
Another criterion concerns whether there is a sufficient knowledge base for
statistical integration. If there are only a few studies or if all of the studies are weakly
designed, it usually would not make sense to compute an “average” effect.
One other issue concerns the consistency of the evidence. When the same
hypothesis has been tested in multiple studies and the results are highly conflicting,
meta-analysis is likely not appropriate. As an extreme example, if half the studies
testing an intervention found benefits for those in the intervention group,