Quick answer. Reliability is about consistency — would the same study repeated produce the same results? Validity is about accuracy — does the study actually measure what it claims to? In quantitative research, reliability is assessed by Cronbach’s alpha, test-retest correlations, and inter-rater agreement; validity by content, construct, and criterion validity. In qualitative research, the equivalent concepts are credibility, transferability, dependability, and confirmability (Lincoln & Guba’s “trustworthiness” framework). Examiners expect explicit treatment of both in your methodology chapter.
Every dissertation has to answer two questions about its methods: are they consistent, and are they accurate? Examiners look for explicit, named procedures that demonstrate both. The vocabulary differs by paradigm — quantitative researchers use reliability and validity; qualitative researchers use trustworthiness criteria — but the underlying concerns are the same. This guide covers both, the procedures that demonstrate each criterion, and how to write up the section in your methodology chapter.
The Quantitative Framework
Reliability types and tests
| Reliability type | What it tests | Statistic | Acceptable threshold |
|---|---|---|---|
| Internal consistency | All items measure the same construct | Cronbach’s α | ≥ 0.70 (acceptable), ≥ 0.80 (good) |
| Test-retest | Same instrument gives stable scores over time | Pearson r between administrations | ≥ 0.70 |
| Parallel-forms | Two versions of the test agree | Pearson r between forms | ≥ 0.80 |
| Inter-rater | Different raters score the same data the same way | Cohen’s κ (categorical) or ICC (continuous) | κ ≥ 0.61 (substantial), ICC ≥ 0.75 |
| Split-half | Two halves of the instrument agree | Spearman–Brown corrected r | ≥ 0.80 |
Validity types
- Content validity — does the instrument cover all aspects of the construct? Assessed by expert panel review (3–5 subject experts rating each item).
- Face validity — does the instrument look reasonable to participants? Weakest form; never sufficient alone.
- Construct validity — does the instrument actually measure the theoretical construct? Tested by factor analysis (exploratory + confirmatory) and convergent/discriminant correlations with related and unrelated measures.
- Criterion validity — does the instrument correlate with a gold-standard outcome? Sub-types: concurrent (both measured at the same time) and predictive (instrument predicts future outcome).
The Qualitative Framework: Lincoln & Guba
Reliability and validity translate awkwardly into qualitative research because qualitative work explicitly rejects the assumption that there is one fixed reality to measure. Lincoln & Guba (1985) proposed parallel criteria under the umbrella of trustworthiness:
| Quant criterion | Qual equivalent | Procedure to demonstrate |
|---|---|---|
| Internal validity | Credibility | Prolonged engagement, persistent observation, triangulation (data, method, source, investigator), member checking, peer debriefing, negative case analysis |
| External validity | Transferability | Thick description — provide enough detail about context that readers can judge applicability to their setting |
| Reliability | Dependability | Audit trail — document every analytic decision so a second researcher could trace the reasoning |
| Objectivity | Confirmability | Reflexivity journal — explicit acknowledgement of researcher position, biases, and how they shaped the analysis |
Hire A PhD Statistician
- SPSS · R · Stata · NVivo · Python
- PRISMA · Cochrane RoB · ADPIE
- Assumption testing + diagnostic plots
- Methodology chapter drafting
- Interpretation walkthrough call

Specific Procedures You Can Cite
Triangulation
Four types: data triangulation (multiple sources of data on the same phenomenon), method triangulation (interviews plus observations plus documents), investigator triangulation (multiple researchers analyse independently), theory triangulation (multiple theoretical lenses applied to the same data). Most rigorous qualitative dissertations use at least two.
Member checking
Return your analysis to participants for feedback. Different forms: returning transcripts for accuracy correction, returning preliminary themes for resonance check, or returning the full analytic write-up. Document what participants confirmed, disputed, or added.
Audit trail
A chronological record of every analytic decision: when codes were created, why, what was merged or split, how themes emerged. Software-supported (NVivo, ATLAS.ti, Dedoose) makes this systematic; pen-and-paper can work with discipline.
Reflexivity journal
Ongoing record of how your position, identity, and experiences shaped each interview, code, and interpretation. Explicit reflexivity is now expected in most qualitative dissertation rubrics at Canadian universities.
Writing the Section in Your Methodology Chapter
Quantitative version structure:
- Instrument source and development history (published scale? adapted? newly developed?).
- Reliability evidence from prior studies, plus from your own data (Cronbach’s α for each subscale).
- Content validity (expert panel) for any newly developed item.
- Construct validity (factor analysis results, convergent/discriminant correlations).
- Pilot study results if conducted.
Qualitative version structure:
- Trustworthiness framework adopted (Lincoln & Guba; Tracy 2010; Yardley 2000).
- Specific procedures for credibility (triangulation methods used, member-checking process, peer debriefing arrangement).
- Transferability evidence (sample description, context detail, thick-description sections of results).
- Dependability (audit trail location and structure).
- Confirmability (reflexivity-journal practice, positionality statement).
Mixed-Methods Studies
Mixed-methods studies need both frameworks applied separately to each strand, plus an integration-quality section. Use Creswell & Plano Clark’s “validity” criteria for mixed-methods designs: data triangulation between strands, narrative summary that integrates strands, joint displays.
Common Pitfalls
- Asserting validity without showing evidence. Saying “the instrument is valid” without naming the type of validity and the procedure used to demonstrate it.
- Confusing reliability and validity. A scale can be perfectly reliable (always gives the same answer) but invalid (the answer is wrong). Reliability is a precondition for validity, not a substitute.
- Skipping qualitative trustworthiness. Examiners flag dissertations that do qualitative interviews and never address credibility, dependability, or reflexivity.
- Using “objective” claims in qualitative work. The qualitative paradigm explicitly rejects objectivism; use “confirmable” or “auditable” instead.
- Reporting only Cronbach’s alpha for validity. Alpha is a reliability measure, not a validity measure. Different procedure required.
Frequently Asked Questions
What is acceptable Cronbach’s alpha?
0.70 is the minimum for published research. 0.80 is good. 0.90+ suggests redundant items — consider shortening the scale.
Do I need to do factor analysis in my dissertation?
Yes if you use a multi-item scale and want to demonstrate construct validity. Use exploratory factor analysis (EFA) if the scale is new or adapted; confirmatory factor analysis (CFA) if you’re testing a hypothesised factor structure. Our data-analysis software guide covers each.
How many participants do I need for member checking?
Aim for 50–100% of participants. If logistics prevent full sample, randomly sample 5–8 representative participants and document the rationale.
Can I use Cohen’s kappa for inter-rater reliability with ordinal data?
Use weighted kappa for ordinal data (gives partial credit for near-misses). Use intraclass correlation coefficient (ICC) for continuous data. Cohen’s standard kappa is for categorical.
What is positionality and where does it go?
Positionality is your explicit statement of identity, prior assumptions, and relationship to the research topic. Standard location: late in chapter 3 (methodology) as a sub-section on reflexivity, or as a stand-alone section in the introduction for sensitive topics.
Is there a quantitative version of credibility?
Internal validity is the closest match — the degree to which observed differences are attributable to the intervention rather than confounders. Threats to internal validity (history, maturation, selection, attrition) need explicit treatment in any experimental or quasi-experimental dissertation.




