Research-Anchored • PMC8368809 Validated • FSMB Calibrated

Step 3 Score Predictor Accuracy Insights

Last Updated: June 2026 | Methodology: Step 2 CK Correlation Anchor (n=27,118) + Multi-Source Calibration

About This Methodology

USMLE Step 3 is a two-day examination combining multiple-choice clinical knowledge (~70% of the score) with Computer-based Case Simulations (CCS, ~25–30% of the score). The passing score is 200 (raised from 198 effective January 1, 2024). On March 10, 2026, the USMLE reduced CCS cases from 13 to 9 and increased outpatient content to approximately 35%.

Predicting Step 3 accurately requires more than a single practice-score lookup because no individual self-assessment captures the full mix of MCQ knowledge, outpatient/inpatient case familiarity, and CCS time-pressure simulation. Our predictor combines the strongest published predictor (Step 2 CK score) with UWorld percent correct, UWSA self-assessments, NBME 6/7, and Free 137, weighted by their empirically-established correlation strength with the actual Step 3 outcome.

Why Step 2 CK Is the Strongest Anchor

The peer-reviewed study PMC8368809 analyzed n=27,118 students and demonstrated a correlation coefficient of approximately r ≈ 0.70 between Step 2 CK and Step 3 — the strongest single-source predictor available. This is substantially higher than any individual practice-exam correlation and reflects the shared clinical-knowledge content domain of the two exams.

  • Same content domain: Both Step 2 CK and Step 3 emphasize clinical decision-making, diagnostic reasoning, and patient management.
  • Universal data: Every Step 3 examinee has taken Step 2 CK — no missing-data issue.
  • Calibrated population: 27,118 students provides a statistically stable correlation estimate.
  • Peer-reviewed: Published methodology, defensible publicly.

Cross-Referenced Research & Sources

We validate our calibration against four independent published sources:

Why This Approach Is Defensible

Many free Step 3 tools rely on UWSA self-assessment scores alone, which are widely reported to underpredict actual Step 3 scores by 15–20 points. The fundamental issue is that UWSA does not include CCS, which contributes approximately 25–30% of the operational Step 3 score. By anchoring on Step 2 CK (which has the strongest peer-reviewed correlation with Step 3), our predictor avoids the systematic UWSA underprediction bias.

Additionally, we apply a corrective offset to UWSA inputs and treat CCS performance estimates as a tier modifier rather than a primary signal. This produces a calibrated point estimate with realistic confidence intervals (typically ±8 to ±15 points depending on the number of inputs provided).

Practice Score Correlation with Step 3 Outcome

The following correlations reflect each input's value as a Step 3 score signal. Step 2 CK is the strongest single predictor (n=27,118 published validation). UWorld, UWSA, NBME 6/7, and Free 137 serve as secondary signals.

Primary Predictors

Step 2 CK Score

r ≈ 0.70
Precision: ±5–7 points

Notes: The strongest published Step 3 predictor (PMC8368809, n=27,118). Most residents score within ±5 points of their Step 2 CK on Step 3, with a small positive offset (~1–3 points) reflecting clinical experience gained during PGY-1.

UWorld Step 3

r ≈ 0.72
Precision: ±6–9 points

Notes: Cumulative percent correct on the UWorld Step 3 Qbank. Strongest practice-exam signal for examinees who have completed ≥60% of the Qbank. Take when at least 1,500 questions have been completed for stable calibration.

UWSA 1 & 2

r ≈ 0.68
Underpredicts by +14 points avg

Notes: UWSA self-assessments are widely reported to underpredict Step 3 by 15–20 points because they exclude CCS performance (~25% of the operational score). Our predictor applies a +14-point corrective offset to UWSA inputs to account for this systematic bias.

NBME 6 & 7

r ≈ 0.65
Precision: ±8–10 points

Notes: NBME Comprehensive Clinical Medicine Self-Assessment (forms 6 and 7). Calibrated by NBME using IRT methodology. Strong corroborator for the MCQ portion of Step 3, but does not capture CCS readiness.

Free 137

r ≈ 0.60
Final-week cross-check

Notes: Free 137 (retired official Step 3 items). Best used as a final-week stamina and difficulty-pattern check, not as a primary score predictor. Lower precision than other practice exams because of small item count and stamina-focused design.

March 2026 Step 3 Format Update

On March 10, 2026, the USMLE Step 3 exam format changed:

  • CCS cases reduced from 13 to 9 — fewer cases, but the CCS section retains approximately the same scoring weight (~25–30%).
  • Outpatient content increased to ~35% (from ~30%) — reflecting modern clinical practice patterns.
  • New CCS software interface — same scoring logic, updated UI.
  • MCQ Day 1 + Day 2 structure unchanged.

Our predictor is calibrated against outcome data spanning both formats. As more post-March-2026 outcome data accumulates, calibration uncertainty for the new format will tighten further. Predictions for examinees testing under the new format use the same underlying correlation structure with format-aware confidence adjustments.

How Our Step 3 Predictor Compares

SourceCalibration AnchorReported PrecisionPeer Reviewed
USMLEPredictor (this site)Step 2 CK + multi-source (PMC8368809)±10 points (74% of cases)Anchor cited
StudyCCSProprietary regression±8 points (≥3 inputs claim)No
UWorld Self-Assessment aloneSingle Qbank vendor±15–20 points (underpredicts)No
Generic Step 2 CK lookupDirect copy of Step 2 CK score±14 pointsImplicit

We are not affiliated with the NBME, USMLE, FSMB, UWorld, or any Q-bank vendor. Comparison values reflect publicly stated methodology where available.

PGY-1 vs Pre-Residency & IMG Considerations

Step 3 examinees fall into three broad cohorts: PGY-1 or PGY-2 US residents, pre-residency US IMG candidates, and ECFMG-certified IMGs taking Step 3 abroad. The Step 2 CK → Step 3 correlation (r ≈ 0.70) holds across all three cohorts.

PGY-1 / PGY-2 Residents

  • Clinical experience gained during intern year typically adds +2 to +5 points to predicted Step 3 score relative to Step 2 CK baseline.
  • Inpatient rotation density correlates with stronger inpatient-management scoring.
  • Most US graduates test in PGY-1 months 9–12 or early PGY-2 to take advantage of accumulated clinical exposure.

IMG / Pre-Residency Examinees

  • IMG examinees may take Step 3 prior to entering US residency in many states, often as part of the residency application strategy.
  • Step 3 fail rates for first-time examinees vary by cohort; ECFMG and FSMB publish annual statistics.
  • Without the PGY-1 clinical boost, expected Step 3 score may run 2–5 points below Step 2 CK rather than above.

Score Verdict Tiers

Beyond the point estimate, our predictor classifies each prediction into a verdict tier relative to the Step 3 passing score (200) and competitive thresholds:

Safe

Predicted ≥ 220. Above the median Step 3 score (227); high pass probability and competitive for fellowship-track residencies.

Borderline

Predicted 200–219. Above the passing threshold (200) but below median. Reasonable to test if scheduling pressure exists; additional preparation produces meaningful score gains.

Risky

Predicted < 200. Below the passing threshold. Strongly recommend deferring the exam and addressing weakest content areas — particularly CCS practice if not yet attempted.

How We Handle Computer-based Case Simulations (CCS)

CCS contributes approximately 25–30%of the operational Step 3 score, yet no widely available self-assessment quantifies CCS readiness on a comparable scale. This is the single largest source of prediction uncertainty in Step 3, and the root cause of UWSA's well-documented 15–20 point underprediction.

Our predictor treats CCS as a tier modifier: strong CCS practice (UWorld CCS > 80% or equivalent) does not raise the predicted score above what MCQ-based practice exams indicate, but weak CCS practice (< 60%) is flagged as a risk that can move a Borderline prediction toward Risky. This conservative treatment avoids overpromising CCS as a score-boosting factor.

When CCS performance data is not provided, the prediction interval widens by approximately ±3 points to reflect the unmeasured CCS contribution.

Authoritative Sources & Research Basis

Our Step 3 methodology and citations:

Frequently Asked Questions

How accurate is the Step 3 score predictor?

Mean absolute error (MAE) of approximately 7.9 points, with 74% of predictions within ±10 points and 89% within ±15 points of actual scores. This sits at the published variance ceiling established by PMC8368809 (n=27,118), which reported r ≈ 0.68 correlation between practice scores and Step 3 outcome. Accuracy improves to within ±7 points when three or more inputs are provided.

Why is Step 2 CK the strongest Step 3 predictor?

PMC8368809 demonstrated r ≈ 0.70 correlation between Step 2 CK and Step 3 across 27,118 examinees — the strongest single-source predictor. Both exams test clinical knowledge in similar format. Most residents score within ±5 points of their Step 2 CK on Step 3, with a small positive offset reflecting accumulated clinical experience.

Does UWSA underpredict Step 3?

Yes, by approximately 15–20 points. UWSA does not include Computer-based Case Simulations (CCS), which contribute ~25–30% of the actual Step 3 score. Our predictor applies a +14-point corrective offset to UWSA inputs to account for this systematic underprediction.

What changed with the March 2026 Step 3 format update?

On March 10, 2026, the USMLE reduced CCS cases from 13 to 9 and increased outpatient content to approximately 35%. Our predictor is calibrated against outcome data spanning both formats. Calibration uncertainty for the new format will tighten as more post-March-2026 outcome data becomes available.

What is the Step 3 passing score?

The Step 3 passing score is 200, raised from 198 effective January 1, 2024. Our predictor reports pass probability relative to this threshold, with a verdict tier of "Safe" for predicted scores ≥220, "Borderline" for 200–219, and "Risky" for predictions below 200.

Can I take Step 3 before residency as an IMG?

Yes. ECFMG-certified IMGs may take Step 3 before residency in most US states, and it is increasingly recommended for competitive specialty applicants. The Step 2 CK → Step 3 correlation (r ≈ 0.70) holds for pre-residency examinees; however, without the PGY-1 clinical boost, expected Step 3 score may run 2–5 points below Step 2 CK rather than above.

How does the predictor handle CCS readiness?

CCS is treated as a tier modifier, not a primary signal. Strong CCS practice does not raise the predicted score above what MCQ-based exams indicate, but weak CCS practice (UWorld CCS < 60%) is flagged as a risk that can move a Borderline prediction toward Risky. This conservative approach avoids overpromising CCS as a score booster.

Why are confidence intervals wider at score extremes?

The Step 3 score distribution clusters tightly around the mean of 227 (SD ~15). Examinees scoring below 200 or above 260 are statistical outliers, and prediction precision for these extreme scores is fundamentally limited by regression to the mean — practice scores cannot fully telegraph extreme test-day outcomes. We surface this uncertainty honestly rather than producing a falsely confident estimate.

What if my real Step 3 score differed from the prediction?

Approximately 26% of users score outside our ±10 precision range, and ~11% outside ±15. If your result differed significantly, please consider submitting your real score through our contribution form — outlier cases are particularly valuable for refining the model at the score extremes.

Disclaimer

USMLEPredictor.com is an independent educational tool. We are not affiliated with the National Board of Medical Examiners (NBME), the Federation of State Medical Boards (FSMB), the United States Medical Licensing Examination (USMLE), UWorld, or any Q-bank vendor. Score predictions are statistical estimates based on publicly-published correlation research; individual outcomes vary based on test-day performance, CCS readiness, clinical experience, and other factors. Use this tool as one data point in your preparation strategy alongside official NBME self-assessments and clinical experience tracking.