What GLP-1 Clinical Trials Actually Showed
Every major trial's headline results in one place — weight loss, completion rates, cardiovascular outcomes, and the gap between on-treatment and intention-to-treat estimates that most presentations don't show.
Last updated: April 2026
Context
What the evidence showed under ideal conditions
GLP-1 clinical trials have demonstrated substantial weight loss and, in the case of SELECT, cardiovascular event reduction. These results are the foundation for coverage decisions, formulary design, and ROI modeling. However, the headline numbers cited in most vendor presentations — often the on-treatment estimate from the highest-responding subgroup — represent one of several ways to read the trial data.
This page presents every major trial's results including both on-treatment and intention-to-treat estimates, completion rates, responder analyses, and the concomitant lifestyle interventions that were part of every trial protocol. Understanding what the trials actually showed — and what they didn't measure — is essential context for interpreting the real-world data that follows in this series.
The gap between on-treatment and ITT estimates ranges from 0.7 to 3.0 percentage points across trials — meaningful for population-level modeling. Trial completion rates of 82–98% vastly exceed real-world persistence rates of 32–63% at one year. And every obesity-focused trial included 15–30 structured counseling sessions that most employer benefit designs do not replicate.
The data
Every major trial, side by side
Click any column header to sort. Use the drug filter to narrow results. Weight-loss figures use the treatment-policy (ITT-like) estimand unless noted. Where the on-treatment figure is the one commonly cited in vendor materials, it is flagged with ⚑.
| Trial ▼ | Drug ▼ | N ▼ | Population ▼ | Duration ▼ | Wt loss (on-tx) ▼ | Wt loss (ITT) ▼ | Completion ▼ |
|---|---|---|---|---|---|---|---|
| STEP 1 NEJM, 2021 | Semaglutide 2.4 mg Injection | 1,961 | Obesity, no T2D | 68 wk | −16.9% ⚑ | −14.9% | 94.3% |
| STEP 2 Lancet, 2021 | Semaglutide 2.4 mg Injection | 1,210 | Overweight/obesity + T2D | 68 wk | ~−10.5% | −9.6% | 96% |
| STEP 3 JAMA, 2021 | Semaglutide 2.4 mg Injection | 611 | Obesity + intensive behavioral tx | 68 wk | ~−17.5% | −16.0% | 92.8% |
| STEP 4 JAMA, 2021 | Semaglutide 2.4 mg Injection | 803 | Obesity, withdrawal design | 68 wk | −18.2% | −17.4% | 98.0% |
| STEP 5 Nat Med, 2022 | Semaglutide 2.4 mg Injection | 304 | Obesity, no T2D | 104 wk | ~−16.5% | −15.2% | 92.8% |
| STEP HFpEF NEJM, 2023 | Semaglutide 2.4 mg Injection | 529 | Obesity + HFpEF, no T2D | 52 wk | — | −13.3% | ~87% |
| STEP HFpEF DM NEJM, 2024 | Semaglutide 2.4 mg Injection | 616 | Obesity + HFpEF + T2D | 52 wk | — | −9.8% | 82.3% |
| SELECT NEJM, 2023 | Semaglutide 2.4 mg Injection | 17,604 | Overweight/obesity + CVD | ~40 mo | −11.7% | −10.2% | 73.3% |
| SURMOUNT-1 (15 mg) NEJM, 2022 | Tirzepatide 15 mg Injection | 2,539 | Obesity, no T2D | 72 wk | −22.5% ⚑ | −20.9% | 86% |
| SURMOUNT-1 (10 mg) NEJM, 2022 | Tirzepatide 10 mg Injection | 2,539 | Obesity, no T2D | 72 wk | −21.4% | −19.5% | 86% |
| SURMOUNT-1 (5 mg) NEJM, 2022 | Tirzepatide 5 mg Injection | 2,539 | Obesity, no T2D | 72 wk | −16.0% | −15.0% | 86% |
| SURMOUNT-2 (15 mg) Lancet, 2023 | Tirzepatide 15 mg Injection | 938 | Overweight/obesity + T2D | 72 wk | −15.7% | −14.7% | ~86–91% |
| SURMOUNT-3 Nat Med, 2023 | Tirzepatide 10/15 mg Injection | 579 | Obesity post-lifestyle lead-in | 84 wk total | −21.1% ⚑ | −18.4% | ~85–90% |
| SURMOUNT-4 JAMA, 2024 | Tirzepatide 10/15 mg Injection | 670 | Obesity, withdrawal design | 88 wk total | ~−26% | −25.3% | 85.6% |
| SCALE Obesity NEJM, 2015 | Liraglutide 3.0 mg Injection | 3,731 | Obesity, no T2D | 56 wk | −9.2% | −8.0% | 75% |
| SCALE Maintenance Int J Obes, 2013 | Liraglutide 3.0 mg Injection | 422 | Post-LCD maintenance | 56 wk | — | −6.2% | ~80% |
| SCALE Diabetes JAMA, 2015 | Liraglutide 3.0 mg Injection | 846 | Overweight/obesity + T2D | 56 wk | — | −6.0% | ~83% |
| OASIS 1 Lancet, 2023 | Semaglutide 50 mg Oral | 667 | Obesity, no T2D | 68 wk | −17.4% ⚑ | −15.1% | 82% |
| OASIS 4 NEJM, 2025 | Semaglutide 25 mg Oral | 307 | Obesity, no T2D | 64 wk | −16.6% ⚑ | −13.6% | 81.5% |
| ATTAIN-1 NEJM, 2025 | Orforglipron 36 mg Oral | 3,127 | Obesity, no T2D | 72 wk | −12.4% ⚑ | −11.1% | ~76% |
| ATTAIN-2 Lancet, 2025 | Orforglipron 36 mg Oral | 1,613 | Overweight/obesity + T2D | 72 wk | −10.5% ⚑ | −9.6% | ~78–81% |
| REDEFINE 1 NEJM, 2025 | CagriSema 2.4/2.4 mg Injection | 3,417 | Obesity, no T2D | 68 wk | −22.7% ⚑ | −20.4% | ~90% |
| REDEFINE 2 NEJM, 2025 | CagriSema 2.4/2.4 mg Injection | 1,206 | Overweight/obesity + T2D | 68 wk | — | −13.7% | ~88% |
| Retatrutide Ph2 NEJM, 2023 | Retatrutide 12 mg Injection | 338 | Obesity, no T2D | 48 wk | −24.2% | −24.2% | ~82% |
| Survodutide Ph2 Lancet D&E, 2024 | Survodutide 4.8 mg Injection | 386 | Obesity | 46 wk | −18.7% | — | ~80% |
| Maritide Ph2 NEJM, 2025 | Maritide 420 mg Injection | 592 | Obesity | 52 wk | ~−20% | −16.2% | ~78% |
All weight-loss figures are mean change from baseline. On-tx = on-treatment/efficacy estimand (assumes perfect adherence). ITT = treatment-policy/intention-to-treat estimand (includes all randomized patients, including dropouts). ⚑ = the on-treatment figure is the one commonly cited in vendor/press materials. Phase 2 trials may report completer analyses only. Full citations with DOIs are in the sources section.
The gap between on-treatment and ITT estimates ranges from 0.7 to 3.0 percentage points in most trials. This matters for population-level modeling: employer populations — with real-world persistence of 32–63% at one year — will experience results closer to or below the ITT estimate. The on-treatment figure is the right benchmark only if you assume perfect adherence.
Estimand gap
On-treatment vs. intention-to-treat estimates
Vendor presentations nearly always cite the larger on-treatment number. Employer populations, with lower adherence, will experience results closer to — or below — the ITT estimate. The chart below shows both estimands as paired bars for every major trial.
Why this distinction matters for employers: The on-treatment estimand answers "how much weight do patients lose if they stay on the drug?" The ITT estimand answers "how much weight does the average randomized patient lose, including those who stop?" Neither answer fully captures employer reality, where persistence is far lower than in either estimand's framework. But the ITT estimate is the more conservative and appropriate benchmark for population-level benefit modeling.
The largest gaps appear in OASIS 4 (3.0 pp), SURMOUNT-3 (2.7 pp), and OASIS 1 / REDEFINE 1 (2.3 pp each) — all trials where dropout rates were higher or where the on-treatment figure has been prominently featured in marketing materials.
Distribution of response
Who achieves clinically meaningful weight loss?
Average weight loss obscures the distribution. In every trial, some patients lose 20%+ while others lose little. The table below shows what percentage of participants crossed key clinical thresholds — using the more conservative ITT estimand where available.
| Trial ▼ | Drug ▼ | ≥5% ▼ | ≥10% ▼ | ≥15% ▼ | ≥20% ▼ | ≥25% ▼ |
|---|---|---|---|---|---|---|
| STEP 1 | Semaglutide 2.4 mg | 86.4% | 69.1% | 50.5% | 32.0% | — |
| STEP 2 (T2D) | Semaglutide 2.4 mg | 68.8% | ~45.6% | ~25.8% | — | — |
| STEP 3 (IBT) | Semaglutide 2.4 mg | 86.6% | 75.3% | 55.8% | ~36% | — |
| STEP 5 (104 wk) | Semaglutide 2.4 mg | 77.1% | 61.8% | 52.1% | 36.1% | — |
| SURMOUNT-1 (15 mg) | Tirzepatide 15 mg | 91% | 84% | ~73% | 57% | 36.2% |
| SURMOUNT-1 (10 mg) | Tirzepatide 10 mg | 89% | 78% | ~65% | 50% | 32.3% |
| SURMOUNT-1 (5 mg) | Tirzepatide 5 mg | 85% | 69% | ~55% | 30% | 15.3% |
| SCALE Obesity | Liraglutide 3.0 mg | 63.2% | 33.1% | ~14.4% | — | — |
| OASIS 1 | Semaglutide 50 mg (oral) | 85% | 69% | 54% | 34% | — |
| ATTAIN-1 (ITT) | Orforglipron 36 mg | ~70% | 54.6% | 36.0% | 18.4% | — |
| ATTAIN-2 (ITT, T2D) | Orforglipron 36 mg | ~60% | 45.6% | 26.0% | — | — |
| REDEFINE 1 (ITT) | CagriSema | 91.9% | ~80% | ~65% | ~50% | 34.7% |
Across the STEP program, 10–17% of semaglutide-treated participants lost less than 5% body weight — effectively non-responders. At the other end, 32–40% achieved ≥20% loss. The presence of type 2 diabetes consistently attenuated response: STEP 2 (T2D) showed roughly 40% less weight loss than STEP 1 (no T2D). This matters for employers because the commercially insured population includes a substantial proportion of individuals with T2D or prediabetes.
Cardiovascular evidence
SELECT — the cardiovascular outcomes trial
The SELECT trial (N=17,604; mean follow-up 39.8 months) is the single most important study for employers evaluating GLP-1 coverage for the cost-offset argument. Semaglutide 2.4 mg reduced 3-point MACE by 20% in adults with established cardiovascular disease and overweight/obesity but without diabetes.
SELECT outcome results
| Outcome | Semaglutide | Placebo | HR (95% CI) | Significance |
|---|---|---|---|---|
| 3-point MACE (primary) | 6.5% | 8.0% | 0.80 (0.72–0.90) | P<0.001 ✓ |
| Nonfatal MI | 2.7% | 3.7% | 0.72 (0.61–0.85) | Significant |
| CV death | 2.5% | 3.0% | 0.85 (0.71–1.01) | P=0.07 — did NOT meet threshold |
| All-cause death | 4.3% | 5.2% | 0.81 (0.71–0.93) | Nominal (hierarchy stopped) |
| HF composite | 3.4% | 4.1% | 0.82 (0.71–0.96) | Nominal (hierarchy stopped) |
| New-onset diabetes | 3.5% | 12.0% | 0.27 (0.24–0.31) | P<0.0001 — 73% reduction |
Because CV death (HR 0.85, P=0.07) did not cross its pre-specified hierarchical significance threshold, the remaining secondary endpoints — including all-cause mortality (HR 0.81) and HF composite (HR 0.82) — cannot formally claim statistical superiority, despite showing large directionally consistent benefits. Vendors sometimes present the all-cause mortality reduction as confirmed; it is not.
Subgroup consistency: Benefit was consistent across baseline BMI categories, HbA1c levels, sex, age, race/ethnicity, and heart failure status. The heart failure subgroup showed a particularly strong signal (MACE HR 0.72 in patients with baseline HF).
Mechanism insight: A 2025 Lancet analysis found that approximately 80% of the MACE benefit was mediated through pathways other than weight loss — only ~33% was mediated through waist circumference reduction. The hsCRP dropped by 39% versus placebo as early as week 4, before significant weight loss occurred, suggesting direct anti-inflammatory mechanisms.
Trial persistence was high: 73.3% of participants remained on drug at end of study — far above real-world rates of 8–63% depending on the cohort and time horizon. The interpretation of SELECT for employer populations (where persistence is dramatically lower) is addressed on page 5 (ROI).
Retention
Trial completion rates
Trial completion rates of 82–98% vastly exceed real-world persistence of 32–63% at one year. Trials actively managed side effects through regular clinical contact — typically every 2–4 weeks — that is absent from typical employer benefit designs. GI adverse events were the dominant reason for discontinuation across every program.
| Trial ▼ | Trial completion ▼ | On-treatment at end ▼ | AE discontinuation ▼ | Key reasons ▼ |
|---|---|---|---|---|
| STEP 1 | 94.3% | 81.1% | 4.5% | GI AEs, consent withdrawal |
| STEP 2 | 96% | 87% | ~4–5% | GI AEs, rescue therapy |
| STEP 3 | 92.8% | 82.7% | 3.4% | GI AEs |
| STEP 4 | 98.0% | 92.3% | Low | Pre-selected tolerators |
| STEP 5 | 92.8% | ~85% | 5.9–7.7% | GI AEs |
| SELECT | ~73% | 73.3% | 16.6% | GI AEs (10.0% vs. 2.0% placebo) |
| SURMOUNT-1 | 86% | 84–86% | 4.3–7.1% | GI AEs, consent withdrawal |
| SURMOUNT-2 | ~86–91% | ~85–91% | <5% | GI AEs |
| SURMOUNT-3 | ~85–90% | ~85% | 4–10.5% | GI AEs |
| SURMOUNT-4 | 85.6% | High | ~5–7% | GI AEs |
| SCALE Obesity | 75% | ~75% | ~5–6% | GI AEs (nausea 40%), consent |
| OASIS 1 | ~82% | 86% | ~5–7% | GI AEs (nausea, vomiting) |
| OASIS 4 | 81.5% | ~81% | 7% | GI AEs (nausea 47%, vomiting 31%) |
| ATTAIN-1 (36 mg) | ~76% | ~76% | 10.3% | GI AEs |
| ATTAIN-2 (36 mg) | ~78–81% | ~78% | 10.6% | GI AEs |
| REDEFINE 1 | ~90% | ~90% | ~6% | GI AEs |
Completion = percentage who completed the trial assessment period (may include patients who stopped drug but remained in study). On-treatment = percentage still taking study drug at trial end. AE discontinuation = percentage who stopped drug specifically due to adverse events. All trials were double-blind, placebo-controlled. GI adverse events (nausea, vomiting, diarrhea, constipation) were the dominant discontinuation reason in every program.
What else was in the protocol
Concomitant lifestyle interventions
Every major GLP-1 obesity trial included structured lifestyle counseling as a mandatory component of both treatment and placebo arms. Most employer GLP-1 benefit designs include the prescription drug only, without the behavioral infrastructure that supported the trial results.
| Trial | Diet Rx | Exercise Rx | Counseling | Frequency | Total sessions |
|---|---|---|---|---|---|
| STEP 1, 2, 4, 5 | −500 kcal/day deficit | ≥150 min/week | Individual, in-person or phone | Every 4 weeks | ~17 sessions |
| STEP 3 | 1,000–1,200 kcal/day × 8 wk, then 1,200–1,800 | 100→200 min/wk (progressive) | Intensive behavioral therapy | >Weekly early on | 30 sessions |
| SURMOUNT-1, 2, 4 | −500 kcal/day (~1,200–1,500 kcal/day) | ≥150 min/week | Monthly counseling | ~Monthly | ~18 sessions |
| SURMOUNT-3 lead-in | 1,200–1,500 kcal/day | ≥150 min/week | ≥14 sessions in 12 weeks | Weekly+ during lead-in | 14+ in 12 wk |
| SCALE O&P | −500 kcal/day deficit | ≥150 min/week | Standardized counseling | Every 2 wk → monthly | ~15 sessions |
| OASIS 1, 4 | −500 kcal/day deficit | ≥150 min/week | Lifestyle counseling | Periodic (likely monthly) | ~15–17 sessions |
| ATTAIN-1, 2 | "Healthy diet" counseling | Physical activity counseling | Counseling sessions | Periodic | Not specified |
| REDEFINE 1 | −500 kcal/day deficit | ≥150 min/week | Counseling sessions | Every 4 weeks | ~17 sessions |
| SELECT | No structured diet/exercise intervention — standard-of-care recommendations only. Regular study visits every 4–8 weeks. | ||||
SELECT — the only trial that did not include structured lifestyle intervention — achieved lower weight loss (−9.4% at week 104) than the obesity-focused trials (−14.9% in STEP 1), despite using the same drug at the same dose. This ~5-percentage-point gap provides a rough estimate of the contribution of lifestyle support.
A 2024 KFF Employer Health Benefits Survey found that nearly 1 in 5 large employers cover GLP-1s for weight loss, but the vast majority provide the prescription drug benefit only — without dietitian access, behavioral counseling, or regular clinical monitoring. Typical employer coverage replicates the SELECT model (drug only), not the STEP model (drug + behavioral support).
Generalizability
Who was excluded from the trials
Employer populations are unselected and include patients with conditions that every trial excluded. A landmark JAMA Internal Medicine analysis applied trial exclusion criteria to nationally representative NHANES data — the results suggest that a substantial portion of the real-world treatment population would not have qualified.
| Exclusion category | Excluded from | Prevalence in US obese population |
|---|---|---|
| Type 2 diabetes | All trials except T2D-specific arms | ~25–30% of adults with BMI ≥30 |
| Prior bariatric surgery | All trials | ~270,000 surgeries/year (growing post-surgical population) |
| Major depression ≤2 years | All trials (PHQ-9 ≥15) | ~8–10% of US adults; higher among obese |
| Anti-obesity medications within 90 days | All trials | Common (phentermine, other agents) |
| GI motility medications | SURMOUNT trials | 23.5% of eligible adults (PPIs, opioids, CCBs, TCAs) |
| Renal impairment (CKD 3+) | Most trials | ~15% of obese adults |
| Insulin therapy | SURMOUNT-2, ATTAIN-2 | ~30% of T2D patients |
Demographic gaps: Trial populations were also non-representative in composition. STEP 1 enrolled only 5% Black participants versus 12.4% in the US obese population. SELECT enrolled only 3.8% Black participants. Across 27 GLP-1 obesity RCTs, 65–73% of participants were female — and women consistently achieved greater weight loss than men (−11.1% vs. −7.5% in SELECT). An employer population closer to 50/50 male/female may see modestly lower average weight loss. ICER rated the STEP and SURMOUNT trials as "Fair" for race/ethnicity diversity and "Fair to Poor" for sex representation.
Source: Bessette & Anderson, JAMA Intern Med 2025;185:108–110. DOI: 10.1001/jamainternmed.2024.6340. Applied trial exclusion criteria to NHANES 2017–2020 data (N=8,767, representing 110.3 million US adults with overweight/obesity). Adults ≥60 years were disproportionately likely to meet exclusion criteria.
Methodology
How this page was built
All data on this page is drawn from peer-reviewed publications in NEJM, JAMA, Lancet, Nature Medicine, JAMA Internal Medicine, JAMA Health Forum, and specialty journals. DOIs are provided for every primary trial citation. Where trials report multiple estimands (on-treatment vs. treatment-policy), both are presented with the estimand clearly labeled. "Treatment-policy" and "ITT" are used interchangeably in this page — both refer to the analysis that includes all randomized participants regardless of whether they completed treatment.
Responder threshold data uses the treatment-policy estimand where available; where only the on-treatment (efficacy) estimand is published, this is noted. Phase 2 data and press-release-only results are clearly flagged as lower-quality evidence pending peer-reviewed publication. All trials were industry-funded — by Novo Nordisk (STEP, OASIS, SCALE, SELECT, REDEFINE), Eli Lilly (SURMOUNT, ATTAIN, retatrutide), Boehringer Ingelheim/Zealand (survodutide), Amgen (maritide), or Altimmune (pemvidutide).
This page does not editorialize on whether the trial results are "good" or "bad." It presents the data as published and flags the systematic factors — estimand choice, lifestyle support, population selection, completion rates — that create a gap between headline trial numbers and real-world employer outcomes. Corrections and updates can be submitted via the contact page.
References
Sources
Questions about this data? Corrections or updates?
Get in touch