GLP-1 weight loss in the real world
What the evidence actually shows, organized by study design: all-initiator averages across all patients who start a GLP-1, persistent-user outcomes for patients who stay on therapy, head-to-head comparisons, and the gap between real-world and trial results. Sources include Cleveland Clinic, Truveta, Vanderbilt, Mayo Clinic, the Danish national registry, and the 2025 Thomsen narrative review in Diabetes, Obesity and Metabolism.
Last updated: April 21, 2026
Context
How to read this page
Real-world GLP-1 weight loss numbers range from under 3% to over 20% at one year. The variation is almost never about the drugs — it's about which patients the study measured.
Three study designs dominate the literature:
All-initiator: weight change averaged across everyone prescribed the drug, including early discontinuers. The population average, most relevant for payer and policy decisions. (The real-world analog to intent-to-treat; RWE studies avoid the term "ITT" because it's technically tied to randomization.)
Persistent-user: weight change only among patients who stayed on therapy, typically for 12 months with no long fill gaps. Higher numbers, but pre-selected for response and tolerability.
Head-to-head: propensity-matched cohorts comparing drugs directly. Typically persistent-user in design — follow-up ends at discontinuation.
Other pages in this series cover clinical trial results, persistence and adherence, weight regain after stopping, cost and ROI, and side effects and safety.
Population average
All-initiator studies
These studies measure weight change across everyone prescribed a GLP-1, including patients who stopped within weeks. This is the most relevant denominator for payer, employer, and policy decisions because it matches what is actually being delivered and paid for at scale.
Other published all-initiator studies report 3–9% weight loss at 12 months. The variation tracks drug generation (older cohorts included liraglutide and dulaglutide, which are less potent), indication mix (T2D patients lose less), and baseline BMI. Gasoyan 2025 is the cleanest modern-practice estimate because it restricts to semaglutide 2.4 mg or tirzepatide for obesity.
| Study | N | Drugs / indication | Cohort setting | Mean weight loss |
|---|---|---|---|---|
| Gasoyan 2025 Obesity (Silver Spring) | 7,881 | Semaglutide or tirzepatide, obesity indication (no T2D) | Cleveland Clinic EHR + Surescripts | −8.7% at 12 mo |
| Gasoyan 2024 JAMA Netw Open | 3,389 | Semaglutide or liraglutide, obesity or T2D (82% T2D) | Cleveland Clinic EHR | −5.1% semaglutide −2.2% liraglutide at 12 mo |
| Powell 2023* Obesity | 3,555 | Semaglutide (mostly T2D-labeled doses) | Greater Plains Collaborative EHR network | −4.4% at 52 wk |
| Dandelion Health 2024* White paper, not peer-reviewed | ~17,000 | Mixed GLP-1s (semaglutide, tirzepatide, liraglutide, dulaglutide) | Multi-health-system EHR consortium, matched controls | ~3% at 12 mo |
| Luo 2022 PMC9877131 | 2,405 | Mixed GLP-1s, T2D only (older agents, 2011–2018) | PA health system EHR | −2.5% at 72 wk |
*Not strictly all-initiator. Powell required continuous fills through week 26; Dandelion required 12 months of continuous fills. Both designs exclude the earliest discontinuers, so their averages likely overstate pure all-initiator weight loss for their populations.
Three factors explain most of the difference across these studies. First, drug generation: Dandelion and Luo included substantial liraglutide and dulaglutide use, which are less potent than semaglutide 2.4 mg and tirzepatide at their approved obesity doses. Second, indication mix: T2D patients on GLP-1s lose less weight than obesity-only patients at comparable doses, because T2D-labeled versions (Ozempic, Mounjaro) max out at lower doses than obesity-labeled versions (Wegovy, Zepbound). Third, baseline BMI: higher baseline BMI produces larger percentage losses. The Gasoyan 2025 cohort is closest to modern US obesity practice because it restricts to semaglutide 2.4 mg or tirzepatide for obesity indication.
Persistent users
Persistent-user studies
These studies restrict the analysis to patients who stayed on therapy, typically for 12 months with no fill gap longer than 30 days. The resulting weight-loss numbers are valid for that sub-population but do not reflect what happens across all initiators. This is where the 13–17% figures cited in clinical and patient-support contexts come from.
Across peer-reviewed persistent-user analyses, mean 12-month weight loss sits in the 13–17% range for semaglutide 2.4 mg and tirzepatide. Academic and manufacturer-sponsored studies agree on this range. Digital programs layered onto medication (Second Nature, WeGoTogether) report higher numbers, reflecting a combination of program-specific cohort selection, self-reported weights, and behavioral support layered onto the drug. All of these cohorts are pre-selected for tolerability, coverage, and adherence.
| Study | N | Drugs / indication | Design | 12-mo weight loss | Sponsor |
|---|---|---|---|---|---|
| Ghusn 2024 Int J Obes | 304 | Semaglutide, obesity indication | Completers-only, Mayo Clinic | −13.4% mean | Academic |
| Samuels 2025 DOM (Vanderbilt) | 2,306 | Semaglutide and/or tirzepatide, obesity | Persistent-user (12 mo), no-cost program | −14.4% median (IQR 9.5–20.5) | Academic |
| Ng 2025 (SHAPE) Adv Ther | 9,916 | Semaglutide 2.4 mg or tirzepatide, obesity (no T2D) | Persistent-user (no gap >30 days) | −14.1% semaglutide −16.5% tirzepatide | Novo Nordisk |
| Ruseva 2025 (SCOPE) Postgrad Med | 4,424 | Semaglutide 2.4 mg, obesity | Persistent-user, claims-based | −15.5 kg (~15%) | Novo Nordisk |
| Hankosky 2025 DOM | 200 | Tirzepatide, obesity (GLP-1-naïve) | Persistent-user, Optum Market Clarity | −12.9% at 6 mo | Eli Lilly |
| Richards 2025 (Second Nature) JMIR Form Res | 339 | Semaglutide or tirzepatide, obesity | Completers-only, UK digital program | −22.1% tirzepatide −17.1% semaglutide | Second Nature |
| Johnson 2025 (WeGoTogether) Adv Ther | Not specified | Semaglutide 2.4 mg, obesity | Persistent + engaged, self-reported weights | −17.6% | Novo Nordisk |
A patient in a persistent-user cohort has, by design, already passed the filters that eliminate most initiators: tolerating GI side effects long enough to titrate, keeping continuous coverage, not hitting cost barriers, not experiencing supply disruptions, and not stopping for goal achievement or other personal reasons. The 13–17% figures are valid for patients in that position but are not a forecast for a general member population or a forward-looking estimate of what a benefit design will produce across everyone who starts therapy.
Completers-only vs persistent-user. These related terms are not identical. Persistent-user means the patient kept filling prescriptions through follow-up (e.g., no gap >30 days for 12 months). Completers-only means the patient reached the study endpoint with a recorded final weight — a stricter filter that also excludes loss to follow-up. Completer-only numbers (Ghusn, Richards in the table above) trend higher because the cohort is even more selected.
Drug comparison
Head-to-head: semaglutide vs tirzepatide
Most GLP-1 real-world studies look at one drug or lump them together. Rodriguez et al. (JAMA Internal Medicine, 2024) is the largest direct comparison in routine care and the only one using propensity matching at scale.
Study design: persistent-user (on-treatment). Follow-up ends when a patient discontinues, so the reported weight loss reflects patients still taking the drug at that time point, not the full starting cohort.
18,386 propensity-matched adults initiating T2D-labeled semaglutide (Ozempic) or tirzepatide (Mounjaro). Weight loss at 12 months: −8.3% semaglutide, −15.3% tirzepatide. Tirzepatide patients were 3.2× more likely to reach 15% weight loss. GI adverse event rates were similar. The study used T2D-labeled formulations; an equivalent comparison using Wegovy and Zepbound has not been published.
Propensity matching pairs each patient on one drug with a patient on the other who looks similar on baseline characteristics (age, sex, BMI, diabetes, comorbidities), approximating a randomized comparison using observational data. It only adjusts for measured variables, but when done well it meaningfully reduces confounding.
| Endpoint | Tirzepatide | Semaglutide | Difference (adjusted) |
|---|---|---|---|
| Mean weight change at 3 mo | −5.9% | −3.6% | −2.4 pp |
| Mean weight change at 6 mo | −10.1% | −5.8% | −4.3 pp |
| Mean weight change at 12 mo | −15.3% | −8.3% | −6.9 pp |
| Hazard ratio, ≥5% weight loss | HR 1.76 (95% CI 1.68–1.84) favoring tirzepatide | ||
| Hazard ratio, ≥10% weight loss | HR 2.54 (95% CI 2.37–2.73) favoring tirzepatide | ||
| Hazard ratio, ≥15% weight loss | HR 3.24 (95% CI 2.91–3.61) favoring tirzepatide | ||
| 12-month discontinuation | 55.9% | 52.5% | Similar |
SURMOUNT-5 (Aronne et al., NEJM 2025) is the only completed obesity-indication randomized head-to-head. At 72 weeks, tirzepatide produced −20.2% weight loss vs semaglutide −13.7% in 751 non-diabetic adults. The direction and magnitude match Rodriguez 2024: tirzepatide produces more weight loss than semaglutide, both in trials and in routine care. For complete trial evidence see the clinical trial results page.
The dose caveat
Under-titration
Even patients who stay on therapy mostly aren't taking the dose the trials studied. This is a separate phenomenon from discontinuation and helps explain why persistent-user numbers sit below trial numbers — "persistent" in claims data doesn't mean "trial-equivalent exposure."
The STEP and SURMOUNT trials escalated patients to maximum dose (semaglutide 2.4 mg, tirzepatide 15 mg) and kept them there. In real-world practice, a minority reach those doses: 13% of Danish Wegovy users by the fifth prescription; 23% of Vanderbilt semaglutide users in a no-cost program; 25.9% of tirzepatide users in the Novo-sponsored SHAPE study. Most "persistence" captured in claims data is persistence at lower maintenance doses. Two patients who both appear "persistent for 12 months" in a dataset can be on very different milligram levels, with correspondingly different expected weight outcomes.
| Study / setting | Drug | Max-dose achievement | Context |
|---|---|---|---|
| Færch 2024 (Danish registry) Diabetes Care | Wegovy (sema 2.4 mg), n=110,748 | 13% reached 2.4 mg by 5th Rx 25% ever filled a 2.4 mg Rx 10% followed protocol titration | National-scale; broad GP-prescribed population |
| Samuels 2025 (Vanderbilt) DOM | Sema / tirz for obesity, n=2,306 | 23% of sema users reached 2.4 mg 28% of tirz users reached 15 mg | No-cost academic program; cost removed as a barrier |
| Gasoyan 2025 (Cleveland Clinic) Obesity | Sema / tirz for obesity, n=7,881 | 80.8% on low maintenance dose overall | Full cohort; low-dose status associated with substantially smaller weight loss at 1 year |
| Hankosky 2025 (Optum) DOM, Lilly-sponsored | Tirzepatide (non-T2D), n=4,177 | 56.2% still on <10 mg by 6th Rx fill | Half remain below the 10 mg threshold by month 6 |
| Ng 2025 (SHAPE) Adv Ther, Novo-sponsored | Sema 2.4 mg / tirz, n=9,916 | 83.5% of sema reached 2.4 mg 25.9% of tirz reached 15 mg | In persistence-filtered cohorts, sema titration looks better than tirz |
Claims-based "persistence" counts fills, not milligrams. The Gasoyan 2025 decomposition makes the dose effect explicit: non-discontinuers on high maintenance doses lost 13.7% (sema) and 18.0% (tirz) — comparable to trial results — while non-discontinuers on low doses lost substantially less. When a real-world study reports "persistent-user weight loss of X%", the reader should ask what share of the cohort actually reached the maximum dose. Most studies don't report this and assume persistence equals trial-equivalent exposure; it doesn't.
The literature points to three main drivers: gastrointestinal side effects during titration (nausea, vomiting, diarrhea) that lead clinicians to pause or hold escalation; supply and coverage disruptions that force patients to cycle through available doses rather than follow the protocol schedule; and clinical inertia once patients show a good response at a lower dose. See the side effects page for detailed GI adverse event rates by dose and agent, and the persistence page for the broader drivers of discontinuation and dose interruption.
Comparison with RCTs
Real-world vs trial results
Real-world weight loss is consistently lower than trial weight loss across every published comparison. The difference is driven by at least six measurable factors, all related to how care is delivered rather than the pharmacology of the drugs themselves.
Trial conditions differ from routine practice on persistence (6–10% trial discontinuation vs 36–65% real-world at 12 months), dose titration (~90% reach max dose in trials vs 13–28% real-world), lifestyle support (structured in trials, typically absent in practice), population (trial cohorts are younger, more female, higher baseline BMI, more obesity-focused), drug formulation (trials use max-dose obesity-labeled versions; practice often uses T2D-labeled lower doses), and cost (trials provide drug free; in practice, cost is the most-cited reason for stopping).
| Factor | Trial condition | Real-world reality | Detail |
|---|---|---|---|
| Persistence | 6–10% discontinuation at 68 weeks (STEP) | 36–65% discontinuation at 12 months | Real-world discontinuation rates are 5–10× higher than trial rates; cost and GI side effects are the most-cited reasons |
| Dose titration | ~90% reach max dose | 13–28% reach max dose | Under-titration persists even in no-cost programs (Vanderbilt: 23–28% at max); see under-titration section above |
| Lifestyle support | Structured 500 kcal/day deficit, 150 min/wk exercise, monthly counseling | Minimal or none | STEP enrolled with structured counseling; most real-world patients get a prescription and limited follow-up |
| Population | STEP-1: 73% female, mean BMI 37.8 | Older, broader BMI range, more T2D, more comorbidities | SELECT (27.7% female, BMI 33.3) produced only 10.2% at 4 years — closer to real-world |
| Drug formulation | Wegovy / Zepbound at max dose | Mix including Ozempic / Mounjaro (T2D-labeled, lower doses) | The Truveta head-to-head used T2D-labeled versions; doses max at 2 mg sema and 15 mg tirz but often stay lower |
| Cost / coverage | Drug provided free | Cost / insurance is the most-cited reason for stopping | Gasoyan chart review of reasons for discontinuation: 48% of stoppers cite cost as primary driver |
For the full persistence evidence see the persistence page; for cost and ROI see the ROI page; for GI side effect rates see the side effects page.
Reference guide
How to read a GLP-1 real-world study
RWE studies on GLP-1s report wildly different weight-loss numbers — from 3% to 22% at 12 months. The differences are almost never about the drugs and almost always about design choices.
Is this study all-initiator or persistent-user? All-initiator studies include everyone who started the drug, including patients who stopped within weeks. Persistent-user studies restrict to patients who stayed on therapy, typically for 12 months with no gap longer than 30 days. The same drug, same setting, same year can produce a 5% all-initiator average and a 15% persistent-user average.
Neither number is wrong — they answer different questions. "What happens if we cover this drug?" is an all-initiator question. "How much weight will a patient who completes 12 months lose?" is a persistent-user question. A study that quotes a persistent-user number for a population-level decision is measuring the wrong thing.
Six other factors worth checking before trusting a headline number: persistence threshold, weight measurement method, drug formulation, cost context, sponsor, and baseline matching. Details below.
1. All-initiator or persistent-user?
The single most important question, expanded. All-initiator (the real-world analog to intent-to-treat) includes everyone prescribed the drug. Persistent-user restricts to patients still filling the drug at follow-up. Example: Gasoyan 2025 on the same Cleveland Clinic cohort reports 8.7% as an all-initiator average but 11.9% among non-discontinuers — same patients, same drugs, same year, different denominator. Always check which one a headline is quoting.
2. What counts as "persistent"?
Persistence thresholds vary: some studies allow 30-day fill gaps, some 60, some 84, some 90. A patient who misses two months of refills counts as "persistent" in one study and "discontinued" in another. When comparing studies, check the allowed gap. Tighter definitions produce smaller but better-behaved cohorts; looser definitions capture more patients but blur the signal.
3. Was weight measured or self-reported?
EHR studies (Cleveland Clinic, Vanderbilt, UNMC) use weights recorded in clinical visits — reasonably accurate for patients who come back, but they miss patients who stopped seeing their provider. Claims studies (Prime, Truveta) often can't measure weight at all. Self-reported weights in digital-program studies (WeGoTogether, Second Nature) are biased upward by motivated reporting and engagement selection — patients who stick with the app and report weights are the ones doing best.
4. Which drug formulation?
Ozempic and Wegovy are the same molecule at different doses. Mounjaro and Zepbound are the same molecule at different doses. The T2D-labeled versions (Ozempic, Mounjaro) cap at lower max doses than the obesity-labeled versions (Wegovy, Zepbound) and produce less weight loss as a result. Rodriguez 2024 Truveta used T2D-labeled formulations, which is why its 8.3% sema / 15.3% tirz figures are lower than obesity-labeled RCT numbers. Lumping T2D and obesity formulations together in a single analysis is a common source of undercounting.
5. What was the cost context?
Real-world weight loss improves when cost barriers are removed, but not as much as a naive reader might expect. Vanderbilt's no-cost program (Samuels 2025) still saw 50% discontinuation at 12 months and only 23–28% of patients reaching maximum dose. Cost is a major discontinuation driver, but not the only one — side effects, goal achievement, strategic pauses, and fatigue all play meaningful roles even in zero-cost settings.
6. Who sponsored it?
Most large persistent-user studies are manufacturer-sponsored: SHAPE, SCOPE, and WeGoTogether by Novo Nordisk; Hankosky Optum by Eli Lilly. The work is usually rigorous, but study design choices — especially persistent-user framing and completers-only analyses — systematically favor drug performance. Academic-sponsored studies (Ghusn, Gasoyan, Samuels) vary in design but are more likely to include all-initiator decompositions alongside persistent-user results. When a headline says "real-world weight loss on drug X was 16%," the next click should be the methods section.
7. Does it match on baseline factors?
Matched-cohort designs (Truveta, HIRD, Komodo) align treated and control patients on demographics, baseline BMI, and comorbidities before comparing outcomes, which strengthens causal claims. Single-arm cohorts (no control group) report weight change but can't separate drug effect from time trends, Hawthorne effects, or concurrent lifestyle change. When a study reports "patients lost 10% on drug X," check whether there was a comparator and whether matching was propensity-based or simpler.
Transparency
How this page was made
Selection criteria. Peer-reviewed studies were prioritized over white papers, which are flagged in-line when used (Dandelion Health). Where multiple designs answered the same question, preference went to studies of injectable semaglutide 2.4 mg or tirzepatide for obesity indication, since those match modern US clinical practice. Studies using older agents (liraglutide, dulaglutide) or T2D-labeled lower-dose formulations are included to show historical context and to explain why older cohorts report lower numbers.
Where the numbers come from. Weight-loss percentages, sample sizes, discontinuation rates, and dose-titration figures are pulled directly from the published papers and verified against PubMed, PMC, or journal full-text where available. Every study cited has a DOI or source link. The Gasoyan 2025 decomposition values are from the published abstract and results section; the Rodriguez 2024 Truveta persistent-user figures are from the published paper and Truveta's own results summary.
What's not included. Pediatric RWE (too thin to draw conclusions), T1D cohorts (different population and goals), and compounded GLP-1 telehealth (limited published outcome data; most visibility is from industry press). Post-discontinuation weight trajectories are covered on the weight regain page. Persistence rates and drivers are covered on the persistence page.
Known limitations. (1) Most large persistent-user studies are manufacturer-sponsored, and design choices in those studies systematically favor drug performance. This is flagged in the persistent-user table. (2) EHR-based studies miss patients who stop seeing their provider, which likely includes some early discontinuers; the true all-initiator averages may be slightly lower than reported. (3) Real-world data on obesity-labeled sema 2.4 mg and tirzepatide for obesity is still maturing — Zepbound was only approved for obesity in November 2023, so 12-month real-world data on obesity-indication use is just now appearing in the literature. Expect these numbers to refine as more peer-reviewed work is published through 2026.
Conflicts of interest. Key to Health is a behavioral weight management program that can operate alongside or independent of GLP-1 therapy. We have a commercial interest in the question of how GLP-1 outcomes are shaped by factors outside the drug itself (persistence, titration, behavioral support). We try to present the evidence without spin, and readers should weigh this page against primary sources when making clinical or benefits decisions.
Author and update cadence. Compiled by Ray Wu, MD/MBA — physician-founder working on metabolic health technology. Last updated April 21, 2026. Corrections welcome via the contact page.
References
Sources
Questions about this data? Corrections or updates?
Get in touch