President Trump has dismissed hundreds of scientists working on the congressionally mandated National Climate Assessment, raising concerns about whether the void will be filled with pseudoscience Firefighters watch as flames and smoke move through a valley in the Forest Ranch area of Butte County as the Park Fire continues to burn near Chico, California, on July 26, 2024. The sixth installment of the congressionally mandated report, which was due to come out by 2028, has typically been put together by about 400 researchers, many of whom are top scientists at universities who volunteer their time. The assessment is used to craft environmental rules, legislation and infrastructure project planning. Work had already begun on the sixth version. The Trump administration ended that with a note sent to researchers Monday. If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today. “At this time, the scope of the NCA6 is currently being reevaluated in accordance with the Global Change Research Act of 1990,” contributors were told in an email obtained by POLITICO's E&E News. The White House did not immediately respond to a request for comment. The assessments help Americans “understand how climate change is impacting their daily lives already and what to expect in the future,” said Rachel Cleetus, one of the researchers who was dismissed. “Trying to bury this report won't alter the scientific facts one bit, but without this information our country risks flying blind into a world made more dangerous by human-caused climate change,” said Cleetus, a senior policy director at the Union of Concerned Scientists, in a statement. The plan closely tracks with a proposal by White House budget director Russ Vought, who has urged the Trump administration to toss out all work on the assessment that began under former President Joe Biden. Global Change Research Program, which supports the assessment. The program, which coordinated the work of 13 federal agencies, had existed for 35 years through Republican and Democratic presidencies, including Trump's first term. Trump officials were caught by surprise by the timing of the fourth National Climate Assessment as it was being prepared for release in 2018. Some wanted to withhold the report and fire the scientists who worked on it, but that plan was scuttled. It's unclear whom Vought would try to recruit for the next assessment, if there is one. There is a relatively small pool of credentialed researchers who downplay the scientific consensus that climate change could push the planet past a series of dangerous tipping points. Some have already told E&E News that they are willing to be involved with the new effort. That includes Bob Kopp, a climate scientist at Rutgers University, and an author of the chapter on ocean coasts that was being prepared for the sixth report. “I know many of the authors would like to find a way to ensure that Americans can still have an updated, evidence-based assessment of our country's climate,” he wrote on Bluesky. Subscribe to Scientific American to learn and share the most exciting discoveries, innovations and ideas shaping our world today.
A new study shows that tiny gold particles could circumvent damaged photoreceptors in patients with macular degeneration and help restore vision. The term “goldeneye” once only described spy thrillers and waterfowl. But soon, it could mean groundbreaking therapy for people with failing vision. Scientists at Brown University injected gold nanoparticles into the retinas of laboratory mice and successfully restored vision in those with retinal disorders like macular degeneration. Well, instead of letting the eyes rely on rods and cones for vision, this method uses gold nanoparticles and laser light to stimulate bipolar and ganglion cells that are “further up on the visual chain,” according to the researchers. This gold-tinted technique bypasses those photoreceptors by focusing infrared light directly on the nanoparticles, which in turn generate heat that activates the bipolar and ganglion cells. This is particularly valuable for the millions of patients with macular degeneration or retinitis pigmentosa—those diseases leave these “further up” cells unscathed, so this stimulation technique could improve sight overall. “This is a new type of retinal prosthesis that has the potential to restore vision lost to retinal degeneration without requiring any kind of complicated surgery or genetic modification,” Jiarui Nie, lead author on the study, said in a press statement. “We believe this technique could potentially transform treatment paradigms for retinal degenerative conditions.” Nie and her team tested this process on mice and, by using probes to analyze increased activity of the visual cortices, confirmed that there was at least partial restoration of vision. However, Nie said that this technique would be much less invasive and could even provide increased resolution, as the nanoparticle solution covers the entire retina. Throughout human history, gold has been on of Earth's most precious metals. But for those struggling with retinal disorders, it could very well be priceless. Humans Could Grow New Teeth in Just a Few Years We Totally Missed a Big Part of Our Immune System
We may earn commission if you buy from a link. A proportionate sample of the human brain would be 240 million cells. Rhetoric about artificial intelligence has raced ahead with terms like “human intelligence,” but the human brain is not well enough understood to truly give credence to that idea. Scientists have worked for decades to analyze the brain, and they're making great progress despite the outsized rhetoric working against them. That said, artificial intelligence designed for specific tasks is essential to research like this. The term comes from the same suffix as in biome or genome, referring to a complete picture or map of something. In one of the team's papers, the researchers were able to make an overall classifying system to cover 30,000 neurons by their different shapes, or morphologies. These neurons are excitatory, meaning they're involved with transmitting messages in the brain. In this study, scientists used machine learning to help classify excitatory neurons, which seem to need a more complicated classifying system. By turning the neurons into measurements, observations ,and layers, the scientists could then use statistical methods to find how often certain types or qualities of these cells appeared. This may sound like an oxymoron, but code can generalize more precisely than human scientists are able to. Having categories for neurons can be and has been useful in studying the brain, but computing power can deepen this understanding and add a great deal of nuance. Neurons that perform certain tasks in the visual cortex of the mouse brain reach out and link up with each other, whether they're adjacent or layers apart. And since even this large mapping of brain tissue is still very incomplete, the number of “like” neurons is likely even higher in reality. It's wild that you don't even have to download anything—you can map the brain using your web browser. Caroline Delbert is a writer, avid reader, and contributing editor at Pop Mech. She's also an enthusiast of just about everything. Excavation Uncovers Skull Which Could Be a King Gold May Be a Possible Treatment for Blindness Scientists Just Solved a Major Mystery About Hail Scientists Go Deeper Into Mantle Than Ever Before A Building Crew Found an Iron Age Chariot Wheel
You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript. Agonists and antagonists of the glucose-dependent insulinotropic polypeptide receptor (GIPR) enhance body weight loss induced by glucagon-like peptide-1 receptor (GLP-1R) agonism. However, while GIPR agonism decreases body weight and food intake in a GLP-1R-independent manner via GABAergic GIPR+ neurons, it remains unclear whether GIPR antagonism affects energy metabolism via a similar mechanism. Here we show that the body weight and food intake effects of GIPR antagonism are eliminated in mice with global loss of either Gipr or Glp-1r but are preserved in mice with loss of Gipr in either GABAergic neurons of the central nervous system or peripherin-expressing neurons of the peripheral nervous system. Single-nucleus RNA-sequencing shows opposing effects of GIPR agonism and antagonism in the dorsal vagal complex, with antagonism, but not agonism, closely resembling GLP-1R signalling. Additionally, GIPR antagonism and GLP-1R agonism both regulate genes implicated in synaptic plasticity. Collectively, we show that GIPR agonism and antagonism decrease body weight via different mechanisms, with GIPR antagonism, unlike agonism, depending on functional GLP-1R signalling. Co-agonism at the receptors for glucagon-like peptide-1 (GLP-1) and glucose-dependent insulinotropic polypeptide (GIP) has been established as a highly effective strategy to manage obesity1,2,3,4 and type 2 diabetes5,6,7,8,9,10. Although GIPR agonism has long been stigmatized as potentially enhancing body weight via stimulation of adipocyte lipid deposition11,12, long-acting GIPR agonists decrease body weight and food intake in diet-induced obese (DIO) mice13,14,15 and amplify weight loss induced by GLP-1R agonism13,14,15,16,17. We and others have shown that long-acting GIPR agonists have a preserved ability to decrease body weight and food intake in Glp-1r-deficient mice15,18, which is lost in obese mice with Nestin-Cre-mediated neuronal loss of Gipr15. We and others further showed a similar effect in mice with Vgat-Cre-mediated deletion of Gipr in gamma-aminobutyric acid (GABAergic) neurons14,19. Consistent with the demonstration that GIPR agonism decreases body weight and food intake via central GIPR signalling in rodents14,15, chemogenetic activation of GIPR neurons in either the hypothalamus20,21 or the hindbrain20 decreases food intake in mice. Although infusion of long-acting (acyl) GIP into the lateral ventricle decreases body weight and food intake in DIO wildtype (WT) mice, these effects vanish in mice with central nervous system (CNS) loss of Gipr15. Superiority of the GIPR:GLP-1R co-agonist MAR709 to yield greater weight loss and further inhibition of food intake relative to GLP-1R agonism is diminished in mice with loss of Gipr in either the CNS15 or in GABAergic neurons14, indicating that GIPR agonism also contributes to weight loss induced by such a co-agonist. Notably, while long-acting GIPR agonists act in the brain in a GLP-1R-independent manner to decrease body weight and food intake via GABAergic GIPR neurons, GIPR antagonism also decreases body weight and food intake in DIO mice and non-human primates, particularly when used in co-therapy with GLP-1R agonism22,23,24,25,26,27,28. Thus, surprisingly, GIPR agonism and antagonism appear to have similar metabolic end points when it comes to body weight control. AMG133, a bispecific hybrid that comprises two GLP-1R agonists conjugated to a monoclonal anti-GIPR antagonist25,26 is currently in phase 2 clinical development for the treatment of obesity and type 2 diabetes. It shows superiority in decreasing body weight relative to targeting of each individual receptor in DIO mice and non-human primates25. In a recent phase 1 study, AMG133 induced more than 10% weight loss after 12 weeks of treatment in healthy humans26. Together, these findings support the notion that both GIPR agonism and antagonism hold therapeutic value to accelerate GLP-1-induced weight loss. The mechanisms underlying the reduction of body weight induced by GIPR antagonism, however, are largely unknown, although some studies suggest that GIPR agonism and antagonism may decrease body weight via similar mechanisms29. Here, to test this hypothesis, we set out to assess the metabolic effects of two validated GIPR antagonists22,23 in mice with whole-body or targeted deletion of Gipr. Like GIPR agonism14, we find that the body weight and food intake reducing effects of GIPR antagonism are lost in global Gipr-deficient mice. However, in contrast to GIPR agonism14, we find that GIPR antagonism fully retains its body weight and food intake reducing effects in mice with Vgat-Cre-mediated deletion of Gipr in GABAergic neurons, as well as in mice with loss of Gipr in peripherin-expressing neurons of the peripheral nervous system (PNS). However, and again in contrast to GIPR agonism14,18, we find that the body weight and food intake inhibitory effects of GIPR antagonism are absent in global Glp-1r-deficient mice, suggesting dependency on GLP-1R-mediated signalling. Consistent with this finding, single-nuclei RNA-sequencing (snRNA-seq) revealed that GIPR agonism and antagonism have opposing effects in the brain, with GIPR antagonism but not agonism mimicking the transcriptional responses of GLP-1R agonism in the dorsal vagal complex of the hindbrain (DVC), and with GIPR antagonism and GLP-1R agonism both modulating DVC gene programmes implicated in synapse formation and neuronal plasticity. Collectively, we show that while GIPR agonism and antagonism have similar effects on body weight and food intake, they do so via different neuronal mechanisms, with GIPR antagonism, but not GIPR agonism, depending on GLP-1R signalling to affect energy metabolism. We recently showed that loss of Gipr in Vgat-expressing GABAergic neurons renders DIO mice resistant to weight loss and inhibition of food intake by GIPR agonism14. To test whether Gipr antagonism affects energy metabolism via a similar mechanism, we treated DIO Vgat-Cre+Giprwt/wt (WT) and Vgat-Cre+Giprflx/flx (Vgat-Gipr knockout (KO)) mice for 24 days with either vehicle, a long-acting GLP-1R agonist (acyl-GLP-1, 10 nmol kg−1)13,14,15 or the combination of acyl-GLP-1 (10 nmol kg−1) and a validated long-acting (acylated) peptide GIPR antagonist (1,500 nmol kg−1)22. Vgat-Cre-mediated Gipr KO was confirmed by RNAscope (Extended Data Fig. Notably, and in contrast to GIPR:GLP-1R co-agonism, which loses its superiority to GLP-1R agonism with respect to decreases in body weight and food intake in Vgat-Gipr KO mice14, the co-therapy of GLP-1R agonism and GIPR antagonism maintained the enhanced effect on weight loss (Fig. 1a,b and Extended Data Fig. 1c–f) and on inhibition of food intake (Fig. 1c) relative to treatment with acyl-GLP-1 alone in Vgat-Gipr KO mice, without a difference of the co-therapy on either body weight or food intake in WT and Vgat-Gipr KO mice. Treatment with the co-therapy decreased body fat and lean tissue mass with comparable efficacy in WT and Vgat-Gipr KO mice (Fig. 1d,e and Extended Data Fig. The co-therapy also improved glucose tolerance with comparable efficacy in WT and Vgat-Gipr KO mice relative to vehicle controls, albeit without superiority to acyl-GLP-1 (Fig. No differences were observed in fasting levels of blood glucose (Fig. 1i), but levels of insulin (Fig. 1j) and insulin sensitivity, as estimated by homeostatic model assessment for insulin resistance (HOMA-IR) (Fig. 1k), were equally improved by treatment with acyl-GLP-1 and the co-therapy, and without difference between WT and Vgat-Gipr KO mice. No differences were observed in plasma levels of triglycerides (Fig. 1l), but levels of cholesterol were decreased after treatment with acyl-GLP-1, but not after treatment with the co-therapy, in both WT and Vgat-Gipr KO mice (Fig. In summary, and in contrast to GIPR agonism14, these data indicate that GIPR+ GABAergic neurons are dispensable for GIPR antagonism to amplify GLP-1-induced weight loss and inhibition of food intake. a–c, Body weight development (a), placebo-corrected weight (b) and cumulative food intake (c) of 33-week-old male C57BL/6J WT or Vgat-Gipr KO mice treated daily over 24 days with either vehicle (Vhcl), acyl-GLP-1 (10 nmol kg−1) or the combination of acyl-GLP-1 (10 nmol kg−1) and a GIPR antagonist (ant.) (1,500 nmol kg−1) (n = 8 mice each group). glucose tolerance (f and g) with corresponding area under curve (h) of 36-week-old male C57BL/6J WT and Vgat-Gipr KO mice (n = 8 each group) after 24 days of treatment. i, Fasting plasma levels of blood glucose (n = 8 each group) in 36-week-old male C57BL/6J WT or Vgat-Gipr KO mice. j,k, Fasting plasma levels of insulin (j) and corresponding HOMA-IR (k) in in 36-week-old male C57BL/6J WT and Vgat-Gipr KO mice treated either with vehicle (n = 8 WT and n = 8 KO), acyl-GLP-1 (n = 8 WT and n = 8 KO) or the co-therapy of acyl-GLP-1 and the GIPR antagonist (n = 7 WT and n = 7 KO). l,m, Ad libitum plasma levels of triglycerides (l) and cholesterol (m) in 36-week-old male C57BL/6J WT or Vgat-Gipr KO mice (n = 8 mice each group). Data in a, c, f and g were analysed by repeated measures two-way ANOVA with Bonferroni's post hoc test for comparison of individual timepoints. Data in b, d, e, h and i–m were analysed using one-way ANOVA. Cumulative food intake (c) was assessed per cage in n = 8 double-housed mice. The blue asterisks in a and c correspond to the comparison of acyl-GLP-1 versus the co-therapy in WT mice, while red asterisks correspond to acyl-GLP-1 versus the co-therapy in the Vgat-Gipr KO mice. Individual P values are shown in the Source data, unless P < 0.0001. Expression of Gipr has been demonstrated in various regions of the PNS30,31. In light of its role in the bi-directional transfer of information between the periphery and the brain, the PNS is well positioned to control energy metabolism, not only by modulating glycaemia via regulation of sympathetic outflow to the skeletal muscle32, but also by promoting GIP-induced vasodilation in the mesenteric vasculature, including the adipose tissue33,34. Considering these effects, we next assessed whether targeted Cre-mediated deletion of Gipr in neurons of the PNS affects energy and glucose metabolism. Mice with deletion of Gipr in neurons of the PNS were generated by crossing C57BL/6J Giprflx/flx mice35,36 with C57BL/6J mice that express Cre-recombinase under control of the promoter for peripherin (MGI 3841120)37. Peripherin is a neuronal intermediate filament protein with largely restricted expression in neurons of the PNS38. Consistent with this, we found expression of peripherin largely absent in the hippocampus, DVC, hypothalamus, sciatic nerve, pancreas and white adipose tissue, but it had robust expression in the dorsal root ganglia (DRG) and trigeminal ganglia (Extended Data Fig. Outside the PNS, expression of peripherin was highest in the ileum, but with more than 31-fold lower expression relative to the trigeminal ganglion, and with even lower to absent expression in the cerebellum, cerebral cortex, midbrain, kidney, testis, pituitary, adrenal gland, stomach, duodenum, jejunum and colon (Extended Data Fig. Collectively, these data indicate that expression of peripherin is largely restricted to neurons of the PNS. In line with previous reports in rats showing that peripherin is expressed in only 46% of DRG neurons39, we find expression of Gipr decreased by ∽43% in the DRG of Per-Cre+Giprflx/flx mice (Per-Gipr KO) relative to Per-Cre+Giprwt/wt (WT) controls, and without differences in relative expression of Gipr in either the hypothalamus, hindbrain, sciatic nerve, epididymal white adipose tissue, pancreas, cerebellum, cerebral cortex, pituitary, kidney, duodenum, jejunum, ileum or colon (Extended Data Fig. 2a,b), we also confirmed Per-Cre-mediated deletion of Gipr in the trigeminal ganglion and the DRG in Per-Gipr KO mice using RNAscope (Extended Data Fig. When fed a high-fat diet (HFD), male Per-Gipr KO mice showed no overt differences in body weight, body composition or food intake relative to WT controls (Fig. We further observed no differences in energy expenditure, locomotor activity or substrate utilization (Fig. However, we did find that DIO Per-Gipr KO mice had a higher glycated haemoglobin A1c (HbA1c) and slightly impaired glucose tolerance (Fig. 2h,i) with normal insulin sensitivity, but impaired secretion of insulin and GIP after oral bolus glucose administration compared with WT mice (Fig. The insulin secretory response to GIP and GLP-1 was, however, fully preserved in pancreatic islets isolated from WT and Per-Gipr KO mice (Fig. 2m), indicating that the impaired insulinotropic response observed in the Per-Gipr KO mice (Fig. 2k) did not result from impaired GIPR signalling in the islets. We also observed no differences in fasting levels of blood glucose, insulin or triglycerides (Fig. We also found that the metabolic phenotype of male DIO Per-Gipr KO mice was recapitulated in female DIO Per-Gipr KO mice, which, like male Per-Gipr KO mice, showed no difference in body weight, body composition, food intake, energy expenditure, locomotor activity or substrate utilization, but the females did have robust glucose intolerance with impaired glucose-induced insulin secretion, despite normal insulin tolerance and unchanged plasma levels of blood glucose, insulin, triglycerides and cholesterol (Extended Data Fig. Collectively, these data indicate that in both sexes, GIPR signalling in peripherin-expressing peripheral neurons is required for normal GIP and insulin responses to orally ingested glucose, but is not necessary for regulation of body weight, body composition or food intake. a, Body weight development of male C57BL6/J Per-Cre+Giprwt/wt (WT) and Per-Cre+Giprflx/flx (KO) mice fed with a HFD (n = 8 each group). b,c, Fat (b) and lean (c) tissue mass of 44-week-old male WT and KO mice (n = 8 each group). d, Cumulative food intake of male WT and KO mice, measured per cage in double-housed mice from age 14 to 47 weeks (n = 8 each group). e–g, Energy expenditure (e), locomotor activity (f) and RER (g) of 49-week old male WT and KO mice (n = 8 each group). h,i, HbA1c in 46-week-old male WT and KO mice (n = 8 each group) (h), as well as glucose tolerance (i) after i.p. dosing with 1.5 g kg−1 glucose in 47-week-old male WT and KO mice (n = 7 each group). dosing with 1.5 U kg−1 insulin (Humalog) in 48-week-old male WT and KO mice (n = 8 each group). k,l, Glucose-induced insulin secretion (n = 7 WT and n = 8 KO) (k) and corresponding levels of total GIP (n = 8 WT and n = 6 KO) (l) after oral glucose bolus administration of 4 g kg−1 glucose in 51-week-old male WT and KO mice. m, Insulin secretion, expressed as fold difference between high and low glucose (2.68 mM and 20 mM) in isolated islets from 46-week-old chow-fed male WT and KO mice treated with either vehicle or 50 nM of either native mouse GIP or GLP-1 (n = 12 independent biological samples per group). n–p, Fasting levels of blood glucose (n) and insulin (o) in 51-week-old male WT and KO mice (n = 8 each group), as well as triglycerides (p) in 52-week-old male WT (n = 7) and KO mice (n = 8). Data in a, d and i–k were analysed by two-way ANOVA with Bonferroni's post hoc test for comparison of individual timepoints. Data in b, c, h, i and n–p were analysed using two-sided, two-tailed Student's t-test. Data in f and g were analysed using a two-tailed, unpaired Mann–Whitney test. Data in m were analysed using a one-way ANOVA. Data in e were analysed using ANCOVA with body weight as the covariate. Cumulative food intake (d) was assessed per cage in n = 8 double-housed mice in each group. For data in m, handpicked islets of similar size were distributed per animal to achieve one well per treatment group (three wells per animal), each containing ten islets per well. Individual P values are shown in the Source data, unless P < 0.0001. Consistent with previous data showing that GIPR agonism acts in the CNS15 to decrease food intake via GABAergic GIPR neurons14, we found that the inhibition of food intake following single subcutaneous (s.c.) administration of acyl-GIP (100 nmol kg−1) was fully preserved in DIO Per-Gipr KO mice (Fig. Notably, however, we found that the co-therapy of acyl-GLP-1 (10 nmol kg−1) and the acylated GIPR antagonist (1,500 nmol kg−1) equally decreased body weight in DIO WT and Per-Gipr KO mice, with superiority of the co-therapy relative to treatment with acyl-GLP-1 alone (Fig. Expectedly, this effect is more clearly pronounced when expressing the data as per cent relative to absolute changes (Fig. 3b,c and Extended Data Fig. 4a).The co-therapy decreased food intake with comparable efficacy in WT and Per-Gipr KO mice, but with significance of the co-therapy over acyl-GLP-1 reached only in the WT mice (Fig. Mice treated with the co-therapy exhibited a greater decrease in fat and lean tissue mass relative to treatment with acyl-GLP-1, without an overt difference between WT and Per-Gipr KO mice (Fig. 3e,f and Extended Data Fig. In both WT and Per-Gipr KO mice, we found that the co-therapy improved glucose tolerance without superiority to GLP-1R agonism alone (Fig. Fasting levels of blood glucose were comparably decreased in mice treated with the co-therapy or acyl-GLP-1, but with significance reached only in the Per-Gipr KO mice (Fig. In both WT and Per-Gipr KO mice, we found that the fasting levels of insulin were decreased and insulin sensitivity increased after treatment with the co-therapy, but without superiority of the co-therapy to GLP-1R agonism alone (Fig. We observed no differences in either treatment or genotype regarding plasma levels of triglycerides (Fig. Collectively, these data show that the ability of GIPR antagonism to enhance GLP-1-induced weight loss is not mediated by GIPR signal inhibition in peripherin-expressing peripheral neurons. Furthermore, and consistent with our data in the Vgat-Gipr KO group (Fig. 1f–h), we found no major additional glycaemic benefits of the co-therapy relative to GLP-1R agonism alone (Fig. a, Acute food intake of 49-week-old male C57BL/6J DIO Per-Cre+Giprwt/wt (WT) or Per-Cre+Giprflx/flx (KO) mice treated s.c. with a single dose of either vehicle (Vhcl) or acyl-GIP (100 nmol kg−1). b–d, Body weight development (b), placebo-corrected weight loss after 25 days treatment (c) and food intake (d) of 47-week-old male C57BL/6J WT and Per-Gipr KO mice treated daily with either vehicle, acyl-GLP-1 (10 nmol kg−1) or the combination of acyl-GLP-1 (10 nmol kg−1) and a GIPR antagonist (ant.) e,f, Body composition (fat mass (e) and lean mass (f), n = 8 each group) of 47-week-old male C57BL/6J DIO WT and Per-Gipr KO mice after 25 days of treatment. glucose tolerance (g and h) with corresponding area under curve (AUC) (i) in 47-week-old male C57BL/6J DIO WT (g and i) and Per-Gipr KO mice (h and i) after 25 days of treatment with either vehicle (n = 8 WT and n = 8 KO), acyl-GLP-1 (n = 7 WT and n = 8 KO) or the co-therapy of acyl-GLP-1 and the GIPR antagonist (n = 8 WT and n = 8 KO). j, Fasting plasma levels of blood glucose in 47-week-old male DIO WT and Per-Gipr KO mice treated either with vehicle, acyl-GLP-1 or the co-therapy of acyl-GLP-1 and the GIPR antagonist (n = 8 each group). k,l, Fasting plasma levels of insulin (k) and corresponding HOMA-IR (l) in 47-week-old male DIO WT and Per-Gipr KO mice treated either with vehicle (n = 8 WT and n = 8 KO), acyl-GLP-1 (n = 7 WT and n = 8 KO) or the co-therapy of acyl-GLP-1 and the GIPR antagonist (n = 8 WT and n = 8 KO). m, Ad libitum plasma levels of triglycerides in 47-week-old male DIO WT and Per-Gipr KO mice (n = 8 mice each group). Data in a, b, g and h were analysed by a two-way ANOVA with Bonferroni's post hoc test for comparison of individual timepoints. Data in c–f and i–m were analysed using a one-way ANOVA. Cumulative food intake (d) was assessed per cage in n = 8 double- or single-housed mice each group. The asterisk colours in a correspond to the comparison of vehicle versus acyl-GIP in WT (black) and Per-GIPR KO (red) mice. The asterisk colours in b correspond to the comparison of acyl-GLP-1 versus the co-therapy in WT (blue) and Per-GIPR KO (red) mice. Individual P values are shown in the Source data, unless P < 0.0001. We next assessed the ability of a mouse GIPR neutralizing antibody23 (Kb of 5 nmol l−1, potency for antagonism of GIP-induced cAMP accumulation) to affect HFD-induced weight gain and food intake in lean mice kept at thermoneutrality (28 °C), an environmental temperature where potential confounding effects due to differences in metabolic rate are lowest. Interestingly, single s.c. treatment with the anti-GIPR antibody (30 mg kg−1) attenuated body weight gain and decreased food intake in lean WT mice (Fig. 4a–c), but not in mice with global loss of either Gipr (Fig. Of note, these data demonstrate that the body weight and food intake reducing effects of GIPR antagonism not only depend on functional GIPR signalling, but also on GLP-1R signalling. The latter contrasts with GIPR agonism, which we and others showed to exhibit a fully preserved ability to decrease body weight and food intake in Glp-1r-deficient mice15,18. a–c, Body weight in grams (a) and percent (b) and food intake (c) of HFD-fed 14–16-week-old male C57BL6/J WT mice treated s.c. with a single dose (30 mg kg−1) of either a control mAb (vehicle; n = 5) or an anti-GIPR antagonist (ant.) d–f, Body weight in grams (d) and percent (e) and food intake (f) of HFD-fed 14–16-week-old male C57BL6/J global Gipr KO mice treated s.c. with a single dose (30 mg kg−1) of either a control mAb (vehicle) or an anti-GIPR antagonist antibody (n = 6 each group). g–i, Body weight in grams (g) and percent (h) and food intake (i) of HFD-fed 14–16-week-old male C57BL6/J global Glp-1r KO mice treated s.c. with a single dose (30 mg kg−1) of either a control mAb (vehicle) or an anti-GIPR antibody (n = 6 each group). Data in a–i were analysed by a two-way ANOVA with Bonferroni's post hoc test for comparison of individual timepoints. 2a), we recently showed that body weight is decreased in HFD-fed mice with CNS-targeted loss of Gipr15, suggesting that the decrease in body weight that is induced by GIPR antagonism is mediated via neurons of the central rather than the peripheral nervous system. This is consistent with our observation that weight loss induced by GIPR antagonism depends on GLP-1R signalling (Fig. 4g–i), which also decreases body weight via central, rather than peripheral mechanisms40. To delineate the similarities and differences of GIPR (ant)agonism in the brain, we next performed snRNA-seq in the hypothalamus and the DVC, two regions implicated in regulation of food intake by GIPR agonism20,21, after single s.c. treatment of DIO mice with either vehicle, acyl-GIP13,14,15 (150 nmol kg−1), acyl-GLP-1 (50 nmol kg−1)13,14,15, the acylated peptide GIPR antagonist (1,500 nmol kg−1)22 or the GIPR:GLP-1R co-agonist MAR709 (50 nmol kg−1)13,14,15 (Fig. The rationale for assessing drug responses after acute treatment was to minimize confounding effects arising from differences in body weight after chronic drug treatment. Treatment groups largely overlapped across tissues, with comparable distribution of cell types, and with neurons constituting most of the captured nuclei across the treatment groups (Fig. We further found higher expression of Gipr in the DVC relative to the hypothalamus, while the opposite was found for the expression of Glp-1r (Fig. After exclusion of low-quality cells, we notably obtained RNA transcriptomes from 57,798 DVC and 211,537 hypothalamic nuclei (Fig. a,b, A schematic of the experimental design (a) and the body weight (b) of 36-week-old male C57BL/6J WT mice fed either a standard chow diet (cntrl.) or a HFD and treated s.c. with a single dose of either vehicle (vhcl), acyl-GIP (150 nmol kg−1), acyl-GLP-1 (50 nmol kg−1), an acylated peptide GIPR antagonist (1,500 nmol kg−1) or the GIPR:GLP-1R co-agonist MAR709 (50 nmol kg−1) (n = 6 each group). c–f, UMAP representations of gene expression coloured by C12-level cell type in the DVC (c) and C25-level cell type in hypothalamus (d), as well as by experimental group in the DVC (e) and hypothalamus (f). g, A bar graph showing mean expression of Glp-1r and Gipr in the DVC and hypothalamus. h, The number of nuclei isolated from each brain region. The colours correspond to log-normalized expression values scaled to the maximum of each gene. Data in b represent mean ± s.e.m. We found that DVC gene expression changes correlated negatively between mice treated with acyl-GIP or the GIPR antagonist (Fig. 6a), but positively between mice treated with acyl-GLP-1 versus the GIPR antagonist (Fig. These data indicate that GIPR antagonism triggers DVC transcriptional responses like those of GLP-1R agonism, and further corroborate that GIPR agonism and antagonism decrease body weight and food intake via different mechanisms. These data are further in agreement with our observation in vivo showing that GIPR antagonism, unlike GIPR agonism15,18, depends on functional GLP-1R signalling to decrease body weight and food intake (Fig. Expectedly, a strong positive correlation in gene expression changes was observed in mice treated with MAR709 versus acyl-GIP (Fig. 6c), but notably not with MAR709 versus acyl-GLP-1 (Fig. 6d) or MAR709 versus GIPR antagonism (Fig. These data are consistent with the established role of MAR709 as a potent GIPR agonist13,14,15,41, and indicate that neither acyl-GIP nor the GIPR:GLP-1R co-agonist MAR709 works as a functional GIPR antagonist. In line with this is our further observation that gene expression changes correlate positively between mice treated with acyl-GLP-1 versus the GIPR antagonist (Fig. 6b), but negatively between mice treated with acyl-GLP-1 versus acyl-GIP (Fig. Notably, the observation that DVC gene expression changes are stronger in mice treated with MAR709 versus acyl-GIP (Fig. 6c) relative to mice treated with MAR709 versus acyl-GLP-1 (Fig. 6d) indicates that GIPR is the primary target of MAR709 in the DVC, with fewer transcriptional changes induced by MAR709 via GLP-1R. In agreement with this notion, we found that expression of Glp-1r concentrated in specific neuronal populations, which include two GABAergic neuronal clusters (C35 GABA3 and C35 GABA4) and one glutamatergic neuronal cluster (C35 Glut 8) (Fig. 6g), while expression of Gipr is more broadly distributed across DVC neuronal populations, with particularly high expression in a small population of 5-HT-positive neurons (Fig. We observed no large differences in Gipr and Glp-1r expression across experimental groups (Fig. a–f, A comparison of log fold-change differences in gene expression in DVC neurons between male DIO C57BL/6J WT mice (DIO cntrl.) treated with acyl-GIP or the GIPR antagonist (ant.) (a), GIPR antagonist versus acyl-GLP-1 (b), acyl-GIP versus MAR709 (c), acyl-GLP-1 versus MAR709 (d), GIPR antagonist versus MAR709 (e) or acyl-GLP-1 versus acyl-GIP (f) (n = 6 mice per group, from which n = 3 mice were pooled to receive n = 2 independent biological replicates per group). g–i, UMAP representation of gene expression of DVC neurons coloured by neuron type (g), and with expression of Glp-1r (h) and Gipr (i). j,k, Heat maps showing mean gene expression of Glp-1r and Gipr in DVC neuronal populations (j) and experimental group (k), and with the colour corresponding to log-normalized expression values scaled to the maximum of each gene. We next performed cell-type prioritization analysis using Augur42 to determine which types of neurons were most affected by individual drug treatment in the DVC (Fig. The higher the Augur score, the more information about group identity is embedded in its gene expression profile, indicating a greater change in gene expression in response to drug treatment. Notably, we found that among the three neuronal populations that were most affected by GIPR antagonism are the two main Glp-1r-expressing clusters C35 GABA4 and C35 GABA3, but with C35 Glut10 neurons being the most affected (Fig. These same neuronal populations also ranked high (sixth, ninth and third, respectively) after treatment with acyl-GLP-1 (Fig. 7b), but ranked low following treatment with acyl-GIP or MAR709 (Fig. Collectively, these data again suggest that GIPR antagonism, unlike agonism, mimics GLP-1R agonism in the DVC and that GIPR, unlike GLP-1R, is the primary target for MAR709 in the DVC. a–d, Bar plots and UMAP representations of gene expression in DVC neurons of male DIO mice treated with a GIPR antagonist (ant.) (a), acyl-GLP-1 (b), acyl-GIP (c) or MAR709 (d) (n = 6 mice per group, from which n = 3 mice were pooled to receive n = 2 independent biological replicates per group). The bar plots and UMAPs are coloured by Augur score, representing cell type-specific changes in gene expression of the treatment group relative to DIO vehicle controls. e–g, Volcano plots (log2 fold change (FC) versus adjusted P values from a two-sided Wilcoxon rank-sum test, corrected for multiple comparison) of differentially expressed genes (DEGs) following treatment with either the GIPR antagonist or acyl-GLP-1 in the top GIPR antagonist affected neuronal clusters Glut10 (e), GABA4 (f) and GABA3 (g). Only the top 15 DEGs are highlighted. Venn diagrams show the overlap of significant DEGs (adjusted P < 0.05) from GIPR antagonist and acyl-GLP-1 groups. P values of DEGs were obtained by Wilcoxon rank-sum tests and were adjusted for multiple comparisons using the Benjamini–Hochberg method. We next compared the differentially expressed genes induced by either GIPR antagonism or GLP-1R agonism in the C35 Glut10, C35 GABA4 and C35 GABA3 clusters (Fig. 7e–g), that is, in the three neuronal populations most affected by GIPR antagonism (Fig. In all three neuronal clusters, we found a similar pattern of gene expression changes after treatment with the GIPR antagonist and acyl-GLP-1, with 29 genes in the C35 Glut10 cluster, 29 genes in the C35 GABA4 cluster and 45 genes in the C35 GABA3 cluster affected by both GLP-1R agonism and GIPR antagonism (Fig. Most of the genes downregulated by GLP-1R agonism and GIPR antagonism were associated with neural plasticity and synapse formation, including neuregulin 3 (Nrg3), neurexin 3 (Nrxn3), discs large MAGUK scaffold protein 2 (Dlg2), sodium leak channel, non-selective (Nalcn), neurotrimin (Ntm), leucine rich repeat and Ig domain containing 2 (Lingo2), leucine rich repeat containing 4C (Lrrc4c), interleukin 1 receptor accessory protein like 1 (Il1rapl1) and glutamate ionotropic receptor NMDA type subunit 2B (Grin2b) (Fig. Notably, Nrxn3, which encodes for a synaptic adhesion protein critical for maintaining synaptic function43, was downregulated by GIPR antagonism and GLP-1R agonism in both the C35 GABA4 and the C35 GABA3 cluster (Fig. 7f,g), while Nrg3, which regulates excitatory synapse formation44, was strongly downregulated in C35 Glut10 and C35 GABA3 neurons by GIPR antagonism but not by GLP-1R agonism (Fig. Furthermore, we found that Lrrc4c and Il1rapl1, both of which encode for factors that are involved in excitatory synapse formation45,46,47, were downregulated by GIPR antagonism and by GLP-1R agonism in the C35 Glut10 cluster (Fig. In summary, these data not only show that GIPR antagonism and GLP-1R agonism act on the same neuronal populations in the DVC, but also that they similarly downregulate gene programmes implicated in synaptic plasticity and synapse formation. We next turned our attention to the hypothalamus. Here, we found only low expression of Gipr across all neuronal types, while Glp-1r was more robustly expressed; particularly, and keeping with the HypoMap48 annotations, in C66-19: Pomc.GLU-5; C66–22: Caprin2.GLU-6; C66–41: Nkx2-4.GABA-3; C66–45: Ghrh.GABA-3; C66–49: Satb2.GABA-6 and C66–50: Chat.GABA-7 neurons (Fig. 7a,b), cell type prioritization analysis revealed that neuron types with high Glp-1r expression do not consistently have the largest changes in gene expression after treatment with either the GIPR antagonist or acyl-GLP-1 (Fig. This observation aligns also with a generally lower correlation in gene expression in hypothalamic neurons of both mice treated with the GIPR antagonist or acyl-GLP-1, and further in mice treated with acyl-GIP or MAR709 (Extended Data Fig. Notably, we found that C66–48: Meis2.GABA-5 neurons are the most affected by GLP-1R agonism, and C66–45: Ghrh.GABA-3 neurons are most affected by GIPR agonism, but both of these populations were less affected after treatment with the GIPR:GLP-1R co-agonist MAR709 (Extended Data Fig. a–d, UMAP representation of hypothalamic neuronal gene expression coloured by expression of Glp-1r (a), Gipr (b), experimental group (c) or C66-level neuron type (d). e, Heat maps showing Glp-1r and Gipr mean gene expression in hypothalamic neuronal types. f,g, Bar plots and UMAP representations of gene expression in DVC neurons of DIO mice treated with a GIPR antagonist (f) or acyl-GLP-1 (g). The bar plots are ranked by, and UMAPs and bar plots are coloured by Augur score, representing cell-specific change in gene expression of the experimental group versus vehicle DIO controls (f and g). h, The top ten most likely cell–cell communication events between DVC GABA4 or Glut 10 neurons and hypothalamic C66-19: Pomc.GLU-4; C66-46: Agrp.GABA-4 or all other hypothalamic neurons in DIO mice treated with vehicle, the GIPR antagonist, acyl-GLP-1 or acyl-GIP. To infer whether the observed changes in transcriptional gene programmes indicative of reduced synaptic plasticity by GIPR antagonism and GLP-1R agonism in the DVC translate to altered signalling in the hypothalamus, we performed a cell–cell communication analysis using the LIANA+49 implementation of the CellPhoneDB50 algorithm, and the receptor–ligand database from NeuronChat51 (Fig. Cell–cell communication analysis in dissociated single-cell data infers likely communication events from the expression of known ligand–receptor pairs across different cell types to predict interactions based on transcriptomic profiles. We found similar alterations in the probability of cell–cell communication events between C35 GABA4 and C35 Glut10 sender neurons and hypothalamic receiver neurons after treatment with acyl-GLP-1 or the GIPR antagonist compared with vehicle DIO controls, which clearly diverge from that of acyl-GIP (Fig. Treatment with the GIPR antagonist and acyl-GLP-1 both led to a reduction in the likelihood of Nrxn3-Nlgn1 signalling between C35 GABA4 neurons and Pomc.GLU-5 and Agrp.GABA-4 neurons, a change that was specific to these feeding-related neurons and absent in other hypothalamic neurons (Fig. Similarly, we observed a decrease in the likelihood of Nrxn1-Nlgn1 signalling from C35 Glut10 neurons to Pomc.GLU-5 and Agrp.GABA-4 neurons (Fig. We did not observe a difference between Pomc.GLU-5 and Agrp.GABA-4 neurons and all other hypothalamic neurons in these signalling pathways in mice treated with acyl-GIP or MAR709 (Fig. Together, these data suggest that GLP-1R agonism and GIPR antagonism may exert their effects on energy balance by downregulating signalling from DVC C35 GABA4 and C35 Glut10 neurons to hypothalamic feeding circuits. In non-neuronal cells, we found in the DVC the highest expression of Gipr in oligodendrocytes (Extended Data Fig. 6a–d), which further constituted the most affected cell type in this area after treatment with either acyl-GIP or MAR709 (Extended Data Fig. In contrast to this, while non-neuronal Gipr expression was also in the hypothalamus highest in oligodendrocytes (Extended Data Fig. 7a–e), this cell type was, in this area, among the least affected after treatment with either acyl-GIP or MAR709 (Extended Data Fig. We also found tanycytes and ependymal cells among the most affected cell types in all treatment groups in the hypothalamus, that is, cell types with privileged access to the third ventricle and which have previously been implicated in the food intake inhibitory effects of the GLP-1R agonist liraglutide52. To compare cell types with the neuronal populations we identified, we integrated our DVC data with two publicly available murine DVC datasets from Hes et al.53 and from Ludwig et al.54. This presented a particular challenge as, in addition to the expected variation from different laboratories, each dataset had been produced using different experimental groups. As each dataset has multiple experimental groups, to correct for the variance between studies while preserving variance between cell types and experimental groups, we trained the single-cell variational inference (scVI) model55 on the control groups from each study with the study as the batch key, and then integrated the experimental groups (Extended Data Fig. Most experimental groups from all three datasets integrated well, however, neurons from mice treated once daily with semaglutide in the Ludwig54 dataset did not integrate well, suggesting that longer-term administration of GLP-1R agonists continue to have a large impact on DVC neuron cell state after the initial dose, although this difference may be confounded by the reduction in body weight (Extended Data Fig. To validate our cell typology framework, we predicted the cell-type labels from our framework using progressive learning through scHPL56 and compared them to the author-provided cell types for both the Hes53 and Ludwig54 datasets (Extended Data Figs. We observed good concordance of major cell types at the C12 annotation level between our data and both the Hes53 and Ludwig54 datasets; however, at the C35 level, most neuron subclusters mapped to the largest glutamatergic or GABAergic neuron cluster, probably owing to the persistent differences between cells from the different experimental conditions. In this study, we assessed the effect on energy metabolism by GIPR antagonism in several mouse lines with global or targeted deficiency of Gipr or Glp-1r. We further delineated the transcriptional similarities and differences of GIPR (ant)agonism, GLP-1R agonism and GIPR:GLP-1R co-agonism in the hypothalamus and DVC of DIO mice using snRNA-seq analysis. Similar to GIPR agonism15,18, we found the reduction of body weight and food intake caused by GIPR antagonism was eliminated in mice with global loss of Gipr. However, while we and others previously showed that GIPR agonism remains fully efficacious to decrease body weight and food intake in mice deficient for Glp-1r15,18, here we found that loss of Glp-1r renders mice resistant to weight loss and inhibition of food intake by GIPR antagonism. Furthermore, while we and others previously showed that GIPR agonism decreases body weight and food intake via Gipr signalling in GABAergic neurons14,19, here we found that the ability of GIPR antagonism to amplify GLP-1-induced weight loss does not depend on the presence of GIPR in GABAergic neurons. Likewise, we show that the ability of GIPR antagonism to enhance GLP-1-induced weight loss is also preserved in mice with peripherin-Cre-mediated loss of Gipr in the PNS. We should note here that the preservation of weight loss and food intake inhibition by GIPR antagonism in mice with disturbed GIPR signalling in the PNS is not unexpected, given that mice with CNS loss of Gipr (thus, mimicking the use of an antagonist) show decreased body weight and food intake when fed a HFD15, suggesting that the reduction in body weight by GIPR antagonism is mediated by neurons of the central rather than peripheral nervous system. Consistent with this is also our observation that the body weight-lowering effects of GIPR antagonism depend on GLP-1R signalling, which likewise are mediated via central rather than peripheral, mechanisms40. Relevant brain areas implicated in GLP-1 control of body weight and food intake include the the hypothalamic arcuate nucleus57,58,59 and the hindbrain DVC59,60, hence the same brain regions that are targeted by long-acting GIPR agonists15,20,21. Notably, the same brain regions are also targeted by the bispecific GIPR antagonist-GLP-1R agonist antibody, GIPR-Ab/GLP-1, as shown in an accompanying manuscript by Liu et al.61. In that study, the authors further utilized pharmacology and mouse genetics to provide complementary evidence supporting a role for attenuation of CNS GIPR signalling in the enhancement of the weight loss effects induced by the GLP-1R agonist dulaglutide61. Furthermore, weight loss achieved using GIPR-Ab/GLP-1 was attenuated in mice with CNS loss of either Gipr or Glp-1r61. Collectively, these findings are consistent with the major experimental findings described herein, and further corroborate that GIPR antagonism acts centrally to amplify GLP-1-induced weight loss. Nonetheless, although mice with GIPR signal inhibition in the PNS do not show alterations in body weight and remain fully sensitive to weight loss induced by GIPR antagonism, these mice develop glucose intolerance with impaired glucose-induced secretion of insulin and GIP when fed a HFD. We hence establish a crucial role of GIPR signalling in peripheral neurons in the regulation of glucose homeostasis, but not body weight, under conditions of diet-induced obesity. Collectively, our data show that GIPR agonism and antagonism decrease body weight and food intake via different neuronal mechanisms, with GIPR antagonism, unlike agonism, depending on GLP-1R signalling but not GIPR signalling in either GABAergic or peripheral neurons. In agreement with this finding, our snRNA-seq analysis revealed that GIPR antagonism, but not agonism, mimics GLP-1R agonism in the DVC. DVC neuronal gene expression changes correlate negatively in mice after treatment with GIPR agonism versus antagonism, but positively in mice treated with GIPR antagonism versus GLP-1R agonism. We observed the greatest transcriptional changes induced by GIPR antagonism in the C35 GABA4, C35 GABA3 and C35 Glut10 neurons, which were also among the highest affected neuronal populations targeted by GLP-1R agonism, but not by GIPR agonism. Interestingly, within these neuronal clusters, GIPR antagonism and GLP-1R agonism are separated from GIPR agonism in that they both similarly downregulate gene programmes indicative of neuronal plasticity and synapse formation. These findings further support the notion that GIPR antagonism and GLP-1R agonism are functionally related and act similarly on DVC neurons, and in a clearly distinct manner from GIPR agonism. In summary, we show here that GIPR agonism and antagonism affect body weight and food intake via different, rather than similar mechanisms, with GIPR antagonism affecting body weight and food intake via modulation of GLP-1R signalling. The observation that gene expression changes induced by GIPR agonism versus its antagonism correlate negatively further argues that our GIPR agonist is not a functional antagonist. It warrants clarification as to how GIPR antagonism decreases body weight in a GLP-1R-dependent manner. The observation that the body weight-lowering effect of GIPR antagonism vanishes in mice with global deletion of both Gipr and Glp-1r potentially points to an inhibitory mechanism by which non-GABAergic GIPR+ neurons partially silence GLP-1R+ neurons so that the latter are less than maximally efficacious. Antagonization of these GIPR+ neurons may thus either directly or indirectly derepress the action of downstream GLP-1R+ neurons to further decrease body weight and food intake. Notably, like previous studies21, we here find expression of Gipr enriched in 5-HT neurons. Given their established role in regulating hunger and satiety62,63 and the recent demonstration that the 5-HT2C receptor agonist lorcaserin acts on brainstem GLP-1 neurons to augment food intake suppression by GLP-1R agonism64, it seems plausible to hypothesize that weight loss induced by GIPR signal modification may involve modulation of the hypothalamic and/or hindbrain serotonergic system. Limitations of our study include that peripherin-Cre does not target all neurons of the PNS. We hence cannot exclude the possibility that peripherin-negative neurons of the PNS play a functional role in the metabolic effects of GIPR antagonism. Since expression of peripherin is not fully exclusive for the PNS, we further cannot exclude the possibility that GIPR was also deleted in our studies in peripherin-expressing neurons outside the PNS. Different molecules with GIPR (ant)agonism may further differ in their pharmacokinetics, including their biodistribution and brain penetrance, which may affect their mode of action in the brain and the periphery. The lack of commonly available and sufficiently selective antibodies to detect GIPR further remains a notable limitation that hinders in-depth immunohistochemical analysis of GIPR in the brain. Notably, expression of drug effects appears generally more robust when comparing relative as compared with absolute values, which is a common problem in biomedical sciences that resides in the typically observed greater data variability in absolute versus relative data. Another limitation of our study is that we only compared drug effects using snRNA-seq after single acute drug treatment, hence not allowing conclusions on transcriptomic changes after more chronic treatment. Further limitations are that the Vgat-Gipr KO and WT mice differ in their starting body weight, which urges caution when comparing drug-induced effects across genotypes. We further only demonstrated the GLP-1R-dependent body weight-lowering effect of GIPR antagonism in mice with global deletion of GLP-1R and GIPR. It warrants clarification whether this effect holds true also in mice with more CNS-targeted deletion of GIPR and GLP-1R. We further only tested drug effects in DIO and glucose intolerant male mice, since female mice are largely resistant to development of diet-induced obesity and glucose intolerance65. It should further be noted that measures of drug effects on body weight are generally more robust than changes in food intake, since mice often have a tendency to shred their food, which if unnoticed, may contribute to a certain degree of bias in the analysis. To not interfere with drug-induced body weight effects, we could further only measure glucose tolerance at the end of the study. Since instant assessment of insulin tolerance using an intraperitoneal (i.p.) insulin tolerance test was not possible due to animal ethics reasons, we were further only able to measure insulin sensitivity using the HOMA-IR, which nonetheless correlates well with direct measures of insulin sensitivity using either i.p. Experiments were performed in accordance with the Animal Protection Law of the European Union after permission by the Government of Upper Bavaria, or the Eli Lilly and Company Institutional Animal Care and Use Committee. Mice were double or single housed and, unless otherwise indicated, fed ad libitum with either a regular chow (1314, Altromin) or a HFD (58% fat, D12331, Research Diets) diet under constant ambient conditions of 22 ± 2 °C with constant humidity (45–65%) and a 12 h/12 h light/dark cycle. C57BL/6J Vgat-ires-cre knock-in mice were purchased from The Jackson Laboratory (028862) and crossed with C57BL6/J Giprflx/flx mice35,36 to generate Vgat-cre+/−Giprflx/flx (Vgat-Gipr KO) mice and Vgat-cre+/−Giprwt/wt (WT) controls. Per-Cre mice37 (MGI ID:3841120) were crossed with C57BL/6J mice for >10 generations before pairing with C57BL6/J Giprflx/flx mice35,36 to receive Per-cre+/−Giprflx/flx (Per-Gipr KO) mice and Per-cre+/−Giprwt/wt (WT) controls. Body composition was analysed using a magnetic resonance whole-body composition analyser (EchoMRI). For assessment of drug effects under room temperature (22 ± 2 °C), male age-matched mice were double housed and fed with a 58% HFD (D12331, Research Diets) for approximately 20 weeks, followed by random assignment into groups of matched genotype, body weight and body composition. Mice were treated at the indicated doses with either long-acting acyl-GIP (IUB0271)13,14,15, acyl-GLP-1 (IUB1746)13,14,15, the GIPR:GLP-1R co-agonist MAR709 (refs. 13,14,15) or an acylated peptide GIPR antagonist ([Nα-Ac,L14,R18,E21]hGIP(5–31)-K11(γE-C16))22. All peptides were provided by the Novo Nordisk Research Center Indianapolis, and have been previously validated in vitro and in vivo for receptor specificity and their ability to decrease body weight in DIO mice13,14,15,22,41. All sequences of the used peptides are published elsewhere13,14,15,22. For assessment of drug effects under thermoneutrality (28 °C), 12–14-week-old male age-matched mice were acclimatized to the housing temperature 2 weeks before start of the studies. At study start, male C57BL6J WT, as well as global Glp-1r−/− and Gipr−/− deficient mice were continued to be housed at thermoneutrality (28 °C) and given ad libitum access to a HFD (60% fat, D12492; Research Diets) and treated with a single dose (30 mg kg−1) of either a control mAb or a GIPR antagonist mAb23 (synthesized and provided by Eli Lilly and Company). Plasma levels of glucose and insulin were measured after 6 h fasting. at a dose of 1.5–2 g kg−1. For assessment of insulin tolerance, insulin (Humalog; Eli Lilly) was injected i.p. at a dose of 0.75–1.5 U kg−1. HbA1c was assessed from fresh blood using the DCA Vantage Analyzer (Siemens). For assessment of glucose-induced insulin secretion, glucose was given orally at a dose of 4 g kg−1 in 6 h fasted mice, followed by blood sampling at timepoints 0, 2, 5, 15 and 30 min after glucose administration. Commercially available enzyme-linked immunosorbent assays (ELISAs) were used according to the manufacturer's instruction to measure insulin (Crystal Chem Zaandam, 90080), total GIP (Sigma-Aldrich, EZRMGIP-55K), triglycerides (Wako Chemicals, 290-63701 or Abcam, ab65336) or total cholesterol (Thermo Fisher Scientific, 10178058). Energy expenditure, food intake, respiratory exchange ratio (RER) and locomotor activity were assessed for 96–132 h, and after 24 h of acclimatization, in single-housed mice using the Promethion climate-controlled indirect calorimetric system (Sabel Systems). For assessment of acute food intake, mice were treated with either vehicle or acyl-GIP (IUB0271)13,14,15 at the indicated doses, followed by measurement of food intake for 16 h. Data for energy expenditure were analysed using analysis of covariance (ANCOVA) with body weight as a covariate70,71. Total RNA was isolated using the RNeasy Kit (Qiagen) according to the manufacturer's instructions. cDNA synthesis was performed using the QuantiTect Reverse Transcription kit (Qiagen) or the High-Capacity cDNA Reverse Transcription kit (Thermo Fisher Scientific), according to the manufacturer's instructions. Gene expression was profiled using SYBR green (Thermo Fisher Scientific) and the Quantstudio 7 flex cycler (Applied Biosystems). The relative expression levels of each gene were normalized to the housekeeping gene peptidylprolyl isomerase A (Ppia), hypoxanthin-phosphoribosyl-transferase 1 (Hprt) or the TATA-binding protein (Tbp). Primer sequences were Ppia-F: 5′-GAG CTG TTT GCA GAC AAA GTT C-3′; Ppia-R: 5′-CCC TGG CAC ATG AAT CCT GG-3′; Hprt-F: 5′-AAG CTT GCT GGT GAA AAG GA-3′; Hprt-R: 5′-TTG CGC TCA TCT TAG GCT TT-3′; Gipr-F: 5′-GGC CCA GAT CAT GAC CCA AT-3′; Gipr-R: 5′-AGC CAA GAA GCA GGT AGC AG-3′; Prph-F: 5′-AAG TTT AAA GAC GAC TGT GCC TG-3′; Prph-R: 5′-TGC TGT TCC TTC TGG GAC TCT-3′; Tbp-F: 5′-GAA GCT GCG GTA CAA TTC CAG-3′; Tbp-R: 5′-CCC CTT GTA CCC TTC ACC AAT-3′. All raw CT values are stated in the Source data. For brain isolation, mice were perfused with PBS, followed by 4% paraformaldehyde (PFA). Brains were then fixed for 24 h at 4 °C in 4% PFA and then transferred to 15% sucrose for 24 h, followed by 24 h in 30% sucrose at 4 °C. For DRG, trigeminal ganglion and nodose ganglion, tissues were extracted and fixed for 1–2 h in 4% PFA and transferred to 30% sucrose overnight at 4 °C. All tissues were frozen in Tissue-Tek O.C.T (Sakura Finetek, 4583), cut in 12–14 µm sections and placed on microscopic slides (Thermo Fisher Scientific, 10149870). The slides were heated for 30 min at 60 °C followed by antigen retrieval using a steamer, then processed by the RNAscope Multiplex Fluorescent Reagent kit v2 (Advanced Cell Diagnostics, 323270) according to the manufacturer's instructions. In brief, a custom-made probe was designed to bind to the deleted exons of mouse Gipr (Advanced Cell Diagnostics, 1138821-C1) and Vgat (Advanced Cell Diagnostics, 319191-C2) hybridized to the RNA, before preamplifiers, amplifiers and dyes were added for visualization of GIPR and Vgat. The slides were incubated with rabbit anti-peripherin antibody (Thermo Fisher Scientific, PA316723; 1:200) for 1 h at room temperature, followed by 30 min incubation with goat anti-rabbit-HRP (Thermo Fisher Scientific, A16096, 1:1,000) at room temperature. TSA vivid dyes 650 and 520 (Advanced Cell Diagnostics, 323271 and 323273, both 1:500 dilution) were added to detect GIPR, peripherin or vesicular GABA transporter (VGAT), respectively. Slides were counterstained with 4,6-diamidino-2-phenylindole (DAPI) (Advanced Cell Diagnostics, 320858) and imaged using Leica SP8 Laser Confocal Microscope using LAS X (version 3.5.7.23225). Mice were killed by cervical dislocation, followed immediately by clamping of the bile duct and perfusion with collagenase P (Roche Diagnostics, 11249002001). Tissues were incubated in a 15 ml Falcon tube with 1 ml of collagenase P solution for 15 min at 37 °C, followed by the addition of 12 ml of cold G-solution (Sigma-Aldrich) and centrifugation at 586g at room temperature. The pellet was subsequently washed with 10 ml of G-solution (500 ml HBSS (Life Technologies, BE10-508F) with 10% BSA (Sigma-Aldrich, 126615-25 ml) and 1% penicillin–streptomycin (Life Technologies, 15140122)) and resuspended in 5.5 ml of gradient solution comprising 15% Optiprep (5 ml 10% RPMI (Life Technologies, 11875093) + 3 ml of 40% Optiprep that was diluted from 60% Optiprep with G-solution (Sigma-Aldrich, D1556)) per sample, and placed on top of 2.5 ml of the gradient solution. To form a three-layer gradient, 6 ml of the G-solution was added on the top. Samples were then incubated for 10 min at room temperature and centrifuged at 630g. The interphase was then collected and filtered through a 70 μm nylon filter (BD Falcon, 352350), before washing with G-solution. Islets were handpicked by a micropipette under the microscope and cultured in RPMI 1640 medium (Life Technologies, 11875093) overnight. The supernatant was collected as a sample under the low glucose condition for 45 min incubation, and islets were incubated for another 45 min at 37 °C with Krebs Ringer HEPES buffer containing 16.7 mM glucose and supplements as above. The supernatant was collected as a sample under the high glucose condition and stored at −20 °C. For drug-induced insulin secretion, native mouse GIP or GLP-1 (provided by Novo Nordisk) were diluted in 1× KRK buffer with 20 mM glucose to reach a concentration of 50 nM. Cells were subsequently treated with either mouse GIP or GLP-1 for 45 min. Insulin concentrations were determined using a Mouse Insulin ELISA (Crystal Chem, 90082). For snRNA-seq, 35-week-old DIO mice were treated 2 h before the end of the light phase with a single s.c. injection of either vehicle (PBS), acyl-GIP (150 nmol kg−1)13,14,15, acyl-GLP-1 (50 nmol kg−1)13,14,15, the GIPR:GLP-1R co-agonist MAR709 (50 nmol kg−1)13,14,15 or an acylated peptide GIPR antagonist (1,500 nmol kg−1)22. The hypothalamus and DVC were collected 8 h after drug administration and stored in liquid nitrogen. Mice were euthanized followed by immediate decapitation and then the skull was removed. An earlier alignment of the brain was determined using a brain matrix and the entire hypothalamus (includes all the nuclei) was collected by microdissection. The hindbrain DVC was microdissected in an area postrema-centric manner after removal of cerebellar cortex. Tissue samples were flash frozen into liquid nitrogen and frozen tissues were stored in liquid nitrogen vapour phase for further processing to single-nuclei isolation. Nuclei were isolated using the 10X Genomics Chromium Nuclei Isolation kit including RNase Inhibitor (10X Genomics, PN-1000494), and using the 10X Genomics protocol for Single Cell Multiome ATAC + Gene Expression (10X Genomics, CG000505 Rev A). Nuclei concentration was determined using a Luna-II Automated Cell Counter (Logos biosystems, L40002) and adjusted to 6,250 nuclei per microlitre after pooling of n = 3 mice per sample. Nuclei were then processed using the 10X Genomics Chromium Next GEM Single Cell Multiome ATAC + Gene Expression (Rev. E) according to the manufacturer's instructions. Pooled samples were loaded into two lanes per group for a total of 24 lanes across three 10X Chromium chips. Equal numbers of cells per sample were loaded on a 10X Genomics Chromium controller instrument to generate single-cell gel beads in emulsion at the Helmholtz Munich Genomics Core Facility. Single-nucleus multimodal libraries were sequenced using the Illumina NovaSeq 6000. FASTQ files were generated from base calls with bcl2fastq software v2.20 (Illumina). Reads were mapped to the pre-built mm10 mouse reference (University of California Santa Cruz mm10 reference genome) using Cell Ranger ARC (v2.0.2, 10X Genomics) with default parameters. The resulting cell-by-peak and cell-by-gene matrices (ATAC and gene expression assays, respectively) from the 24 samples were integrated using Cellranger aggr (10X Genomics). The raw gene expression matrix was filtered after removal of cells with either more than five mean absolute deviations more mitochondrial gene expression unique molecular identified counts, fewer than 500 detected genes or with more than 5 mean absolute deviations in total unique molecular identified counts. Scrublet72 was used to identify likely doublets, Leiden clustering was performed and clusters containing majority likely doublets were removed. After filtering, 211,537 nuclei from the hypothalamus and 57,798 nuclei from the DVC were used for further analysis. The processed gene expression matrix was imported into Scanpy (v1.9.8)73 and normalized using Scran74. The 4,000 most-variable genes were used for principal component analysis and the top 50 principal components were used for the Uniform Manifold Approximation and Projection (UMAP) visualization. We built a k-nearest neighbour graph for clustering using k = 50 nearest neighbours. Then, the Leiden clustering algorithm was used to group the cells into different clusters. To annotate hypothalamic cells, we used scArches75 to transfer labels from the HypoMap48 at the C66 cell annotation level. Then the expression of marker genes from the HypoMap was evaluated in each Leiden cluster, and then was manually annotated informed by marker gene expression and the majority cell type from scArches label transfer. The HypoMap hierarchical cell-type annotation framework was then used to map C25, C7 and C2 cell-type annotations. To annotate DVC cell types, as there is no comparable atlas and annotation framework to the hypomap for the DVC, each Leiden cluster was manually annotated into 35 fine-grained cell types (C35 cell type), and were then mapped to the coarser-grained C12 and C2 levels of cell type based on the expression of marker genes. DVC neurons were further subdivided into individual clusters labelled by major neurotransmitter expression. For comparison of gene expression differences in DVC and hypothalamic neurons, the mean log fold difference in normalized expression of each gene was compared between the treatment groups and the DIO control group. Linear regression and Spearman correlation coefficients were calculated between treatment groups. Cell type prioritization was done using the Pertpy76 implementation of Augur42, which uses a random forest classifier to assess how accurately the experimental condition of cells within a given cell type can be predicted based on their gene expression profiles. Cell type prioritization comparisons are always made between the experimental group and the DIO vehicle control. Differential gene expression analysis was performed using Scanpy's tl.rank_genes_groups function to identify genes that were differentially expressed between two experimental groups within a given cell type, genes with fewer than 30 counts were filtered for each comparison. The Wilcoxon rank-sum test was applied to assess differences in gene expression between groups. Default parameters were used, and multiple testing was accounted for by adjusted P values using the Benjamini–Hochberg method. Only genes with an adjusted P value <0.05 were considered statistically significant. Cell–cell communication between DVC and hypothalamic cells was performed using LIANA+49 implementation of the CellPhoneDB50 algorithm combined with the receptor–ligand database from NeuronChat51. DVC neurons were specified as sender cell types and hypothalamic neurons as receiver cell types. We selected the 2,500 most variable genes in our DVC snRNA-seq data and used scVI55 to integrate snRNA-seq data from Ludwig et al.54 and Hes et al.53, subset to the same 2,500 genes. We used treeArches to create a manual tree with three layers of granularity in cell type in our data. We mapped our own data with the Hes53 and Ludwig54 datasets into a joint latent space using scArches75, and then mapped the parameters for hierarchical progressive learning from scHPL v1.0.5 (ref. 56) to predict the cell type from our annotation framework each cell type from the Hes53 and Ludwig54 datasets correspond to. In vivo studies were performed in male or female age-matched mice that were randomly distributed to achieve groups of equal body weight and body composition. The number of independent biological samples per group is indicated in the figure legends and Source data. No animals were excluded from the studies unless health issues demanded exclusion of single mice (for example, due to fighting injuries) as indicated in the Source data. For in vivo studies, drugs were aliquoted by a lead scientist in number-coded vials and most, but not all, handling investigators were blinded to the treatment condition. Analyses of glucose and insulin tolerance were performed by experienced research assistants who did not know prior treatment conditions. For animal studies, sample sizes were calculated based on a power analysis assuming that a body weight difference of ≥5 g between the treatment groups can be captured with a power of ≥75% when using a two-sided, two-tailed statistical test under the assumption of a s.d. Statistical analyses were performed using the statistical tools implemented in GraphPad Prism10 (version 10.0.3) and after testing of data for normal distribution using the Kolmogorov–Smirnov test, D'Agostino and Person test, Anderson–Darling test or Shapiro–Wilk test implemented in GraphPad Prism (version 10.0.3). Nonparametric tests such as the Mann–Whitney U test or the Kruskal–Wallis test were used to analyse data that were not normally distributed. Normally distributed data were analysed with the following parametric tests: two-tailed Student's t-test, one-way analysis of variance (ANOVA) or two-way ANOVA with time and genotype as co-variants followed by Bonferroni's post hoc multiple comparison test for individual timepoints. All results are given as mean ± s.e.m. Differences in energy expenditure were calculated using ANCOVA with body weight as co-variate using SPSS (version 24). No data were excluded from the analysis unless for animal welfare reasons (for example, injury due to fighting) or identification of singular outlier using Grubbs test. Individual P values and outliers are shown in the Source data, unless P < 0.0001. Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article. The snRNA-seq data are available in the GEO under SuperSeries accession number GSE288514. All data used for the statistical analysis are available in the data source file, along with the GraphPad Prism-derived report on the statistical analysis. Source data are provided with this paper. Continued treatment with tirzepatide for maintenance of weight reduction in adults with obesity: the SURMOUNT-4 randomized clinical trial. Garvey, W. T. et al. Tirzepatide once weekly for the treatment of obesity in people with type 2 diabetes (SURMOUNT-2): a double-blind, randomised, multicentre, placebo-controlled, phase 3 trial. Jastreboff, A. M. et al. Tirzepatide once weekly for the treatment of obesity. Wadden, T. A. et al. Tirzepatide after intensive lifestyle intervention in adults with overweight or obesity: the SURMOUNT-3 phase 3 trial. Dahl, D. et al. Effect of subcutaneous tirzepatide vs placebo added to titrated insulin glargine on glycemic control in patients with type 2 diabetes: the SURPASS-5 randomized clinical Trial. Del Prato, S. et al. Tirzepatide versus insulin glargine in type 2 diabetes and increased cardiovascular risk (SURPASS-4): a randomised, open-label, parallel-group, multicentre, phase 3 trial. Frias, J. P. et al. Tirzepatide versus semaglutide once weekly in patients with type 2 diabetes. Ludvik, B. et al. Once-weekly tirzepatide versus once-daily insulin degludec as add-on to metformin with or without SGLT2 inhibitors in patients with type 2 diabetes (SURPASS-3): a randomised, open-label, parallel-group, phase 3 trial. Rosenstock, J. et al. Tirzepatide vs insulin lispro added to basal insulin in type 2 diabetes: the SURPASS-6 randomized clinical trial. Rosenstock, J. et al. Efficacy and safety of a novel dual GIP and GLP-1 receptor agonist tirzepatide in patients with type 2 diabetes (SURPASS-1): a double-blind, randomised, phase 3 trial. Irwin, N. & Flatt, P. R. Therapeutic potential for GIP receptor agonists and antagonists. Akindehin, S. et al. Loss of GIPR in LEPR cells impairs glucose control by GIP and GIP:GLP-1 co-agonism without affecting body weight and food intake in mice. Liskiewicz, A. et al. Glucose-dependent insulinotropic polypeptide regulates body weight and food intake via GABAergic neurons in mice. The glucose-dependent insulinotropic polypeptide (GIP) regulates body weight and food intake via CNS–GIPR signaling. Coskun, T. et al. LY3298176, a novel dual GIP and GLP-1 receptor agonist for the treatment of type 2 diabetes mellitus: from discovery to clinical proof of concept. Finan, B. et al. Unimolecular dual incretins maximize metabolic benefits in rodents, monkeys, and humans. Mroz, P. A. et al. Optimized GIP analogs promote body weight lowering in mice through GIPR agonism not antagonism. Specific loss of GIPR signaling in GABAergic neurons enhances GLP-1R agonist-induced body weight loss. Adriaenssens, A. et al. Hypothalamic and brainstem glucose-dependent insulinotropic polypeptide receptor neurons employ distinct mechanisms to affect feeding. Adriaenssens, A. E. et al. Glucose-dependent insulinotropic polypeptide receptor-expressing cells in the hypothalamus regulate food intake. Discovery of a potent GIPR peptide antagonist that is effective in rodent and human systems. Killion, E. A. et al. Anti-obesity effects of GIPR antagonists alone and in combination with GLP-1R agonists in preclinical models. Lu, S. C. et al. GIPR antagonist antibodies conjugated to GLP-1 peptide are bispecific molecules that decrease weight in obese mice and monkeys. Veniant, M. M. et al. A GIPR antagonist conjugated to GLP-1 analogues promotes weight loss with improved metabolic parameters in preclinical and phase 1 settings. Kaneko, K. et al. Gut-derived GIP activates central Rap1 to impair neural leptin sensitivity during overnutrition. Jensen, M. H. et al. AT-7687, a novel GIPR peptide antagonist, combined with a GLP-1 agonist, leads to enhanced weight loss and metabolic improvements in cynomolgus monkeys. & Rosenkilde, M. M. GIP as a therapeutic target in diabetes and obesity: insight from incretin co-agonists. Okawa, T. et al. Sensory and motor physiological functions are impaired in gastric inhibitory polypeptide receptor-deficient mice. A., Gasis, M., Thorens, B., Muller, H. W. & Bosse, F. Glucose-dependent insulinotropic polypeptide (GIP) and its receptor (GIPR): cellular localization, lesion-affected expression, and impaired regenerative axonal growth. & Schlaich, M. P. Relevance of sympathetic nervous system activation in obesity and metabolic syndrome. Asmar, M. et al. GIP-induced vasodilation in human adipose tissue involves capillary recruitment. The gluco- and liporegulatory and vasodilatory effects of glucose-dependent insulinotropic polypeptide (GIP) are abolished by an antagonist of the human GIP receptor. Campbell, J. E. et al. TCF1 links GIPR signaling to the control of beta cell function and survival. Ussher, J. R. et al. Inactivation of the glucose-dependent insulinotropic polypeptide receptor improves outcomes following experimental myocardial infarction. Zhou, L. et al. Murine peripherin gene sequences direct Cre recombinase expression to peripheral neurons in transgenic mice. A new neuronal intermediate filament protein. & Gainer, H. NF-L and peripherin immunoreactivities define distinct classes of rat sensory ganglion cells. Sisley, S. et al. Neuronal GLP1R mediates liraglutide's anorectic but not glucose-lowering effect. Cell type prioritization in single-cell data. & He, G. Structure, function, and pathology of Neurexin-3. Mei, L. & Nave, K. A. Neuregulin-ERBB signaling in the nervous system and neuropsychiatric diseases. DeNardo, L. A., de Wit, J., Otto-Hitt, S. & Ghosh, A. NGL-2 regulates input-specific synapse development in CA1 pyramidal neurons. Choi, Y. et al. NGL-1/LRRC4C deletion moderately suppresses hippocampal excitatory synapse development and function in an input-independent manner. Montani, C., Gritti, L., Beretta, S., Verpelli, C. & Sala, C. The synaptic and neuronal functions of the X-linked intellectual disability protein interleukin-1 receptor accessory protein like 1 (IL1RAPL1). Steuernagel, L. et al. HypoMap-a unified single-cell gene expression atlas of the murine hypothalamus. Dimitrov, D. et al. LIANA+ provides an all-in-one framework for cell–cell communication inference. & Vento-Tormo, R. CellPhoneDB: inferring cell–cell communication from combined expression of multi-subunit ligand-receptor complexes. Zhao, W., Johnston, K. G., Ren, H., Xu, X. & Nie, Q. Inferring neuron–neuron communications from single-cell transcriptomics through NeuronChat. Imbernon, M. et al. Tanycytes control hypothalamic liraglutide uptake and its anti-obesity actions. Hes, C. et al. A unified rodent atlas reveals the cellular complexity and evolutionary divergence of the dorsal vagal complex. Ludwig, M. Q. et al. A genetic map of the mouse dorsal vagal complex and its role in obesity. Lopez, R., Regier, J., Cole, M. B., Jordan, M. I. & Yosef, N. Deep generative modeling for single-cell transcriptomics. Michielsen, L., Reinders, M. J. T. & Mahfouz, A. Hierarchical progressive learning of cell identities in single-cell data. The arcuate nucleus mediates GLP-1 receptor agonist liraglutide-dependent weight loss. The hypothalamic glucagon-like peptide 1 receptor is sufficient but not necessary for the regulation of energy balance and glucose homeostasis in mice. Gabery, S. et al. Semaglutide lowers body weight in rodents via distributed neural pathways. Huang, K. P. et al. Dissociable hindbrain GLP1R circuits for satiety and aversion. Liu, C. M. et al. GIPR-Ab/GLP-1 peptide–antibody conjugate requires brain GIPR and GLP-1R for additive weight loss in obese mice. He, Y. et al. 5-HT recruits distinct neurocircuits to inhibit hunger-driven and non-hunger-driven feeding. Xu, Y. et al. 5-HT2CRs expressed by pro-opiomelanocortin neurons regulate energy homeostasis. Wagner, S. et al. Obesity medication lorcaserin activates brainstem GLP-1 neurons to reduce food intake and augments GLP-1 receptor agonist induced appetite suppression. Jall, S. et al. Monomeric GLP-1/GIP/glucagon triagonism corrects obesity, hepatosteatosis, and dyslipidemia in female mice. Antunes, L. C., Elkfury, J. L., Jornada, M. N., Foletto, K. C. & Bertoluci, M. C. Validation of HOMA-IR in a model of insulin-resistance induced by a high-fat diet in Wistar rats. Comparison between surrogate indexes of insulin sensitivity and resistance and hyperinsulinemic euglycemic clamp estimates in mice. Comparison between surrogate indexes of insulin sensitivity/resistance and hyperinsulinemic euglycemic clamp estimates in rats. Muller, T. D., Klingenspor, M. & Tschop, M. H. Revisiting energy expenditure: how to correct mouse metabolic rate for body mass. Tschop, M. H. et al. A guide to analysis of mouse energy metabolism. Wolock, S. L., Lopez, R. & Klein, A. M. Scrublet: computational identification of cell doublets in single-cell transcriptomic data. Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. & Marioni, J. C. A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor. Lotfollahi, M. et al. Mapping single-cell data to reference atlases by transfer learning. Heumos, L. et al. Pertpy: an end-to-end framework for perturbation analysis. This work was funded by the European Union within the scope of the European Research Council ERC-CoG Trusted no. The views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Research Council. Neither the European Union nor the awarding authority can be held responsible for them. further received funding from the German Research Foundation (grant nos. DFG TRR296, TRR152, SFB1123 and GRK 2816/1) and the German Center for Diabetes Research (DZD e.V.). The skilful technical support of the Core Facility Genomics at Helmholtz Munich is highly acknowledged. We further thank S. Padmarasu and I. De la Rosa Velazquez for their help with the snRNA-seq. Open access funding provided by Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH). These authors contributed equally: Robert M. Gutgesell, Ahmed Khalil. Institute for Diabetes and Obesity, Helmholtz, Munich, Germany Robert M. Gutgesell, Ahmed Khalil, Arkadiusz Liskiewicz, Gandhari Maity-Kumar, Aaron Novikoff, Gerald Grandl, Daniela Liskiewicz, Callum Coupland, Ezgi Karaoglu, Seun Akindehin, Russell Castelino, Xue Liu, Cristina Garcia-Caceres, Alberto Cebrian-Serrano & Timo D. Müller German Center for Diabetes Research, DZD, Neuherberg, Germany Robert M. Gutgesell, Ahmed Khalil, Arkadiusz Liskiewicz, Gandhari Maity-Kumar, Aaron Novikoff, Gerald Grandl, Daniela Liskiewicz, Callum Coupland, Ezgi Karaoglu, Seun Akindehin, Russell Castelino, Xue Liu, Cristina Garcia-Caceres, Alberto Cebrian-Serrano & Timo D. Müller Department of Pharmacology, Experimental Therapy and Toxicology, Institute of Experimental and Clinical Pharmacology and Pharmacogenomics, Eberhard Karls University, Tübingen, Germany Department of Computational Health, Institute of Computational Biology, Helmholtz, Munich, Germany Medizinische Klinik und Poliklinik IV, Klinikum der Universität, Ludwig–Maximilians Universität München, Munich, Germany Indiana Biosciences Research Institute, Indianapolis, IN, USA Diabetes, Obesity and Complications Therapeutic Area, Eli Lilly and Company, Indianapolis, IN, USA Brian Finan, Kyle W. Sloop & Ricardo J. Samms TUM School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany Division of Metabolic Diseases, Department of Medicine, Technische Universität, Munich, Germany You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar participated in the molecular design and interpretation of data. participated in the study design, analysis of data and interpretation of results. wrote the paper with support of R.M.G. Correspondence to Matthias H. Tschöp or Timo D. Müller. He was a member of the Research Cluster Advisory Panel (ReCAP) of the Novo Nordisk Foundation between 2017 and 2019. He attended a scientific advisory board meeting of the Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, in 2016. He received funding for his research projects by Novo Nordisk (2016–2020) and Sanofi-Aventis (2012–2019). He was a consultant for Bionorica SE (2013–2017), Menarini Ricerche S.p.A. (2016) and Bayer Pharma AG Berlin (2016). As former Director of the Helmholtz Diabetes Center and the Institute for Diabetes and Obesity at Helmholtz Zentrum München (2011–2018), and since 2018, as CEO of Helmholtz Zentrum München, he has been responsible for collaborations with a multitude of companies and institutions, worldwide. In this capacity, he discussed potential projects with and has signed/signs contracts for his institute(s) and for the staff for research funding and/or collaborations with industry and academia, worldwide, including but not limited to pharmaceutical corporations like Boehringer Ingelheim, Eli Lilly, Novo Nordisk, Medigene, Arbormed, BioSyngen and others. In this role, he was/is further responsible for commercial technology transfer activities of his institute(s), including diabetes related patent portfolios of Helmholtz Zentrum München as, for example, WO/2016/188932 A2 or WO/2017/194499 A1. confirms that, to the best of his knowledge, none of the above funding sources were involved in the preparation of this paper. receives research funding by Novo Nordisk and has received speaking fees from Novo Nordisk, Eli Lilly, Boehringer Ingelheim, Merck, AstraZeneca and Mercodia. is a co-inventor on intellectual property owned by Indiana University and licensed to Novo Nordisk. were previously employed by Novo Nordisk. are current employees of Eli Lilly. The other authors declare no competing interests. Nature Metabolism thanks Alice Adriaenssens, Nigel Irwin and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Christoph Schmitt, in collaboration with the Nature Metabolism editorial team. Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. RNAscope validation of GIPR deletion in 20 week-old male chow-fed Vgat cre+ Giprwt/wt (WT) and Vgat cre+ Giprflx/flx (KO) mice (pictures are representative examples of n = 3 mice per group) (a,b). Absolute values of body weight development (c–e) and total change in body weight (f) of 33-wk old male C57BL/6 J wildtype (WT) or Vgat-Gipr knockout (KO) mice treated daily over 24 days with either vehicle, acyl-GLP-1 (10 nmol/kg), or the combination of acyl-GLP-1 (10 nmol/kg) and a GIPR antagonist (1,500 nmol/kg) (n = 8 each group). Absolute changes of body composition of 36-wk old male C57BL/6 J wildtype and Vgat-Gipr KO mice treated either with Vehicle (n = 8 WT and n = 8 KO), acyl-GLP-1 (n = 8 WT and n = 8 KO), or the co-therapy of acyl-GLP-1 and the GIPR antagonist (n = 8 WT and n = 7 KO) (g,h). Data in panel d and e were analyzed by repeated measures 2-way ANOVA with Bonferroni's post-hoc test for comparison of individual time points. Data in panel f-h were analyzed using 1-way ANOVA. Individual p-values are shown in the Data Source file, unless p < 0.0001. RNAscope validation of Gipr deletion in trigeminal ganglion of 44-week old male chow-fed WT and KO mice (q) and of the DRG of 51-week old male chow-fed WT and KO mice (r). Data in panels q and r are representative examples of n = 3 mice each group. Data in panel c-f and h-p were analyzed using two-sided, two-tailed Student's ttest, data in panel g were analyzed two-sided by Mann-Whitney test. Individual p-values are shown in the Data Source file, unless p < 0.0001. Scale bars in panel q and r are 5 μm. Body weight development of female C57BL6/J Per-Cre+Giprwt/wt (WT) and Per-Cre+Giprflx/flx (KO) mice fed with a HFD (n = 8 each group) (a). Fat and lean tissue mass of 35-week old female WT and KO mice (n = 8 each group) (b,c). Cumulative food intake of 52-week old female WT (n = 6) and KO mice (n = 7) (d). Energy expenditure (e), locomotor activity (f) and respiratory exchange ratio (RER) (g) of 52-week old female WT (n = 6) and KO mice (n = 7). Glucose tolerance (h) and corresponding area under curve (AUC) (i) after i.p. dosing with 2 g/kg glucose in 48-week old female WT and KO mice (n = 7 each group). Glucose-induced insulin secretion (j) and corresponding AUC (k) after oral glucose bolus administration of 4 g/kg glucose in 54-week old female WT and KO mice (n = 6 each group). dosing with 0.75U/kg insulin (Humalog) in 50-week old female WT (n = 6) and KO mice (n = 8). Fasting levels of blood glucose (n = 8 each group) (m) and insulin (n = 6 WT, n = 7 KO) (n) in 54-week old female WT and KO mice, as well as fasting plasma levels of triglycerides (n = 6 WT, n = 7 KO) (o) and cholesterol (n = 6 WT, n = 7 KO) (p) in 55-week old female WT and KO mice. Data in panel a,c,f and g were analyzed by repeated measures 2-way ANOVA with Bonferroni's post-hoc test for comparison of individual time points. Data in panel f, and g, were analyzed using Mann-Whitney test. Data in panel a,d,e,h,j and l were analyzed using 2-way ANOVA and with Bonferroni post-hoc comparison of individual time points. Data in panel e was analyzed using ANCOVA with body weight as covariate. Data in panel b,c,i,k,m-p were analyzed using Student's two-tailed, two-sided ttest. Individual p-values are shown in the Data Source file, unless p < 0.0001. Body weight development of 47-week old male C57BL/6 J wildtype (WT) and Per-Gipr knockout (KO) mice treated daily with either vehicle, acyl-GLP-1 (10 nmolkg), or the combination of acyl-GLP-1 (10 nmol/kg) and a GIPR antagonist (1,500 nmol/kg) (n = 8 each group) (a). Body composition (n = 8 each group) 47-wk old male C57BL/6 J DIO wildtype and Per-Gipr KO mice after 25 days of treatment (b,c). Data in panel a was analyzed by 2-way ANOVA with Bonferroni's post-hoc test for comparison of individual time points. Individual p-values are shown in the Data Source file, unless p < 0.0001. Comparison of Log fold change differences in gene expression between male DIO C57BL/6 J wildtype mice treated with acyl-GIP or the GIPR antagonist (a), MAR709 vs. acyl-GLP-1 (b), MAR709 vs. acyl-GIP (c), MAR709 vs. the GIPR antagonist (d), acyl-GLP-1 vs. acyl-GIP (e), or acyl-GLP-1 vs. the GIPR antagonist (f) (n = 6 mice per group, from which n = 3 mice were pooled to receive n = 2 independent biological replicates per group). Bar plots are ranked Augur score in mice treated with either acyl-GIP (g) or MAR709 (h), representing cell-specific change in gene expression of the respective groups relative to Vehicle-treated DIO controls. The top 10 most likely cell-cell communication events between DVC GABA4 or Glut 10 neurons and hypothalamic C66-19: Pomc.GLU-4, C66-46: Agrp.GABA-4, or all other hypothalamic neurons in mice treated with MAR709 (i). Cellphone p-values are permutation-based p-values, Lr means are mean ligand-receptor expression. UMAP representations of gene expression of DVC non-neuronal cells colored by experimental group (a), cell type (b), or expression of either Glp-1r (c) or Gipr (d). Bar plots with Augur scores of DVC non-neuronal cells from mice treated with either the GIPR antagonist (e), acyl-GLP-1 (f), acyl-GIP (g), or MAR709 (h). Bar plots are ranked and colored by Augur score, representing cell-specific change in gene expression of the respective group relative to vehicle-treated DIO controls. UMAP representations of gene expression of hypothalamus non-neuronal cells colored by experimental group (a), cell type (b), or expression of Glp-1r (c), or Gipr (d). (e) Heatmap showing Glp-1r and Gipr mean gene expression in hypothalamic non-neuronal cell types. Color corresponds to log-normalized expression values scaled to the maximum of each gene. Bar plots showing Augur scores of hypothalamic non-neuronal cells from mice treated with the GIPR antagonist (f), acyl-GLP-1 (g), acyl-GIP (h), or MAR709 (i). Bar plots are ranked and colored by Augur score, representing cell-specific change in gene expression of the respective group relative to vehicle-treated DIO controls. UMAP representation of an scVI joint embedding showing all DVC nuclei from this study (Gutgesell), with all nuclei from the Hes et al.53 and Ludwig et al.54 datasets (a), and individual UMAPs showing each individual study: Gutgesell (b), Hes et al. (c), Ludwig et al.54 (d), and by the experimental group from each study (e). Pairwise heatmap showing the proportion of cells labeled by Hes et al. (y-axis), predicted to belong to each DVC cell-type from this study (x-axis) using scHPL. Pairwise heatmap showing the proportion of cells labeled by Ludwig et al. (y-axis), predicted to belong to each DVC cell-type from this study (x-axis) using scHPL. RNAscope analysis of GIPR in the nodose ganglion of 59-week-old male chow-fed WT and Per-GIPR KO mice. Data are representative examples of n = 3 mice each group. Original pictures for Supplementary Fig. Statistical source data for Fig. Statistical source data for Fig. Statistical Source Data for Fig. Statistical source data for Fig. Statistical source data for Fig. Statistical source data for Fig. Statistical source data for Fig. Statistical source data for Fig. Original pictures for Extended Data Figs. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. Gutgesell, R.M., Khalil, A., Liskiewicz, A. et al. GIPR agonism and antagonism decrease body weight and food intake via different mechanisms in male mice. Anyone you share the following link with will be able to read this content: Sorry, a shareable link is not currently available for this article. Provided by the Springer Nature SharedIt content-sharing initiative Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.
HPV Infection May Increase the Risk of Heart Disease. A vaccine that blocks infection with the human papillomavirus has helped to lower cervical cancer rates. Researchers want to find out if the shot also prevents heart attacks Now recent research suggests HPV infection also increases the risk of heart disease. An analysis of seven studies with a total of nearly 250,000 participants found that those who tested positive for HPV were 33 percent more likely than those who tested negative to develop cardiovascular disease. The vaccine, which has been recommended for adolescents since 2006, protects against infection with nine strains of HPV, including high-risk types that are the most likely to cause cervical cancer, as well as strains that cause genital warts. The Centers for Disease Control and Prevention recommends that boys and girls receive a series of two HPV shots at ages 11 or 12 as part of their routine childhood vaccinations—and that people receive three shots if their first dose is instead administered between the ages of 15 and 26. The vaccine is most protective when given before people become sexually active. If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today. “We're hoping that [the vaccine] will be a powerful tool for prevention.” It has not yet been published as a peer-reviewed study. The analysis included studies published between 2011 and 2024 that followed women for three to 17 years. The largest study included in the analysis was published by researchers in South Korea in 2024 and followed apparently healthy women who were tested for 13 strains of high-risk HPV as part of a routine screening for cervical cancer. Although heart disease and death were rare among these women, who had an average age of 40, those who tested positive for high-risk HPV were nearly four times as likely as those who tested negative to develop blocked arteries or die from heart disease, the study found. Women aren't the only ones at risk, Akinfenwa says. In one paper included in the analysis, a 2017 study of people undergoing radiation therapy for head and neck cancer, 75 percent of patients were men. (Head and neck cancers are more than twice as common in men as they are in women, according to the National Cancer Institute.) The 2017 study found that people who tested positive for HPV were more likely to have strokes compared with those who tested negative. Among sexually active people, more than 90 percent of men and more than 80 percent of women are infected with HPV during their lifetime. Vaccine hesitancy and lack of awareness about HPV has kept many parents from vaccinating their children against the infection, research shows. Some parents are reluctant to vaccinate their kids against HPV because they don't think their children will have sex as teenagers. Only 61 percent of adolescents are up to date on all HPV vaccines. Even without a study that has specifically analyzed the effect of HPV vaccination on heart disease, the link between HPV and heart disease suggests that “vaccination is a good idea, and our study definitely supports that,” Akinfenwa says. Other experts aren't so sure about the link between HPV and heart disease. Mark Einstein, chair of obstetrics and gynecology and women's health at Montefiore Einstein, who also was not involved in the analysis, says researchers have a long way to go before they can confidently say that the virus causes heart disease. HPV causes cancer in parts of the body that come into direct contact with the virus through sexual activity, says Kevin Ault, a professor of obstetrics and gynecology at the Western Michigan University Homer Stryker M.D. “We usually don't think of human papillomavirus as going all around the body,” Ault says. “It's going to infect mostly skin” or mucous membranes. Inflammation can also cause those plaques to burst and form blood clots, which can lead to a heart attack or stroke. Although the immune system naturally controls most HPV infections within a year or two, a small number of infections become chronic, a problem that increases the risk of cervical cancer, says Rebecca Perkins, obstetrician and gynecologist at the Woman, Mother and Baby Research Institute at Tufts Medical Center. And just as the varicella-zoster virus can reactivate decades after a childhood infection and cause shingles, HPV can wake up and cause women to test positive during cervical cancer screenings, Perkins says. HPV tests are now included in most routine cervical cancer screenings, either alone or in combination with a Pap smear. A wide variety of viruses, bacteria, parasites and fungi can trigger myocarditis, an inflammation of the heart muscle, which can make the heart too weak to pump blood efficiently. Those include the viruses that cause influenza and COVID. “Many infectious diseases set off inflammatory cascades that can prompt cardiovascular and neurological events like heart attacks, blood clots and strokes,” Adalja says. “By staving off infection, [vaccinating] against these agents—such as influenza, varicella-zoster virus and, presumably, HPV—these events will be prevented or become less likely.” Doctors have frequently been surprised by unexpected or off-target benefits from vaccines, Adalja says. The bacillus Calmette-Guérin (BCG) vaccine against tuberculosis also has been found to reduce risk of other diseases in which the immune system goes awry, including type 1 diabetes, cancer, multiple sclerosis and Alzheimer's disease. And an analysis published in 2024 found that meningitis vaccines reduced the incidence of gonorrhea by 30 to 59 percent. Such cross-protective immunity can occur when two bacteria are from similar families. To better understand how HPV damages the heart and whether the HPV vaccine might offer protection, Merz says, researchers could compare rates of chronic inflammation in adolescents who were vaccinated with rates in those who weren't vaccinated. “It's logical to think preventing the HPV infection itself via vaccination will reduce the risk of cardiovascular disease,” Akinfenwa says. “Having said that, it needs to be tested.”
You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript. Qubit frequency shifts, which often contain information about a target environment variable, are detected with Ramsey interference measurements. Unfortunately, the sensitivity of this protocol is limited by decoherence. We introduce a new protocol to enhance the sensitivity of a qubit frequency measurement in the presence of decoherence by applying a continuous drive to stabilize one component of the Bloch vector. We demonstrate our protocol on a superconducting qubit, enhancing sensitivity per measurement shot by 1.65 × and sensitivity per qubit evolution time by 1.09 × compared to Ramsey. We also explore the protocol theoretically, finding unconditional enhancements compared to Ramsey interferometry and maximum enhancements of 1.96 × and 1.18 × , respectively. Additionally, our protocol is robust to parameter miscalibrations. It requires no feedback and no extra control or measurement resources, and can be immediately applied in a wide variety of quantum computing and quantum sensor technologies. Ramsey interferometry1 has been long established as the most sensitive measure of a qubit's frequency2. In a Ramsey measurement, a qubit is prepared in a superposition of energy states, allowed to evolve freely and acquire phase, and then measured along some axis. Decoherence at a rate γ2 = 1/T2 limits the signal-to-noise ratio (SNR) of quantum sensors. To date, most work has focused on SNR scaling beyond the standard quantum limit of \(\sqrt{N}\) (where N is the number of independent qubits and/or measurements)13,14,15,16,17, using dynamical decoupling to enhance frequency discrimination and reduce or characterize non-Markovian decoherence18,19,20, using measurement-based feedback to rapidly lock in on large signals21, time-varying signals22, or an unknown signal axis23, and compensating for measurement errors24. Likewise, researchers have developed sensors which are less prone to decoherence25,26,27 and couple more strongly to wanted signals28,29,30. However, no results have shown improvement over traditional Ramsey interferometry for increasing the SNR from a single qubit measuring a small, static (zero frequency) signal. Here, we demonstrate a protocol for enhanced quantum sensing of static fields. The protocol is based on a recent theory of quantum property preservation31 generalizing the coherence preservation results of ref. 32, showing how certain scalar functions of a quantum state can be stabilized using purely Hamiltonian control. We use deterministic Hamiltonian control of a single qubit to stabilize one Bloch vector component (vx), enabling increased phase accumulation in the orthogonal component vy and thus enhanced sensitivity. Our protocol gives a significant signal enhancement over standard Ramsey interferometry, up to a factor of 1.96 per measurement shot or 1.18 per qubit evolution time. We derive analytical expressions for the signal enhancement in the small-signal regime, and show simulations of the protocol's robustness to miscalibrations. The protocol is robust to variation in environmental parameters, requires no feedback (i.e., is unconditional and deterministic), and can be applied in a wide variety of experimental systems. Our results demonstrate a general technique for enhanced quantum sensing. Consider a sensor that maps some environmental variable B to a qubit's frequency, and thus to the detuning Δ between qubit frequency and drive. We assume the transduction function Δ = f(B) is a fixed and known property of the sensor. The challenge is then to design a protocol that measures Δ as precisely as possible. Detuning causes rotation about the z-axis, leading vy to grow proportionally to vx at a rate Δ. For small rotation angle, vy(t) is linearly proportional to Δ—this is the relevant limit for a weak signal with Δ ≪ γ2 = 1/T2. We can thus write vy = aΔ, where a depends on the protocol used and on γ2. The uncertainty in a measurement of Δ is thus This means that to minimize δΔ, one should maximize \(a\sqrt{N}\), which is equivalent to maximizing \(\sqrt{N}{v}_{y}\) for a given detuning Δ. We assume that state preparation and measurement errors affect all protocols equally and, therefore, ignore them for this analysis. We further assume that γ2 is a property of the sensor and is not affected by the protocol. The task is thus to choose a protocol that will maximize \(\sqrt{N}{v}_{y}=\sqrt{\frac{T}{t+{t}_{i}}}{v}_{y}(t)\) for a given detuning Δ and decoherence rate γ2. Here T is the total experiment time, t is the time the qubit spends accumulating phase in an iteration, and ti is the “inactive” time spent preparing, reading out, and resetting the qubit in an iteration. Consider a qubit subject to relaxation at rate γ1 = 1/T1, dephasing at rate γϕ, detuning Δ, and a coherent drive Hdrive(t) = hy(t)σy. In a Ramsey sequence the qubit is prepared in the state \(\overrightarrow{v}(0)=(1,0,0)\) and allowed to freely evolve (hy = 0). When Δ ≪ γ2 = γϕ + γ1/2, this leads to a maximum y-component at t ≈ T2/2 (see Supplementary Material for derivations). These cases have \(a={(e{\gamma}_{2})}^{-1}\) with \(a\sqrt{N} \sim a\) and \(a={(2\sqrt{e}{\gamma}_{2})}^{-1}\) with \(a\sqrt{N} \sim a/\sqrt{t}={(2e{\gamma}_{2})}^{-1/2}\), respectively. Likewise, we define the signal per root evolution time improvement ratio Rs (i.e., the SNR improvement for fixed total evolution time) as the ratio of maximum \({v}_{y}/\sqrt{t}\) from the given protocol to that from the Ramsey experiment, the latter being given by (3). Note that the uncertainty in a measurement of detuning δΔ is inversely proportional to \(\sqrt{N}{v}_{y}\) as given in (1). This means that our protocol reduces δΔ by a factor of Rv when N is fixed and by a factor of Rs when N ~ 1/t. Our protocol uses Hamiltonian control to preserve state coherence32 and enhance sensitivity. The general theory is derived in31. Here we report the central results, with full details given in the Supplementary Material. For small, unknown Δ ≪ γ2, we can stabilize vx(t) ≈ vx(0) for 0 ≤ t ≤ tb so long as vz(t) ≠ 0 by setting where the stabilization is exact to 2nd order in Δ/γ2. At this breakdown time tb, the protocol fails and coherence must decay. However, stability can be achieved indefinitely if \({v}_{x}(0)\le \frac{1}{2}\sqrt{\frac{{\gamma}_{1}}{{\gamma}_{2}}}\), since at low temperature relaxation deterministically causes growth of vz towards its thermal value vz = 1—that is, relaxation takes the qubit to the ground state, and so the ground state population in an ensemble grows as a function of time. The drive can then rotate vz towards vx, transferring this ground state population to the desired vx, preserving coherence. 1 for an illustration of a Bloch trajectory with coherence stabilization. Temperature effects are discussed later in the text and in the Supplementary Material. a Simulated trajectories of the Bloch vector in the xz plane for a qubit state which is either allowed to freely evolve (star markers; steady decay of vx) or stabilized with our protocol (triangle markers; stabilized vx up to a breakdown time, followed by steady decay), with T1 = T2 = 1 and vx(0) = 0.68. b, c 3D Bloch sphere plot of the same trajectories for stabilization (b) and free evolution (c). d Pulse sequence schematic of our stabilization protocol. We prepare a state in the xz plane at an angle θ from the z-axis, then stabilize the Bloch component \({v}_{x}=\sin \theta\) up to a breakdown time. After a variable evolution time we perform quantum state tomography. When vx is thus coherence-stabilized at \({v}_{x}^{c}\approx {v}_{x}(0)\) and the unknown detuning Δ ≠ 0, vy grows to an asymptotic maximum \({v}_{y}^{c}\to \frac{{v}_{x}(0)}{{\gamma}_{2}}\Delta\), leading to a signal improvement ratio (SNR per measurement shot improvement ratio) relative to Ramsey of Rv = evx(0). In the limit where relaxation dominates decoherence (γ1 = 2γ2), \({R}_{v}=\frac{e}{\sqrt{2}}\approx 1.922\). We stabilize vx(t) until breakdown and then set hy = 0. In the small detuning limit, this gives improvement ratio While there is a closed-form expression for tb in terms of vx(0)31, it does not allow an analytical solution for the maximum of (5) over all values of vx(0). Instead we optimize numerically, as discussed below, reaching a maximum of Rv = 1.96 when γ1 = 2γ2. Even in the limit of no relaxation where γ1 → 0, we find Rv = 1.09. Thus, our protocol can achieve an unconditional vy signal boost compared to Ramsey interferometry. When the inactive time ti is negligible and (3) applies, we instead maximize the signal per root evolution time \({v}_{y}^{c}/\sqrt{t}\) over all times and initial states. In this case, using a permanently-stabilized state gives a maximum achievable advantage (when γ1 = 2γ2) of only Rs = 1.052, and can be disadvantageous when dephasing is non-negligible. However, using an initial state with a finite breakdown time leads to a larger and unconditional SNR enhancement. Once again we can find an analytic expression for the maximum SNR and improvement ratio in terms of the initial state vx(0) and breakdown time tb, but must numerically optimize over vx(0), as discussed in the Supplementary Material. We can compare this protocol to others that use control to reduce the effects of decoherence such as spin locking and dynamical decoupling (DD). In spin locking, a strong constant drive along the axis of the Bloch vector causes any noisy Hamiltonian terms along orthogonal axes to average out, provided these terms are constant over the period required for the drive to rotate the vector a full revolution. Likewise, dynamical decoupling uses fast pulsed rotations to average out quasi-static noise. In both cases the protocols suppress decoherence due to slowly-varying noise and would cause complete insensitivity to detection of a static detuning, while sensitivity to an oscillatory detuning at particular frequencies is enhanced; this enhancement has been used in the past for optimizing sensing18,19,20. In contrast, our coherence stabilization protocol works for broadband Markovian decoherence and maintains sensitivity to static Hamiltonian terms. We demonstrate our protocol using a superconducting qubit; device parameters and experimental details are given in the Methods. We first show coherence stabilization with Δ = 0. The experimental pulse sequence is shown in Fig. We prepare a state in the xz plane with \(\overrightarrow{{{{\bf{v}}}}}(0)=(\sin \theta,0,\cos \theta)\). We cut off the control after breakdown to prevent rotations from vx to vz that would decrease vx faster and reduce the growth of our vy signal. Quantum state tomography data showing coherence stabilization for two different initial states is presented in Fig. 2a, b, along with theoretical predictions (not fits) generated by solving the Bloch dynamics for our system parameters. Depending on the initial state and the ratio γ1/γ2, the stabilization may exhibit a breakdown (panel (a)) or long-time stability (panel (b)). When the qubit is detuned from the drive frequency by some small detuning Δ ≪ γ2, vy(t) grows to some maximum. 2a, b, respectively, and measure vy (Fig. We compare to vy from Ramsey sequences (vx(0) = 1) for the same detunings. For both states there is an enhanced vy signal compared to Ramsey, validating the essential aspect of our protocol. Note that these data were taken on different days and T2 drifted from 73 μs (c) to 89 μs (d), which accounts for the larger signal in (d) despite a smaller Δ. a, b Bloch vector evolution, as measured through quantum state tomography, for a state with a vx(0) = 0.652 and breakdown at 148 μs and b vx(0) = 0.599 with no breakdown (i.e, a stable state). c, d Evolution of vy for the same initial states as in (a, b), with added small detuning Δ/2π = 396 Hz (c) and 324 Hz (d). To quantify the enhancement of signal Rv, we sweep the detuning Δ and measure the coherence-stabilized \({v}_{y}^{c}({t}_{\max},\Delta,\theta)\) for each initial state polar angle θ. Here, \({t}_{\max}\) is the predicted time of maximum \({v}_{y}^{c}\) (see Supplementary Material); for solutions with no breakdown we use \({t}_{\max}=350\,\mu {{{\rm{s}}}}\approx 5{T}_{2}\). We also measure the Ramsey evolution \({v}_{y}^{R}({T}_{2},\Delta)\) at t = T2 when theory predicts \({v}_{y}^{R}\) will be maximized. We then fit the slopes of \({v}_{y}^{c}\) and \({v}_{y}^{R}\) vs Δ and take their ratio to compute Rv. During this experiment we measured T1/T2 = 0.749 ± 0.112 and T2 = 83.4 ± 9.9 μs. Using our protocol with initial state θ = 0.198π and N shots, we are able to detect the qubit frequency with a minimum 1 − σ uncertainty of \(\sqrt{N}\delta {f}_{c}=3.4\pm 0.8\,{{{\rm{kHz}}}}\sqrt{{{{\rm{shots}}}}}\), compared to \(\sqrt{N}\delta {f}_{{{{\rm{R}}}}}=5.5\pm 0.7\,{{{\rm{kHz}}}}\sqrt{{{{\rm{shots}}}}}\) for Ramsey (variances are over repetitions of the experiment; see Supplementary Material for an explanation of the sensitivity calculation and units). Thus, our protocol reduces qubit frequency detection uncertainty by a factor of Rv = 1.62 ± 0.13 when t ≪ ti. Theory predicts \({v}_{y}^{c}\) will be maximized when θ = 0.213π [vx(0) = 0.671], with predicted Rv = 1.649. We find good agreement with the data with no free parameters, indicating that our protocol behaves as predicted. Error bars are calculated from the variance of the ratio across many measurements. The dashed line is a theoretical prediction, not a fit, showing good agreement. Bottom row: Numerically simulated improvement ratio Rv (a.ii) and Rs (b.ii) as a function of initial state and T1/T2 ratio, for small detuning. Also shown are numerically simulated (markers) and analytically derived (dashed line) improvement ratio Rv (a.iii) and Rs (b.iii) at optimal initial state as a function of T1/T2. To test the protocol under different environmental parameters, we numerically simulate the evolution as a function of initial state and T1/T2 ratio, all at small detuning Δ = 0.01/T2. We maximize over initial state at each T1/T2 and plot these maxima in Fig. We find that Rv reaches a maximum value of 1.96 in the limit where relaxation dominates dephasing (T2 = 2T1), and has a minimum of 1.09 in the limit where dephasing dominates relaxation (T1 ≫ T2), as predicted by analytical theory (shown as a dashed line). To quantify the enhancement of SNR per qubit evolution time Rs, we use a similar procedure as above, except that we measure the stabilized vy and Ramsey vy at the times theory predicts will maximize \({v}_{y}/\sqrt{t}\) (see Supplementary Material). Experimental, theoretical, and numerical results are plotted in Fig. Using our protocol with initial state θ = 0.315π and total experiment time T, we find a minimum frequency detection uncertainty of \(\sqrt{T}\delta {f}_{c}=63\pm 4\,{{{\rm{Hz}}}}/\sqrt{{{{\rm{Hz}}}}}\), compared to \(\sqrt{T}\delta {f}_{{{{\rm{R}}}}}=70\pm 1\,{{{\rm{Hz}}}}/\sqrt{{{{\rm{Hz}}}}}\) with Ramsey. Again, our protocol reduces uncertainty by a factor of Rs = 1.11 ± 0.03 when ti ≪ t (see Methods). Our experimental results once more agree well with the theory, which predicts a maximum Rs = 1.094 at θ = 0.283π [vx(0) = 0.776] for this T1/T2 ratio. Theory and simulation show the improvement ratio Rs ranges from 1.184 when T2 = 2T1 to 1 when T1 ≫ T2. A quantum sensor may have comparable inactive time ti and evolution time t, between the limits we study. For a fixed ti, the problem is to optimize \({v}_{y}(t)/\sqrt{t+{t}_{i}}\). Again, the Ramsey protocol can be optimized analytically, while our coherence stabilization protocol can be optimized numerically by the same procedure used above. The SNR improvement ratio will land somewhere between our two limiting cases, but will always be ≥1. For instance, when T2 = 2T1 and ti = 0.1T2, the SNR improvement is by a factor of 1.23, by a factor of 1.43 when ti = T2, and by a factor of 1.76 when ti = 10T2. Optimal sensing depends on accurate knowledge of T1 and T2. To quantify the robustness of our protocol to miscalibrations, we simulate the Bloch evolution using the initial state, control field, and measurement time that would be optimal for a nominal T1 = T2 = 1 while varying the actual values of T1 and T2 that control the dynamics. Thus, we simulate a situation in which T1 and T2 have changed unbeknownst to the experimenter. We simulate Ramsey experiments with the same miscalibration: measurements are performed at times determined by the nominal T2, not the actual simulated one. We plot the SNR improvement ratios as a function of percentage change in γ1 = 1/T1 and γ2 = 1/T2 in Fig. We see little change in the improvement ratios when T1 and T2 are miscalibrated by the same factor, as shown by the near-constant values along lines of constant T1/T2 ratio running diagonally from bottom left to top right. We see some dependence of the improvement ratios on changes in the T1/T2 ratio, as shown by the steepest gradient running approximately diagonally from top left to bottom right. The majority of this dependence is not due to miscalibration, but rather is due to the fact that Rv and Rs depend on T1/T2 even when perfectly calibrated. This can be seen in the plots as the large regions where SNR is improved even more than at the perfectly calibrated T1 and T2 (red color), due to the fact that T1/T2 has decreased. There is some additional SNR suppression due to using a suboptimal initial state and measurement time, which is evident in the top left corner of Fig. 4(b)—Rs dips slightly below 1, indicating worse performance than Ramsey, while an optimal protocol would always have Rs ≥ 1. Still, this requires quite a large deviation, with actual T2/T1 ~ 2/3 of its nominal value, and so the protocol is relatively robust to fluctuations in T1 and T2. Furthermore, we see from the experimental measurements in Fig. 3 that improvement on par with the theoretical maximum value for that T1/T2 ratio can be achieved, even in a real system with fluctuating coherence times. a Rv and b Rs to miscalibrations of decoherence rates 1/T1 and 1/T2, for nominal T1/T2 = 1. Miscalibration of 1/T is defined as (Tnominal/Tactual − 1). All results and theory thus far are derived or acquired in the low temperature limit where the qubit deterministically relaxes to the ground state (vz grows to 1). Higher temperature reduces the improvement our protocol can provide since there is no longer as fast of a growth in vz that can be transfered back to vx for stabilization. However, this only applies if the qubit begins in a state that is more pure than the thermal equilibrium state. In the case where the qubit is initialized in a partially-mixed thermal state, both Ramsey and stabilization protocols are affected equally, the breakdown time becomes independent of temperature, and the improvement factors Rv and Rs recover their low-temperature values. Even when state preparation is perfect, our protocol provides an signal boost Rv by a factor of at least 1.05 so long as T1 > 10T2 or the thermal state has \({v}_{z}^{{{{\rm{thermal}}}}} > 0.5\). In all cases our stabilization protocol is guaranteed to be non-inferior to Ramsey, as it reduces to a Ramsey protocol in the limit where vx(0) → 1. In conclusion, we have demonstrated a protocol for enhancing qubit sensitivity to weak environmental fields by stabilizing partial qubit coherence. Our protocol requires only deterministic Hamiltonian control and is therefore applicable to a wide variety of qubit technologies. In the limit where decoherence is dominated by relaxation, we show a theoretical maximum of a 1.96× improvement over standard Ramsey interferometry in SNR per measurement shot, and a 1.184× improvement in SNR per root qubit evolution time. In our experimental apparatus with dephasing comparable to relaxation and with fluctuating system parameters, we achieve improvements of 1.6× and 1.1× , respectively. Our results show a resource-efficient, broadly-applicable technique for unconditionally enhancing the SNR from qubit-based sensors and speeding calibration of qubit parameters. A natural application of our technique is to sensing magnetic fields or, equivalently, measuring the field-to-frequency transduction function of a spin species. These measurements are typically done in ambient conditions far from the low-temperature limit which would seem to reduce the benefit that our protocol could give. However, often the system is initialized in a thermal state and so the low-temperature enhancement of signal versus Ramsey would apply. While our technique provides a significant SNR boost over Ramsey interferometry, it is not fully optimal. For instance, if γ1 > 0 and the optimal measurement time is after breakdown, there will be some nonzero vz that could be used to boost signal. Thus, for each set of environmental conditions, there is likely a Bloch trajectory (i.e., an initial state and control Hamiltonian) that provides an even larger signal than our protocol of stabilizing coherence. This is a problem of optimal control, and as such can be tackled with control theory techniques33,34. It is a relatively unconstrained problem, as the initial state, final state, and final time are all variable. Given this lack of constraint, numerical solution methods will likely be required. Such optimal control has already been pursued in quantum sensing of time-varying signals20,35 and large signals36, and inspiration can be drawn from these results. In addition, it should be possible to stabilize properties of multi-qubit states31, including various entanglement measures37. Future work could therefore explore the possibility of extending our sensitivity enhancement to entangled states. We note that, like Ramsey, our protocol cannot achieve the Heisenberg limit of SNR scaling linearly with total experiment time T. Instead it maintains the standard quantum limit scaling of SNR with \(\sqrt{T}\). Combining our stabilization protocol with entanglement-based sensing, continuous weak-measurement feedback, and other sophisticated techniques could allow for further sensitivity gains38. Our device is a standard grounded superconducting transmon qubit coupled to a quarter-wave transmission line cavity. Device parameters are given in Table 1, and the design is available on the SQuADDS database39. The qubit and cavity are far off resonance. In this dispersive regime there is approximately 0 energy exchange between qubit and cavity, but the cavity frequency shifts by χ/2π = 150 kHz when the qubit changes state. We measure the qubit by driving the cavity with an on-resonant pulse generated by mixing a carrier at the cavity frequency with a Gaussian envelope. The pulse transmits through the device, interacting with the cavity as it passes, then passes through an amplification chain up to room temperature, where we mix it back down to DC with an IQ mixer, giving a two-channel DC voltage signal that we then digitize. A diagram of the experimental setup is given in Fig. We project the measured two-channel voltage onto an axis which gives maximum discrimination between the signals for qubit ground and excited states. We drive rotations of the qubit state by driving it with an on-resonance microwave pulse. All qubit control envelopes are generated by the Zurich Instruments HDAWG, then upconverted and combined with measurement pulses from the ZI UHFQA before being fed into a heavily attenuated line. Readout signals are amplified and then downconverted and fed back into the UHFQA for analysis. We begin by measuring the qubit's T1 (using a population decay measurement) and its T2 and frequency (using a standard Ramsey measurement). If the qubit frequency has drifted, we reset the drive frequency to be resonant, then detune it by Δ. We then calculate the control waveform to stabilize vx for our chosen initial state; in the case where we are stabilizing a state with a breakdown time and we want to extend the evolution past breakdown, we set the control to 0 after breakdown. We then calibrate the strength of our drives with a Rabi measurement, where we drive the qubit with a pulse of constant duration and variable amplitude. This pulse has a cosine envelope and is 2.35 μs in duration. We measure oscillations of the qubit population and thus extract the driven Rabi frequency at a given control output voltage. We use this to convert our calculated control waveform into output voltage units for our control electronics. Note that long qubit manipulation pulses are used because our control line is heavily attenuated in order to give fine resolution of the continuous control waveform. We therefore prepare a state with \({v}_{x}(0)=\sin \theta\) by driving a pulse with amplitude \(\frac{\theta}{\pi}{A}_{\pi}\), where Aπ is the amplitude of a π rotation pulse calibrated via the Rabi measurement. We next drive the continuous control waveform to stabilize the state for a time t. We then stop the control and perform quantum state tomography, measuring the qubit state along one axis. To measure vx, we apply a -π/2 rotation pulse about the y-axis, then drive a readout pulse on the cavity. To measure vy we use a π/2 rotation about the x-axis, then a readout pulse; to measure vz we use a 2π rotation about the y-axis, then readout. After a measurement we either do nothing or drive a π rotation of the qubit, conditional on the measurement outcome, to reset it to the ground state. We wait an additional 60 μs to damp any residual excited state population. We repeat each time point three times to measure all three Bloch vector components, then sweep t. Before measuring the first time point, we perform a measurement to calibrate the voltage corresponding to the ground state vz = 1; after the last time point, we perform a π rotation on the qubit and then measure to calibrate the voltage corresponding to the excited state vz = −1. We then repeat this entire process many times and directly average the measured voltages. We use these voltages as the reference values for vi = ±1 (i = {x, y, z}). When comparing the signal from coherence stabilization vs Ramsey, we interleave the measurements to avoid errors due to drifts in qubit parameters. We use threshold assignment of the readout voltage to the ground or excited state in order to reduce noise in these small signals. To speed data collection, we only measure vy and ignore the other Bloch components and only measure at the optimal times for maximizing vy and \({v}_{y}/\sqrt{t}\) (T2 and T2/2, respectively, for Ramsey, and the times derived in the Supplementary Material for coherence-stabilized measurements). We interleave coherence-stabilized and Ramsey measurements for a given detuning and coherence-stabilized initial state, repeating many times to build up accurate estimates of vy. We then move on to the next detuning while keeping θ constant. We repeat these detuning sweeps many times to build up statistics, re-measuring T1 and T2 before each sweep. We then move on to the next initial state θ. We take this set of coherence-stabilized and Ramsey data from the many detuning sweep iterations for each θ and break it into chunks of ~10 iterations, interspersed throughout the dataset. For instance, one chunk might contain our 1st, 11th, 21st, ...,91st iteration. We rescale the detuning axis of this data by the T2 measured in that iteration, and likewise divide the measurement times by T2 to render them dimensionless. We then take all the vy data in a given chunk and fit it simultaneously to a linear dependence on ΔT2, extracting a slope for coherence stabilized and Ramsey measurements. In the case where we are calculating Rs, we then divide each slope by √t/T2, where t is the time at which the measurement was taken in each iteration. Note that t will vary linearly with T2, so even as T2 fluctuates through iterations, this ratio remains constant (so long as T1/T2 remains roughly constant). We take the ratio of the coherence-stabilized slope to the Ramsey slope, then average over all N ~10 chunks to give an estimate of Rv or Rs. We compute the variance of these ratios and divide them by N as an estimate of the error. All data is available at the Zenodo40 or upon request to the corresponding author. Analytical theory derivation code and numerical simulation code is available at the repository github.com/LFL-Lab/stabilized-sensing. The repository is also linked to Zenodo41. Ramsey, N. F. A molecular beam resonance method with separated oscillating fields. Balasubramanian, G. et al. Nanoscale imaging magnetometry with diamond spins under ambient conditions. Bal, M., Deng, C., Orgiazzi, J.-L., Ong, F. R. & Lupascu, A. Ultrasensitive magnetic field detection using a single artificial atom. Dixit, A. V. et al. Searching for dark matter with a superconducting qubit. Bass, S. D. & Doser, M. Quantum sensing for particle physics. Aslam, N. et al. Quantum sensors for biomedical applications. Ristè, D. et al. Millisecond charge-parity fluctuations and induced decoherence in a superconducting transmon qubit. Hot nonequilibrium quasiparticles in transmon qubits. Liu, C. H. et al. Quasiparticle poisoning of superconducting qubits from resonant absorption of pair-breaking photons. Improving qubit coherence using closed-loop feedback. Magnetic field sensing beyond the standard quantum limit using 10-Spin NOON states. Proposed robust entanglement-based magnetic field sensor beyond the standard quantum limit. Magnetic field sensing beyond the standard quantum limit under the effect of decoherence. Lawrie, B. J., Lett, P. D., Marino, A. M. & Pooser, R. C. Quantum sensing with squeezed light. & Degen, C. L. Quantum sensing with arbitrary frequency resolution. Simultaneous spectral estimation of dephasing and amplitude noise on a qubit sensor via optimally band-limited control. Titum, P., Schultz, K., Seif, A., Quiroz, G. & Clader, B. D. Optimal control for quantum detectors. & Paraoanu, G. S. Benchmarking machine learning algorithms for adaptive quantum phase estimation with noisy intermediate-scale quantum sensors. Naghiloo, M., Jordan, A. N. & Murch, K. W. Achieving optimal quantum acceleration of frequency estimation using adaptive coherent control. Song, X. et al. Agnostic phase estimation. & Kołodyński, J. Quantum metrology with imperfect measurements. Shulman, M. D. et al. Suppressing qubit dephasing using real-time Hamiltonian estimation. & Takahashi, S. Reduction of surface spin-induced electron spin relaxations in nanodiamonds. Danilin, S., Nugent, N. & Weides, M. Quantum sensing with tunable superconducting qubits: optimization and speed-up. Taylor, J. M. et al. High-sensitivity diamond magnetometer with nanoscale resolution. Levenson-Falk, E., Antler, N. & Siddiqi, I. Dispersive nanoSQUID magnetometry. Saurav, K. & Lidar, D. A. Quantum property preservation. & Schneider, S. Stabilizing qubit coherence via tracking-control. Roloff, R., Wenin, M. & Pötz, W. Optimal control for open quantum systems: qubits and quantum gates. Rembold, P. et al. Introduction to quantum optimal control for quantum sensing with nitrogen-vacancy centers in diamond. Poggiali, F., Cappellaro, P. & Fabbri, N. Optimal control for one-qubit quantum sensing. Basilewitsch, D., Yuan, H. & Koch, C. P. Optimally controlled quantum discrimination and estimation. Zhou, S. & Jiang, L. Asymptotic theory of quantum channel estimation. Shanto, S. et al. SQuADDS: a validated design database and simulation workflow for superconducting qubit design. The authors gratefully acknowledge useful discussions with Kater Murch, Archana Kamal, Sacha Greenfield, and Sadman Shanto. This material is based upon work supported in part by the U. S. Army Research Laboratory and the U. S. Army Research Office under contract/grant number W911NF2310255, the National Science Foundation, the Quantum Leap Big Idea under Grant No. OMA-1936388, the Office of Naval Research under Grant No. N00014-21-1-2688, and Research Corporation for Science Advancement under Cottrell Award 27550. Devices were fabricated and provided by the Superconducting Qubits at Lincoln Laboratory (SQUILL) Foundry at MIT Lincoln Laboratory, with funding from the Laboratory for Physical Sciences (LPS) Qubit Collaboratory. Center for Quantum Information Science and Technology, University of Southern California, Los Angeles, CA, 90089, USA M. O. Hecht, Kumar Saurav, Evangelos Vlachos, Daniel A. Lidar & Eli M. Levenson-Falk M. O. Hecht, Evangelos Vlachos, Daniel A. Lidar & Eli M. Levenson-Falk Ming Hsieh Department of Electrical & Computer Engineering, University of Southern California, Los Angeles, CA, 90089, USA Kumar Saurav, Daniel A. Lidar & Eli M. Levenson-Falk Quantum Elements, Inc., Thousand Oaks, California, USA You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar designed the device used in experiments. The authors declare no competing interests. Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. A peer review file is available. Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/. Hecht, M.O., Saurav, K., Vlachos, E. et al. Beating the Ramsey limit on sensing with deterministic qubit control. Anyone you share the following link with will be able to read this content: Sorry, a shareable link is not currently available for this article. Provided by the Springer Nature SharedIt content-sharing initiative Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.
Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript. Nature Communications volume 16, Article number: 3992 (2025) Cite this article As conventional electronic materials approach their physical limits, the application of ultrafast optical fields to access transient states of matter captures imagination. The inversion symmetry governs the optical parity selection rule, differentiating between accessible and inaccessible states of matter. To circumvent parity-forbidden transitions, the common practice is to break the inversion symmetry by material design or external fields. Here we report how the application of femtosecond ultraviolet pulses can energize a parity-forbidden dark exciton state in black phosphorus while maintaining its intrinsic material symmetry. Unlike its conventional bandgap absorption in visible-to-infrared, femtosecond ultraviolet excitation turns on efficient Coulomb scattering, promoting carrier multiplication and electronic heating to ~3000 K, and consequently populating its parity-forbidden states. Interferometric time- and angle-resolved two-photon photoemission spectroscopy reveals dark exciton dynamics of black phosphorus on ~100 fs time scale and its anisotropic wavefunctions in energy-momentum space, illuminating its potential applications in optoelectronics and photochemistry under ultraviolet optical excitation. Excitons in insulators and semiconductors1 are the primary quasiparticles of light-matter interaction governed by quantum mechanical selection rules2. Optical excitation creates a valence band (VB) hole and a conduction band (CB) electron that form an exciton state bound by the screened Coulomb interaction. Excitons are quanta that can transduce the energy and information of light, and define the optical-electrical-chemical properties of matter. When the VB and CB carry opposite spin or possess different momentum, optical transitions between them are forbidden, and their excitons are said to be dark3,4,5. The suppressed radiative electron-hole recombination makes dark excitons longer-lived, attracting keen interest to switch them on at will6,7,8,9. The parity of electronic wavefunctions, defined by switching (‘−') or retention (‘+') of sign upon point reflection, also defines the strength of optical transitions. A striking example is the metastable excited state of helium, whose decay time to the same parity ground state is > 106 longer than for the parity-allowed states10. The optical selection rules require that the product of symmetry characters Γ is equal to one. The parity symmetry selection rule is satisfied when the electronic wave functions and light: where the individual characters stand for VB and CB bands, photon, and exciton envelope11,12,13,14. The component \({\varGamma }_{{VB}}\otimes {\varGamma }_{{CB}}\otimes {\varGamma }_{{photon}}\) symmetry determines the optical CB ← VB transitions, where \({\varGamma }_{{photon}}=-1\) restricts single-photon transitions to occur between electronic bands of opposite parity11. The exciton envelope parity is defined by the orbital angular momentum quantum number \(l\) as \({\varGamma }_{{exciton\_envelope}}={\left(-1\right)}^{l}\) (ref. 12,14), rendering the lowest-lying 1 s exciton ground state of \({\varGamma }_{{exciton\_envelope}}=1\). Therefore, 1 s exciton for one-photon CB ← VB transitions between bands of opposite parity is bright (Fig. 1a), but dark when they have the same parity (Fig. The parity selection rule can be circumvented by breaking the inversion symmetry, as has been done by introducing lattice-distorting polarons in strongly confined perovskite quantum dots15 or distorting the electronic wave functions by external magnetic or electric fields in carbon nanotubes13,16,17. a, b The schematic allowed opposite and forbidden same parity electric dipole transitions between VB and CB, which form bright (red shading) or dark (gray shading) 1 s exciton state. c When the CB and VB have the same parity, the initial optical excitation generates a hot hole distribution ①, enabling ultrafast impact ionization within VBs to promote an electron to CB ②, turning on a dark exciton metastable state. Blue and Gray shadings in a-c indicate the CB and VB, respectively ③. d The calculated band structure of bulk BP with measured band dispersions along Z-S (\({k}_{{AC}}\)) and Z-T (\({k}_{{ZZ}}\)) lines [gray contrast is the ARPES spectrum measured with a He lamp (\({{\hslash }}{{{\rm{\omega }}}}=21.2{{{\rm{eV}}}}\))]. The 3D Brillouin zone is shown in the bottom. The colors and symbol sizes express the dominant \({p}_{z}\) (red), \({p}_{x}\) (blue), and \({p}_{y}\) (green) atomic orbital compositions of bands. Electrons in even symmetry VBs cannot be promoted to the same symmetry CBs by optical transitions. e Geometry of the optical excitation, the BP sample azimuth, and the photoelectron emission relative to the optical plane. The detector measures photoelectron counts as a function of energy and momentum parallel to the optical plane following excitation by two identical collinear pulses separated by time delay \(\delta\). A λ/2 waveplate defines the p-pol (θ = 0°) and s-pol (θ = 90°) field polarizations relative to the optical plane. Source data are provided as a Source Data file. Here, we show how the parity-forbidden dark exciton state can be switched on in a bulk material even when its intrinsic crystal symmetry is maintained (Fig. BP is a particularly well-suited semiconductor18,19 for this study because (i) it has a direct band-gap20 with remarkable optical responses21; (ii) its low atomic number22 gives it a negligible spin-orbit coupling; (iii) its excitonic wavefunctions express the highly anisotropic band dispersions in the Z-S (armchair, AC) and Z-T (zigzag, ZZ, Fig. 1d) directions23; and (iv) its crystal structure exhibits multiple symmetries, providing an ideal platform to interrogate the optical selection rules (Supplementary Note 1). Especially, under three-dimensional space inversion, the CB2 as well as VB1-VB3 have even parity (ref. 1d), making the CB2 ← VB1-VB3 transitions strictly parity-forbidden (Supplementary Note 1.1). Previous research on BP with infrared excitation has focused on the allowed CB1 ← VB1 transition24,25 and its bright exciton26,27. Herein we study by energy-momentum-time-resolved two-photon photoemission (2PP) spectroscopy the high-lying parity-forbidden CB2(e−)←VB1(h+) dark exciton formation under femtosecond UV excitation. Conventional angle-resolved photoemission spectra (ARPES) of BP excited with a He lamp reproduce the known18,20,21,28 anisotropic dispersions of the VB1-VB3 along the high-symmetry Z-T and Z-S lines (Fig. 1d and Supplementary Fig. 2PP spectra excited with \({{\hslash }}{{{\rm{\omega }}}}=4.59{{{\rm{eV}}}}\) femtosecond pulses record the electronic band structures of transiently populated CBs (Fig. Figure 2a shows the dispersions of CB2 and CB3 recorded in the 2PP \(E({k}_{||})\) spectrum, for s-pol excitation with ZZ direction of BP aligned in the optical plane. Tuning the light polarization from s-pol (θ = 90°) to p-pol (θ = 0°) selectively probes different CBs and the free-electron dispersing image potential state (IPS)29 (Fig. The observed band assignments are established by comparing the experimental and calculated band energies and masses (Supplementary Fig. 2), as summarized in Fig. Independent of the light polarization, the line traces taken at \({k}_{{ZZ}}=0\,{{{{\text{\AA }}}}}^{-1}\) show that the 2PP signal rises from the surface work function (WF) edge of BP at \({E}_{{final}}=4.68{eV}\), and thereafter, gradually decreases to higher energies (right panel in Fig. The decreasing signal with \({E}_{{final}}\) represents a hot thermal electron distribution within CB1. This suggests that the photoexcitation occurs by a Drude-like second-order inelastic scattering process30 that populates all accessible CBs. At the intermediate θ = 60° excitation polarization, an additional band appears ~ 85 meV below the CB2 with flatter dispersion that cannot be assigned within the single-particle band structure of BP (Fig. 2b and Supplementary Fig. Considering its binding relative to CB2, we attribute it to the CB2 exciton. a–c, Representative 2PP E(k||) spectra of BP excited with UV \(({{\hslash }}{{{\rm{\omega }}}}=4.59{{{\rm{eV}}}})\) light polarized with θ = 90° (s-pol), 60°, 0° (p-pol) with the ZZ azimuth in its optical plane. The left and right ordinates, respectively, numerate the intermediate state and final photoelectron energies (\({E}_{{{\mathrm{int}}}}\) and \({E}_{{final}}\)), in the 2PP process, relative to the Fermi level (EFermi). The right spectra are line profiles taken through \({k}_{{{{\rm{ZZ}}}}}=0\,{{{{\text{\AA }}}}}^{-1}\); the arrows point to the active bands in each spectrum. In (b and c), the higher energy range intensities are magnified to visualize low-intensity spectral peaks. d Sketch of the observed CBs along \({k}_{{{{\rm{ZZ}}}}}\) with their specific dispersions. e A colormap composed of a series of line profiles taken at \({k}_{{{{\rm{ZZ}}}}}=0\,{{{{\text{\AA }}}}}^{-1}\) in 10° intervals for light polarization range from θ = − 90° to 90° (s → p → s). f Polar plots of the relative peak intensity of each state after background subtraction vs. the light polarization θ = 0° to 360°. The red dashed lines show the calculated θ dependences of |TDMs|2 for each \({E}_{{final}}\leftarrow\)CBs one-photon and the coherent \({E}_{{final}}\leftarrow\)IPS ← VB1 two-photon processes. Source data are provided as a Source Data file. Assignments of the detected CB bands are affirmed by distinct polarization dependences of their 2PP spectra. As shown by the line profiles at \({k}_{{ZZ}}=0\,{{{{\text{\AA }}}}}^{-1}\) taken by stepwise tuning of the s → p → s excitation polarization (presented in the color map of Fig. 2e), the intensity maxima for CB1, CB4, and exciton state occur for p-pol, while for CB2 and CB3 they occur for s-pol. In polar coordinates, by comparing the relative 2PP intensity θ dependences with those of the calculated moduli square of transition dipole moments (|TDM | 2) for each CB (Fig. 2f and Supplementary Note 1.2), we conclude that the \({\cos }^{2}{{{\rm{\theta }}}}\) dependence of CB1 and CB4, and the \({\sin }^{2}{{{\rm{\theta }}}}\) dependence of CB2 and CB3 are defined by one-photon transitions from these transiently populated bands to the \({E}_{{final}}\) photoemission continuum, in line with their incoherent population and subsequent photoemission. Further symmetry analysis of CBs explains their distinct polarization dependences. Specifically, when the ZZ azimuth of BP is aligned in the optical plane, the pseudospin symmetry is even for CB1 and CB4, and odd for CB2 and CB3, which defines their respective detection by p-pol and s-pol (Supplementary Note 1.3). By contrast, the IPS has intensity maxima at θ = ± 45°, which stands out because photoemission from IPS is always maximum for normal emission. This, however, can be elegantly understood as a consequence of a coherent two-photon \({E}_{{final}}\leftarrow\)IPS ← VB1 excitation that must have distinct \({\sin }^{2}{{{\rm{\theta }}}}\cdot {\cos }^{2}{{{\rm{\theta }}}}\) dependence (Supplementary Fig. The CB2 exciton state appears for p-pol, because its joint electron-hole character of CB2 and VB1 both of odd symmetry, gives it a total even symmetry. We next establish the exciton state energy by tuning excitation \({{\hslash }}{{{\rm{\omega }}}}=3.94-4.77{{{\rm{eV}}}}\) (Supplementary Fig. Figure 3a records the \({E}_{{final}}\) shift of each state with \({{\hslash }}{{{\rm{\omega }}}}\), where the approximate fitted slopes of ~ 1 confirm their incoherent population in the 2PP process, and determine the exciton binding energy of Eb ~ 80 ± 20 meV relative to CB2 (Fig. Furthermore, 2PP recorded for ZZ to AC azimuths capture the anisotropies of CBs bands and CB2 exciton (the AC measurements are shown in Supplementary Figs. 6, 7 and analyzed in Supplementary Note 1.4). Surprisingly, the CB2 exciton dispersion reverses from positive along kZZ to negative along kAC (Fig. A detailed description of the dispersion extraction in Supplementary Fig. 3, 7, gives the exciton effective masses \({m}_{{ex}}^{*}({ZZ})=0.90\,{m}_{e}\) and \({m}_{{ex}}^{*}({AC})=-0.36\,{m}_{e}\) (\({m}_{e}\) is the free electron mass). a \({E}_{{final}}\) values of CB1-CB3 and CB2 exciton vs. \({{\hslash }}{{{\rm{\omega }}}}\), and their linear fitting (dashed lines), with numbers giving their slopes. The slopes of ~ 1 signify that the 2PP experiment measures their incoherently excited populations. The data with hollow and solid circles are extracted from s-pol and p-pol 2PP spectra in Supplementary Fig. b Data from a transformed to \({E}_{{{\mathrm{int}}}}\) to obtain the state energies (indicated by the numbers). c, d 2PP spectra of CB2 excitons measured with \({{\hslash }}{{{\rm{\omega }}}}=4.59{{{\rm{eV}}}}\) for the ZZ and AC azimuths, respectively. For ZZ, the polarization is set at θ = 60° to record both the CB2 and CB2 exciton, while for AC, it is at θ = 0°. The measured dispersions of CB2 and CB2 exciton are indicated by white and orange curves, respectively. The lower panels show the k|| momentum intensity distributions of CB2 excitons. The black curves indicate the fitting of the exciton wavefunctions densities via \({\left|\phi \left(k\right)\right|}^{2}=1/{\left[1+{\left(k{a}_{{ex}}\right)}^{2}/4\right]}^{4}\) (ref. 32), reporting the anisotropic exciton Bohr radii \({a}_{{ex}}=\,12.3\pm 0.6\,{{{\text{\AA }}}}\,\left({ZZ}\right)\) and \(15.6\pm 0.5\,{{{\text{\AA }}}}\,({AC})\). e, f The simulated spectra of CB2(e−)←VB1(h+) excitons along \({k}_{{ZZ}}\) and \({k}_{{AC}}\), whose spectral dispersions are shown by the orange dashed curves. g The nonlinear order N obtained by fitting the s-pol (upper panel) and p-pol (lower panel) power-dependent 2PP spectral intensities (Supplementary Fig. 8) to \(Y={I}^{N}\), plotted vs. Eint (Efinal). The gray line indicates the corresponding 2PP spectra excited with s-pol \({{\hslash }}{{{\rm{\omega }}}}=4.59{{{\rm{eV}}}}\). h 2PP spectra excited with s-pol \({{\hslash }}{{{\rm{\omega }}}}=4.43{{{\rm{eV}}}}\) at BP sample temperatures of 300, 260, 190, and 90 K. The spectrum at 90 K shows Gaussian function fits for the CB2 and CB3 peaks (gray shading), and Fermi-Dirac (F-D) distribution for hot electrons giving \({T}_{e}=2650\,\pm \, 16\, K\) (blue shading). i The obtained hot electron \({T}_{e}\) values plotted for different sample temperatures. The error bars in (g and i) are given by the 95% confidence interval for the nonlinear least-square parameter estimates. Source data are provided as a Source Data file. We note that the true dispersion of an exciton is defined by the sum of the electron and hole masses31, whereas its photoemission spectra record the “apparent dispersion” from its photoemitted electron32,33,34. Photoemission dissociates an exciton, with the energy and momentum conservation constraining its \(E\left({k}_{\parallel }\right)\) distribution by that remaining with the VB hole32,33,34. Simulation of the “apparent dispersion” by the effective mass approximation model34 shows that the effective exciton temperature, \({T}_{{ex}}\), can tune the “apparent dispersion” to be close to VB dispersion at \({T}_{{ex}}\to 0\) and to CB dispersion at \({T}_{{ex}}\to \infty\) (Supplementary Note 2.1). The CB2 has a strong upward dispersion along kZZ, and the VB1 has a strong downward dispersion along kAC (blue and red curves in Fig. This can generate the reverse “apparent dispersions” for CB2(e−)←VB1(h+) exciton along kZZ and kAC at a modest \({T}_{{ex}}\), where the experimentally observed dispersions are best simulated for \({T}_{{ex}}=340{K}\) (color contrast in Fig. Furthermore, many-body perturbation theory calculation supports that the observed exciton is the 1 s ground state of CB2(e−)←VB1(h+) exciton with a calculated Eb ~ 60 meV and anisotropic wave function in real-space (Supplementary Note 2.1). As explained, parity forbids direct optical CB2 ← VB1 transitions, and therefore its 1 s exciton is strictly a parity-forbidden dark state. So, elucidating excitation provides insight into the optical response of BP. The laser intensity (I) power-law29 for photoelectron yield (Y) by two-photon absorption, \(Y={I}^{N}\), is expected to be N = 2. The measured N > 2 values below the highest energy photoelectron states, however, imply that there must be a carrier multiplication processes (Fig. 3g and Supplementary Fig. This can occur by Auger-type scattering as it happens in copper35,36 and layered materials29,37. Like the quasi-two dimensional (2D) semimetal graphite29, weak screening of the Coulomb interaction in the Van der Waals layered BP enables ultrafast electron excitation to CBs by second-order Coulomb scattering channels that are only restricted by energy and momentum conservation but not symmetry. This is immediately evident in the unstructured hot electron signal from CB1 in 2PP spectra of BP, whose intensity decreases above the WF edge following UV excitation. We attribute this signal to hot thermalized electrons with an effective electron temperature (Te) given by the Fermi-Dirac (F-D) distribution, \(f\left({E}_{{{\mathrm{int}}}}\right)=1/[1+\exp \left(\frac{{E}_{{{\mathrm{int}}}}-{E}_{{Fermi}}}{{k}_{B}{T}_{e}}\right)]\) (\({k}_{B}\) is the Boltzmann constant). Fitting to the F-D distribution gives \({T}_{e}=2650-3240 \, K\) range for sample temperatures of 90–300 K (Fig. The effect of the sample temperature on the effective hot electron temperature implies that electron-phonon (e-ph) scattering also contributes to the hot electron generation and thermalization. Evidence for the delayed excitation of dark states and hot electron heating is conspicuous in the interferometric time-resolved 2PP measurements for excitation with a pair of identical, collinear, phase-correlated pump-probe pulses at \({{\hslash }}{{{\rm{\omega }}}}=4.02{{{\rm{eV}}}}\) (Fig. The measurements generate 3D movies of \({E}_{{{\mathrm{int}}}}({k}_{{||}})\) vs. \(\delta\) scanned in 0.064 fs/frame steps to capture the coherent and incoherent electron dynamics with sub-optical-cycle accuracy (Supplementary Movies 1, 2). Extracting the \({E}_{{{\mathrm{int}}}}(\delta )\) at \({k}_{{||}}=0\,{{{{\text{\AA }}}}}^{-1}\) from the movies generates 2D interferograms of the polarization and population dynamics for each state (Fig. 4a for s-pol and Supplementary Fig. With s-pol excitation, the interferogram shows the main features of CB2 and CB3 superposed on the dominant hot electron signal with decreasing intensity above the WF edge in the vertical line trace at \(\delta=0{fs}\) (Fig. Plotting the vertical line traces at Δ\(\delta\)=20 fs intervals show the hot electron signal component to be decreasing (Fig. Fitting this component by F-D distribution (Fig. 4c) shows that \({T}_{e}\)(\(\delta\)) has the primary maximum at \(\delta=0{fs}\) where the laser fluence is maximum, and tellingly, a secondary maximum at \(\delta=\sim 80{fs}\) (red arrow in Fig. 4c) where the hot electron population is multiplied and heated. We attribute the rising signal to retarded Auger-type scattering processes38,39,40. a Interferometric time-resolved 2PP interferogram at \({k}_{{{{\rm{ZZ}}}}}=0\,{{{{\text{\AA }}}}}^{-1}\), displays the photoelectron counts (color scale) vs. \({E}_{{{\mathrm{int}}}}\) (ordinate) and \(\delta\) (abscissa), excited by s-pol \({{\hslash }}{{{\rm{\omega }}}}=4.02{{{\rm{eV}}}}\) light (ZZ azimuth). The orange curve is a line profile at \(\delta=0{{{\rm{fs}}}}\). b A series of vertical line profiles extracted from a taken in intervals of Δ\(\delta\)= 20 fs, and intensities normalized at WF edge. The spectrum at \(\delta=260{{{\rm{fs}}}}\) is fitted by the F-D distribution (blue shaded; \({T}_{e}=2130 \, K\)), and Gaussian lineshapes for the CB band peaks (gray). c nonmonotonically evolving \({T}_{e}\) from F-D fitting of profiles with increasing \(\delta\) in (b). d Interferometric two-pulse correlation (I2PC) traces extracted as horizontal line profiles from a for \({E}_{{{\mathrm{int}}}}\) at the WF edge, CB2, and CB3, respectively, with intensities normalized at \(\delta=0{{{\rm{fs}}}}\). The black curves are obtained by Fourier transformation of the I2PC data and reverse transformation of its zero-frequency \(0{{{\rm{\omega }}}}\) component to emphasize its slowly evolving hot electron signal. The green curve is the in situ pulse autocorrelation reference (Supplementary Fig. e, f Inverse Fourier transforms the map of the \(0{{{\rm{\omega }}}}\) signal from the data in a for s-pol, and from the Supplementary Fig. 9c for p-pol excitation; the intensities are normalized at \(\delta=0{{{\rm{fs}}}}\). The transiently populated longer-lived dark CB2 and its exciton populations are evident g, h The delayed Auger/direct population ratio \({{{\rm{\alpha }}}}\) and the hot electron lifetime \({\tau }_{e}\) from the OBE simulation of \(0{{{\rm{\omega }}}}\) traces, plotted as a function of \({E}_{{{\mathrm{int}}}}\), for s-pol and p-pol measurements. The vertical line in (h) indicates the limiting time resolution of ~30 fs (for \({E}_{{{\mathrm{int}}}} \, > \, 1.8{eV}\), \({\tau }_{e}\) extraction is not reliable). In the \({E}_{{{\mathrm{int}}}}=0.7-1.8{eV}\) range, the data are fitted with \({\tau }_{e}\, \sim \,{\left({E}_{{{\mathrm{int}}}}\right)}^{-n}\), giving n = 1.93 ± 0.12 and 1.89 ± 0.11 for the s-pol and p-pol measurements, respectively. Source data are provided as a Source Data file. The contribution of the retarded carrier scattering is evident in the interferometric two-pulse correlation (I2PC, Fig. 4d) traces obtained as horizontal line profiles through the interferogram in Fig. 4a, as well as the enhanced (N > 2) nonlinear order of 2PP signals in Fig. The phase-independent \(0{{{\rm{\omega }}}}\) signal obtained by Fourier filtering41 of the I2PC scans records the incoherent electron dynamics (black traces in Fig. The \(0{{{\rm{\omega }}}}\) trace for CB3 at \({E}_{{{\mathrm{int}}}}=2.5{eV}\) is almost identical to the reference pulse autocorrelation (green trace), because the electron dynamics at that energy are too fast to resolve. By contrast, the \(0{{{\rm{\omega }}}}\) signals for CB2 and WF edge show a delayed rise and pedestal formation at \(\delta \, > \, 50{fs}\), giving evidence for carrier multiplication35,36. Plotting the \(0{{{\rm{\omega }}}}\) traces at different \({E}_{{{\mathrm{int}}}}\) shows that the delayed rise feature primarily occurs at low \({E}_{{{\mathrm{int}}}}\) and becomes invisible at \({E}_{{{\mathrm{int}}}} \, > \, 2.4{{{\rm{eV}}}}\) (Supplementary Fig. 9b, d), indicating that the CB population by carrier multiplication decreases as \({E}_{{{\mathrm{int}}}}\) increases to reach the expected N = 2 for a two-photon process at \({E}_{{{\mathrm{int}}}}={{\hslash }}{{{\rm{\omega }}}}\). Particularly, pronounced delayed rise signals appear at Eint of CB2 and CB2 exciton, as are evident in the \(0{{{\rm{\omega }}}}\) maps under s-pol and p-pol light excitation (Fig. Moreover, the delayed rise features appear earlier under higher light fluence (Supplementary Fig. 10), as expected for the Auger-type scattering process42. Optical Bloch equation (OBE) simulations at specific \({E}_{{{\mathrm{int}}}}\) including the delayed Auger-type generation, can well reproduce the \(0{{{\rm{\omega }}}}\) traces (Supplementary Note 3), from which we can extract the hot electron lifetime \({\tau }_{e}\)35,36, and the ratio, α, of the delayed to the prompt hot electron population. As \({E}_{{{\mathrm{int}}}}\) increases from the WF edge, the value of α decreases non-monotonically (Fig. 4g), gaining secondary maxima at CB2 and CB2 exciton, signifying their dominant excitation is by carrier multiplication. The obtained \({\tau }_{e}\) is longest (~ 180 fs) at WF edge (\({E}_{{{\mathrm{int}}}}=0.7{eV}\)) and decreases at higher \({E}_{{{\mathrm{int}}}}\) (Fig. 4h) with an approximate \({\tau }_{e}\, \sim \,{\left({E}_{{{\mathrm{int}}}}\right)}^{-1.9}\) dependence, which is close to the inverse quadratic dependence of \({\tau }_{e}\) for e-e scattering in normal Fermi liquids43. This invites a conclusion that for \({E}_{{{\mathrm{int}}}}\ge 0.7{eV}\) above EFermi, e-e scattering dominates on the ~ 100 fs time scale, while closer to the CB minimum, the e-ph scattering may dominate on a longer time scale43. The semiconducting BP with a layer thickness tunable bandgap presents a broad optical response in the visible-to-infrared range21,24 and attracts interest in its variable-spectrum optoelectronic applications44. Its absorption in \({{\hslash }}{{{\rm{\omega }}}}\) = 3 − 4 eV UV range is regarded to have a “colossal”, five orders-of-magnitude, increase in photoconductivity relative to the visible-near-infrared excitaiton45. Wu et al. attribute this changeover to the excitation of flat bands (G-Z direction) that may contribute a high density-of-states for excitation from ~ − 2 to 2 eV45. Our femtosecond deep UV (\({{\hslash }}{{{\rm{\omega }}}}\) = 4.68 − 3.94 eV) pulse 2PP spectroscopy of BP shows the photoexcitation unconventionally occurs through a second-order Drude-like process where transient e-−h+ pairs turn on Coulomb interactions that activate carrier-carrier scattering that populates CB1-CB4, in sharp contrast to the conventional band-to-band transitions under visible excitation (Supplementary Note 4). Consequently, in the primary response, energy and momentum are conserved, generating the hot electron gas with Te up to 3000 K in CB1 and populating CB2-CB4. The concomitantly generated hot holes participate in Auger-type interband carrier-carrier scattering. Particularly, VB1 electrons participate in impact ionization, where the energy of one recombing with a deep hole is transferred to the other to populate the CB2 to generate the parity-forbidden CB2(e−)←VB1(h+) dark exciton state (Supplementary Fig. The energy and momentum conservations in impact ionization determine the exciton distribution with finite center-of-mass (COM) momentum, thereby achieving the observed exciton temperature of \({T}_{{ex}}=340 \, K\) (Supplementary Fig. Such unconventional optical response with exceptional dark exciton formation and hot electron generation extend the potential applications of BP in optoelectronics11,21,37,44 and photochemistry46,47 under femtosecond UV excitation. The involvement of carrier scattering in UV absorption can be attributed to weak and time-dependent screening of the Coulomb interaction in 2D materials. The UV excited transient excitonic virtual e-h pairs scatter within the 30-fs excitation pulse window to generate real thermalized hot electrons and holes48. The generated hot carrier plasma can undergo further carrier multiplication scattering on < 100 fs time scale that is seen as the delayed dark state rise. This analysis is entirely consistent with the understanding that, while the electronic band structure picture of 2D materials is valid, the quasiparticle picture of their excitations is equivocal because of charge carrier Coulomb correlations that favor second-order scattering processes48. Thus, our energy-momentum-time-resolved nonlinear 2PP spectroscopy shows that under femtosecond UV excitation, the optical response of 2D materials can deviate from band-to-band dipole transitions, opening new routes to access novel dark state with singular properties relative to the starting ground state. We expect that the departure from the quasiparticle concept under strong excitation is a general feature of 2D materials that show evidence of Coulomb correlations under weak excitation and low-temperature conditions49. The single-crystalline BP samples are grown by the chemical vapor transport (CVT) method in a two-zone tube furnace using high-purity red phosphorus, tin iodide and tin powders as the starting materials50. Commercial BP samples (HQ Graphene) are also used to check for the consistency of 2PP experiments. The BP surface is cleaved in situ under ultrahigh vacuum (UHV). A low-temperature scanning tunneling microscopy at constant current mode (STM, Omicron LT) is used to establish the surface cleanness (Supplementary Fig. A 1030 nm femtosecond laser (Light conversion, Pharos-20W) operating at a 500 kHz repetition pumps a self-built noncollinear optical parametric amplifier (NOPA) to produce tunable excitation pulse trains between 500 − 930 nm (\({{\hslash }}{{{{\rm{\omega }}}}}_{L}\) = 2.48 − 1.33 eV) with an average output power of ~ 60 − 80 mW. The pulses are compressed by multiple reflections from a matched pair of negative dispersion mirrors to ~20–30 fs for further frequency doubling by a BBO crystal to 250 − 465 nm (\({{\hslash }}{{{\rm{\omega }}}}\) = 4.96 − 2.67 eV). Dispersion compensation for UV pulses uses a series of four dispersive prisms to reach a pulse duration of ~30 fs. The laser beam is focused onto the sample at an incident angle of 45° from the surface normal with a spot diameter ~ 50 μm. 2PP measurements are carried out mainly at room temperature in a UHV chamber with a base pressure < 1.5 × 10−10 mBar. The manipulator can also cool the samples down to 90 K with liquid nitrogen, such as for the measurements in Supplementary Fig. 7 and 10, for a better signal-to-noise ratio. The 2PP spectra are collected with a hemispherical electron energy analyzer (Specs, Phoibos 150, ± 15° acceptance angle). A 2D DLD delay-line detector (Surface Concept) records Efinal(k||) values in single photoelectrons counting acquisition mode. A bias of 3 V is applied between the sample and the analyzer to collect low-energy photoelectrons. When \({{\hslash }}{{{\rm{\omega }}}} \, < \, {{{\rm{WF}}}}\), two-photon absorption excites electrons from below EFermi to above the vacuum level (EV), to overcome the surface work function and undergo photoemission. The observed electron dynamics occur dominantly in bound intermediate states Eint(k||) = Efinal(k||) -\(\,{{\hslash }}{{{\rm{\omega }}}}\). During the electronic system evolution, absorbing the second photon projects electrons into the photoemission continuum, where their energy and momentum are recorded. As shown in Fig. 1e, the incident laser beam, the sample surface normal, and the analyzer slit are aligned in the optical plane. A λ/2 waveplate sets the excitation polarization between p- and s-pol. A 5-axis manipulator with in-plane azimuth rotation can align the ZZ or AC edges of BP crystals to the optical plane, where its crystalline orientations are further verified by recording the LEED pattern analysis (SPECS ErLEED 100, Supplementary Fig. The conventional ARPES (referred to as 1PP) is excited by \({{\hslash }}{{{\rm{\omega }}}}=21.2{{{\rm{eV}}}}\) from a He discharge lamp (VG Scienta, VUV5000). The optical excitation pathways for 1PP and 2PP are schematically shown in Supplementary Fig. A Mach-Zehnder interferometer generates identical, collinear, phase-correlated pump-probe pulse pairs for time-resolved 2PP measurements30. A piezoelectrically actuated translation stage scans the pump-probe time delay (\(\delta\)), enabling the phase coherent response to be recorded with each frame recording 2D Eint(k||) images. Scanning the delay over \(\delta\) = 280 to − 50 fs range in 64 attosecond steps records a 3D interferogram movie (Supplementary Movies 1, 2) of the variation of photoelectron counts in the energy-momentum-time domains41,51,52,53,54,55,56. To improve the counting statistics, more than 200 pump-probe scans are accumulated together with the delay time calibration interference fringes at the laser center frequency. The calibration fringes define the time axis with a constant optical cycle at specific excitation \({{\hslash }}{{{\rm{\omega }}}}\). From the accumulated 3D interferogram movie, we extract the 2D interferograms of Eint vs. \(\delta\) at specific k||, such as in Fig. 4a and Supplementary Fig. Such interferograms are Fourier transformed (FT) from time to frequency domains to analyze responses at the dominant coherent polarization frequencies corresponding to \(0{{{\rm{\omega }}}},\,1{{{\rm{\omega }}}},\,2{{{\rm{\omega }}}}\) harmonics (\({{{\rm{\omega }}}}\) is the laser frequency). For further analysis, 2D-FT spectra are inverse Fourier transformed back to the time domain, in particular, the \(0{{{\rm{\omega }}}}\) phase-averaged component (Fig. 4e, f and Supplementary Fig. 10) is used for the analysis of optical phase-independent hot electron population dynamics. The OBE simulation details of the population dynamics are in Supplementary Note 3. Density functional theory (DFT) calculations are performed with the projector-augmented wave method (PAW)57 as implemented in Vienna ab-initio simulation package (VASP)58,59. The Perdew-Burke-Ernzerhof (PBE) functional60 is used for the exchange-correlation potential. For geometry optimization, the vdW interaction is considered at the vdW-DF level with the optB88 exchange functional (optB88-vdW)61,62. The plane wave cut-off energy is set as 400 eV. An 18 × 18 × 6 k-mesh is used to sample the Brillouin zone. The atomic geometry is fully optimized until a residual force on each atom is less than 0.01 eV Å−1. Electronic band structure is calculated by hybrid functional (HSE06)63 methods based on the atomic structures obtained from the full optimization by optB88-vdW, and the result is shown in Fig. Ab initio GW plus Bethe-Salpeter Equation (GW-BSE)64,65 calculations is performed using VASP as well to evaluate the exciton binding energy and wave function, with the same structure, energy cutoff and k-grid as in DFT calculation. The main data supporting the findings are provided in the Source data files with this paper. All the data that support the findings of this study are available from the corresponding author upon request. Source data are provided in this paper. Theory of the contribution of excitons to the complex dielectric Vonstant of crystals. Wang, G. et al. Colloquium: Excitons in atomically thin transition metal dichalcogenides. Jiang, X. et al. Real-time GW-BSE investigations on spin-valley exciton dynamics in monolayer transition metal dichalcogenide. Poem, E. et al. Accessing the dark exciton with light. Loh, K. P. 2D materials: Brightening the dark excitons. Zhou, Y. et al. Probing dark excitons in atomically thin semiconductors via near-field coupling to surface plasmon polaritons. Zhang, X. X. et al. Magnetic brightening and control of dark excitons in monolayer WSe2. Madeo, J. et al. Directly visualizing the momentum-forbidden dark excitons and their dynamics in atomically thin semiconductors. Schmitt, D. et al. Formation of moire interlayer excitons in space and time. Hodgman, S. S. et al. Metastable Helium: A New Determination of the Longest Atomic Excited-State Lifetime. Ye, Z. et al. Probing excitonic dark states in single-layer tungsten disulphide. Kazimierczuk, T., Frohlich, D., Scheel, S., Stolz, H. & Bayer, M. Giant Rydberg excitons in the copper oxide Cu2O. Matsunaga, R., Matsuda, K. & Kanemitsu, Y. Evidence for dark excitons in a single carbon nanotube due to the aharonov-Bohm effect. Orfanakis, K. et al. Rydberg exciton-polaritons in a Cu2O microcavity. Rossi, D. et al. Light-induced activation of forbidden exciton transition in strongly confined perovskite quantum dots. Uda, T., Yoshida, M., Ishii, A. & Kato, Y. K. Electric-field iInduced sctivation of dark excitonic states in narbon Nanotubes. Ishii, A., Machiya, H. & Kato, Y. K. High efficiency dark-to-bright exciton conversion in carbon nanotubes. Li, L. et al. Black phosphorus field-effect transistors. Qiao, J., Kong, X., Hu, Z.-X., Yang, F. & Ji, W. High-mobility transport anisotropy and linear dichroism in few-layer black phosphorus. Kim, J. et al. Observation of tunable band gap and anisotropic Dirac semimetal state in black phosphorus. Yuan, H. et al. Polarization-sensitive broadband photodetector using a black phosphorus vertical p-n junction. Kurpas, M., Gmitra, M. & Fabian, J. Spin properties of black phosphorus and phosphorene, and their prospects for spincalorics. Tran, V., Soklaski, R., Liang, Y. & Yang, L. Layer-controlled band gap and anisotropic excitons in few-layer black phosphorus. Zhou, S. et al. Pseudospin-selective Floquet band engineering in black phosphorus. Chen, Z. et al. Band Gap Renormalization, Carrier multiplication, and stark broadening in Photoexcited Black Phosphorus. Wang, X. et al. Highly anisotropic and robust excitons in monolayer black phosphorus. Zhang, G. et al. Determination of layer-dependent exciton binding energies in few-layer black phosphorus. Jung, S. W. et al. Black phosphorus as a bipolar pseudospin semiconductor. Tan, S., Argondizzo, A., Wang, C., Cui, X. & Petek, H. Ultrafast Multiphoton Thermionic Photoemission from Graphite. Petek, H. & Ogawa, S. Femtosecond time-resolved two-photon photoemission studies of electron dynamics in metals. Mattis, D. C. & Gallinar, J. P. What is the Mass of an Exciton? Fukutani, K. et al. Detecting photoelectrons from spontaneously formed excitons. Man, M. K. L. et al. Experimental measurement of the intrinsic excitonic wave function. & Kemper, A. F. Photoemission signature of excitons. Petek, H., Nagano, H., Weida, M. J. & Ogawa, S. The role of Auger decay in hot electron excitation in copper. Petek, H., Nagano, H. & Ogawa, S. Hot-electron dynamics in copper revisited: The d-band effect. et al. High-lying valley-polarized trions in 2D semiconductors. Klimov, V. I. Multicarrier interactions in Semiconductor Nanocrystals in Relation to the Phenomena of Auger Recombination and Carrier Multiplication. Fu, J. et al. Hot carrier cooling mechanisms in halide perovskites. Paul, K. K., Kim, J.-H. & Lee, Y. H. Hot carrier photovoltaics in van der Waals heterostructures. Reutzel, M., Li, A. & Petek, H. Coherent two-simensional multiphoton photoelectron Sectroscopy of Metal Surfaces. Zhang, T. et al. Regulation of the luminescence mechanism of two-dimensional tin halide perovskites. Bauer, M., Marienfeld, A. & Aeschlimann, M. Hot electron lifetimes in metals probed by time-resolved two-photon photoemission. Kim, H. et al. Actively variable-spectrum optoelectronics with black phosphorus. Wu, J. et al. Colossal ultraviolet photoresponsivity of few-layer black phosphorus. Zhang, K. et al. Black phosphorene as a hole extraction layer boosting solar water splitting of oxygen evolution catalysts. Kalay, E., Küçükkeçeci, H., Kilic, H. & Metin, Ö. Black phosphorus as a metal-free, visible-light-active heterogeneous photoredox catalyst for the direct C–H arylation of heteroarenes. Tomadin, A., Brida, D., Cerullo, G., Ferrari, A. C. & Polini, M. Nonequilibrium dynamics of photoexcited electrons in graphene: Collinear scattering, Auger processes, and the impact of screening. Rodin, A., Trushin, M., Carvalho, A. & Castro Neto, A. H. Collective excitations in 2D materials. Realizing nearly-free-electron like conduction band in a molecular film through mediating intermolecular van der Waals interactions. Transient excitons at metal surfaces. Tan, S. et al. Plasmonic coupling at a metal/semiconductor interface. Coherent electron transfer at the Ag/graphite heterojunction interface. Reutzel, M., Li, A., Wang, Z. & Petek, H. Coherent multidimensional photoelectron spectroscopy of ultrafast quasiparticle dressing by light. Li, A. et al. Towards full surface Brillouin zone mapping by coherent multi-photon photoemission. Petek, H., Li, A., Li, X., Tan, S. & Reutzel, M. Plasmonic decay into hot electrons in silver. Blöchl, P. E. Projector augmented-wave method. Efficient iterative schemes for ab initio total-energy calculations using a plane-wave basis set. Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set. Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Klimeš, J., Bowler, D. R. & Michaelides, A. Van der Waals density functionals applied to solids. Klimeš, J., Bowler, D. R. & Michaelides, A. Chemical accuracy for the van der Waals density functional. Heyd, J., Scuseria, G. E. & Ernzerhof, M. Hybrid functionals based on a screened Coulomb potential. Hybertsen, M. S. & Louie, S. G. Electron correlation in semiconductors and insulators: band gaps and quasiparticle energies. Rohlfing, M. & Louie, S. G. Electron-hole excitations and optical spectra from first principles. acknowledges the CAS Project for Young Scientists in Basic Research (YSBR-054). acknowledges the New Cornerstone Science Foundation. acknowledges the NSF grant CHE-2102601 and CHE−1414466, as well as the President's International Fellowship Initiative of CAS. We also appreciate the support from the Innovation Program for Quantum Science and Technology (2021ZD0303302), CAS Strategic Priority Research Program (XDB36020200), and the National Natural Science Foundation of China (22425206, 11904349, 12125408). These authors contributed equally: Guangzhen Shen, Xirui Tian. Hefei National Research Center for Physical Sciences at the Microscale, New Cornerstone Science Laboratory, and Department of Physics, University of Science and Technology of China, Hefei, Anhui, China Guangzhen Shen, Xirui Tian, Xintong Li, Yishu Tian, Xuefeng Cui, Jin Zhao, Bing Wang & Shijing Tan Hefei National Laboratory, University of Science and Technology of China, Hefei, Anhui, China Guangzhen Shen, Xirui Tian, Xintong Li, Yishu Tian, Xuefeng Cui, Jin Zhao, Bing Wang & Shijing Tan School of Physics and Technology, Wuhan University, Wuhan, Hubei, China Limin Cao & Min Feng School of Physics, Zhejiang University, Hangzhou, Zhejiang, China Department of Physics and Astronomy and the IQ Initiative, University of Pittsburgh, Pittsburgh, Pennsylvania, USA You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar This is joint research conducted at the University of Science and Technology of China (USTC), the University of Pittsburgh (Pitt), and Wuhan University (WHU). initiated the research at Pitt; S.T. supervised the experiments at USTC; M.F. supervised the synthesis of BP crystals at WHU; J.Z. supervised the calculations at USTC. performed the calculations; X.C., X.L., and Y.T. participated in the construction of the 2PP experimental setup; H.P. provided the interpretation of ultrafast Auger scattering and finalized the manuscript; all authors contributed to the discussion. Correspondence to Jin Zhao, Bing Wang, Hrvoje Petek or Shijing Tan. The authors declare no competing interests. Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work. A peer review file is available. Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/. Reprints and permissions Shen, G., Tian, X., Cao, L. et al. Ultrafast energizing the parity-forbidden dark exciton in black phosphorus. Anyone you share the following link with will be able to read this content: Sorry, a shareable link is not currently available for this article. Provided by the Springer Nature SharedIt content-sharing initiative Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.
You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript. Reinforcement learning theory explains human behavior as driven by the goal of maximizing reward. Conventional approaches, however, offer limited insights into how people generalize from past experiences to new situations. Here, we propose refining the classical reinforcement learning framework by incorporating an efficient coding principle, which emphasizes maximizing reward using the simplest necessary representations. This refined framework predicts that intelligent agents, constrained by simpler representations, will inevitably: 1) distill environmental stimuli into fewer, abstract internal states, and 2) detect and utilize rewarding environmental features. Consequently, complex stimuli are mapped to compact representations, forming the foundation for generalization. We tested this idea in two experiments that examined human generalization. Our findings reveal that while conventional models fall short in generalization, models incorporating efficient coding achieve human-level performance. We argue that the classical RL objective, augmented with efficient coding, represents a more comprehensive computational framework for understanding human behavior in both learning and generalization. Making sense of this dynamic reality requires the ability to generalize; that is, to extract knowledge from past experiences and apply it to new, unseen futures. Effective generalization remarkably improves the capacity of intelligent agents to adapt to rapid changes. She makes numerous attempts, falling and adjusting her balance through trial and error. Once bike riding is mastered, the child can then generalize those balancing skills to ride a scooter, allowing her to quickly master the scooter without having to learn from scratch. Given its importance to adaptive learning, generalization has been the focus of study in both cognitive neuroscience1,2,3 and machine learning4,5,6. Recent research illustrates that representation learning is one of the cornerstones that support generalization7,8,9. Representation learning involves the transformation of raw environmental stimuli or events into robust abstract states (“state abstraction”), which summarize underlying patterns and regularities in the raw data. For example, riding a bike and scooter may be conceptually abstracted into one activity, enabling a child to realize they can transfer balancing skills previously learned from riding a bicycle to a scooter. In addition, effective representations can detect and extract a subset of the most informative and rewarding features within environments (“rewarding feature extraction”). For instance, although bicycles and scooters have distinct designs, their shared feature of having two wheels requires similar balancing skills. Historically, there has been a gap in the theoretical and comprehensive understanding of how to constitute effective representations. Bridging this gap and developing algorithms that learn generalizable representations has become a central pursuit in recent research on human cognitive neuroscience9,10,11,12 and artificial intelligence7,13,14,15,16. This paper focuses on understanding how humans learn effective representations that enhance their generalization abilities. One influential framework for understanding human behavioral learning is reinforcement learning (RL), which views intelligent behavior as seeking to maximize expected reward17,18. This framework provides a normative understanding of a spectrum of human learning processes19,20,21,22,23,24,25,26 and offers theories on the underlying neural mechanisms27,28,29,30. However, by itself, the traditional RL framework provides very limited insights into human representation learning and generalization10,20,31,32,33. The framework often assumes a predefined, fixed set of task representations on which learning can operate directly, without the need for additional representation learning17. However, in real-world decision-making, humans are not provided with predefined representations. Instead, they must infer these representations from complex and dynamic environmental observations. Here, we propose augmenting the classical RL theory to incorporate the principle of efficient coding34: while maximizing reward, intelligent agents should use the simplest necessary representations. The origin of this approach lies in the basic fact that the human brain, as a biological information processing system, possesses finite cognitive resources35. The idea of efficient use of cognitive resources has had profound impacts across many domains in psychology and neuroscience, including perception36,37,38, working memory39,40, perceptual-based generalization3, and motor control41. Furthermore, our approach aligns with Botvinick's42 proposal that the efficient coding principle can be instrumental in understanding the representation of problems in learning and decision-making. Our work extends their proposal by concretely operationalizing efficient coding using information theory, providing a calculable measure within the RL framework, and validating this idea on human data. Critically, our proposed approach suggests that, driven by the principle of efficient coding, an intelligent agent can autonomously learn appropriate simplified representations, which enables both state abstraction and the extraction of rewarding features, naturally resulting in generalization. To validate these predictions, we designed two experiments focusing on learning and generalization. Participants first learned a set of stimulus-action associations and were then tested on their ability to generalize to a new set of associations they had not encountered before. Human participants displayed strong generalization abilities in both experiments, correctly responding to new associations without additional training. We developed a principled model based on efficient coding and demonstrated its capacity to achieve human-level generalization performance in both experiments—performance that classical RL models have not accomplished. These findings lead us to conclude that generalization is an inherent outcome of efficient coding. Given humans' remarkable capacity for generalization, we assert that the classical RL objective augmented with efficient coding and reward maximization presents a more comprehensive computational objective for human learning. Perceptual-based generalization occurs when two stimuli share a similar appearance1,38,39. Functional-based generalization, in contrast, occurs between stimuli that have similar functions (e.g. linked to the same actions), even when they do not look alike2,43,44,45,46. The latter type of generalization is more complex because it necessitates the acquisition of unseen environmental statistics before it can occur. To investigate both types of generalization, we leveraged the acquired equivalence paradigm2,43,44. This experimental framework first links two visually distinct stimuli with identical actions, then assesses the increase in generalization between these stimuli based on their shared actions. This approach effectively establishes the functional similarity between the two stimuli, enabling a controlled experimental investigation into participants' ability of functional-based generalization. Specifically, participants performed a two-stage task. In each trial, participants were shown an alien (stimulus \(s\)) and were told that different aliens preferred to visit different locations. For a given stimulus, participants were required to choose one of two places (action \(a\)) that they believed the alien would prefer to visit (Fig. During the training stage, participants were trained on six stimulus-action associations, each repeated ten times to learn the equivalence between stimuli based on their associated actions (Figs. For example, if aliens s1 and s2 both preferred to visit a desert (a1) rather than a forest (a2), then they are equivalent and the psychological similarity of the two aliens may increase. During the training stage, participants received feedback (reward \(r\), taking a value of either 0 or 1) after every choice. The alien stimuli were used with modifications with permission from Isabel Gauthier and Michael Tarr. The scene stimuli were adopted without modifications from Zhou et al.87. Each response screen displays an alien stimulus, as well as two location pictures representing different actions. B One block contains two stages. The training stage trains three associations with feedback. The testing stage tests an untrained association (dashed line) in addition to the three trained associations without feedback. D Stimuli used in Experiment 1 are designed to be the same color but with different shapes and appendages to control for perceptual similarity. The four stimuli are referred to as \(x,\,{x}^{{\prime} },{y},{y}^{\prime}\), with \(x\) and \(x^{\prime}\) associating with the same actions, as do \(y\) and \(y^{\prime}\). Each block contains a different type of perceptual similarity. The classical reinforcement learning policy gradient (RLPG) model learns a policy that maps from stimuli \(s\) to a distribution of action \(a\). Due to the introduction of representation \(r\), the policies of the cascade policy gradient (CPG) and the efficient coding policy gradient (ECPG) model are broken into an encoder \(\psi\) and decoder \(\rho\). In the testing stage, participants were tested on eight associations: the six trained associations plus two untrained associations that were not presented in the training stage. The untrained associations were used to evaluate people's generalization performance. For example, if the participant learned during the training stage that \({s}_{1}\) and \({s}_{2}\) were similar to each other (had similar preferences), then participants might generalize other preferences from \({s}_{1}\) to \({s}_{2}\), even though no feedback was given about those preferences. No feedback was provided during the testing stage, and each association was repeated six times. To quantify human generalization ability, we calculate the “untrained accuracy”, which is the response accuracy for the untrained associations that were not presented during training. Similarly, “trained accuracy”—the response accuracy for trained associations that were presented in the training stage—serves as a measure of human learning performance. Both metrics are crucial and will be used extensively throughout this paper. All data were collected online via Amazon Mechanical Turk. David Marr47 famously argued that the human brain can be understood at three levels: the computational level, which defines the goals to be achieved; the algorithmic level, which details the specific algorithms the human brain used to reach these goals; and the implementational level, which describes how these algorithms are physically realized. In psychology and cognitive science, researchers often build models at the algorithmic level. They typically postulate specific cognitive mechanisms within the human brain, describe these mechanisms using computer programs, and demonstrate their explanatory power over human behavioral data45,46,48. However, the question of whether the human brain reconstructs efficient representations for task stimuli is situated at the computational level. Therefore, we need to construct models at this same level. In concrete, we formalized our hypotheses—with or without efficient coding—as distinct computational goals, each addressed using the simplest possible algorithm. Unlike algorithmic-level models, computational-level models do not presume specific mechanisms; Instead, these mechanisms naturally emerge during the process of achieving the defined computational goal. Thus, computational-level models not only explain human behaviors but also shed light on the potential cognitive mechanisms underlying these behaviors, thereby demonstrating superior explanatory power over algorithmic-level models. First, we established a classical RL baseline, named Reinforcement Learning Policy Gradient (RLPG; see Fig. 1F and “Method-Models-RLPG”), which assumes that humans do not learn simplified representations. The computational goal is formulated as follows: On each trial, an agent had to choose between two possible actions, each with a 50% chance of being correct. Prior to making a decision, the agent was expected to have a baseline reward expectation of \(b\) = 0.5. This baseline was used to evaluate the “goodness” of the actual reward received. A reward was considered positive if it exceeded the agent's expectation, otherwise negative. Second, we developed an Efficient Coding Policy Gradient model (ECPG; Fig. 1F and “Method-Models-ECPG”), which posits that humans learn simpler representations through efficient coding. The challenge in modeling this principle lies in defining the complexity (or simplicity) of representations. Recent studies on human perception have conceptualized perception as an information transmission process, where an encoder transmits environmental sensory signals (\(s\)) into internal representation (\(z\))3,39,40. These studies measure the complexity of representations by the amount of information transmitted by the encoder, quantified by the mutual information between stimuli and representations \({I}^{\psi }({S;Z})\). Based on these works, the computational goal of efficient coding is formalized as maximizing reward while minimizing the representation complexity, When \(\lambda\) = 0, the agent does not compress stimuli representations for simplicity, and the efficient coding goal reduces to the RL goal. Conversely, as \(\lambda \to \infty\), the agent learns the simplest set of representations, encoding all stimuli into a single, identical representation. Therefore, the optimal \(\lambda\) should be a moderate value, balancing compressing without oversimplification. Due to the introduction of latent representation \(z\), the policy needs to be broken down into an encoder, \(\psi\), and a decoder, \(\rho\), which are simultaneously optimized according to Eq. To test whether humans learn compact representations, the establishment of the RLPG and ECPG models would typically be sufficient, because the contrasting hypotheses they represent (RLPG stands for “No”, ECPG stands for “Yes”) together cover the entire hypothesis space. One concern, however, is that the introduction of the representation in ECPG has changed the model architecture, potentially introducing confounding factors. To control these confounders, we implemented a third model, Cascade Policy Gradient (CPG; “Method-Models-CPG”), which also supports the non-efficient coding hypothesis. This model serves as an intermediary between the RLPG and ECPG models, optimizing for the classical RL objective while concurrently updating the representations. To ensure that observed behavioral differences result only from optimizing different computational goals, we carefully controlled for all other model components. First, all three models address their computational goals using the same policy gradient approach, where models explicitly learn and maintain a parameterized policy17,49. The method was selected over the more commonly used value function approach in psychology and neuroscience because it introduces a minimum number of parameters, therefore better distilling the computational essence of each computational goal. We chose a threshold of 99% instead of 100% for two reasons: first, to model the perceptual noise present in the human visual system, and second, to prevent gradient vanishing, which is an engineering concern. The RLPG model implicitly assumes perfect discrimination between stimuli and, therefore, does not require the same pretraining as others. Lastly, we used the same model fitting method for all three models, fitting the parameters to each participant separately using maximum-a-posteriori (MAP) estimation (“Method-Model fitting”), based on behavioral data from both the training and the testing stages. In the following sections, we demonstrate that, at the computational level, only the ECPG model—which incorporates representation simplification—can qualitatively account for human generalization behaviors. We also compare the ECPG model to several published algorithmic-level models and show that, even without presuming any specific algorithmic details about cognitive mechanisms, the ECPG model surpasses models with handcrafted cognitive mechanisms in describing human behavior. Overall, our findings show that integrating efficient coding into the classical RL objective provides a more comprehensive computational framework for understanding human learning and generalization. Experiment 1 studies human generalization using the standard acquired equivalence paradigm. In this setting, the four alien stimuli within each block share the same color but differ in shapes and appendages (Fig. This design allows us to specifically study functional-based generalization, because the perceptual features (color, shape, and appendage) provide no cues for generalization. The proposed efficient coding principle posits that, to achieve simplified representations, an agent must appropriately abstract environmental stimuli into robust latent states. Within each of these abstract states, the stimuli can then mutually generalize. To illustrate this, we simulated the ECPG model at different levels of simplicity, controlled by the parameter λ (0, 0.07, 0.1, 0.2, 0.5), while keeping other parameters constant (See simulation details in “Method-Simulation”). Note that when \(\lambda\) = 0, the ECPG model reduces to the CPG model, which does not employ efficient coding. The simulations first demonstrate that efficient coding drives state abstraction. 2A (\(\lambda\) = 0.1), representation complexity decreases significantly from the beginning of training (t = 0) to the end (t = 60). This representation simplification significantly affects the model's internal representations (\(\lambda\) = 0.1). Before training, when representations are complex, each stimulus is encoded in an unstructured way, with a one-to-one correspondence in representation space (Fig. Driven by efficient coding, the ECPG model compresses representations, discarding redundant information and mapping stimuli associated with the same actions into similar representations, forming abstract states (Fig. We quantify the degree of the state abstraction using the Silhouette score50, which measures an object's (\(x\)) similarity to its own latent state (\(x^{\prime}\)) relative to other states (\(y\) and \(y^{\prime}\)). A score close to 0 indicates poor abstraction, while a score close to 1 indicates strong abstraction—stimuli within each abstract state (\(x\) and \(x^{\prime}\)) are encoded similarly and associate with each other, while stimuli across abstract states (\(x\) and \(y\)) remain distinct. Figure 2B (\(\lambda\) = 0.1) shows the Silhouette score increasing from 0 toward 1, indicating emergence of stable, meaningful abstract states from the initially unstructured set of representations. A–E are generated by averaging over 400 simulations. A Representation complexity \({I}^{\psi }({S;Z})\) throughout the training process. The cross makers at t = 0, 10, 20, 40, 60 indicate the trials that are sampled for detailed analysis. C The proportion of correct responses for associations that were presented (trained) and not presented (untrained) during the training stage reflects learning and generalization performance, respectively. This figure includes only data from the testing stage. Dashed lines represent the 50% chance level. D Encoders \(\psi ({z|s})\) with \(\lambda\) = 0.1 at the sampled training trials. Each row stands for a categorical distribution that sums to 1. Darker shades indicate higher probability values. E Policies \(\pi ({a|s})\) with \(\lambda\) = 0.1 at the sampled training trials. F Predefined correct associations in the training and testing stages. We further show that stimuli within the same abstract state can generalize to each other. After abstract states stabilize (t > 40), the model begins to decode policies from the structured representations, and changes in representation complexity become more nuanced (Fig. Policies decoded from stimuli within the same abstract state are similar (Fig. 2E, red arrows), illustrating the ECPG model's ability to generalize from training to testing associations. This is reflected by the model's significantly above-chance untrained accuracy (Fig. 2C, \(\lambda\) = 0.1), despite being exposed to only a subset of the associations during training (Fig. Similar results can be observed when \(\lambda\) is set to 0.2 (Supplementary Note 1.2). Note that the degree of state abstraction is critical; both insufficient and excessive abstraction impair generalization. As \(\lambda\) increases, the model prioritizes representation simplification over reward maximization, resulting in more intense and rapid abstraction (Fig. For lower \(\lambda\) (0 or 0.07), the ECPG model becomes more reward-focused and exhibits little or no reduction in representation complexity (Fig. The insufficient compression prevents the model from associating stimuli that share the same actions, leading to a failure in state abstraction (Fig. This results in significant reward loss and unstable state abstraction, as reflected by the oscillating Silhouette score (Fig. Such oversimplified abstraction can be detrimental to both generalization and learning performance (Fig. So far, our theoretical framework has outlined how efficient coding could result in functional-based generalization. To verify whether these principles in humans, we collected behavioral data from 165 participants performing two blocks of the standard acquired equivalence task. We fitted all three models to the data and evaluated them using the Bayesian Information Criterion (BIC). 3A), with a stronger advantage in the testing stage, where generalization occurs (Table 1). Some participants' behaviors were poorly captured by the ECPG model, primarily due to their low effort, which resulted in poor learning (Pearson's r(165) = −0.96, p < 0.001, 95% CI = [−0.97, −0.94]) and generalization performance (Pearson's r(165) = −0.52, p < 0.001, 95% CI = [−0.63, −0.40]). As expected, the ECPG model again was ranked first among the three models (PXP > 0.999; Fig. These findings underscore the unique capability of the ECPG model in capturing human learning and generalization performance. Models were fit to all behavioral data of each participant in both training and testing stages. A Models' Bayesian information criterion (BIC) for each participant. Also, see Table 1 for the exact value. B Protected exceedance probability (PXP) tallies for each model. C Change in representation complexity after training. Scatterplot data located above the horizontal dashed line indicate representation expansion, while those below the line indicate representation compression. The RLPG model assumes that environmental stimuli are always perfectly reconstructed and, therefore, yield no change in complexity. Error bars reflect the mean ± standard deviation (SD) across 165 valid participants. D Proportion of correct responses for trained and untrained associations in the testing stage, representing learning and generalization performance, respectively. Only data from the testing stage is included. Error bars reflect the mean ± SD across 165 valid participants. E Proportion of correct responses over the number of trials for each association. To show that only the ECPG model learns simplified representations, we computed the change in representation complexity during training, which was quantified as the difference in the mutual information \({I}^{\psi }({S;Z})\) before and after training (Fig. The ECPG model successfully reduced complexity, indicating that it learned simplified representations as expected. In contrast, the two control models did not compress their representations. Furthermore, we examined the simplicity parameter \(\lambda\) of the ECPG model and found it to be significantly greater than 0 (two-sided t(164) = 4.29, p < 0.001, Cohen's d = 0.33, 95% CI = [0.11, 0.29]) (see Supplementary Note 1.1 for model parameters). This finding suggests that the representation simplification plays an important role in capturing human behaviors. Despite receiving no training, the untrained accuracy for human participants is significantly greater than the 50% chance level, though slightly lower than trained accuracy (Fig. This observation, consistent with many prior studies2,43,44, indicates that human participants effectively generalized from prior learning. The ECPG model closely captures this generalization phenomenon, whereas the two control models cannot generalize at all, with the untrained accuracy remaining at the 50% chance level (Fig. More importantly, the ECPG model's strong performance in capturing human generalization did not compromise its explanatory power for human learning behavior. It offers a description as precise as that of the two control models concerning the human learning curve throughout the training stage (Fig. Based on both quantitative and qualitative evidence, we conclude that humans' ability to generalize originates from their computational goal of efficient coding. This process promotes the emergence of abstract latent states, which form the foundational basis for generalization. Experiment 2 extended the standard paradigm to examine both functional-based and perceptual-based generalizations in humans. The experiment featured two primary modifications. First, we manipulated the stimuli's perceptual cues--shape, color, and appendage--to ensure each feature provided a different amount of information about the environment's rewards. We designed three experimental conditions, each with a distinct rewarding configuration (Fig. This condition, like Experiment 1, only tested the functional-based generalization and the state abstraction ability of an agent. In the conflict condition, stimuli with the same color were associated with different actions, making shapes and appendages the rewarding features, while the color cue yielded a negative reward. These three conditions also indicated three levels of difficulty in rewarding feature extraction. In the consistent and conflict condition, the four stimuli shared two colors, making color cues more frequent and salient. For example, while the “cylinder” shape was associated with rewards twice during training, the color “red” might have been rewarded four times. The second primary modification in Experiment 2 was the incorporation of a probe stimulus during the testing stage; this stimulus was entirely new and had not been encountered during training. This probe was used to assess humans' ability to extract informative and rewarding features at a behavioral level. A more detailed introduction to the use of this probe design follows below, along with the presentation of our model's predictions. We reused the three models in Experiment 1, only adding a feature embedding function to encode perceptual information. Each of the three visual features was encoded into a five-dimensional one-hot code, where each dimension indicated a specific feature value. Each stimulus was represented by a combination of three such codes, concatenated into a 15-dimensional vector to form the model's input. We refer to the models used in Experiment 2 as feature RLPG (fRLPG; “Method-Models-fRLPG”), feature CPG (fCPG; “Method-Models-fCPG”), and feature ECPG (fECPG; “Method-Models-fECPG”) models to highlight their integration of the feature embedding construct. To evaluate the model's feature extraction ability, we analyzed the importance assigned to each feature by perturbing one feature dimension and measuring changes in representations52,53. A larger change in the representations indicated a higher feature importance (see “Method-Perturbation-based feature importance”). Therefore, in this experiment, if a model consistently assigns more importance to the predefined rewarding perceptual cue across all three conditions, we conclude that this model can effectively detect and extract rewarding features. Now that the stage is set, we can focus on answering three central research questions. First, does the principle of efficient coding drive a model to extract rewarding features? Second, if so, how can we validate that humans follow this principle in their learning processes? Third, how does the rewarding feature extraction interact with the state abstraction ability examined in Experiment 1? For the first question, we ran stimulations and showed that efficient coding does promote reward feature extraction. Driven by the need for simpler representations—reflected in a focus on fewer features—the fECPG model (\(\lambda\) = 0.2) must selectively assign more importance to the color cue in the consistent case; and less importance when the color becomes unrewarding (Fig. Conversely, in the conflict condition, the model had to first deemphasize the salient cue color, due to its negative rewards, and then reallocate importance to the other features contributing to positive rewards (Fig. The demand for simplicity drives the model to focus on a subset of features, and the goal of maximizing reward ensures that these focused features must be rewarding. In contrast, a model without efficient coding (\(\lambda=0\)) cannot adaptively reallocate feature importance during its interaction with the environment. The model exhibits nearly the same feature importance assignment for both consistent and conflict cases (Fig. 4A), indicating its inability to detect rewarding information. It is worth noting that the fECPG model unintuitively predicts that shape and appendage are rewarding features before training. We believe this is caused by our simplistic approach to encoder initialization. We will further elaborate this point in the discussion section. However, this observation does not undermine our conclusion. All panels are generated by averaging over 400 simulations. A Simulated feature importance along with training of the fECPG model with \(\lambda\) = 0, which collapses to the fCPG model. Note that the “shape” and “appendage” curves always overlap. Darker tiles denote higher values. The predictive policies applied to the probe stimuli are visualized in both a heatmap and a bar plot. E Learning and generalization performance for the fECPG model with different levels of simplicity, λ = 0, 07, 0.1, 0.2, 0.5. To address the second question, we adopted a “probe” design. In the consistent condition, where color was the most important feature, the probe stimulus should be perceived as similar to stimulus \(x\) (Fig. 4C, consistent, encoder), leading to a response that coincides with the one for stimulus \(x\) (Fig. In this scenario, it is expected that human participants will demonstrate a higher preference for actions \({a}_{1}\) and \({a}_{3}\) when responding to the probe stimuli. Conversely, in the conflict condition where color was neglected, the probe stimulus should be perceived as more similar to stimulus \(y^{\prime}\) (Fig. 4C, conflict, encoder), which should be also reflected in the response (Fig. In the control condition, given the lack of a dominant rewarding feature, the response to the probe stimulus should not show a strong preference, being distributed between those for stimuli \(x\) and \(y^{\prime}\) (Fig. For the third question, we observed that the efficiency of rewarding feature extraction extends or shortens the time it takes to form stable abstract states, influencing the agent's learning and generalization. 4D, consistent), enabling a high degree of generalization (Fig. However, in the control condition, where no salient cue dominates, the model experienced a slower state abstraction process (Fig. Consequently, the time available for policy decoding was reduced, resulting in poorer learning and generalization performance in the conflict case (Fig. To validate these model predictions, we collected behavioral data from 313 participants who each completed three task blocks corresponding to consistent, control, and conflict conditions. We fit all three feature-based models and found that both BIC and PXP preferred the fECPG model as the best model for capturing human behavioral data, consistent with the findings from Experiment 1 (Fig. Models were fit to all behavioral data of each participant in both training and testing stages. A Models' Bayesian information criterion (BIC) for each participant. The BICs for a fully random policy are represented by dashed lines. Participants with lower BIC scores generally exhibit better learning and generalization performance. See Table 2 for the exact value. B Protected exceedance probability (PXP) tallies for each model. C Learning and generalization performance of human participants and models for each experimental condition. Dashed lines indicate the 50% chance level. Error bars reflect the mean ± SD across 313 valid participants. More importantly, as the principle of efficient coding predicts, human participants exhibit different levels of generalization across experimental conditions. They achieved high untrained accuracy in the consistent condition, lower in the control condition, and lowest in the conflict condition (Fig. Beyond the overall trend, humans' generalization behaviors are also characterized by high variability. Some participants generalized effectively across all conditions, while others always negatively transferred their knowledge. This variability is also accurately captured by the fECPG model (Fig. 5C, red), but not by the two classical RL models without efficient coding (Fig. To our surprise, the ECPG model shows a significant advantage in predicting human learning performance—an area where classical RL models have traditionally been preferred. Participants demonstrated a more rapid improvement in the consistent condition than in the control and conflict conditions. Models were fit to all behavioral data of each participant in both training and testing stages. A Proportion of correct responses over the number of times that each association is shown. Dashed lines split the experiment into training and testing stages. C Correlation between model predictions and human responses to probe stimuli. The annotated value represents Spearman's correlation coefficient under different experimental conditions. The probe design further validated the human participant's ability to extract rewarding features as predicted by the efficient coding principle. Human participants' responses to the probe stimuli were consistent with the fECPG model predictions (Fig. 6B, C; Spearman's r > 0.60, p < 0.001 for all conditions; see “Method-Correlation between humans' and models' probe response” for the correlation calculation). In contrast, models without efficient coding, fRLPG and fCPG, failed to replicate such behavioral patterns (see Supplementary Note 1.5 for their probe responses), exhibiting significantly weaker correlations with human behavioral data (Fig. It is important to note that there is a discrepancy between our prediction and human behavior in the control condition. Human participants were likely to use the policy of stimulus \(x\) rather than a random policy in response to the probe. This phenomenon could have arisen from two potential factors. Participants learned two associations with stimulus \(x\) and one with stimulus \(y^{\prime}\) (with the other association tested in the testing stage), which implies that stimulus \(x\) was shown twice as frequently as was stimulus \(y^{\prime}\). Consequently, participants might have adaptively adjusted their encoding and decision-making on these statistics and placed more attention on stimulus \(x\). Second, the color feature might have been inherently more salient to humans. When the three features were equally informative, participants may have naturally prioritized the color feature. However, this gap does not undermine our conclusion that the fECPG model best captures human participants' responses to the probe stimulus. All evidence leads to one conclusion: during learning, humans strive to distill representations into their simplest and most essential forms. Driven by this goal, humans learn representations using a small subset of rewarding features within their environments. They further simplify these representations by abstracting them into compact, lower-dimensional internal states, which naturally leads to generalization. A potential argument is that the classical RL objective is still sufficient to explain human behavior once it is augmented with cognitive mechanisms at the algorithmic level. We oppose this view for two reasons. First, a range of current algorithmic-level models fail to capture human behaviors as effectively as the ECPG model (as detailed below). Second, the mechanisms embedded in these models inherently simplify representations, essentially pursuing efficient coding. We developed and compared three algorithmic-level models (Fig. The first model, the Latent Cause model45,46,54,55 (LC; “Method-Models-LC”) employs a hierarchical nonparametric Bayesian process to simulate human state abstraction. During the learning period, the LC model categorizes observed stimuli into latent clusters and learns the decision policy for these clusters. The second model, called the Memory-Association model (MA; “Method-Models-MA”), memorizes all stimuli and their preferred actions, establishing associations between stimuli that share the same actions. These associations facilitate the inference of correct actions in untrained tasks, thereby enabling generalization. The third model, Attention at Choice and Learning48,56,57 (ACL; “Method-Models-ACL”) learns the value of each feature and calculates the feature importance based on these values. The model uses a linearly weighted feature value for decision-making. Notably, the LC and MA models emphasize state abstraction ability, whereas the ACL model is designed to extract and prioritize rewarding features. Both abilities could emerge by optimizing for the efficient coding goal, but in a different computational formulation. A An overview of models across hierarchical levels. The central research question explores whether the human brain optimizes for efficient coding to enhance generalization. Below this, the ECPG model is contrasted with various algorithmic-level models (LC, MA, ACL), each designed with specific cognitive mechanisms. Additionally, the ECPG model is compared against several common machine learning regularizers (L2PG, L1PG, DCPG) that also aim to reduce model complexity, but through different methods. B Model comparisons for all models in terms of BIC and PXP in Experiment 2. Error bars reflect the mean ± SD across 313 participants. Refer to Supplementary Note 1.3 for the model comparison in Experiment 1. Additionally, see Supplementary Note 3 for an analysis of why other models perform worse in fitting. C Learning and generalization performances across all models in the control case of Experiment 2. Error bars reflect the mean ± SD across 313 participants. See Supplementary Note 1.4 for generalizations in other conditions. D Correlation between model predictions and human responses to probe stimuli at different experimental conditions. See Supplementary Note 1.5 for the bar plots. We tested these algorithmic-level models on Experiment 2, with a focus on two qualitative metrics: generalization in the control case to examine their latent cause abstraction ability (Fig. All three models underperformed the fECPG model in terms of BIC and PXP (Fig. The LC and MA models failed to account for human responses to probe stimuli (Fig. 7D) due to lacking feature extraction mechanism. 7C) as well as extracting rewarding features in both the control and conflict cases (Fig. 7D), because its feature importance calculations cannot deemphasize the negatively-rewarded feature effectively (see Supplementary Note 3.2 for further discussion). These results underscore the superior performance of the fECPG model, a computational-level model, in modeling human behaviors and support our hypothesis that human participants learn simplified representations when maximizing rewards. From a machine learning perspective, the fECPG model proposed here defines a regularized optimization objective. This raises a final question: can the efficient-coding term be substituted by other commonly used machine learning regularizers? We implemented an L1-Norm Policy Gradient (L1PG; “Method-Models-L1PG, L2PG, and DCPG”) and an L2-Norm Policy Gradient (L2PG), incorporating L1 or L2 norms as heuristic approximations for representation complexity. While the L1PG model underperformed, the L2PG model showed comparable performance to fECPG (Fig. Although a substantial portion ( ~ 36%) of participants were better described by the L2PG model, these participants displayed distinct behavioral dynamics: compared to participants better captured by the fECPG model, they tended to learn more slowly and showed weaker generalization (see Supplementary Note 3.3 for further details). This suggests that the ECPG model has a unique capability in capturing humans' fast learning and strong generalization patterns. For completeness, we also tested a Random Regularizer Policy Gradient (RNDPG; “Method-Models-RNDPG”), which injects noise into the encoder weights58, as well as a Decoder Complexity Policy Gradient (DCPG), which constrains decoder complexity. However, both models failed to generalize in the control condition or extracting rewarding features (Fig. Finally, we validated our conclusions by performing a model recovery analysis to test our ability to differentiate between models (Supplementary Note 1.6). Importantly, we found that the ECPG model can be uniquely distinguished from the other models. The low false positive rate (with other models unlikely to be misidentified as ECPG) indicates that the ECPG model's superior performance over the control models is not due to its expressiveness but to its accurate description of human behavior. Thus, these findings support our conclusion that the ECPG model, with its efficient coding-augmented RL objective, best accounts for human learning and generalization. The classical RL framework has limitations in terms of its ability to explain human representation learning and generalization. In this paper, we proposed augmenting the classical RL objective with the efficient coding principle: an intelligent agent should distill the simplest necessary representations that enable it to achieve its behavioral objectives. A computational-level model derived from the revised framework (Efficient Coding Policy Gradient; ECPG), predicts that an intelligent agent automatically learns to construct representations with a small set of rewarding features with the environment. These representations are further simplified by abstracting them into compact, lower-dimensional internal states, which naturally results in generalization. These predictions were validated in two behavioral experiments, where the ECPG model consistently provided a more accurate description of human behavior than two classical RL models without efficient coding as well as several published human representation learning models. These findings indicate that efficient coding offers a more suitable computational objective in understanding human behavior. In this paper, we examine whether the classical RL objective alone, or in combination with efficient coding, better aligns with Marr's computational level in explaining human behavior. A potential critique of our approach in section “The human brain optimizes efficient coding to enhance learning and generalization” is the lack of comparison with an alternative model capable of generalizing without representation simplification. However, we found no such model in the existing literature. This absence reflects the historical context of the acquired equivalence paradigm on which our study builds. Although generalization within this paradigm has long been documented59, previous explanations—including categorization60, stimulus association2, and selective attention61—are all encompassed by our efficient coding framework. In other words, algorithmic models based on selective attention, for example, inherently implement mechanisms predicted by the computational-level goal of efficient coding. Previous research using the acquired equivalence paradigm has demonstrated that people with schizophrenia62,63, mild Alzheimer's disease64, hippocampal atrophy43, and Parkinson's disease43 exhibit dysfunction in performing acquired equivalence task. The ECPG model, which provides a detailed computational representation of human learning and generalization within this paradigm, may offer a framework for investigating the cognitive and neural processes underlying these cognitive anomalies. However, this potential application of the ECPG model remains untested and requires empirical validation through experimental studies. Beyond serving as a better empirical model for human learning, the proposed computational objective could potentially represent a rational strategy (specifically, a resource-rational strategy; see below) for humans. The classical RL objective was designed to maximize expected reward in narrowly defined settings, where agents focus on learning a single, well-defined task17. However, humans live in more complex and dynamic real-world environments, where decision-making requires agents to generalize effectively from past experiences to earn rewards in unseen scenarios. Moreover, the human brain is innately capacity-constrained35; it has inherent limitations in processing and storing information, which requires the efficient use of cognitive resources. Therefore, learning simpler representations that facilitate generalization is a crucial component in the pursuit of maximizing reward in real-world decision-making. We believe that this insight can also improve learning and generalization in artificial intelligence operating under real-world conditions. The idea of linking RL to efficient coding has been applied to understand learning and generalization in various contexts22,42,65,66,67,68,69. For example, this approach has been shown to better explain monkeys' neural activity in frontal areas65, humans' risky choice behavior67, and meta-level generalization between tasks66. Here, we present a specific formalization of efficient coding using information-theoretic measures. We demonstrate that this approach provides a better empirical description of both human learning and generalization behaviors compared to several alternatives. Our study also helps bridge the gap between representation learning in the human brain and machine learning. In cognitive science, researchers have applied latent cause clustering (LC) and Association-Choice Learning (ACL) models to understand a variety of phenomena9. Selective attention, on the other hand, has been used to explain concept formation72, the evolution of beliefs73, and has received neural evidence from eye-tracking and functional Magnetic Resonance Imaging (fMRI) studies48,56. In machine learning, researchers have focused on how information-theoretic regularizers facilitate an artificial agent performing complex cognitive tasks. For example, information-theoretic regularizers may help an agent learn robust state abstractions that enhance learning speed74,75,76 and form a simple but informative world model77,78. Our study demonstrates that in a simple cognitive task, both mechanisms serve the unified objective of minimizing representation complexity, guided by an information-theoretic regularizer. This finding facilitates communication between the two fields and contributes to a unified research framework for understanding both machine and human intelligence. Building on this line of thought, we plan to extend the current framework in future research to more complex task settings, such as multi-step Markov Decision Processes (MDPs), and explore whether complex human behaviors like planning and multi-task learning align with the predictions of information-theoretic regularizers within machine learning. Recent research has suggested that human intelligence is more accurately described by the principle of resource-rationality79,80 than by the classical notion of rationality81. The combination of efficient coding and reward maximization principles applied in this study encapsulates the idea of resource-rationality, with reward maximization representing the notion of rationality and representation complexity representing computational costs. The basic idea is that information transmission in the brain incurs significant metabolic costs, thus minimizing representation complexity (a quantification of average information transmitted into the brain) serves as a reasonable proxy to minimize computational costs82. Notably, while numerous studies have employed resource-rationality to explain deviations from pure rationality in human behavior38,83,84, our research further emphasizes the advantages conferred by the principle, particularly in accounting for state abstraction, rewarding feature extraction, and generalization. The experimental protocol was approved by the University Committee on Activities involving Human Subjects at Rensselaer Polytechnic Institute (IRB-2055). Our experiment did not collect any demographical information from participants, including gender. The experiment included two types of pictures: alien and scene. For the alien pictures, we utilized the “greebles” stimuli reported by Gauthier and Tarr85,86 (http://www.tarrlab.org/). The original greeble stimuli are purple. We created several new variants by modifying their color. Regarding the scene pictures, we sampled from the “Places205” picture database, as reported in Zhou et al.87 (http://places.csail.mit.edu/downloadData.html). In the main experiment blocks, aliens and scenes were organized into sets. Six of these associations were trained during the association stage, while all eight were tested in the testing stage. It is important to note that the stimuli in the AE task are defined to be “superficially dissimilar”. In our experiment, the greeble stimuli within a block were required to have the same color but exhibit mutually different shapes and appendages. We recruited 302 participants from Amazon Mechanical Turk (MTurk)88. No statistical method was used to predetermine sample size. All participants gave informed consent before starting the experiment. Each participant completed two practice blocks. To ensure a comprehensive understanding of the experiment, participants were required to achieve at least 70% accuracy in the second practice block to progress to the main experimental stage. Those who did not meet this criterion were allowed to repeat the second practice block until they achieved the necessary performance level; otherwise, they could not proceed to the main experiment. Participants received a base payment of $2 plus a bonus of up to $3 based on their response accuracy in this 20-minute experiment. This project aimed to study generalization within the learning process, meaning participants who did not learn were outside the scope of this study. Consequently, we excluded 137 participants who failed a screening criterion (average accuracy lower than 60% for the last 24 trials, equating to 4 repetitions, in the training stage). All analyzes in Experiment 1 were conducted with the remaining 165 qualified participants. Each training trial comprised three screens. Following a 500 ms fixation screen, the trial presented an alien stimulus in the upper middle of the screen, along with photographs of scenes, offering one correct and one incorrect choice. Participants were instructed, “Which scene is associated with this alien?” and asked to respond by pressing the “F” or “J” key. These choices' left-right order was counterbalanced across trials. The stimulus screen remained visible for ten seconds, followed by a one-second feedback screen displaying either “Correct! The test stage trials were identical, except that no feedback was provided after responses. Each block consisted of a training stage, during which participants learned the stimulus-action associations, and a testing stage, during which participants were tested on the learned associations as well as an untrained generalization probe (the dashed associations). The training stage involved each association being trained ten times with feedback, resulting in 6 (associations) × 10 (repetitions) = 60 training trials. The testing stage tested both the trained and untrained associations six times, resulting in 8 (associations) × 6 (repetitions) = 48 testing trials. Participants were explicitly informed of the transition between the two experimental stages, and they were also reminded to keep and reapply their training experiences to achieve better performance. Before the main blocks, each participant was required to complete two practice blocks. The first practice block contained a simple trial-and-error learning task, where participants were trained to learn the correct answer through feedback. They were asked to correctly associate \(x-{a}_{1}\) and \(y-{a}_{2}\) without being asked to build any between-stimuli equivalence. This block provided a gentle introduction to the experiment, with ten trials and unlimited response time. The second practice served as a quiz. This block included a simplified version of the main training stage, where participants were presented with four stimuli but only required to choose from two actions. It contained 4 (associations) × 10 (repetitions) = 40 training trials. The practice blocks were designed to help participants learn to establish between-stimuli equivalence in preparation for the main experimental blocks and used similar materials as the main blocks. No statistical method was used to predetermine sample size. All participants gave informed consent prior to the experiment. Each participant completed two practice blocks. Those who did not meet this criterion were given the opportunity to repeat the second practice block until they reached the required accuracy. All participants received a $3 base payment plus up to a $4.5 bonus based on their response accuracy in this 30-minute experiment. We filtered the participants' data using the same screening criterion as in Experiment 1. A total of 184 participants were excluded because they did not achieve an average accuracy of 60% for the last 24 trials (equivalent to 4 repetitions) in the training stage. All analyzes in Experiment 2 were conducted with the remaining 313 qualified participants. This is because each participant in Experiment 1 completed two identical experimental blocks, while in Experiment 2, participants completed three different blocks, each corresponding to a different experimental condition. To ensure that each condition in Experiment 2 had a comparable amount of data to Experiment 1, we increased participant enrollment. After completing the same practice blocks as in Experiment 1, participants were required to complete three main experimental blocks: a consistent block, a control block, and a conflict block. The sequence of these blocks was counterbalanced among participants. Within each block, participants were required to complete a 60-trial training stage, which was the same as in Experiment 1. They then entered the testing stage, where they had to respond to eight regular testing associations plus an additional probe stimulus. Consequently, the testing stage comprised 9 (associations) × 6 (repetitions) = 54 trials. Note that, unlike the untrained associations, we did not predefine a correct answer for the probe stimulus. We simply record participants' responses and hope to uncover which feature people were attending to by analyzing the response distribution. To set the stage, we first formalize a dynamic decision process in the AE paradigm. For consistency, we adopt a notation system similar to that used in the experimental paradigm. We refer to a participant or decision maker as an agent. In each trial \(t\), an agent is presented with an alien stimulus \({s}_{t}\) from the set \(\{x,\,{x}^{{\prime} },{y},{y}^{\prime} \}\). Both the stimulus \(S\) and action \(A\) are defined as categorical variables. RLPG is a computational level model. In the AE experiment, an agent was required to choose from two possible actions; before receiving any feedback, each action had a 50% chance of being correct. The agent should have had a baseline estimation of reward, denoted as \(b\), prior to making a decision. An action is considered positive when it yields a reward higher than the baseline and negative when the reward is lower. 4 to include this baseline reward estimation \(b\), In this AE task, we assumed \(b\) = 0.5, corresponding to an expected reward of 0.5 (reward of 1 with 50% probability). The objective function can theoretically be tackled by any RL algorithm, but we have chosen a particular approach for its simplicity: the policy gradient method. We assume the policy follows a parameterized softmax distribution, transforming the optimization problem into a parameter search: Here, \(\phi\) is a 4-by-4 table (4 stimuli by 4 action). See Supplementary Note 2.1 for a graphical illustration of the model architecture. Let \(J\left(\phi \right)={\max }_{\phi }E\left[r\left({s}_{t},{a}_{t}\right)-b\right]\), then the policy parameters were updated based on the gradient of the objective function \({\nabla }_{\phi }J(\phi )\), where \({\alpha }_{\pi }\ge 0\) is the learning rate of policy \(\pi\). This policy learning rate is the only parameter in the RL baseline model. Equation 7 updates the policy via its gradient, which gives the name “policy gradient”. We have derived the analytical gradient for both models and verified the derivation using pyTorch package89. See supplementary material for detailed derivation. There are two remarks related to this simple model. First, though not explicitly shown, the RLPG assumes a perfect representation that fully reconstructs the stimulus. If we construct a model that explicitly includes the representation \(z\) and assume that each stimulus \(s\) deterministically maps to a unique representation \(z\), the model nevertheless collapses to the RLPG model described above. Second, the RLPG model introduced in this study behaves similarly to the classic Q-learning model which is extensively used in psychology25. The most significant advantage of RLPG is its simplicity. This allows for a more effective distillation of the computational essence underlying representation compression. The ECPG model is designed with a dual computational goal: to maximize reward while minimizing representation complexity. When \(\lambda=0\), the agent does not compress stimuli representations for simplicity, focusing solely on reward maximization. Conversely, as \(\lambda \to \infty\), the agent learns the simplest set of representations, encoding all stimuli into a single, identical representation. Therefore, an optimal \(\lambda\) balances compression and oversimplification. The introduction of latent representation \(z\) divides the policy into an encoder, \(\psi\), and a decoder, \(\rho\), both of which are optimized according to Eq. Like the RLPG, we solve Eq. The encoder parameter θ is a 4-by-4 table (4 stimuli by 4 representations) and the decoder parameter \(\phi\) is also a 4-by-4 table (See Supplementary Note 2.3.3 for a graphical illustration of the model architecture). We iteratively update the encoder and decoder to optimize Eq. The first two optimization problems were solved using gradient ascent with learning rate parameters, \({\alpha }_{\psi }\) and \({\alpha }_{\rho }\). The prior representation probability \(p(z)\) was updated according to the definition of marginal probability. In practice, we also experimented with updating the prior in the gradient formula but found it made no significant difference in modeling human behavior. Therefore, we adopted the current scheme to reduce the number of free parameters. In this article, \(z\) is a categorical variable that shares the same sample space as the stimulus. The encoder parameters \(\theta\) are initialized by passing the product of an identity matrix and an initial value through a softmax function, We pretrained the encoders to reach 99% discrimination accuracy by tuning the initial value \({\theta }_{0}\). See “Method-Pretrain an encoder” for more details. In summary, the ECPG model has three parameters: an encoder learning rate \({\alpha }_{\psi }\), a decoder learning rate \({\alpha }_{\rho }\), a simplicity parameter \(\lambda\). The ECPG model's encoder acts as a generative component, similar to the encoder in the beta variational autoencoder (\(\beta\)VAE) as described by Higgins et al.13 but with a categorical hidden layer instead of a continuous Gaussian distribution. This design facilitates the computation of mutual information and the quantification of representation complexity, building upon the work of Lu et al.90. The feature-based models we developed are extensions of the models previously introduced. The primary difference lies in the incorporation of a feature embedding function \({{{\mathcal{F}}}}\) that maps a stimulus \(s\) onto a set of features \(f\). We crafted a feature embedding function to decompose a greeble stimulus into three distinct features: shape, color, and appendage, using “one-hot encoding” for clear differentiation. The feature function \({{{\mathcal{F}}}}\) each input stimulus with its one-hot code and concatenates these codes into a 15-dimensional vector \(f\), which serves as the model's input (See Supplementary Fig. We modified the policy of the fRLPG model to create a feature-based baseline RL model that does not compress stimuli. This model proposes that visual similarity alone could account for human generalization performance, without the need for a representation compression mechanism. With the feature embedding function \({{{\mathcal{F}}}}\) defined, the policy at trial \(t\) can be expressed as follows, The parameter \(\phi\) now is a 15-by-4 table. 7, and actions are selected by sampling from the softmax policy, Eq. We modified the ECPG encoder to include the feature embedding function and preserved the previous ECPG decoder formulation (See Supplementary Fig. The parameter \(\theta\) now is a 15-by-4 table. A new challenge we faced was initializing the encoder parameters for the feature-based model. The previous method, which relied on an identity matrix, was no longer suitable because stimuli with overlapping features naturally appear more similar. For instance, a purple greeble should be more similar to another purple greeble than to a yellow one. To address this, we introduced a new initialization technique: First, we measured the visual similarity between a stimulus \(s\) and all possible stimuli \(z\) (including stimuli \(s\) itself) by calculating the dot product of their feature embeddings, Second, we multiplied these similarity scores by a scalar \({\theta }_{0}\) and passed them through a softmax function to form the representation of stimuli \(s\), As before, the initial value \({\theta }_{0}\) is tuned through pretraining. This value controls the perceived similarity between stimuli. When \({\theta }_{0}\) is small, stimuli with overlapping features look similar. Finally, we used the representation \(\bar{\psi }\left(z,|,s\right)\) as supervised labels to train the model encoder \(\psi \left(z{{{\mathcal{F}}}}\left(s\right),\,\theta \right)\) by minimizing their cross-entropy loss The subsequent learning and decision-making processes are consistent with the original ECPG model. The LC model is an algorithmic-level model, adopted and modified from Gershman et al.54. The central idea of the LC model is to use a non-parametric Bayesian process—Chinese Restaurant Process—to model the cognitive process of latent-cause clustering. The original model cannot be directly applied to the acquired equivalence task, as it is a model of associative learning, not instrumental learning. We modified the model to learn a stimulus-action value function that allows instrumental learning. See more implementation details in Supplementary Note 2.4. The MA model is an algorithmic-level model that combines memory and association mechanisms. It memorizes stimuli-action pairs and forms associations between stimuli with shared actions or features, using these associations to infer actions for untrained tasks. Stimuli with salient shared features (like “color”) are more easily associated. See more implementation details in Supplementary Note 2.5. The ACL model is an algorithmic-level model that model humans' rewarding feature extraction ability using a linear selective attention mechanism. The model was from Leong et al.56 with two modifications. First, the original ACL model was developed on a different paradigm and could not be directly applied to the current generalization task. We modified the ACL model to include a feature-action value \(Q(f,a)\) design as in Ballard et al.57. Second, instead of using the attention weights calculated from the eye-tracking and functional MRI data, we estimated the attention weight using an attention model. The original authors constructed two types of models to examine the bidirectional relationship between learning and attention. The “choice models” utilize attention data, collected through eye tracking and fMRI, to predict human behaviors. Conversely, the “attention models” use human behavioral data as input and predict the recorded attention data. We chose this approach because our study lacks attention data, such that we have to use the best attention model to provide reasonable estimation of the attention weights. See more implementation details in Supplementary Note 2.6. The three models have similar computational goals with the ECPG model, except that the representation complexity \({I}^{\psi }({S;Z;}\theta )\) terms in Eqs. 8 and 13 was respectively replaced by L1 norm (\({\left|\left|\theta \right|\right|}_{1}\)), L2 norm (\({\left|\left|\theta \right|\right|}_{2}\)), and decoder complexity \({I}^{\rho }({Z;A;}\phi )\). The RNDPG considers replacing the mutual information regularizer \({I}^{\psi }\left({S;Z}\right)\) with a random noise penalty \({R}^{\psi }(\varepsilon )\) on the encoder weights, where \({R}^{\psi }\left(\varepsilon \right)\) is a Gaussian noise injected to the encoder weights58. Unlike the other regularized policy gradient models, such as ECPG and L1PG, there is no close-form solution for the RNDPG model. Instead, we implemented this model using a sampling method. Please see more implementation details in Supplementary Note 2.7. AE describes the phenomenon where generalization between two “superficially dissimilar” stimuli increases after they have been paired with the same actions. To ensure their dissimilarity, we selected stimuli that are easily distinguishable by human participants. We operationally defined this dissimilarity by setting a criterion: all four input stimuli must be classifiable with an accuracy of 99%. In order to accurately model human behaviors in the AE task, all models should undergo pretraining to achieve this level of discrimination accuracy. This is because all models' encoders were specially designed such that they can be initialized once the \({\theta }_{0}\) is decided, Given these constructs, we can search for an appropriate \({\theta }_{0}\) by addressing the following objective, Addressing this optimization objective, we initialized ECPG and CPG model using \({\theta }_{0}^{*}=\,5.232\). A standard RL problem considers decision-making as sampling an action from the policy \(\pi \left(a,|,{s}_{t}\right)\), a categorical distribution over the possible action space. In the AE problem, the possible action space varied from trial to trial. To run both RL-base models in the AE task, we applied a technique called invalid action masking91. The simplest masking is to add a large negative number \(\zeta\) (in this work, \(\zeta=-1e12\)) to logits of the actions that are not presented in the current trial. That is, when an RLPG agent needs to choose between \({a}_{1}\) and \({a}_{2}\), we can calculate its renormalized policy as, We then sampled from this renormalized policy to model human decision-making. For the ECPG and CPG models, we masked and re-normalized the decoder for all representations \(z\), This method calculates the importance of each feature by theorizing that if an agent focuses heavily on a particular feature, then a minor perturbation in that feature might lead to significant changes in the output. This perturbation-based importance has been applied to extract measures of attention from large-scale deep reinforcement learning models in artificial intelligence, which was shown to be similar to human eye-tracking attention data53. For each model, we estimated its free parameters separately for each subject, using all behavioral data from both the training and testing trials without cross-validation. This approach is consistent with many previous human learning studies, which are often structured with 2 to 4 parallel blocks due to various practical constraints e.g84,93,94,95. Given the insufficient number of blocks, these studies, including ours, do not meet the prerequisites for effective cross-validation. The parameters were estimated via maximum a posteriori (MAP): \(N\) is the number of trials for each participant. \({s}_{i}\) and \({a}_{i}\) are the presented stimuli and human responses recorded on each trial. We selected a very flat prior \(p\left(\xi \right)={{{\rm{Halfnorm}}}}(0,\,50)\) for all parameters with a range of \((0,\infty )\) only to avoid extreme parameter values without biasing estimation. This prior is uninformative yet ensures that parameter estimates remain within a reasonable range. Parameter estimation was performed using the BFGS algorithm, implemented with the Python package scipy.minimize. For each participant, we ran the algorithm with 50 different randomly chosen parameter initializations to avoid local minima in the non-convex landscape. We simulated the ECPG model's learning and generalization behaviors by varying \(\lambda\), while keeping the other two learning rate parameters constant. In Experiment 1, the learning rate of the encoder was fixed at \({\alpha }_{\psi }=40\), and that of the decoder was fix at \({\alpha }_{\rho }=4\). In Experiment 2, the learning rate of the encoder was fixed at \({\alpha }_{\psi }=8\), and that of the decoder was fix at \({\alpha }_{\rho }=4\). For each participant within a block, we calculated the frequency for each action as an estimation of human probe policy. We applied the same method to the simulated data to obtain models' probe policy. Subsequently, we computed the Spearman's correlation between human participants and models based on the probability of selecting actions \({a}_{1}\) and \({a}_{3}\). These two actions sufficiently characterize a policy. Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article. The source code for this study is publicly available on Github: https://doi.org/10.5281/zenodo.15087038. Shepard, R. N. Toward a universal law of generalization for psychological science. Shohamy, D. & Wagner, A. D. Integrating memories in the human brain: hippocampal-midbrain encoding of overlapping events. Sims, C. R. Efficient coding explains the universal law of generalization in human perception. Li, F-F. et al. A Bayesian approach to unsupervised one-shot learning of object categories. in proceedings ninth IEEE international conference on computer vision. Asadi, A., Abbe, E. & Verdú, S. Chaining mutual information and tightening generalization bounds. Pensia, A., Jog, V. & Loh, P.-L. Generalization error bounds for noisy, iterative algorithms. in 2018 IEEE International Symposium on Information Theory. & Vincent, P. Representation learning: a review and new perspectives. & Summerfield, C. Orthogonal representations for robust context-dependent task performance in brains and neural networks. Higgins, I. et al. Beta-vae: Learning Basic Visual Concepts With A Constrained Variational Framework. in International conference on learning representations (2016). & Littman, M. L. Towards a unified theory of state abstraction for MDPs. in Proceedings of the International Symposium on Artificial Intelligence and Mathematics (2006). Shwartz-Ziv, R. Information flow in deep neural networks. Tishby, N., Pereira, F. C. & Bialek, W. The information bottleneck method. In 37th Annual Allerton Conference on Communication, Control, and Computing 368–377 (Springer, 2000). Sutton, R. S. & Barto, A. G. Reinforcement learning: An introduction. Silver, D., Singh, S., Precup, D. & Sutton, R. S. Reward is enough. Ribas-Fernandes, J. J. et al. A neural signature of hierarchical reinforcement learning. Xia, L. & Collins, A. G. E. Temporal and state abstractions for efficient learning, transfer, and composition in humans. Tomov, M. S., Schulz, E. & Gershman, S. J. Multi-task reinforcement learning in humans. & Zhu, L. Neurocomputational mechanism of real-time distributed learning on social networks. Daw, N. D., Gershman, S. J., Seymour, B., Dayan, P. & Dolan, R. J. Model-based influences on humans' choices and striatal prediction errors. Expertise increases planning depth in human gameplay. Barto, A. G. et al. Adaptive Critics And The Basal Ganglia. A framework for mesencephalic dopamine systems based on predictive Hebbian learning. Niv, Y. Reinforcement learning in the brain. Schultz, W., Dayan, P. & Montague, P. R. A neural substrate of prediction and reward. & Daw, N. D. Reinforcement learning and episodic memory in humans and animals: an integrative framework. Mnih, V. et al. Human-level control through deep reinforcement learning. Ho, M. K. et al. People construct simplified mental representations to plan. Possible principles underlying the transformation of sensory messages. The magical number seven plus or minus two: some limits on our capacity for processing information. A Bayesian observer model constrained by efficient coding can explain ‘anti-Bayesian' percepts. Natural image statistics and neural representation. Sims, C. R. Rate-distortion theory and human perception. & Knill, D. C. An ideal observer analysis of visual working memory. Bates, C. J., Lerch, R. A., Sims, C. R. & Jacobs, R. A. Adaptive allocation of human visual working memory capacity during statistical and categorical learning. & Barto, A. Reinforcement learning, efficient coding, and the statistics of natural tasks. Myers, C. E. et al. Dissociating medial temporal and basal ganglia memory systems with a latent learning task. Meeter, M., Shohamy, D. & Myers, C. E. Acquired equivalence changes stimulus representations. Collins, A. G. E. & Frank, M. J. Neural signature of hierarchically structured expectations predicts clustering and transfer of rule sets in reinforcement learning. Collins, A. G. & Frank, M. J. Cognitive control over learning: creating, clustering, and generalizing task-set structure. Marr, D. Vision: A computational investigation into the human representation and processing of visual information. Niv, Y. et al. Reinforcement learning in multidimensional environments relies on attention mechanisms. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Rousseeuw, P. J. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. & Daunizeau, J. Bayesian model selection for group studies—revisited. & Fern, A. Visualizing and understanding atari agents. in International conference on machine learning. Guo, S. S. et al. Machine versus human attention in deep reinforcement learning tasks. The computational nature of memory modification. Leong, Y. C., Radulescu, A., Daniel, R., DeWoskin, V. & Niv, Y. Dynamic Interaction between Reinforcement Learning and Attention in Multidimensional Environments. Ballard, I., Miller, E. M., Piantadosi, S. T., Goodman, N. D. & McClure, S. M. Beyond reward prediction errors: human striatum updates rule values during learning. & Han, B. Regularizing deep neural networks by noise: Its interpretation and optimization. Bonardi, C., Graham, S., Hall, G. & Mitchell, C. Acquired distinctiveness and equivalence in human discrimination learning: evidence for an attentional process. Farkas, M. et al. Associative learning in deficit and nondeficit schizophrenia. Keri, S., Nagy, O., Kelemen, O., Myers, C. E. & Gluck, M. A. Dissociation between medial temporal lobe and basal ganglia memory systems in schizophrenia. & Keri, S. Associative learning, acquired equivalence, and flexible generalization of knowledge in mild Alzheimer disease. Berger, S. & Machens, C. K. Compact task representations as a normative model for higher-order brain activity. Franklin, N. T. & Frank, M. J. Generalizing to generalize: humans flexibly switch between compositional and conjunctive structures during reinforcement learning. On the normative advantages of dopamine and striatal opponency for learning and choice. Luettgau, L. et al. Decomposing dynamical subprocesses for compositional generalization. & Cikara, M. Structure learning principles of stereotype change. Lehnert, L., Littman, M. L. & Frank, M. J. Reward-predictive representations generalize across tasks in reinforcement learning. Mack, M. L., Love, B. C. & Preston, A. R. Dynamic updating of hippocampal object representations reflects new conceptual knowledge. Markovic, D., Glascher, J., Bossaerts, P., O'Doherty, J. Modeling the evolution of beliefs using an attentional focus mechanism. Konidaris, G. On the necessity of abstraction. Chelombiev, I., Houghton, C. & O'Donnell, C. Adaptive estimators show information compression in deep neural networks. International Conference on Learning Representations (ICLR, 2019). Representation learning in deep RL via discrete information bottleneck. 26th International Conference on Artificial Intelligence and Statistics 8699–8722 (AISTATS, 2023). Rakelly, K., Gupta, A., Florensa, C. & Levine, S. Which Mutual-Information Representation Learning Objectives are Sufficient for Control? Ferns, N. & Precup, D. Bisimulation metrics are optimal value functions. B. Computational rationality: a converging paradigm for intelligence in brains, minds, and machines. Griffiths, T. L., Lieder, F. & Goodman, N. D. Rational use of cognitive resources: levels of analysis between the computational and the algorithmic. Theory of games and economic behavior, 2nd rev. & Pezzulo, G. An information-theoretic perspective on the costs of cognition. Lieder, F. & Griffiths, T. L. Resource-rational analysis: understanding human cognition as the optimal use of limited computational resources. Origin of perseveration in the trade-off between reward and complexity. Becoming a “Greeble” expert: exploring mechanisms for face recognition. Gauthier, I., Tarr, M. J., Anderson, A. W., Skudlarski, P. & Gore, J. C. Activation of the middle fusiform ‘face area' increases with expertise in recognizing novel objects. Learning deep features for scene recognition using places database. in Proceedings of the 28th International Conference on Neural Information Processing Systems - 1, 487–495 (2014). Crowston, K. Amazon mechanical turk: A research tool for organizations and information systems scholars. in Shaping the Future of ICT Research. Methods and Approaches: IFIP WG 8.2, Working Conference, Tampa, FL, USA, 210–221 (2012). Paszke, A. et al. Pytorch: An imperative style, high-performance deep learning library. Lu, X., Lee, K., Abbeel, P. & Tiomkin, S. Dynamics generalization via information bottleneck in deep reinforcement learning. Huang, S. & Ontañón, S. A closer look at invalid action masking in policy gradient algorithms. Thirty-Fifth International Florida Artificial Intelligence Research Society Conference (2022). Fisher, A., Rudin, C. & Dominici, F. All models are wrong, but many are useful: learning a variable's importance by studying an entire class of prediction models simultaneously. Daw, N. D. Trial-by-trial data analysis using computational models. Browning, M., Behrens, T. E., Jocham, G., O'Reilly, J. X. Anxious individuals have difficulty learning the causal statistics of aversive environments. Rac-Lubashevsky, R., Cremer, A., Collins, A. G. E., Frank, M. J. & Schwabe, L. Neural index of reinforcement learning predicts improved stimulus-response retention under high working memory load. We thank Isabel Gauthier and Michael Tarr for granting permission to use their “greebles” stimuli, originally sourced from http://www.tarrlab.org/. We also thank Bolei Zhou et al. for making their data set publicly available at http://places.csail.mit.edu/downloadData.html. This research is supported by the National Natural Science Foundation of China (32441102 [Z.F. ]), and the China Postdoctoral Science Foundation (2024M761999 [Z.F. Brain Health Institute, National Center for Mental Disorders, Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine and School of Psychology, Shanghai, 200030, China Key Laboratory of Brain-Machine Intelligence for Information Behavior-Ministry of Education, Shanghai International Studies University, Shanghai, China Department of Cognitive Science, Rensselaer Polytechnic Institute, Troy, NY, USA You can also search for this author inPubMed Google Scholar You can also search for this author inPubMed Google Scholar Both authors designed research, analyzed data, discussed the results and wrote the paper. The authors declare no competing interests. Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. A peer review file is available. Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/. Humans learn generalizable representations through efficient coding. Anyone you share the following link with will be able to read this content: Sorry, a shareable link is not currently available for this article. Provided by the Springer Nature SharedIt content-sharing initiative Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.