The foundational deep-inelastic scattering experiments at the SLAC linear collider in the late 1960s and early 1970s demonstrated the presence inside the proton of point-like constituents, soon identified with quarks, the elementary particles that interact and are bound inside the proton by gluons, the carriers of the strong nuclear force. It was rapidly clear, and confirmed in detail by subsequent studies, that these point-like constituents, collectively called partons by Feynman^{7}, include the up and down quarks that carry the proton quantum numbers, but also gluons, as well as an infinite number of pairs of quarks and their antimatter counterparts, antiquarks. The description of electron–proton and proton–proton collisions at high momentum transfers in terms of collisions between partons is now rooted in the theory of quantum chromodynamics (QCD), and it provides the basis of modern-day precision phenomenology at proton accelerators such as the Large Hadron Collider (LHC) of CERN^{8} as well as for future facilities including the Electron–Ion Collider^{9}, the Forward Physics Facility^{10} and neutrino telescopes^{11}.

Knowledge of the structure of the proton, which is necessary to obtain quantitative prediction for physics processes at the LHC and other experiments, is encoded in the distribution of momentum carried by partons of each type (gluons, up quarks, down quarks, up antiquarks and so on): parton distribution functions (PDFs). These PDFs could, in principle, be computed from first principles, but in practice even their determination from numerical simulations^{12} is extremely challenging. Consequently, the only strategy available at present for obtaining the reliable determination of the proton PDFs that is required to evaluate LHC predictions is empirical, through the global analysis of data for which precise theoretical predictions and experimental measurements are available, so that the PDFs are the only unknown^{8}.

Although this successful framework has by now been worked through in great detail, several key open questions remain open. One of the most controversial of these concerns the treatment of so-called heavy quarks (that is, those whose mass is greater than that of the proton; *m*_{p} = 0.94 GeV). Indeed, virtual quantum effects and energy–mass considerations suggest that the three light quarks and antiquarks (up, down and strange) should all be present in the proton wavefunction. Their PDFs are therefore surely determined by the low-energy dynamics that controls the nature of the proton as a bound state. However, it is well known^{8,13,14,15} that in high enough energy collisions all species of quarks can be excited and hence observed inside the proton, so their PDFs are nonzero. This excitation follows from standard QCD radiation, and it can be computed accurately in perturbation theory.

However, then the question arises of whether heavy quarks also contribute to the proton wavefunction. Such a contribution is called intrinsic, to distinguish it from that computable in perturbation theory, which originates from QCD radiation. Already since the dawn of QCD, it was argued that all kinds of intrinsic heavy quark must be present in the proton wavefunction^{16}. In particular, it was suggested^{1} that the intrinsic component could be non-negligible for the charm quark, whose mass (*m*_{c} ≃ 1.51 GeV) is of the same order of magnitude as the mass of the proton.

This question has remained highly controversial, and indeed recent dedicated studies have resulted in disparate claims, from excluding momentum fractions carried by intrinsic charm larger than 0.5% at the 4-standard-deviation (4*σ*) level^{17} to allowing up to a 2% charm momentum fraction^{18}. A particularly delicate issue in this context is that of separating the radiative component: finding that the charm PDF is nonzero at a low scale is not sufficient to argue that intrinsic charm has been identified.

Here we present a resolution of this four-decade-long conundrum by providing unambiguous evidence for intrinsic charm in the proton. This is achieved by means of a determination of the charm PDF (ref. ^{3}) from an extensive hard-scattering global dataset, using state-of-the-art perturbative QCD calculations^{19}, adapted to accommodate the possibility of massive quarks inside the proton^{4,20,21}, and sophisticated machine learning techniques^{3,22,23}. This determination is performed at next-to-next-to-leading order (NNLO) in an expansion in powers of the strong coupling, *α*_{s}, which represents the precision frontier for collider phenomenology.

The charm PDF determined in this manner includes a radiative component, and indeed it depends on the resolution scale: it is given in a four-flavour number scheme (4FNS), in which up, down, strange and charm quarks are subject to perturbative radiative corrections and mix with each other and the gluon as the resolution is increased. The intrinsic charm component can be disentangled from it as follows. First, we note that in the absence of an intrinsic component, the initial condition for the charm PDF is determined using perturbative matching conditions^{24}, computed up to NNLO in ref. ^{25}, and recently (partly) extended up to next-to-next-to-next-to-leading order (N^{3}LO; refs. ^{26,27,28,29,30,31,32,33,34}). These matching conditions determine the charm PDF in terms of the PDFs of the 3FNS, in which only the three lightest quark flavours are radiatively corrected. Hence, this perturbative charm PDF is entirely determined in terms of the three light quarks and antiquarks and the gluon. However, the 3FNS charm quark PDF needs not vanish: in fact, if the charm quark PDF in the 4FNS is freely parametrized and thus determined from the data^{4}, the matching conditions can be inverted. The 3FNS charm PDF thus obtained is then by definition the intrinsic charm PDF: indeed, in the absence of intrinsic charm it would vanish^{21}. Thus, unlike the 4FNS charm PDF, which includes both an intrinsic and a radiative component, the 3FNS charm PDF is purely intrinsic. In this work we have performed this inversion at NNLO (ref. ^{25}) as well as at N^{3}LO (refs. ^{26,27,28,29,30,31,32,33,34}), which—as we shall see—provides a handle on the perturbative uncertainty of the NNLO result.

Our starting point is the NNPDF4.0 global analysis^{3}, which provides a determination of the sum of the charm and anticharm PDFs, namely (c^+(x,Q)equiv c(x,Q)+barc(x,Q)), in the 4FNS. This can be viewed as a probability density in *x*, the fraction of the proton momentum carried by charm, in the sense that the integral over all values of 0 ≤ *x* ≤ 1 of *x**c*^{+}(*x*) is equal to the fraction of the proton momentum carried by charm quarks, although note that PDFs are generally not necessarily positive definite. Our result for the 4FNS *x**c*^{+}(*x*, *Q*) at the charm mass scale, *Q* = *m*_{c} with *m*_{c} = 1.51 GeV, is shown in Fig. 1 (left). The ensuing intrinsic charm is determined from it by transforming to the 3FNS using NNLO matching. This result is also shown in Fig. 1 (left). The bands indicate the 68% confidence level interval associated with the PDF uncertainties (PDFU) in each case. Henceforth, we will refer to the 3FNS *x**c*^{+}(*x*, *Q*) PDF as the intrinsic charm PDF.

The intrinsic (3FNS) charm PDF exhibits a characteristic valence-like structure at large *x* peaking at *x* ≃ 0.4. Although intrinsic charm is found to be small in absolute terms (it contributes less than 1% to the proton total momentum), it is significantly different from zero. Note that the transformation to the 3FNS has little effect on the peak region, because there is almost no charm radiatively generated at such large values of *x*: in fact, a very similar valence-like peak is already found in the 4FNS calculation.

As at the charm mass scale the strong coupling *α*_{s} is rather large, the perturbative expansion converges slowly. To estimate the effect of missing higher-order uncertainties (MHOU), we have also performed the transformation from the 4FNS NNLO charm PDF determined from the data to the 3FNS (intrinsic) charm PDF at one order higher, namely at N^{3}LO. The result is also shown Fig. 1 (left). Reassuringly, the intrinsic valence-like structure is unchanged. On the other hand, it is clear that for *x* ≲ 0.2 perturbative uncertainties become very large. We can estimate the total uncertainty on our determination of intrinsic charm by adding in quadrature the PDFU and a MHOU estimated from the shift between the result found using NNLO and N^{3}LO matching.

This procedure leads to our final result for intrinsic charm and its total uncertainty (shown in Fig. 1, right). The intrinsic charm PDF is found to be compatible with zero for *x* ≲ 0.2: the negative trend seen in Fig. 1 with PDFU becomes compatible with zero only on inclusion of theoretical uncertainties. However, at larger *x*, even with theoretical uncertainties the intrinsic charm PDF differs from zero by about 2.5*σ* in the peak region. This result is stable on variations of dataset, methodology (in particular the PDF parametrization basis) and values of parameters (specifically the charm mass) of the standard model, as demonstrated in Supplementary Sections C and D.

Our determination of intrinsic charm can be compared to theoretical expectations. Subsequent to the original intrinsic charm model of ref. ^{1} (BHPS model), a variety of other models were proposed^{5,35,36,37,38} (see ref. ^{2} for a review). Irrespective of their specific details, most models predict a valence-like structure at large *x* with a maximum located between *x* ≃ 0.2 and *x* ≃ 0.5, and a vanishing intrinsic component for *x* ≲ 0.1. In Fig. 1 (right), we compare our result to the original BHPS model and to the more recent meson/baryon cloud model of ref. ^{5}.

As these models predict only the shape of the intrinsic charm distribution, but not its overall normalization, we have normalized them by requiring that they reproduce the same charm momentum fraction as our determination. We find remarkable agreement between the shape of our determination and the model predictions. In particular, we reproduce the presence and location of the large-*x* valence-like peak structure (with better agreement, of marginal statistical significance, with the meson/baryon cloud calculation), and the vanishing of intrinsic charm at small *x*. The fraction of the proton momentum carried by charm quarks that we obtain from our analysis, used in this comparison to models, is (0.62 ± 0.28)% including PDFU alone (see Supplementary Section E for details). However, the uncertainty on inclusion of MHOU greatly increases, and we obtain (0.62 ± 0.61)%, due to the contribution from the small-*x* region, *x* ≲ 0.2, where the MHOU is very large (Fig. 1, right). Note that in most previous analyses^{18} (see Supplementary Section F) intrinsic charm models (such as the BHPS model) are fitted to the data, with only the momentum fraction left as a free parameter.

We emphasize that in our analysis the charm PDF is entirely determined by the experimental data included in the PDF determination. The data with the most impact on charm are from recently measured LHC processes, which are both accurate and precise. As these measurements are made at high scales, the corresponding hard cross-sections can be reliably computed in QCD perturbation theory.

Independent evidence for intrinsic charm is provided by the very recent LHC beauty (LHCb) measurements of *Z*-boson production in association with charm-tagged jets in the forward region^{6}, which were not included in our baseline dataset. This process, when measured in terms of the ratio (mathcalR_rmj^rmc) of charm-tagged jets normalized to flavour-inclusive jets, is directly sensitive to the charm PDF (ref. ^{39}), and with LHCb kinematics also in the kinematic region where the intrinsic component is relevant. Following refs. ^{6,39}, we have evaluated (mathcalR_rmj^rmc) at NLO (refs. ^{40,41}; see Supplementary Section G for details), both with our default PDFs that include intrinsic charm, and also with an independent PDF determination in which intrinsic charm is constrained to vanish identically, so charm is determined by perturbative matching (see Supplementary Section B).

In Fig. 2 (top left) we compare the LHCb measurements of (mathcalR_rmj^rmc), provided in three bins of the *Z*-boson rapidity *y*_{Z}, with the theoretical predictions based on both our default PDFs and the PDF set in which we impose the vanishing of intrinsic charm. In Fig. 2 (top right) we also show the correlation coefficient between the charm PDF at *Q* = 100 GeV and the observable (mathcalR_rmj^rmc), demonstrating how this observable is highly correlated to charm in a localized *x* region that depends on the rapidity bin. It is clear that our prediction is in excellent agreement with the LHCb measurements, and in particular for the highest-rapidity bin, which is highly correlated to the charm PDF in the region of the observed valence peak *x* ≃ 0.45, the prediction obtained by imposing the vanishing of intrinsic charm undershoots the data at the 3*σ* level. Hence, this measurement provides independent direct evidence in support of our result.

We have also determined the impact of these LHCb *Z* + charm measurements on the charm PDF. As the experimental covariance matrix is not available, we have considered two limiting scenarios in which the total systematic uncertainty is either completely uncorrelated (*ρ*_{sys} = 0) or fully correlated (*ρ*_{sys} = 1) between rapidity bins. The charm PDF in the 4FNS before and after inclusion of the LHCb data (with either correlation model), and the intrinsic charm PDF obtained from it, are shown in Fig. 2 (centre left and right, respectively). The bands account for both PDFU and MHOU. The results show full consistency: inclusion of the LHCb ({mathcalR}_{rmj}^{rmc}) data leaves the intrinsic charm PDF unchanged, while moderately reducing the uncertainty on it.

In the past, the main indication for intrinsic charm came from EMC data^{42} on deep-inelastic scattering with charm in the final state^{43}. These data are relatively imprecise, their accuracy has often been questioned, and they were taken at relatively low scales for which radiative corrections are large. For these reasons, we have not included them in our baseline analysis. However, it is interesting to assess the impact of their inclusion. Results are shown in Fig. 2 (bottom left), which shows the intrinsic charm PDF before and after inclusion of the EMC data. As in the case of the LHCb data, we find full consistency: unchanged shape and a moderate reduction of uncertainties.

We can summarize our results through their so-called local statistical significance (namely, the size of the intrinsic charm PDF in units of its total uncertainty). This is shown in Fig. 2 (bottom right) for our default determination of intrinsic charm, as well as after inclusion of either the LHCb *Z* + charm or the EMC data, or both. We find a local significance for intrinsic charm at the 2.5*σ* level in the region 0.3 ≲ *x* ≲ 0.6. This is increased to about 3*σ* by the inclusion of either the EMC or the LHCb data, and above if they are both included. The similarity of the impact of the EMC and LHCb measurements is especially remarkable in view of the fact that they involve very different physical processes and energies.

In summary, in this work we have presented evidence for intrinsic charm quarks in the proton. Our findings close a fundamental open question in the understanding of nucleon structure that has been hotly debated by particle and nuclear physicists for the past 40 years. By carefully disentangling the perturbative component, we obtain unambiguous evidence for intrinsic charm, which turns out to be in qualitative agreement with the expectations from model calculations. Our determination of the charm PDF, driven by indirect constraints from the latest high-precision LHC data, is perfectly consistent with direct constraints from both EMC charm production data taken 40 years ago, and very recent *Z* + charm production data in the forward region from LHCb. Combining all data, we find a local significance for intrinsic charm in the large-*x* region just above the 3*σ* level. Our results motivate further dedicated studies of intrinsic charm through a wide range of nuclear, particle and astroparticle physics experiments, such as those accessible at the High-Luminosity LHC (ref. ^{44}) and the fixed-target programmes of LHCb (ref. ^{45}) and A Large Ion Collider Experiment (ALICE)^{46}, to the Electron–Ion Collider, AFTER (ref. ^{47}), the Forward Physics Facility^{48} and neutrino telescopes^{49}.