April 13 2017

Modern Technologies of Solution Nuclear Magnetic Resonance Spectroscopy for Three-dimensional Structure Determination of Proteins Open Avenues for Life Scientists

Abstract

Nuclear magnetic resonance (NMR) spectroscopy is a powerful technique for structural studies of chemical compounds and biomolecules such as DNA and proteins. Since the NMR signal sensitively reflects the chemical environment and the dynamics of a nuclear spin, NMR experiments provide a wealth of structural and dynamic information about the molecule of interest at atomic resolution. In general, structural biology studies using NMR spectroscopy still require a reasonable understanding of the theory behind the technique and experience on how to recorded NMR data. Owing to the remarkable progress in the past decade, we can easily access suitable and popular analytical resources for NMR structure determination of proteins with high accuracy. Here, we describe the practical aspects, workflow and key points of modern NMR techniques used for solution structure determination of proteins. This review should aid NMR specialists aiming to develop new methods that accelerate the structure determination process, and open avenues for non-specialist and life scientists interested in using NMR spectroscopy to solve protein structures.

Article type: Review Article

Keywords: Nuclear magnetic resonance (NMR) spectroscopy, Solution NMR, Automation, Protein, Structure determination, Validation

Authors: Toshihiko Sugiki, Naohiro Kobayashi, Toshimichi Fujiwara

Article links: DOI: 10.1016/j.csbj.2017.04.001 | PubMed: 28487762 | PMC: PMC5408130

Relevance: Moderate: mentioned 3+ times in text

Full text: PDF (1.1 MB)

Introduction

Tertiary structure determination of biomolecules such as nucleic acids and proteins at atomic resolution provides essential insight into the function of bioactive molecules. X-ray crystallography ref. [1] and nuclear magnetic resonance (NMR) spectroscopy ref. [2] have been the primary methods over the past few decades to obtain high-resolution structures. More recently, the rapid technological growth of cryo-electron microscopy ref. [3] has seen this technique emerge as a third major approach to solve biomacromolecular structures at atomic resolution. Solution NMR offers a number of distinct features for structural biology studies: 1) Dynamics of protein folding, structural fluctuations, internal mobility and chemical exchange of target molecules can be investigated over a wide range of timescales (i.e., between picoseconds and sub-millisecond timescales) ref. [4], ref. [5]. The most successful application of NMR to study biomolecular dynamics has been carried out on intrinsic disordered proteins (IDPs) such as zinc finger proteins that usually do not yield crystals, which provide sufficient quality X-ray diffraction data for high-resolution structure determination ref. [6], ref. [7], ref. [8], ref. [9], ref. [10], ref. [11]. Moreover, these proteins are too small or too flexible to obtain strong contrast images by modern cryo-EM analysis. 2) Studies of protein-protein or protein–ligand interactions can be performed under physiological conditions. The affinity and the location of the interaction sites between the target protein and its binding partner molecules can be determined accurately and sensitively even if the interaction is very weak (K_d = ~ mM) by performing simple NMR experiments, e.g., chemical shift perturbation analysis by measuring two-dimensional (2D) heteronuclear single-quantum coherence (HSQC) spectra of target proteins with and without the binding partner, while such an interaction study by co-crystallization or co-immunoprecipitation can be challenging ref. [12], ref. [13], ref. [14]. Therefore, techniques based on using solution NMR can be readily used to screen chemical fragment libraries for preliminary hits, which have the potential to act as seed compounds in drug development. Such approaches usually identify weak binders for the target protein, but represent a powerful starting point for structure-based drug discovery and development studies.

Among the various applications of solution NMR to study protein function, as aforementioned, tertiary structure determination of proteins by NMR remains a steadfast and powerful application of this spectroscopy. Recent development of software and a number of web-based resources, which reduce the burden of complex NMR data analysis, have contributed to the systematic integration of sophisticated semi-automated NMR platforms for structure determination of biomolecules ref. [15], ref. [16], ref. [17], ref. [18], ref. [19]. Such advances, along with improvements in NMR hardware, have lowered the knowledge barrier to facilitate entry into the field of structural biology by NMR for non-specialists. However, only a handful of articles that describe the workflow and practical aspects of protein structure determination by solution NMR spectroscopy, which covers the latest method developments, have been published despite the practicality of this technique.

Workflow of Protein Structure Determination by Solution NMR

The standard approach to protein structure determination by solution NMR comprises several steps: (i) preparation of a protein sample labeled with stable NMR spin-1/2 isotopes, ¹³C and ¹⁵N; (ii) acquisition of NMR spectra; (iii) data processing and assignments of signals; (iv) generation of a chemical shift table that gives the signal assignments derived from the analysis of NMR spectra such that the analyst or an automated program can subsequently assign NOE signals to generate distance or dihedral angle restraints; and (v) simulated annealing (SA) by simplified molecular dynamics calculations with the NMR-based restraints are used for initial structure modeling to obtain an ensemble of structures that satisfy the experimentally determined constraints. These SA calculations are followed by molecular dynamics (MD) simulation with explicit or implicit water using a more sophisticated force field. (vi) The final stage of the workflow is validation of the precision and accuracy of the determined structure (Fig. 1). Details of these steps are explained in the subsections below. In this mini-review, basic and general information for researchers interested in NMR structure determination of proteins will be provided. We will mainly describe general procedures that are routinely used for NMR structure determination of soluble and small-to-medium size (c.a. < 25 kDa) proteins. More advanced techniques required for NMR analysis of more challenging proteins, e.g. membrane proteins or larger proteins (c.a. > 25 kDa), will also be described briefly.

PMC5408130 – f0005 — **Fig. 1:** Workflow of protein structure determination by solution NMR spectroscopy. (A) First, the target protein is uniformly labeled with NMR active isotopes (13C and 15N) and purified by chromatography processes. In many cases, it is essential to optimize the solution composition of the NMR sample (e.g., buffer, pH, type and concentration of salt, and other additives to prevent aggregation and/or to improve thermal stability of the target protein) and NMR parameters (e.g., sample temperature during NMR data collection, strength of static magnetic field, pulse sequences, inversion recovery delay) prior to collecting all the multidimensional heteronuclear NMR experiments. (B) Generally, various sample conditions and NMR parameters are verified by measuring 2D 1H–15N HSQC NMR spectra and/or 2D projection spectra of 3D HNCACB (e.g., 2D HN(CA)CB) until optimal conditions are found. If sample/NMR data collection optimization is performed by only 15N-based NMR experiments, only a 15N-labeled target protein (without 13C-labeling) is required. (C) The NMR signals of all 1H, 13C and 15N nuclei are assigned by analyzing multidimensional heteronuclear NMR spectra using NMR software that facilitates assignment of the signals. The right lower panel of this figure is a screenshot of a NMR signal assignment using the Kujira and MagRO software, developed by Prof. Naohiro Kobayashi (Osaka University, Japan), which is available at PDBj-BMRB http://bmrbdep.pdbj.org/en/nmr_tool_box/magro_nmrview.html. (D) After completing the assignment process, the solution structures of the target protein are determined by performing simulated annealing using NMR-based restraints such as 1H–1H distances and dihedral angles derived from NOEs and chemical shifts, respectively. (E) Finally, the precision and the accuracy of the determined structure are verified statistically and experimentally (e.g., by analyzing the Ramachandran plot and measuring RDCs of the target protein, respectively).

Protein Sample Preparation for NMR Measurements

In the initial NMR studies aimed at solving protein structures, complete assignment of the proton (¹H) NMR signals was achieved by acquiring ¹H NMR spectra, e.g., 2D ¹H–¹H double quantum filtered-correlation spectroscopy (DQF-COSY), 2D ¹H–¹H totally correlation spectroscopy (TOCSY), and 2D ¹H–¹H NOE correlated spectroscopy (NOESY) spectra. These spectra provided sufficient information to obtain ¹H–¹H distance restraints to calculate the tertiary structure for non-isotope labeled peptides and small proteins (c.a. < 10 kDa) ref. [20], ref. [21]. Extension of the ¹H only approach to larger proteins was not possible because of severe signal overlap precluding unambiguous assignment of signals ref. [21]. In the early 1990s, owing to the development of technologies for expressing recombinant proteins using bacteria cultures with medium containing stable isotopes, a large number of target proteins could be uniformly labeled with the stable “NMR active” isotopes, ¹³C and ¹⁵N, which can be detected by NMR measurements. Two- (2D), three- (3D) and four- (4D) dimensional NMR spectra observing signals that correlated ¹H, ¹³C and ¹⁵N nuclei made for dramatic improvements in signal separation and removed signal overlap present in ¹H NMR spectra ref. [21].

The Escherichia coli (E. coli) expression system is most commonly used for stable isotope labeling of proteins, as it has several advantages: easy handling, bacteria cells grow quickly, cost efficiency and the availability of many established methods for protein isotope labeling ref. [22], ref. [23]. For example, the most powerful application of the methods would be site-specific/desired amino acid selective isotope labeling of proteins, which has been firmly established using the E. coli expression system ref. [22], ref. [23], ref. [24], ref. [25]. Alternative ¹³C-labeling of protein is also possible by using particular ¹³C-enriched carbon sources, such as [1-¹³C]-glucose ref. [26], [1,3-¹³C₂]- or [2-¹³C]-glycerol ref. [27], ¹³C-acetate ref. [28], and [1,2-¹³C₂]- or [3-¹³C]-pyruvate ref. [29], ref. [30], ref. [31]. As this technique enables NMR observation of only the desired moiety of the target protein, it is especially useful for large molecular-weight (c.a. > 25 kDa) proteins or IDPs that show severe signal degeneracy.

In situations where an insufficient amount of the recombinant protein is produced by the E. coli expression system, the yeast expression systems, Pichia pastoris ref. [32] or Kluyveromyces lactis ref. [33] as host cells, can be used as an alternative. These systems not only have the advantage of being similar to the E. coli expression system (easy handling, rapid cell growth, and many available and firmly established isotope labeling techniques with relatively low cost), but also offer eukaryotic cell-specific features that facilitate the over-expression and correct folding of particular proteins, e.g., protein requires a complex array of disulfide bonds to form for native folding ref. [22], ref. [23], ref. [34], ref. [35].

Cell-free protein expression systems are another option to overexpress proteins. This approach synthesizes recombinant proteins in vitro in a reaction mixture containing the DNA coding the target protein and a cell lysate containing gene transcription/translation molecular machineries, and, if required, isotope labeled amino acids for protein labeling ref. [36], ref. [37], ref. [38], ref. [39], ref. [40]. Preparation of a fresh cell lysate, which is necessary for synthesizing a sufficient amount of the recombinant protein stably and reproducibly, is challenging. Furthermore, the running cost of such a cell-free system is generally high ref. [23], indicating that it is not suitable for large-scale protein production with stable isotopes. However, this approach is still adequate for amino acid selective isotope labeling or selective introduction of unnatural amino acid residues into the target protein, because amino acid scrambling is relatively mild when compared with other expression systems that use host cells ref. [41], ref. [42], ref. [43], ref. [44], ref. [45], ref. [46]. As an additional remarkable point, reconstructing an isotope-labeled membrane protein into artificial lipid bilayer discs, e.g., nanodiscs, is relatively easy to achieve by cell-free expression ref. [47], ref. [48], ref. [49], ref. [50].

Isotope labeling of recombinant protein using insect or mammalian cells is also possible, although this is generally an expensive method with low yields. Moreover, the available isotope labeling variation is limited when compared with that of other expression systems ref. [51], ref. [52].

In NMR protein structure determination, sample purity should be as high as possible because signal linewidths and protein stability are affected by non-specific contact with impurities. However, NMR spectra can be obtained even if the sample is crude, as demonstrated by studies examining cell lysates or metabolites ref. [53], ref. [54], ref. [55], ref. [56], ref. [57], ref. [58].

The conditions of the protein solution and the NMR experimental parameters should be sufficiently optimized prior to starting multidimensional heteronuclear NMR measurements ref. [59], ref. [60], ref. [61]. In many cases, the protein concentration, hydrogen ion (pH) value of the sample solution, the type of salt/buffer and its concentration and the NMR measurement temperature should be carefully examined by several pilot NMR experiments (typically 1D ¹H or 2D ¹H–¹⁵N HSQC spectra are recorded) with different sample solution compositions and NMR parameters. If an optimal condition is found, the NMR spectra will show sharp (narrow linewidths) and sufficiently well-dispersed signals ref. [59], ref. [61], ref. [62]. To assign backbone ¹H, ¹³C and ¹⁵N signals successfully by the inter-residue chemical shift linking approach, the observation of ¹³C^β nuclei with reasonable chemical shift dispersion and sensitivity is a key point to consider. Pilot measurements of 3D HNCACB and/or 2D HN(CA)CB spectra can be good indicators for judging the adequacy of the solution conditions and potential difficulties of obtaining signal assignments. Perdeuteration of target proteins combined with transverse relaxation optimized spectroscopy (TROSY)-type NMR pulse schemes may be beneficial when the molecular weight of the target protein exceeds 25 kDa (see Section 4).

Although the sensitivity of solution NMR is continuously improving thanks to the development of new NMR pulse schemes and hardware, e.g., an increase in the static magnetic field strength and the innovation of cryogenic probe ref. [63], a higher protein concentration is preferable to obtain NMR signals with higher signal-to-noise ratio and resolution within a reasonable period of NMR instrument use. In particular, for protein structure determination, sufficiently high solubility and stability are typically required. Ideally, the target protein should be a mono-dispersed species at a protein concentration > 0.5 mM and be stable in the NMR instrument for a minimum of 1–5 days.

The aggregation character of the target protein can be monitored by observing broadening of the NMR signals as a function of protein concentration or by investigating the rotational correlation time of the target protein via measurement of the [¹⁵N,¹H]-TRACT ref. [64] or the ¹⁵N T₁/T₂ value ref. [65]. Biochemical wet experiments (e.g., size-exclusion column chromatography) are also useful for investigating the character of the target protein. However, this method cannot address protein aggregation at the protein concentration used for NMR measurements, because the protein is significantly diluted in chromatography ref. [59].

In general, the pH value of the protein solution is selected by considering the theoretically/experimentally determined isoelectric point (pI) of the protein. A lower pH is preferable to reduce the chemical exchange between water ¹H and protein ¹H^N nuclei, because ¹H fast exchange reduces the sensitivity of ¹H^N–¹⁵N correlation signals ref. [66], ref. [67]. In some cases, however, the protein cannot be placed at a low pH because of the pI, solubility, or stability. As buffer components, phosphate buffer (e.g. sodium or potassium phosphate) is an ideal choice because it does not have ¹H nuclei ref. [61]. As a character of buffer conductivity and its effect on the sensitivity of NMR signal detection using cryogenic probes, however, other buffers such as HEPES- or MES-NaOH would be more preferable ref. [68].

The sample temperature during NMR data collection is often room temperature (typically 25 °C). However, a higher sample temperature should give better spectra unless such temperatures affect the structure and stability of the target protein (up to 40 °C when a cryogenic probe is used). Higher temperatures offer sharper signals by decreasing solvent viscosity and thus the rotational correlation time of the protein ref. [59], ref. [61].

Some types of chemical additives can increase the solubility and/or stability of proteins ref. [59], ref. [61]. For example, a reducing agent (e.g., dithiothreitol (DTT), tris(2-carboxyethyl)phosphine(TCEP)-HCl) may be essential for preventing protein oligomerization caused by inter-molecular nonspecific disulfide bond formation, when free cysteines exist on the surface of the protein. Multiple well plate-based high-throughput screening methods using fluorescence correlation spectroscopy or thermal shift assays can explore chemical additives that improve protein solubility and stability, which may help to find the best condition without exhaustive sample condition optimization work that consumes limited NMR instrument availability ref. [59], ref. [61].

Note that 5%–10% (v/v) ²H₂O must be added to the NMR sample to provide a lock signal for the static magnetic field passing through the sample solution. In addition, in many cases, a standard chemical shift reference compound is dissolved in the sample solution, as described in the next section.

Acquisition of NMR Spectra and Their Assignment

Initially, ¹H, ¹³C, and ¹⁵N NMR signals arising from the nuclei of the target protein have to be assigned before we can obtain angle and distance restraints for protein solution structure determination. In general, a number of NMR spectra are recorded to obtain a complete set of NMR assignments ref. [69], ref. [70], ref. [71].

For backbone (¹H, ¹³C and ¹⁵N) signal assignments: 2D ¹H–¹⁵N HSQC, 3D HNCA*, 3D HN(CO)CA*, 3D HNCACB, 3D CBCA(CO)NH, 3D HNCO, 3D HN(CA)CO* and 3D HBHA(CO)NH are recorded. The 3D HNCACB and 3D CBCA(CO)NH are the most basic spectra for backbone signal assignments ref. [71]. The asterisk next to the experiment acronym indicates optional spectra required when 3D HNCACB and 3D CBCA(CO)NH spectra are insufficient for completing backbone assignment. The 3D HNCO and 3D HN(CA)CO are useful complements to inter-residue chemical shift linking using 3D HNCACB and 3D CBCA(CO)NH experiments, and also help to eliminate any potential assignment unambiguity that exists. Furthermore, chemical shift assignments of ¹³C^α, ¹³C^β, ¹³C′ and ¹H^α nuclei are important for secondary structure prediction by TALOS or its related programs (e.g., TALOS +, TALOS-N) as described in Section 5.2. In those pulse schemes, except CBCA(CO)NH, ¹H–¹³C–¹⁵N correlations are measured using the out-and-back coherence transfer approach between ¹H and ¹³C/¹⁵N to improve the sensitivity of those low gyromagnetic ratio (γ) nuclei (typically 5–10 fold). First, polarization of proton magnetization is transferred to low-sensitive nuclei ¹³C or ¹⁵N via J-couplings, and then the chemical shifts of those nuclei are recorded. Finally, the magnetization of ¹³C or ¹⁵N is returned to the starting proton, and the chemical shifts of the ¹H nuclei are recorded as the free induction decay (FID). However, nuclei denoted within brackets in the acronym of the pulse sequence participate in the coherence transfer pathway but their chemical shifts are not encoded in those multi-dimensional NMR spectra. Incidentally, those out-and-back coherence transfer schemes starting from the amide proton are also applicable to perdeuterated proteins.

For side chain assignments: 2D ¹H–¹³C constant-time (CT) HSQC (¹³C offset on aliphatic and aromatic regions), 3D ¹⁵N-edited TOCSY-HSQC, 3D CC(CO)NH, 3D H(CCCO)NH, 3D (H)CCH-TOCSY (¹³C offset on aliphatic and aromatic regions), 3D H(C)CH-TOCSY (¹³C offset on aliphatic and aromatic regions), 2D (HB)CB(CGCD)HD, 2D (HB)CB(CGCDCE)HE, ¹³C-edited NOESY-HSQC and ¹⁵N-edited NOESY-HSQC spectra are required. There are many variations of those pulse schemes for time evolution such as CT or semi-CT methods, echo and anti-echo, single or multiple quantum coherence, or TROSY type acquisition. It is preferable that each user selects proper pulse schemes for individual protein samples by performing pilot NMR experiments to collect better quality NMR spectra that have the highest signal-to-noise ratio and resolution. Catalogues of standard pulse programs disclosed by the manufacturer of NMR instruments, such as Bruker (available from their websites), may help determine the best NMR measurements to use.

The minimum number of multi-dimensional NMR spectra required to obtain complete assignment of ¹H, ¹³C, and ¹⁵N signals of proteins using magnetization transfer via inter-nuclear J-couplings is six: 3D HNCACB, 3D CBCA(CO)NH, 3D (H)CCH-TOCSY (aliphatic and aromatic regions) and 3D H(C)CH-TOCSY (aliphatic and aromatic regions) ref. [71]. In practice, however, using NOESY spectra (e.g., ¹³C-edited NOESY-HSQC), which provide a wide range of inter-nuclear distance information between the backbone and side chains, may be helpful, in particular, for assignment of aromatic side chain signals.

The ¹³C- and ¹⁵N-edited NOESY-HSQC spectra are essential for both signal assignments and for generation of ¹H–¹H distance restraints ref. [21], ref. [72]. The mixing time used in NOESY spectra of proteins is generally 80–150 ms, as described in Section 5.1.

In general, when the molecular weight of the target protein exceeds 30 kDa, measurement and assignment of the protein NMR signals become difficult owing to the increasing degeneration and line-broadening of the signals, of which effect is associated with fast transverse magnetization decay ref. [21]. The latter issue is the biggest burden to solution NMR spectroscopy. For example, the transverse relaxation rate of ¹³C^α magnetization is dominated by the dipole–dipole (DD) interaction with the proximate ¹H^α, leading to dramatic sensitivity losses to ¹³C^α signals or signals in multidimensional NMR spectra that transfer magnetization via ¹³C^α. This is particularly problematic when the molecular weight of the target protein exceeds 20 kDa ref. [73], ref. [74]. In this case, the DD relaxation of ¹³C^α magnetization can be suppressed remarkably by substitution of ¹H^α to ²H (perdeuteration) by overexpressing the target protein in ²H₂O media ref. [73], ref. [74]. In addition, ¹H–¹⁵N correlation spectra with narrower signals can be obtained by TROSY-type NMR pulse schemes, which selectively observes the coherence where DD relaxation has been attenuated by the chemical shift anisotropy (CSA) effect ref. [75]. The TROSY effect will be more prominent in combination with a perdeuterated protein and higher static magnetic fields (i.e., ≥ 800 MHz) because cancellation of DD relaxation by the CSA effect is field dependent. However, the benefits from perdeuterated proteins are limited to improving backbone NMR signal assignments and for ¹³C-direct detection NMR experiments ref. [76], ref. [77], ref. [78], ref. [79]. ¹³C-direct detection NMR methods are also applicable to protein structure determination ref. [80]. In many cases, moreover, mild denaturation and regeneration treatments of the deuterated protein sample in a ¹H₂O buffer may be required to back-exchange deuterium to proton at amide positions to regenerate ¹H^N–¹⁵N correlations. However, multi-dimensional NMR experiments that use polarization transfer starting from non-labile protons, e.g. 3D CBCA(CO)NH, cannot be used with deuterated proteins ref. [71].

Prior to starting the assignment of ¹H, ¹³C and ¹⁵N chemical shifts of the target protein, the NMR spectra should be calibrated ref. [81], ref. [82]. ¹H chemical shift calibration (or denoted as “referencing”) uses the temperature-dependency of the ¹H chemical shift of ¹H₂O automatically when processing NMR spectra using the program NMRPipe ref. [83]. In addition, chemical shift calibration using standard compounds that yield ¹H chemical shifts, e.g., 4,4-dimethyl-4-silapentane-1-sulfonic acid-d₆ (d₆-DSS), offer further reliable calibration ref. [81], ref. [82]. Practically, calibration can be achieved by simply measuring the chemical shift value of the major ¹H peak of the trimethyl group of d₆-DSS (expected to appear near 0 ppm) in a 1D ¹H spectrum. It is important for appropriate calibration to acquire the 1D ¹H spectrum with sufficient number of data points, in order to read its peak maximum as precisely as possible. Then, the ¹H carrier offset of the NMR spectra of the target protein is adjusted as the chemical shift value of the ¹H peak of the trimethyl group of d₆-DSS becomes 0.000 ppm. As shown in the BioMagResBank (BMRB) website (http://www.bmrb.wisc.edu/ref_info/cshift.html), the calibrated offset of the ¹³C and ¹⁵N carrier frequencies (Hz) of the target protein are determined by multiplying factors based on the gyromagnetic ratio (γ_13C/γ_1H or γ_15N/γ_1H, corresponding to 0.251449530 or 0.101329118, respectively) to the calibrated ¹H center frequency (Hz) ref. [81], ref. [82]. After chemical shift referencing of the ¹H, ¹³C, and ¹⁵N center frequencies, all of the NMR spectra of the target protein are processed.

It is also possible to obtain the 1D ¹H NMR spectrum of a standard compound by dissolving it in a target protein sample solution as an internal standard. However, the user has to be careful that occasionally nonspecific interactions between the standard compound and the protein can occur. Furthermore, a strong ¹H signal and ¹H noise derived from the standard compound often obstructs analysis of the NMR spectra. Therefore, it is also acceptable to measure the ¹H reference spectrum of the standard compound using a NMR sample only containing standard compound, e.g., 0.5 mM d₆-DSS dissolved in 90% (v/v) H₂O/10% (v/v) ²H₂O, as an external standard.

Chemical shift referencing is mandatory for depositing the NMR data and structure into public databases such as the BMRB and Protein Data Bank (PDB). In addition, chemical shift calibration is important for accurate prediction of secondary structures and backbone dihedral angles by comparing data to chemical shift/structure databases using TALOS or related programs ref. [84], ref. [85], as described in Section 5.2. In any case, it is crucial to obtain reference chemical shift data on a standard compound at an identical temperature and static magnetic field strength to the NMR data collection on the target protein.

The processing of the collected NMR data involves a series of several mathematical conversions from time domain data to frequency domain data. Here, fast Fourier transformation of discretely sampled digital data of the direct and indirect dimensions with adequate shaping by apodization and window functions, linear prediction, phase adjustment and baseline correction are comprehensively achieved by programs such as NMRPipe (https://spin.niddk.nih.gov/bax/software/NMRPipe/) ref. [83].

The chemical shift of the signals arising from the target protein should be assigned as completely as possible. In particular, ¹H assignments are important, as they form the basis of inter-proton distance restraints. The completeness and accuracy of the ¹H assignments will strongly and directly affect the accuracy and convergence of the calculated NMR structure ref. [86], ref. [87], ref. [88], ref. [89], ref. [90]. There are many visualization software packages to examine and analyze the NMR spectra and perform chemical shift assignments, e.g., Sparky (Goddard TD and Kneller DG, SPARKY 3, University of California, San Francisco), CARA (developed by Dr. Keller of the Kurt Wüthrich’s group, http://cara.nmr-software.org/portal/), NMRView ref. [91], and Kujira and MagRO (developed by Prof. Naohiro Kobayashi of the PDBj-BMRB group (Osaka University, Japan), http://bmrbdep.pdbj.org/en/nmr_tool_box/magro_nmrview.html) ref. [15], ref. [17] with NMRView and CcpNmr (http://www.ccpn.ac.uk/). Programs for automated assignment of backbone and side chain signals are also available, such as PINE and PINE-SPARKY, which is included in the new version of Sparky (NMRFAM-Sparky) ref. [92], ref. [93], MARS ref. [94], UNIO (assembly of MATCH/ASCAN/CANDID/ATNOS algorithms) ref. [87], ref. [88], ref. [95], ref. [96], ref. [97] and FLYA ref. [98], which is partially powered by the GARANT algorithm ref. [99]. The basic concept of the automated signal assignment process is matching peaks between experimentally observed and expected ones, which are based on the amino acid sequence of the target protein and magnetization transfer schemes of the recorded NMR spectra ref. [99]. From a practical perspective, successful performance of automated assignment software requires the collection of high-quality NMR spectra that show sufficiently well-resolved signals and the expected number of signals. They typically provide reliable assignment of well-separated signals by linking chemical shifts, which are expected from coherence transfer pathways based on the pulse schemes used for the NMR experiments and topology of chemical structures of amino acid residues. Moreover, auto-assignment programs can offer possible assignment of unassigned signals, which are difficult to assign manually owing to ambiguity of inter-residue chemical shift linking. In many cases, therefore, the user has to manually confirm and carefully correct the assignment made by automated programs using NMR spectrum visualization software, as described above.

NMR Structure Calculation

Distance Restraints

Protein structure determination by simulated annealing uses the ¹H–¹H distance and dihedral angle restraints derived from NMR data ref. [86], ref. [90]. The ¹H–¹H distance information is obtained by estimating the intensity of the ¹H–¹H NOE signals ref. [21]: the intensity of the ¹H–¹H NOE signals is dependent on the mixing time (τ_m), during which ¹H–¹H cross-peaks are generated ref. [21]. However, although build-up rates obtained from linear dependency of the NOE cross-peak intensity on the τ_m correlate with the ¹H–¹H distance when a limited (short τ_m) NOE build-up regime is used, quantitative estimation of ¹H–¹H distances from the intensities of the ¹H–¹H NOE signals becomes impossible over the limited τ_m range because relaxation and spin diffusion also proceed during long mixing times ref. [21], ref. [100]. In general, the τ_m value should be within the initial NOE build-up range (typically 80–150 ms for proteins) ref. [21]. Nonetheless, recently developed innovative methods offering more exact conversion of NOE data to internuclear distance information, Exact NOE (eNOEs), are powerful protocols for accurate and precise protein solution structure determination ref. [101], ref. [102], ref. [103]. The 3D ¹³C- or ¹⁵N-edited NOESY-HSQC spectra are widely used. Following peak picking and integration of the ¹H–¹H NOE signals, the table containing the residue number/atom name/chemical shift/¹H–¹H NOE signal intensity is used as primary input data for simulated annealing ref. [90]. In the classical way to prepare NOE based distance restraints, the NOE peaks are roughly classified into three (or four) groups depending on their signal intensity: strong, medium and weak (and very weak) ref. [86], ref. [90]. In the structure calculation, this information is translated into ¹H–¹H distance ranges of 1.8–2.7, 1.8–3.3, and 1.8–5.0 (and 1.8–6.0) Å ref. [86], ref. [90]. The lower limit of the ¹H–¹H distance, 1.8 Å, corresponds to twice the van der Waals radius of ¹H atoms ref. [86], ref. [90].

Rough classification and wide range (not rigorous) distance restrictions based on NOE data are based on the difficulty of converting the NOE signal intensity to an inter-nuclei distance accurately and directly because the NOE signal intensity is affected by not only the inter-atomic distance, but also various factors such as spin-diffusion and conformational averaging as a result of local structural fluctuations. Because of the mildness of distance restraints based on NOE experiments, a sufficient number of distance restraints are required for accurate and precise protein structure determination (generally speaking more than 10–20 distance restraints per residue, not including intra-residue NOEs) in order to build a sufficient density of NOE-network anchoring ref. [90]. The concept of NOE-network anchoring is based on the consideration that the correctly assigned NOE distance restraints will form a self-consistent NOE set, which is compatible with tertiary structure models of target proteins ref. [87], ref. [88]. It is also important to empower automated NOE assignment by the program CYANA ref. [90] (see Section 6).

In particular, a sufficient number of long-range distance information (5.0–6.0 Å, which are upper limit distance restraints) are critical for precise and accurate structure determination ref. [104]. As the intensity of NOE peaks with important distance information, i.e., long-range, is often weak, optimization of experimental conditions of the NOESY spectra is important to gather a sufficient number of weak NOE signals. Higher sample concentration (> 0.5 mM), lower temperature, lower pH (in the case of ¹⁵N-edited NOESY-HSQC) and stronger static magnetic fields are preferable parameters to ensure observation of weak NOEs. When only a minimal number of NOE signals are observed owing to the poor character of the target protein, e.g. aggregation or signal line broadening owing to averaging of multiple conformation caused by fluctuation in an intermediate-slow NMR time scale regime, long-range distance information can be collected, which may complement the shortage of NOE-based distance restraints, by protein labeling with paramagnetic metal ions or radicals to measure the paramagnetic relaxation enhancement (PRE) effect ref. [105] or the pseudo-contact shift (PCS) ref. [106], ref. [107]. The PCS data provide important NMR-based restraints that describe the global fold of the protein and/or tertiary structure of protein-protein complexes with high accuracy owing to its unique features: PCS provides both long-range distance restraints and angle information of bond vectors in the protein against the tensor frame of the magnetically aligned paramagnetic center molecule.

In addition, the resolution of NOESY spectra significantly affects the quality of the structure determined by NMR ref. [108]. Using the highest possible resolution is appropriate for precise and accurate structure determination; this is especially important when performing automated NOE peak picking and assignments. Algorithms for automated NOE peak picking, e.g., NMRView ref. [91], AUTOPSY ref. [109], ATNOS ref. [88], AUDANA ref. [110], or CYPICK ref. [111], have been used for preparing peak lists to perform automated NMR analysis.

Information about the location of hydrogen bonds can be used as distance restraints ref. [90]. The rate of chemical exchange between the amide proton in the protein and protons in the solvent water can be estimated by performing H/D exchange ref. [112] or CLEANEX-PM ref. [113] experiments. It is possible that amide protons showing a significantly slow exchange rate may be involved in the formation of hydrogen bonds. At a stage of the structure calculation process that the global fold of the protein has been trustfully determined, pair residues forming hydrogen bonds may be deduced from the modeled structure. However, this step must be carefully executed to prevent incorrect assignment of hydrogen bonded residues. More ideally, therefore, pair residues forming hydrogen bonds should be experimentally determined by long-range HNCO measurements ref. [114], ref. [115] or other analogous NMR techniques to avoid incorrect interpretation of the modeled structure. The distances between backbone amide protons and carbonyl oxygen atoms, or backbone nitrogen and carbonyl oxygen, are normally set as 1.8–2.0 or 2.7–3.0 Å, respectively. Since the distance restraints of hydrogen bonding are strong, giving large entropic information that contrasts those of dihedral angles and NOE-based distance restraints, introduction of hydrogen bonds should be applied at a later stage in the structure calculation. In a similar manner, information of the location of intra-molecular disulfide bonds also available as restraints should be entered carefully into a structure calculation, perhaps after an initial model has been calculated using purely distance and dihedral restraints.

Dihedral Angle Restraints

Dihedral angles of the target protein, such as backbone φ(N-C^α), ψ(C^α-C’), ω(N^α-C’) and side-chain χ(n), are determined by a wide variety of NMR experiments ref. [86], ref. [90]. These angles are important for defining the secondary structure and the side-chain conformation of the target protein.

According to the Karplus equation, a dihedral angle can be estimated by measuring the ³J-coupling constant ref. [116]. For example, the ³J-coupling constants of each residue can be directly measured by NMR experiments such as 3D HNHA ref. [117], ARTSY-J ref. [118] and 3D HNHB ref. [119]. However, acquiring these experiments with high precision and accuracy cannot be achieved easily because the sensitivity of these NMR experiments that measure ³J-couplings is relatively low. In those pulse schemes, long evolution and refocusing periods (total ~ 50 ms in the case of 3D HNHA) are required to measure the rather small ³J-coupling constants (< 10 Hz), which would be especially difficult to measure accurately with a protein that gives rise to broad signals owing to, for example, molecular size or chemical exchange processes ref. [118]. As an alternative approach, dihedral angles and the secondary structure of proteins are predicted by the programs TALOS (and related versions) ref. [84], ref. [85], CamShift ref. [120] and SHIFTX2 ref. [121]. These programs use prior chemical shifts statistics, including ¹H^N, ¹⁵N, ¹³C^α, ¹³C^β, ¹H^α and carbonyl ¹³C nuclei, with corresponding structure coordinates.

Dihedral angle information is useful to refine secondary structures in spite of their application in simulated annealing as relatively mild restraints (lower and upper limits are normally around +/− 20–40°). For application of the predicted dihedral angles in structure modeling, it is important to use trusted chemical shifts derived from correctly calibrated NMR spectra, as described above.

Automated NOE Assignment and Structure Modeling of Proteins

Programs for automated NOE assignments and molecular modeling, e.g., CYANA ref. [86], ref. [90], ref. [122], Xplor-NIH ref. [123], AUREMOL (http://www.auremol.de/), critical assessment of automated structure determination of proteins by NMR (CASD-NMR) developed by the e-NMR project (http://haddock.chem.uu.nl/enmr/eNMR-portal.html) ref. [19], ref. [124], ref. [125], PONDEROSA-C/S (http://ponderosa.nmrfam.wisc.edu/) developed by the National Magnetic Resonance Facility at Madison (NMRFAM) (http://www.nmrfam.wisc.edu/software.htm) ref. [16], ref. [18], ref. [126] and UNIO (http://perso.ens-lyon.fr/torsten.herrmann/Herrmann/Software.html) ref. [19], ref. [87], ref. [88], ref. [95], ref. [96], ref. [97], are used for solution structure determination of biomolecules such as proteins and nucleic acids. In this mini-review, we will show an example of the program CYANA, which is used widely for NMR structure determination, to explain the general workflow of protein structure calculation. The amino acid sequence data of the target protein, the chemical shift table, and the NOE peak list are required as primary input files to perform automated NMR assignments and molecular modeling using CYANA ref. [86], ref. [90], ref. [122]. In NOE based structure modeling by CYANA, seven steps of iterative calculations including the structure-modeling step for obtaining an initial global fold of the protein are performed. This is followed by refinements of the automated NOE assignments by referencing the calculated intermediate structure model of the protein using the algorithm extended from the program CANDID ref. [90], ref. [122]. The automated NOE assignment process has the advantages of being fast, unambiguous and unbiased ref. [88], ref. [127]. However, in many cases, manual intervention and/or re-assignment of the results of automated NOE assignment are necessary because completeness of the automated process is not perfect, especially when the peak maxima of the NOESY signals do not match ¹H chemical shift assignment table ref. [88], ref. [127]. Several other algorithms, e.g., NOAH ref. [128], ref. [129], ASDP ref. [130], ARIA and its related programs ref. [131], ref. [132], KNOWNOE ref. [133], AutoNOE-ROSETTA ref. [134] and PASD in Xplor-NIH ref. [135], have also been developed for automated NOE assignments and structure calculation. Even if the stereo-specific assignments of methylene protons or methyl groups of leucine or valine residues are ambiguous, in the process of the structure-aided NOE auto-assignment, those signals can be automatically swapped with fair correctness by accounting for the modeled structure just before the final stage in the calculation. A high precision of modeled structures can be strongly supported by random combination of long-range distance restraints and sufficiently condensed NOE-network anchoring generated in the early stage of the CYANA calculation ref. [87], ref. [90].

In particular, when performing structure-aided NOE auto-assignment and structure modeling, the quality of the resultant structures strongly depends on the completeness and correctness of the assigned chemical shits ref. [86], ref. [87], ref. [88], ref. [89]. Notably, a number of spurious NOEs such as noise peaks included in the input can lead to wrong NOE assignments, and thereby result in incorrect modeled structures. These issues have been intensively discussed in a literature ref. [90]; nevertheless, a sufficient number of correct but weak NOE signals providing long-range distance information is a key feature for accurate and precise structure modeling ref. [90]. In contrast, using the conventional approach of manual NOE assignments and generation of distance restraints, violations in the structure calculation caused by restraint violations may act as indicators to determine whether there are problems in the restraints list. Thus, in the conventional way, when systematic violations arise from the presence of incorrect restraints, the structure calculation and revision of restraints have to be repeated exhaustively. Both conventional and automated approaches can lead to endless and repetitive rounds of calculations, and thus a set of criteria is needed to ensure such a cycle is avoided. Previously, the national project Protein 3000 mainly performed by RIKEN solved a large number of structures by solution NMR using the CYANA and Kujira systems (over 1080 structures) ref. [15], ref. [17]. This project revealed a number of benchmarks to achieve when solving structures by NMR: 1) the remaining number of unassigned NOE peaks should be <~5%, which should not be localized to a certain region of the calculated structure; 2) high completeness and accuracy of the assigned chemical shifts of ¹H, ¹³C and ¹⁵N signals (> 90% of completeness); and 3) a sufficient number of NOE signals in the NOESY spectra (> 10–20 NOEs per residue), with a wide intensity range and sufficiently high signal-to-noise ratio and resolution. Using the Kujira and MagRO system, the analysts can easily and visually inspect the accuracy of the assigned chemical shifts and the NOE signal that were automatically assigned by CYANA. In the process, NOE signals showing abnormal line-shapes and/or spurious NOE peaks can be eliminated. The aforementioned strategy should be suitable for structural studies of small proteins that give good-quality NMR spectra, which has been quantitatively evaluated by Kobayashi et al. ref. [15], ref. [17].

Furthermore, a systematic violation found in the structure calculations when there is sufficient NOE network anchoring may aid in identifying inappropriate angle restraints.

A general way to model structures using distance and angle restraints is by simulated annealing, as described above ref. [86], ref. [90]. The protocol of the calculation has been standardized. A completely unfolded state of the target protein, namely at an extremely high temperature (e.g. 10,000 K), is virtually generated as an initial model, and the temperature is gradually cooled to around 0 K to minimize the total potential energy (or target functions) and to satisfy the input NMR restraints. It is well known that because the modeled NMR structures calculated by simulated annealing tend to be trapped into local minima of the energy landscape, the analysts should run multiple simulated annealing calculations with different random seeds to get an ensemble of modeled structures as mentioned below.

The structure calculation by simulated annealing using CYANA is performed using a highly simplified force field with smaller van der Waals radii and without static electric potential energy. In modern NMR structure determination, therefore, the models determined by CYANA are further refined by molecular dynamics calculations with explicit or implicit water model systems using other computer programs powered by advanced force fields, e.g., Xplor-NIH ref. [123], ref. [136], CNS ref. [137], ARIA and related software ref. [131], ref. [132], AMBER ref. [138], OPLS-AA ref. [139] and CHARMM ref. [140].

Conventionally, the final NMR structure is represented by an ensemble of 10–30 lowest-potential energy structures or structures with the lowest number of violations. Superposition of the structural ensemble can be performed manually based on the secondary structures in the determined structures. The automated identification of the ordered region of the determined structures is also possible using particular programs, e.g., NMRCORE ref. [141], FindCore ref. [142], CYRANGE ref. [143] and FitRobot ref. [144]. The resulting structural ensemble can be graphically represented using molecular viewers such as UCSF Chimera ref. [145], CCP4mg ref. [146], VMD-XPLOR ref. [147], PyMol (https://www.pymol.org/) and MOLMOL ref. [148]. The convergence of the modeled structures is estimated by the root-mean-square deviation (RMSD, Å) of backbone and/or heavy atoms of the 10–30 models that represent the ensemble ref. [149], ref. [150]. This RMSD provides an indicator of the precision of the models.

Regions where the structure is determined by a sufficient number of restraints appear well-converged, and therefore the RMSD value in those regions is small. The local diversity of the RMSD values directly correlates to the number of restraints used in the structure calculation. Through superimposed representation of the structural ensemble it is easy to identify regions that poorly converge or are over-restrained.

Poorly converged regions appear partially disordered or flexible. However, other possibilities may explain why the structure in that region could not be determined correctly in the modeled ensemble, i.e., insufficient restraints to force convergence of this region of the protein structure. Indicating whether the non-converged region is really disordered or dynamic in the deposition of the NMR structure into public databases or publication in a journal sometimes cannot be unambiguously stated. Thus, there is no conclusive evidence about whether the poorly converged region undergoes conformational fluctuations, without further investigation of the dynamics, kinetics and stoichiometrical features, which can be performed by NMR relaxation experiments.

Validation of Precision and Accuracy of the Determined Protein Structure

Validation of the modeled structures to assess the precision and accuracy is an important final step in the structure determination process by NMR. There is no guarantee that the protein structure determined by solution NMR with high precision matches the same protein structure determined by a different method ref. [149], ref. [150]. To finalize the structural study by NMR, the analyst should carefully verify and assess both the precision and accuracy of the determined structure based on particular criteria.

As mentioned above, the convergence (precision) of the modeled structures can be generally assessed through the RMSD values of the 10–30 lowest minimal potential energy structures ref. [151].

The accuracy of the modeled structures can be assessed through geometric parameters derived from the coordinates of the lowest energy structures ref. [149], ref. [150], ref. [152]. The most widely used analysis examines bond angles, chirality and the side chain rotamer states, as well as the backbone conformation by creation of a Ramachandran plot. The results of the Ramachandran plot analysis are classified into four groups: 1) most favored, 2) additionally allowed, 3) generously allowed and 4) disallowed. Ideally the number of residues found in the generously allowed and disallowed regions of the Ramachandran plot is zero, or at least less than 10%, indicating that there are only a few geometric issues among the atoms and bonds in the structural ensemble ref. [150]. This kind of analysis can be performed by PROCHECK-NMR ref. [153], ref. [154] or MolProbity ref. [155], and is a requirement when depositing the molecular coordinates into the PDB.

The orientation of bond vectors in a protein, such as ¹H^N–¹⁵N and ¹H^α–¹³C^α, relative to the principal axis of the molecular alignment tensor of the structure can be investigated by measuring residual dipolar coupling (RDC) between the dipolar-coupled nuclei using uniformly ¹⁵N- and/or ¹³C-labeled protein in a weakly aligned medium ref. [152]. The information about the bond vector orientations in a protein can be used to validate the accuracy of the determined structure ref. [152]. The interaction of the dipolar coupling between two spin-1/2 nuclei depends on the angle and distance between the bond vector and their gyromagnetic ratio. In contrast to solid-state NMR, the Hamiltonian derived from the dipolar coupling interaction between two dipolar-coupled nuclei is not observed because it averages to zero because of isotropic tumbling of the protein in solution. When the protein is dissolved together with a medium that orients under the influence of the magnetic field, e.g., a liquid crystalline formed by lipid discs such as bicelles ref. [156], or filamentous phage such as Pf1 ref. [157], ref. [158], the aligned medium will provide some extent of anisotropy to the proteins owing to slight restriction of the isotropic molecular tumbling due to repetitive collisions between the protein and the aligned media ref. [159]. As a result, a small residual value of the dipolar coupling between nuclei can be observed, and the size of the RDC can be tuned by the extent of alignment of the protein.

From the Cartesian coordinates of the aligned protein, it is theoretically possible to predict the alignment tensor of the weakly aligned protein using programs such as PALES ref. [160] or REDCAT ref. [161] by simulation of the predicted dipolar coupling between the bond vector and the principal axes of the estimated alignment tensor of the protein.

RDC values can also be calculated from the determined structure. The back-calculated RDC values from the structure (the RDC data have not been used as restraints in the structure calculation) should show a high correlation (correlation coefficient = 0.8–1.0) with the measured RDCs, confirming that the structure has been correctly determined ref. [162]. Therefore, we would encourage this approach as a means to validate the protein structure, if the target protein is stable in alignment media and a sufficient number of RDCs are measured.

As aforementioned, if the analyst wants to directly apply the RDC values as restraints for structure calculation ref. [159], it is necessary that the alignment tensor of the protein be determined precisely and unambiguously in advance. In a conventional manner, precise and unambiguous determination of the alignment tensor of the protein based on RDCs is impossible without information about the tertiary structure of the target protein. Therefore, conventionally, restraints for bond vector orientations using RDCs are generally used during the final refinement stage in NMR structure calculations ref. [159]. Recently, however, methods for estimating the alignment tensor based solely on a histogram distribution of the experimental RDCs (without information about the tertiary structure) have been developed ref. [163].

There is always a certain possibility that the determined structure with a high convergence is (locally or wholly) incorrect because internuclear distance restraints based on NOE and dihedral angle restraints fundamentally provide local (short-range, < 6 Å) information, which means that those restraints cannot unambiguously restrict the relative orientation between each secondary structure or sub-domains of a protein ref. [159]. In particular, in the cases of a multidomain protein or a protein–peptide complex, accurate determination of the relative orientation between each molecule is difficult when using such short-range restraints, even if the tertiary structure of each component has been determined precisely and accurately ref. [164], ref. [165], ref. [166], ref. [167].When the NMR structures have been refined using RDC data as restraints, as a matter of course, the RDC data may no longer be applied to validation.

Concluding Remarks

Solution NMR spectroscopy is a popular method to determine the tertiary structure of proteins. Surprisingly, however, there are only a handful of articles that describe the workflow of protein NMR structure determination ref. [71], ref. [168], ref. [169], ref. [170]. We anticipate that this mini-review should help scientists who are interested in protein solution structure determination by NMR.

The rapid and steady progress in NMR hardware and software, e.g., CYANA upgrades and automation of NMR signal assignment programs, has continued unabated ref. [171]. Furthermore, recently, integrated NMR software platforms offering systematic and semi-automatic biomolecular structure determination, e.g. PONDEROSA-C/S ref. [16], ref. [18], ref. [126] and UNIO ref. [19], ref. [87], ref. [88], ref. [95], ref. [96], ref. [97], have been developed. Moreover, global fold determination of desired proteins has now become easier owing to the development of programs, e.g. CS-ROSETTA ref. [172], ref. [173], which can determine protein structures using only chemical shifts and RDC data as restraints.

Furthermore, advanced techniques for protein sample preparation (e.g., SAIL technology ref. [42], ref. [174], ref. [175], ref. [176], ref. [177], methyl group-selective ¹H,¹³C-labeling with deuteration of the other non-labile protons ref. [178], ref. [179], ref. [180], ref. [181], ref. [182], ref. [183]) combined with elaborate NMR pulse schemes (e.g., rapid NMR data collection such as the Band Selective Optimized-Flip-Angle Short-Transient (SOFAST), Band-selective Excitation Short-Transient (BEST) ref. [184], ref. [185] and their easy set up/use scripts (http://www.ibs.fr/research/scientific-output/software/pulse-sequence-tools/), ASAP or ALSOFAST ref. [186] methods, the 3D or 4D methyl-methyl NOESY based on high-resolution and diagonal-free HMQC-NOESY-HMQC pulse schemes ref. [181], ref. [187], ref. [188], ref. [189], and the dual- or parallel-FID acquisition approaches ref. [190], ref. [191], ref. [192]), and non-uniform data sampling (NUS) applying NMR data collection of the indirect dimension and quantitative reconstitution of NMR spectra from sparsely sampled data ref. [193], ref. [194], ref. [195], should facilitate the study of challenging proteins by NMR (e.g., membrane proteins, enzymes such as kinase/phosphatase, and supramolecular complexes). Therefore, solution NMR spectroscopy is expected to develop even further as a tool for determining the tertiary structure of macromolecules and large molecular weight protein complexes (ca. > 50 kDa) ref. [196], ref. [197], ref. [198], ref. [199], ref. [200].

In addition, solution NMR spectroscopy is a powerful tool for intermolecular interaction studies and screening of druggable seed compounds with atomic level resolution, because it can detect and determine a binding site, even if the interaction is extremely weak. By performing magnetization relaxation dispersion NMR experiments, tertiary structures of minor, excited or invisible protein conformations can be determined when this conformation is populated by < 1% ref. [201], ref. [202]. Therefore, it is expected that methodological advances of solution NMR spectroscopy, which can determine the tertiary structure of a protein in the bound state accurately and precisely in a weak interaction, will become a unique and beneficial structural biology tool by maximizing the features of solution NMR spectroscopy. Experimentally determined internuclear distances and secondary structure information predicted by TALOS and its related programs ref. [84], ref. [85], CamShift ref. [120] and SHIFTX2 ref. [121] can greatly assist accurate homology based modeling or docking simulations to obtain structural models. The most successful examples using these methods would be HADDOCK ref. [203], ref. [204] and CS-ROSETTA ref. [172], ref. [173], whose feasibility has been generally accepted to yield structural insights that give rational explanations describing the biological functions of biomolecular complexes and models. The integration of the above methods and/or additional structural data such as X-ray scattering and information about intermolecular contacts derived from pull-down and yeast-two-hybrid assays as well as point mutation data, the so-called hybrid/integrative methods, is a new approach to study large biomacromolecular systems ref. [205].

A number of software and web-based resources for NMR data analysis as described in this article are available and user friendly, and should help to open avenues for non-specialist and life scientists interested in studying the structure of proteins.

References

A. Ilari, C. Savino. Protein structure determination by X-ray crystallography. Methods Mol Biol, 2008. [PubMed]
K. Wüthrich. The way to NMR structures of proteins. Nat Struct Biol, 2001. [PubMed]
Z.H. Zhou. Atomic resolution cryo electron microscopy of macromolecular complexes. Adv Protein Chem Struct Biol, 2011. [PubMed]
J.G. Kempf, J.P. Loria. Protein dynamics from solution NMR: theory and applications. Cell Biochem Biophys, 2003. [PubMed]
M. Kovermann, P. Rogne, M. Wolf-Watz. Protein dynamics and function from solution state NMR spectroscopy. Q Rev Biophys, 2016
M.R. Jensen, P.R. Markwick, S. Meier, C. Griesinger, M. Zweckstetter, S. Grzesiek. Quantitative determination of the conformational properties of partially folded and intrinsically disordered proteins using NMR dipolar couplings. Structure, 2009. [PubMed]
L. Salmon, M.R. Jensen, P. Bernadó, M. Blackledge. Measurement and analysis of NMR residual dipolar couplings for the study of intrinsically disordered proteins. Methods Mol Biol, 2012. [PubMed]
S. Gil, T. Hošek, Z. Solyom, R. Kümmerle, B. Brutscher, R. Pierattelli. NMR spectroscopic studies of intrinsically disordered proteins at near-physiological conditions. Angew Chem Int Ed Engl, 2013. [PubMed]
M.R. Jensen, R.W. Ruigrok, M. Blackledge. Describing intrinsically disordered proteins at atomic resolution by NMR. Curr Opin Struct Biol, 2013. [PubMed]
M.R. Jensen, M. Zweckstetter, J.R. Huang, M. Blackledge. Exploring free-energy landscapes of intrinsically disordered proteins at atomic resolution using NMR spectroscopy. Chem Rev, 2014. [PubMed]
A. Abyzov, N. Salvi, R. Schneider, D. Maurin, R.W. Ruigrok, M.R. Jensen. Identification of dynamic modes in an intrinsically disordered protein using temperature-dependent NMR relaxation. J Am Chem Soc, 2016. [PubMed]
A.L. Skinner, J.S. Laurence. High-field solution NMR spectroscopy as a tool for assessing protein interactions with small molecule ligands. J Pharm Sci, 2008. [PubMed]
M.J. Harner, A.O. Frank, S.W. Fesik. Fragment-based drug discovery using NMR spectroscopy. J Biomol NMR, 2013. [PubMed]
Z. Liu, Z. Gong, X. Dong, C. Tang. Transient protein-protein interactions visualized by solution NMR. Biochim Biophys Acta, 2016. [PubMed]
N. Kobayashi, J. Iwahara, S. Koshiba, T. Tomizawa, N. Tochio, P. Güntert. KUJIRA, a package of integrated modules for systematic and interactive analysis of NMR data directed to high-throughput NMR structure studies. J Biomol NMR, 2007. [PubMed]
W. Lee, J.H. Kim, W.M. Westler, J.L. Markley. PONDEROSA, an automated 3D-NOESY peak picking program, enables automated protein structure determination. Bioinformatics, 2011. [PubMed]
N. Kobayashi, Y. Harano, N. Tochio, E. Nakatani, T. Kigawa, S. Yokoyama. An automated system designed for large scale NMR data deposition and annotation: application to over 600 assigned chemical shift data entries to the BioMagResBank from the Riken Structural Genomics/Proteomics Initiative internal database. J Biomol NMR, 2012. [PubMed]
W. Lee, J.L. Stark, J.L. Markley. PONDEROSA-C/S: client-server based software package for automated protein 3D structure determination. J Biomol NMR, 2014. [PubMed]
P. Guerry, V.D. Duong, T. Herrmann. CASD-NMR 2: robust and accurate unsupervised analysis of raw NOESY spectra and protein structure determination with UNIO. J Biomol NMR, 2015. [PubMed]
W. Braun. Distance geometry and related methods for protein structure determination from NMR data. Q Rev Biophys, 1987. [PubMed]
K. Wüthrich. Protein structure determination in solution by NMR spectroscopy. J Biol Chem, 1990. [PubMed]
H. Hiroaki. Recent applications of isotopic labeling for protein NMR in drug discovery. Expert Opin Drug Discov, 2013. [PubMed]
T. Sugiki, T. Fujiwara, C. Kojima. Latest approaches for efficient protein production in drug discovery. Expert Opin Drug Discov, 2014. [PubMed]
H. Hiroaki, Y. Umetsu, Y. Nabeshima, M. Hoshi, D. Kohda. A simplified recipe for assigning amide NMR signals using combinatorial 14N amino acid inverse-labeling. J Struct Funct Genomics, 2011. [PubMed]
R. Rasia, B. Brutscher, M. Plevin. Selective isotopic unlabeling of proteins using metabolic precursors: application to NMR assignment of intrinsically disordered proteins. Chembiochem, 2012. [PubMed]
P. Lundström, K. Teilum, T. Carstensen, I. Bezsonova, S. Wiesner, D.F. Hansen. Fractional 13C enrichment of isolated carbons using [1-13C]- or [2-13C]-glucose facilitates the accurate measurement of dynamics at backbone Calpha and side-chain methyl positions in proteins. J Biomol NMR, 2007. [PubMed]
K. Takeuchi, Z.Y. Sun, G. Wagner. Alternate 13C–12C labeling for complete mainchain resonance assignments using C alpha direct-detection with applicability toward fast relaxing protein systems. J Am Chem Soc, 2008. [PubMed]
M.T. Eddy, M. Belenky, A.C. Sivertsen, R.G. Griffin, J. Herzfeld. Selectively dispersed isotope labeling for protein structure determination by magic angle spinning NMR. J Biomol NMR, 2013. [PubMed]
A.L. Lee, J.L. Urbauer, A.J. Wand. Improved labeling strategy for 13C relaxation measurements of methyl groups in proteins. J Biomol NMR, 1997. [PubMed]
C. Guo, C. Geng, V. Tugarinov. Selective backbone labeling of proteins using 1,2-13C2-pyruvate as carbon source. J Biomol NMR, 2009. [PubMed]
K. Takeuchi, D.P. Frueh, Z.Y. Sun, S. Hiller, G. Wagner. CACA-TOCSY with alternate 13C–12C labeling: a 13Calpha direct detection experiment for mainchain resonance assignment, dihedral angle information, and amino acid type identification. J Biomol NMR, 2010. [PubMed]
R. Daly, M.T. Hearn. Expression of heterologous proteins in Pichia pastoris: a useful experimental tool in protein engineering and production. J Mol Recognit, 2005. [PubMed]
T. Sugiki, I. Shimada, H. Takahashi. Stable isotope labeling of protein by Kluyveromyces lactis for NMR study. J Biomol NMR, 2008. [PubMed]
H. Takahashi, I. Shimada. Production of isotopically labeled heterologous proteins in non-E. coli prokaryotic and eukaryotic cells. J Biomol NMR, 2010. [PubMed]
T. Sugiki, O. Ichikawa, M. Miyazawa-Onami, I. Shimada, H. Takahashi. Isotopic labeling of heterologous proteins in the yeast Pichia pastoris and Kluyveromyces lactis. Methods Mol Biol, 2012. [PubMed]
T. Kigawa, T. Yabuki, Y. Yoshida, M. Tsutsui, Y. Ito, T. Shibata. Cell-free production and stable-isotope labeling of milligram quantities of proteins. FEBS Lett, 1999. [PubMed]
K. Ozawa, S. Jergic, J. Crowther, P. Thompson, G. Wijffels, G. Otting. Cell-free protein synthesis in an autoinduction system for NMR studies of protein–protein interactions. J Biomol NMR, 2005. [PubMed]
T. Matsuda, S. Koshiba, N. Tochio, E. Seki, N. Iwasaki, T. Yabuki. Improving cell-free protein synthesis for stable-isotope labeling. J Biomol NMR, 2007. [PubMed]
K. Takai, T. Sawasaki, Y. Endo. Practical cell-free protein synthesis system using purified wheat embryos. Nat Protoc, 2010. [PubMed]
J. Yokoyama, T. Matsuda, S. Koshiba, T. Kigawa. An economical method for producing stable-isotope labeled proteins by the E. coli cell-free system. J Biomol NMR, 2010. [PubMed]
I. Hirao, T. Ohtsuki, T. Fujiwara, T. Mitsui, T. Yokogawa, T. Okuni. An unnatural base pair for incorporating amino acid analogs into proteins. Nat Biotechnol, 2002. [PubMed]
M. Kainosho, T. Torizawa, Y. Iwashita, T. Terauchi, A. Mei Ono, P. Güntert. Optimal isotope labelling for NMR protein structure determinations. Nature, 2006. [PubMed]
R. Abe, K. Shiraga, S. Ebisu, H. Takagi, T. Hohsaka. Incorporation of fluorescent non-natural amino acids into N-terminal tag of proteins in cell-free translation and its dependence on position and neighboring codons. J Biosci Bioeng, 2010. [PubMed]
K.V. Loscha, A.J. Herlt, R. Qi, T. Huber, K. Ozawa, G. Otting. Multiple-site labeling of proteins with unnatural amino acids. Angew Chem Int Ed Engl, 2012. [PubMed]
X.C. Su, C.T. Loh, R. Qi, G. Otting. Suppression of isotope scrambling in cell-free protein synthesis by broadband inhibition of PLP enymes for selective 15N-labelling and production of perdeuterated proteins in H2O. J Biomol NMR, 2011. [PubMed]
Y. Shimizu, Y. Kuruma, T. Kanamori, T. Ueda. The PURE system for protein production. Methods Mol Biol, 2014. [PubMed]
A. Deniaud, L. Liguori, I. Blesneac, J.L. Lenormand, E. Pebay-Peyroula. Crystallization of the membrane protein hVDAC1 produced in cell-free system. Biochim Biophys Acta, 2010. [PubMed]
S. Reckel, S. Sobhanifar, F. Durst, F. Löhr, V.A. Shirokov, V. Dötsch. Strategies for the cell-free expression of membrane proteins. Methods Mol Biol, 2010. [PubMed]
S. Haberstock, C. Roos, Y. Hoevels, V. Dötsch, G. Schnapp, A. Pautsch. A systematic approach to increase the efficiency of membrane protein production in cell-free expression systems. Protein Expr Purif, 2012. [PubMed]
X. Wang, J. Wang, B. Ge. Evaluation of cell-free expression system for the production of soluble and functional human GPCR N-formyl peptide receptors. Protein Pept Lett, 2013. [PubMed]
A. Dutta, K. Saxena, H. Schwalbe, J. Klein-Seetharaman. Isotope labeling in mammalian cells. Methods Mol Biol, 2012. [PubMed]
M. Sastry, C.A. Bewley, P.D. Kwong. Mammalian expression of isotopically labeled proteins for NMR spectroscopy. Adv Exp Med Biol, 2012. [PubMed]
J. Kikuchi, K. Shinozaki, T. Hirayama. Stable isotope labeling of Arabidopsis thaliana for an NMR-based metabolomics approach. Plant Cell Physiol, 2004. [PubMed]
T. Etezady-Esfarjani, T. Herrmann, R. Horst, K. Wüthrich. Automated protein NMR structure determination in crude cell-extract. J Biomol NMR, 2006. [PubMed]
E. Chikayama, M. Suto, T. Nishihara, K. Shinozaki, T. Hirayama, J. Kikuchi. Systematic NMR analysis of stable isotope labeled metabolite mixtures in plant and animal systems: coarse grained views of metabolic pathways. PLoS One, 2008
M.P. Latham, L.E. Kay. Is buffer a good proxy for a crowded cell-like environment? A comparative NMR study of calmodulin side-chain dynamics in buffer and E. coli lysate. PLoS One, 2012
S. Tomita, S. Ikeda, S. Tsuda, N. Someya, K. Asano, J. Kikuchi. A survey of metabolic changes in potato leaves by NMR-based metabolic profiling in relation to resistance to late blight disease under field conditions. Magn Reson Chem, 2017. [PubMed]
Z. Zheng, H. Sun, C. Hu, G. Li, X. Liu, P. Chen. Using “on/off” (19)F NMR/magnetic resonance imaging signals to sense tyrosine kinase/phosphatase activity in vitro and in cell lysates. Anal Chem, 2016. [PubMed]
T. Sugiki, C. Yoshiura, Y. Kofuku, T. Ueda, I. Shimada, H. Takahashi. High-throughput screening of optimal solution conditions for structural biological studies by fluorescence correlation spectroscopy. Protein Sci, 2009. [PubMed]
R. Horst, K. Wüthrich. Micro-scale NMR experiments for monitoring the optimization of membrane protein solutions for structural biology. Bio Protoc, 2015
S. Kozak, L. Lercher, M.N. Karanth, R. Meijers, T. Carlomagno, S. Boivin. Optimization of protein samples for NMR using thermal shift assays. J Biomol NMR, 2016. [PubMed]
B. Pedrini, P. Serrano, B. Mohanty, M. Geralt, K. Wüthrich. NMR-profiles of protein solutions. Biopolymers, 2013. [PubMed]
P. Flynn, D. Mattiello, H. Hill, A. Wand. Optimal use of cryogenic probe technology in NMR studies of proteins. J Am Chem Soc, 2000
D. Lee, C. Hilty, G. Wider, K. Wüthrich. Effective rotational correlation times of proteins from NMR relaxation interference. J Magn Reson, 2006. [PubMed]
L.E. Kay, D.A. Torchia, A. Bax. Backbone dynamics of proteins as studied by 15N inverse detected heteronuclear NMR spectroscopy: application to staphylococcal nuclease. Biochemistry, 1989. [PubMed]
K. Wüthrich. 1986
T. Yuwen, N.R. Skrynnikov. CP-HISQC: a better version of HSQC experiment for intrinsically disordered proteins under physiological conditions. J Biomol NMR, 2014. [PubMed]
A.E. Kelly, H.D. Ou, R. Withers, V. Dötsch. Low-conductivity buffers for high-sensitivity NMR measurements. J Am Chem Soc, 2002. [PubMed]
69Driscoll PC, Clore GM, Marion D, Wingfield PT and Gronenborn AM (1990) Complete resonance assignment for the polypeptide backbone of interleukin 1 beta using three-dimensional heteronuclear NMR spectroscopy. Biochemistry 29: 3542-2356.
G.M. Clore, A. Bax, P.C. Driscoll, P.T. Wingfield, A.M. Gronenborn. Assignment of the side-chain 1H and 13C resonances of interleukin-1 beta using double- and triple-resonance heteronuclear three-dimensional NMR spectroscopy. Biochemistry, 1990. [PubMed]
V. Kanelis, J.D. Forman-Kay, L.E. Kay. Multidimensional NMR methods for protein structure determination. IUBMB Life, 2001. [PubMed]
D. Marion, P.C. Driscoll, L.E. Kay, P.T. Wingfield, A. Bax, A.M. Gronenborn. Overcoming the overlap problem in the assignment of 1H NMR spectra of larger proteins by use of three-dimensional heteronuclear 1H–15N Hartmann–Hahn-multiple quantum coherence and nuclear Overhauser-multiple quantum coherence spectroscopy: application to interleukin 1 beta. Biochemistry, 1989. [PubMed]
C. Fernández, G. Wider. TROSY in NMR studies of the structure and function of large biological macromolecules. Curr Opin Struct Biol, 2003. [PubMed]
G. Wider. NMR techniques used with very large biological macromolecules in solution. Methods Enzymol, 2005. [PubMed]
K. Pervushin, R. Riek, G. Wider, K. Wüthrich. Attenuated T2 relaxation by mutual cancellation of dipole–dipole coupling and chemical shift anisotropy indicates an avenue to NMR structures of very large biological macromolecules in solution. Proc Natl Acad Sci U S A, 1997. [PubMed]
W. Bermel, I. Bertini, L. Duma, I.C. Felli, L. Emsley, R. Pierattelli. Complete assignment of heteronuclear protein resonances by protonless NMR spectroscopy. Angew Chem Int Ed Engl, 2005. [PubMed]
W. Bermel, I. Bertini, I.C. Felli, Y.M. Lee, C. Luchinat, R. Pierattelli. Protonless NMR experiments for sequence-specific assignment of backbone nuclei in unfolded proteins. J Am Chem Soc, 2006. [PubMed]
W. Bermel, I. Bertini, I.C. Felli, M. Matzapetakis, R. Pierattelli, E.C. Theil. A method for C(alpha) direct-detection in protonless NMR. J Magn Reson, 2007. [PubMed]
I. Bertini, B. Jiménez, R. Pierattelli, A.G. Wedd, Z. Xiao. Protonless 13C direct detection NMR: characterization of the 37 kDa trimeric protein CutA1. Proteins, 2008. [PubMed]
M. Matzapetakis, P. Turano, E.C. Theil, I. Bertini. 13C–13C NOESY spectra of a 480 kDa protein: solution NMR of ferritin. J Biomol NMR, 2007. [PubMed]
D.S. Wishart, C.G. Bigam, J. Yao, F. Abildgaard, H.J. Dyson, E. Oldfield. 1H, 13C and 15N chemical shift referencing in biomolecular NMR. J Biomol NMR, 1995. [PubMed]
J.L. Markley, A. Bax, Y. Arata, C.W. Hilbers, R. Kaptein, B.D. Sykes. Recommendations for the presentation of NMR structures of proteins and nucleic acids. IUPAC-IUBMB-IUPAB Inter-Union Task Group on the standardization of data bases of protein and nucleic acid structures determined by NMR spectroscopy. J Biomol NMR, 1998. [PubMed]
F. Delaglio, S. Grzesiek, G.W. Vuister, G. Zhu, J. Pfeifer, A. Bax. NMRPipe: a multidimensional spectral processing system based on UNIX pipes. J Biomol NMR, 1995. [PubMed]
Y. Shen, F. Delaglio, G. Cornilescu, A. Bax. TALOS +: a hybrid method for predicting protein backbone torsion angles from NMR chemical shifts. J Biomol NMR, 2009. [PubMed]
Y. Shen, A. Bax. Protein backbone and sidechain torsion angles predicted from NMR chemical shifts using artificial neural networks. J Biomol NMR, 2013. [PubMed]
P. Güntert, C. Mumenthaler, K. Wüthrich. Torsion angle dynamics for NMR structure calculation with the new program DYANA. J Mol Biol, 1997. [PubMed]
T. Herrmann, P. Güntert, K. Wüthrich. Protein NMR structure determination with automated NOE assignment using the new software CANDID and the torsion angle dynamics algorithm DYANA. J Mol Biol, 2002. [PubMed]
T. Herrmann, P. Güntert, K. Wüthrich. Protein NMR structure determination with automated NOE-identification in the NOESY spectra using the new software ATNOS. J Biomol NMR, 2002. [PubMed]
J. Jee, P. Güntert. Influence of the completeness of chemical shift assignments on NMR structures obtained with automated NOE assignment. J Struct Funct Genomics, 2003. [PubMed]
P. Güntert. Automated NMR structure calculation with CYANA. Methods Mol Biol, 2004. [PubMed]
B.A. Johnson, R.A. Blevins. NMR View: a computer program for the visualization and analysis of NMR data. J Biomol NMR, 1994. [PubMed]
W. Lee, W. Westler, A. Bahrami, H. Eghbalnia, J. Markley. PINE-SPARKY: graphical interface for evaluating automated probabilistic peak assignments in protein NMR spectroscopy. Bioinformatics, 2009. [PubMed]
W. Lee, M. Tonelli, J. Markley. NMRFAM-SPARKY: enhanced software for biomolecular NMR spectroscopy. Bioinformatics, 2015. [PubMed]
Y.S. Jung, M. Zweckstetter. Mars — robust automatic backbone assignment of proteins. J Biomol NMR, 2004. [PubMed]
J. Volk, T. Herrmann, K. Wüthrich. Automated sequence-specific protein NMR assignment using the memetic algorithm MATCH. J Biomol NMR, 2008. [PubMed]
F. Fiorito, T. Herrmann, F.F. Damberger, K. Wüthrich. Automated amino acid side-chain NMR assignment of proteins using (13)C- and (15)N-resolved 3D [(1)H, (1)H]-NOESY. J Biomol NMR, 2008. [PubMed]
P. Serrano, B. Pedrini, B. Mohanty, M. Geralt, T. Herrmann, K. Wüthrich. The J-UNIO protocol for automated protein structure determination by NMR in solution. J Biomol NMR, 2012. [PubMed]
E. Schmidt, P. Güntert. A new algorithm for reliable and general NMR resonance assignment. J Am Chem Soc, 2012. [PubMed]
C. Bartels, M. Billeter, P. Güntert, K. Wüthrich. Automated sequence-specific NMR assignment of homologous proteins using the program GARANT. J Biomol NMR, 1996. [PubMed]
H. Hu, K. Krishnamurthy. Revisiting the initial rate approximation in kinetic NOE measurements. J Magn Reson, 2006. [PubMed]
B. Vögeli. The nuclear Overhauser effect from a quantitative perspective. Prog Nucl Magn Reson Spectrosc, 2014. [PubMed]
B. Vögeli, J. Orts, D. Strotz, C. Chi, M. Minges, M.A. Wälti. Towards a true protein movie: a perspective on the potential impact of the ensemble-based structure determination using exact NOEs. J Magn Reson, 2014. [PubMed]
B. Vögeli, S. Olsson, P. Güntert, R. Riek. The exact NOE as an alternative in ensemble structure determination. Biophys J, 2016. [PubMed]
G.M. Clore, M.A. Robien, A.M. Gronenborn. Exploring the limits of precision and accuracy of protein structures determined by nuclear magnetic resonance spectroscopy. J Mol Biol, 1993. [PubMed]
K. Furuita, S. Kataoka, T. Sugiki, Y. Hattori, N. Kobayashi, T. Ikegami. Utilization of paramagnetic relaxation enhancements for high-resolution NMR structure determination of a soluble loop-rich protein with sparse NOE distance restraints. J Biomol NMR, 2015. [PubMed]
T. Saio, K. Ogura, M. Yokochi, Y. Kobashigawa, F. Inagaki. Two-point anchoring of a lanthanide-binding peptide to a target protein enhances the paramagnetic anisotropic effect. J Biomol NMR, 2009. [PubMed]
T. Saio, M. Yokochi, H. Kumeta, F. Inagaki. PCS-based structure determination of protein-protein complexes. J Biomol NMR, 2010. [PubMed]
S. Tikole, V. Jaravine, V. Orekhov, P. Güntert. Effects of NMR spectral resolution on protein structure calculation. PLoS One, 2013
R. Koradi, M. Billeter, M. Engeli, P. Güntert, K. Wüthrich. Automated peak picking and peak integration in macromolecular NMR spectra using AUTOPSY. J Magn Reson, 1998. [PubMed]
W. Lee, C.M. Petit, G. Cornilescu, J.L. Stark, J.L. Markley. The AUDANA algorithm for automated protein 3D structure determination from NMR NOE data. J Biomol NMR, 2016. [PubMed]
J.M. Würz, P. Güntert. Peak picking multidimensional NMR spectra with the contour geometry based algorithm CYPICK. J Biomol NMR, 2017. [PubMed]
S.W. Englander, L. Mayne. Protein folding studied using hydrogen-exchange labeling and two-dimensional NMR. Annu Rev Biophys Biomol Struct, 1992. [PubMed]
T.L. Hwang, P.C. van Zijl, S. Mori. Accurate quantitation of water-amide proton exchange rates using the phase-modulated CLEAN chemical EXchange (CLEANEX-PM) approach with a Fast-HSQC (FHSQC) detection scheme. J Biomol NMR, 1998. [PubMed]
Y.X. Wang, J. Jacob, F. Cordier, P. Wingfield, S.J. Stahl, S. Lee-Huang. Measurement of 3hJNC’ connectivities across hydrogen bonds in a 30 kDa protein. J Biomol NMR, 1999. [PubMed]
F. Cordier, S. Grzesiek. Temperature-dependence of protein hydrogen bond properties as studied by high-resolution NMR. J Mol Biol, 2002. [PubMed]
M. Karplus. Vicinal proton coupling in nuclear magnetic resonance. J Am Chem Soc, 1963
117Vuister G and Bax A (1993) Quantitative J correlation — a new approach for measuring homonuclear 3-bond J(H(N)H(ALPHA) coupling-constants in N-15-enriched proteins. J Am Chem Soc 115: 7772-7777.
J. Roche, J. Ying, Y. Shen, D.A. Torchia, A. Bax. ARTSY-J: convenient and precise measurement of (3)JHNHα couplings in medium-size proteins from TROSY-HSQC spectra. J Magn Reson, 2016. [PubMed]
S. Archer, M. Ikura, D. Torchia, A. Bax. An alternative 3D-NMR technique for correlating backbone N-15 with side-chain H-beta-resonances in larger proteins. J Magn Reson, 1991
K. Kohlhoff, P. Robustelli, A. Cavalli, X. Salvatella, M. Vendruscolo. Fast and accurate predictions of protein NMR chemical shifts from interatomic distances. J Am Chem Soc, 2009. [PubMed]
B. Han, Y. Liu, S.W. Ginzinger, D.S. Wishart. SHIFTX2: significantly improved protein chemical shift prediction. J Biomol NMR, 2011. [PubMed]
P. Güntert, L. Buchner. Combined automated NOE assignment and structure calculation with CYANA. J Biomol NMR, 2015. [PubMed]
C. Schwieters, J. Kuszewski, G. Clore. Using Xplor-NIH for NMR molecular structure determination. Prog Nucl Magn Reson Spectrosc, 2006
A. Rosato, A. Bagaria, D. Baker, B. Bardiaux, A. Cavalli, J.F. Doreleijers. CASD-NMR: critical assessment of automated structure determination by NMR. Nat Methods, 2009. [PubMed]
A. Rosato, W. Vranken, R.H. Fogh, T.J. Ragan, R. Tejero, K. Pederson. The second round of critical assessment of automated structure determination of proteins by NMR: CASD-NMR-2013. J Biomol NMR, 2015. [PubMed]
W. Lee, G. Cornilescu, H. Dashti, H.R. Eghbalnia, M. Tonelli, W.M. Westler. Integrative NMR for biomolecular research. J Biomol NMR, 2016. [PubMed]
M. Nilges, S. O’Donoghue. Ambiguous NOEs and automated NOE assignment. Prog Nucl Magn Reson Spectrosc, 1998
C. Mumenthaler, W. Braun. Automated assignment of simulated and experimental NOESY spectra of proteins by feedback filtering and self-correcting distance geometry. J Mol Biol, 1995. [PubMed]
C. Mumenthaler, P. Güntert, W. Braun, K. Wüthrich. Automated combined assignment of NOESY spectra and three-dimensional protein structure determination. J Biomol NMR, 1997. [PubMed]
Y.J. Huang, R. Tejero, R. Powers, G.T. Montelione. A topology-constrained distance network algorithm for protein structure determination from NOESY data. Proteins, 2006. [PubMed]
M. Nilges, M.J. Macias, S.I. O’Donoghue, H. Oschkinat. Automated NOESY interpretation with ambiguous distance restraints: the refined NMR solution structure of the pleckstrin homology domain from beta-spectrin. J Mol Biol, 1997. [PubMed]
W. Rieping, M. Habeck, B. Bardiaux, A. Bernard, T.E. Malliavin, M. Nilges. ARIA2: automated NOE assignment and data integration in NMR structure calculation. Bioinformatics, 2007. [PubMed]
W. Gronwald, S. Moussa, R. Elsner, A. Jung, B. Ganslmeier, J. Trenner. Automated assignment of NOESY NMR spectra using a knowledge based method (KNOWNOE). J Biomol NMR, 2002. [PubMed]
Z. Zhang, J. Porter, K. Tripsianes, O.F. Lange. Robust and highly accurate automatic NOESY assignment and structure determination with Rosetta. J Biomol NMR, 2014. [PubMed]
J. Kuszewski, C.D. Schwieters, D.S. Garrett, R.A. Byrd, N. Tjandra, G.M. Clore. Completely automated, highly error-tolerant macromolecular structure determination from multidimensional nuclear Overhauser enhancement spectra and chemical shift assignments. J Am Chem Soc, 2004. [PubMed]
C.D. Schwieters, J.J. Kuszewski, N. Tjandra, G.M. Clore. The Xplor-NIH NMR molecular structure determination package. J Magn Reson, 2003. [PubMed]
A. Brunger, P. Adams, G. Clore, W. DeLano, P. Gros, R. Grosse-Kunstleve. Crystallography & NMR system: a new software suite for macromolecular structure determination. Acta Crystallogr Sect F, 1998
R. Salomon-Ferrer, D. Case, R. Walker. An overview of the Amber biomolecular simulation package. Wiley Interdiscip Rev Comput Mol Sci, 2013
M.J. Robertson, J. Tirado-Rives, W.L. Jorgensen. Improved peptide and protein torsional energetics with the OPLSAA force field. J Chem Theory Comput, 2015. [PubMed]
S. Jo, T. Kim, V.G. Iyer, W. Im. CHARMM-GUI: a web-based graphical user interface for CHARMM. J Comput Chem, 2008. [PubMed]
L. Kelley, S. Gardner, M. Sutcliffe. An automated approach for defining core atoms and domains in an ensemble of NMR-derived protein structures. Prot Eng, 1997
D.A. Snyder, J. Grullon, Y.J. Huang, R. Tejero, G.T. Montelione. The expanded FindCore method for identification of a core atom set for assessment of protein structure prediction. Proteins, 2014. [PubMed]
D.K. Kirchner, P. Güntert. Objective identification of residue ranges for the superposition of protein structures. BMC Bioinformatics, 2011. [PubMed]
N. Kobayashi. A robust method for quantitative identification of ordered cores in an ensemble of biomolecular structures by non-linear multi-dimensional scaling using inter-atomic distance variance matrix. J Biomol NMR, 2014. [PubMed]
E.F. Pettersen, T.D. Goddard, C.C. Huang, G.S. Couch, D.M. Greenblatt, E.C. Meng. UCSF Chimera — a visualization system for explorator 5 y research and analysis. J Comput Chem, 2004. [PubMed]
S. McNicholas, E. Potterton, K.S. Wilson, M.E. Noble. Presenting your structures: the CCP4mg molecular-graphics software. Acta Crystallogr D, 2011. [PubMed]
C.D. Schwieters, G.M. Clore. The VMD-XPLOR visualization package for NMR structure refinement. J Magn Reson, 2001. [PubMed]
R. Koradi, M. Billeter, K. Wüthrich. MOLMOL: a program for display and analysis of macromolecular structures. J Mol Graph, 1996. [PubMed]
S. Nabuurs, C. Spronk, G. Vriend, G. Vuister. Concepts and tools for NMR restraint analysis and validation. Concept Magn Reson, 2004
C. Spronk, S. Nabuurs, E. Krieger, G. Vriend, G. Vuister. Validation of protein structures derived by NMR spectroscopy. Prog Nucl Magn Reson Spectrosc, 2004
X. Huang, F. Moy, R. Powers. Evaluation of the utility of NMR structures determined from minimal NOE-based restraints for structure-based drug design, using MMP-1 as an example. Biochemistry, 2000. [PubMed]
A. Rosato, R. Tejero, G. Montelione. Quality assessment of protein NMR structures. Curr Opin Struct Biol, 2013. [PubMed]
R. Laskowski, M. Macarthur, D. Moss, J. Thornton. PROCHECK — a program to check the stereochemical quality of protein structures. J Appl Cryst, 1993
R. Laskowski, J. Rullmann, M. MacArthur, R. Kaptein, J. Thornton. AQUA and PROCHECK-NMR: programs for checking the quality of protein structures solved by NMR. J Biomol NMR, 1996. [PubMed]
I.W. Davis, A. Leaver-Fay, V.B. Chen, J.N. Block, G.J. Kapral, X. Wang. MolProbity: all-atom contacts and structure validation for proteins and nucleic acids. Nucleic Acids Res, 2007. [PubMed]
A. Bax, N. Tjandra. High-resolution heteronuclear NMR of human ubiquitin in an aqueous liquid crystalline medium. J Biomol NMR, 1997. [PubMed]
M.R. Hansen, L. Mueller, A. Pardi. Tunable alignment of macromolecules by filamentous phage yields dipolar coupling interactions. Nat Struct Biol, 1998. [PubMed]
M. Zweckstetter, A. Bax. Characterization of molecular alignment in aqueous suspensions of Pf1 bacteriophage. J Biomol NMR, 2001. [PubMed]
R. Lipsitz, N. Tjandra. Residual dipolar couplings in NMR structure analysis. Annu Rev Biophys Biomol Struct, 2004. [PubMed]
M. Zweckstetter. NMR: prediction of molecular alignment from structure using the PALES software. Nat Protoc, 2008. [PubMed]
C. Schmidt, S. Irausquin, H. Valafar. Advances in the REDCAT software package. BMC Bioinformatics, 2013. [PubMed]
K. Simon, J. Xu, C. Kim, N. Skrynnikov. Estimating the accuracy of protein structures using residual dipolar couplings. J Biomol NMR, 2005. [PubMed]
M. Habeck, M. Nilges, W. Rieping. A unifying probabilistic framework for analyzing residual dipolar couplings. J Biomol NMR, 2008. [PubMed]
M. Fischer, J. Losonczi, J. Weaver, J. Prestegard. Domain orientation and dynamics in multidomain proteins from residual dipolar couplings. Biochemistry, 1999. [PubMed]
P. Bolon, H. Al-Hashimi, J. Prestegard. Residual dipolar coupling derived orientational constraints on ligand geometry in a 53 kDa protein–ligand complex. J Mol Biol, 1999. [PubMed]
M. Jensen, J. Ortega-Roldan, L. Salmon, N. van Nuland, M. Blackledge. Characterizing weak protein-protein complexes by NMR residual dipolar couplings. Eur Biophys J Biophys Lett, 2011
C. Gobl, T. Madl, B. Simon, M. Sattler. NMR approaches for structural analysis of multidomain proteins and complexes in solution. Prog Nucl Magn Reson Spectrosc, 2014. [PubMed]
A. Yee, A. Gutmanas, C.H. Arrowsmith. Solution NMR in structural genomics. Curr Opin Struct Biol, 2006. [PubMed]
J. Shin, W. Lee. Structural proteomics by NMR spectroscopy. Expert Rev Proteomics, 2008. [PubMed]
P. Güntert. Automated structure determination from NMR spectra. Eur Biophys J, 2009. [PubMed]
P. Guerry, T. Herrmann. Advances in automated NMR protein structure determination. Q Rev Biophys, 2011. [PubMed]
Y. Shen, O. Lange, F. Delaglio, P. Rossi, J.M. Aramini, G. Liu. Consistent blind protein structure generation from NMR chemical shift data. Proc Natl Acad Sci U S A, 2008. [PubMed]
Y. Shen, R. Vernon, D. Baker, A. Bax. De novo protein structure generation from incomplete chemical shift assignments. J Biomol NMR, 2009. [PubMed]
M. Takeda, T. Ikeya, P. Güntert, M. Kainosho. Automated structure determination of proteins with the SAIL-FLYA NMR method. Nat Protoc, 2007. [PubMed]
T. Ikeya, M. Takeda, H. Yoshida, T. Terauchi, J.G. Jee, M. Kainosho. Automated NMR structure determination of stereo-array isotope labeled ubiquitin from minimal sets of spectra using the SAIL-FLYA system. J Biomol NMR, 2009. [PubMed]
Y. Miyanoiri, M. Takeda, K. Okuma, A.M. Ono, T. Terauchi, M. Kainosho. Differential isotope-labeling for Leu and Val residues in a protein by E. coli cellular expression using stereo-specifically methyl labeled amino acids. J Biomol NMR, 2013. [PubMed]
Y. Miyanoiri, Y. Ishida, M. Takeda, T. Terauchi, M. Inouye, M. Kainosho. Highly efficient residue-selective labeling with isotope-labeled Ile, Leu, and Val using a new auxotrophic E. coli strain. J Biomol NMR, 2016. [PubMed]
M.K. Rosen, K.H. Gardner, R.C. Willis, W.E. Parris, T. Pawson, L.E. Kay. Selective methyl group protonation of perdeuterated proteins. J Mol Biol, 1996. [PubMed]
K. Gardner, L. Kay. Production and incorporation of N-15, C-13, H-2 (H-1-delta 1 methyl) isoleucine into proteins for multidimensional NMR studies. J Am Chem Soc, 1997
N.K. Goto, K.H. Gardner, G.A. Mueller, R.C. Willis, L.E. Kay. A robust and cost-effective method for the production of Val, Leu, Ile (delta 1) methyl-protonated 15N-, 13C-, 2H-labeled proteins. J Biomol NMR, 1999. [PubMed]
V. Tugarinov, L.E. Kay. Ile, Leu, and Val methyl assignments of the 723-residue malate synthase G using a new labeling strategy and novel NMR methods. J Am Chem Soc, 2003. [PubMed]
V. Tugarinov, L.E. Kay. An isotope labeling strategy for methyl TROSY spectroscopy. J Biomol NMR, 2004. [PubMed]
P. Gans, O. Hamelin, R. Sounier, I. Ayala, M. Dura, C. Amero. Stereospecific isotopic labeling of methyl groups for NMR spectroscopic studies of high-molecular-weight proteins. Angew Chem Int Ed Engl, 2010. [PubMed]
P. Schanda, Ē. Kupče, B. Brutscher. SOFAST-HMQC experiments for recording two-dimensional heteronuclear correlation spectra of proteins within a few seconds. J Biomol NMR, 2005. [PubMed]
P. Schanda, B. Brutscher. Hadamard frequency-encoded SOFAST-HMQC for ultrafast two-dimensional protein NMR. J Magn Reson, 2006. [PubMed]
D. Schulze-Sünninghausen, J. Becker, B. Luy. Rapid heteronuclear single quantum correlation NMR spectra at natural abundance. J Am Chem Soc, 2014. [PubMed]
V. Tugarinov, L. Kay, I. Ibraghimov, V. Orekhov. High-resolution four-dimensional H-1-C-13 NOE spectroscopy using methyl-TROSY, sparse data acquisition, and multidimensional decomposition. J Am Chem Soc, 2005. [PubMed]
S. Hiller, I. Ibraghimov, G. Wagner, V. Orekhov. Coupled decomposition of four-dimensional NOESY spectra. J Am Chem Soc, 2009. [PubMed]
P. Rossi, Y. Xia, N. Khanra, G. Veglia, C.G. Kalodimos. (15)N and (13)C- SOFAST-HMQC editing enhances 3D-NOESY sensitivity in highly deuterated, selectively [(1)H,(13)C]-labeled proteins. J Biomol NMR, 2016. [PubMed]
Ē. Kupče, L.E. Kay, R. Freeman. Detecting the “afterglow” of 13C NMR in proteins using multiple receivers. J Am Chem Soc, 2010. [PubMed]
J.G. Reddy, R.V. Hosur. Parallel acquisition of 3D-HA(CA)NH and 3D-HACACO spectra. J Biomol NMR, 2013. [PubMed]
C. Wiedemann, P. Bellstedt, A. Kirschstein, S. Häfner, C. Herbst, M. Görlach. Sequential protein NMR assignments in the liquid state via sequential data acquisition. J Magn Reson, 2014. [PubMed]
M.J. Bostock, D.J. Holland, D. Nietlispach. Improving resolution in multidimensional NMR using random quadrature detection with compressed sensing reconstruction. J Biomol NMR, 2017
J. Ying, F. Delaglio, D.A. Torchia, A. Bax. Sparse multidimensional iterative lineshape-enhanced (SMILE) reconstruction of both non-uniformly sampled and conventional NMR data. J Biomol NMR, 2016
T. Miljenović, X. Jia, P. Lavrencic, B. Kobe, M. Mobli. A non-uniform sampling approach enables studies of dilute and unstable proteins. J Biomol NMR, 2017
J.L. Markley, E.L. Ulrich, W.M. Westler, B.F. Volkman. Macromolecular structure determination by NMR spectroscopy. Methods Biochem Anal, 2003. [PubMed]
S. Raman, Lange OF, P. Rossi, M. Tyka, X. Wang, J. Aramini. NMR structure determination for larger proteins using backbone-only data. Science, 2010. [PubMed]
E. Karaca, A.S. Melquiond, S.J. de Vries, P.L. Kastritis, A.M. Bonvin. Building macromolecular assemblies by information-driven docking: introducing the HADDOCK multibody docking server. Mol Cell Proteomics, 2010. [PubMed]
K. Kazimierczuk, V.Y. Orekhov. Accelerated NMR spectroscopy by using compressed sensing. Angew Chem Int Ed, 2011
D. Russel, K. Lasker, B. Webb, J. Velazquez-Muriel, E. Tjioe, D. Schneidman-Duhovny. Putting the pieces together: integrative modeling platform software for structure determination of macromolecular assemblies. PLoS Biol, 2012
J. Loria, R. Berlow, E. Watt. Characterization of enzyme motions by solution NMR relaxation dispersion. Acc Chem Res, 2008. [PubMed]
N. Anthis, G. Clore. Visualizing transient dark states by NMR spectroscopy. Q Rev Biophys, 2015. [PubMed]
C. Dominguez, R. Boelens, A.M. Bonvin. HADDOCK: a protein–protein docking approach based on biochemical or biophysical information. J Am Chem Soc, 2003. [PubMed]
G.C. van Zundert, J.P. Rodrigues, M. Trellet, C. Schmitz, P.L. Kastritis, E. Karaca. The HADDOCK2.2 web server: user-friendly integrative modeling of biomolecular complexes. J Mol Biol, 2016. [PubMed]
A. Sali, H.M. Berman, T. Schwede, J. Trewhella, G. Kleywegt, S.K. Burley. Outcome of the first wwPDB hybrid/integrative methods task force workshop. Structure, 2015. [PubMed]