Author
Listed:
- Hailiang Huang
- Bruno M Jedynak
- Joel S Bader
Abstract
Yeast two-hybrid screens are an important method for mapping pairwise physical interactions between proteins. The fraction of interactions detected in independent screens can be very small, and an outstanding challenge is to determine the reason for the low overlap. Low overlap can arise from either a high false-discovery rate (interaction sets have low overlap because each set is contaminated by a large number of stochastic false-positive interactions) or a high false-negative rate (interaction sets have low overlap because each misses many true interactions). We extend capture–recapture theory to provide the first unified model for false-positive and false-negative rates for two-hybrid screens. Analysis of yeast, worm, and fly data indicates that 25% to 45% of the reported interactions are likely false positives. Membrane proteins have higher false-discovery rates on average, and signal transduction proteins have lower rates. The overall false-negative rate ranges from 75% for worm to 90% for fly, which arises from a roughly 50% false-negative rate due to statistical undersampling and a 55% to 85% false-negative rate due to proteins that appear to be systematically lost from the assays. Finally, statistical model selection conclusively rejects the Erdös-Rényi network model in favor of the power law model for yeast and the truncated power law for worm and fly degree distributions. Much as genome sequencing coverage estimates were essential for planning the human genome sequencing project, the coverage estimates developed here will be valuable for guiding future proteomic screens. All software and datasets are available in Datasets S1 and S2, Figures S1–S5, and Tables S1−S6, and are also available from our Web site, http://www.baderzone.org.: The genome sequence of an organism provides a parts list of proteins, but not an instruction manual for assembling the parts into a cell. Assembly instructions now come from experiments such as two-hybrid screens that detect physical interactions between pairs of proteins. Defining the resources required for generating a full interaction map requires accurate estimates of the false-negative and false-positive rates of genome-scale screens. Two-hybrid screens often select a query protein and sample its interaction partners. True partners may be missed, and false partners may be spuriously identified. This sampling process resembles a capture–recapture experiment, except that classical capture–recapture theory assumes no false positives. Novel extensions to capture–recapture theory permit its application to proteomic screens. This new theory provides statistically grounded answers to long-standing questions: false-discovery rates of high-throughput screens (possibly over 50% per unique interaction, but probably no more than 15% per clone); the quality of different screening libraries; protein properties leading to “sticky” or “promiscuous” interactions; the global network topology; and, most importantly, the coverage of existing two-hybrid maps. Models estimate roughly 30,000 total pairwise interactions in yeast and 500,000 to 1,000,000 in metazoans. The majority of these interactions remain to be discovered.
Suggested Citation
Hailiang Huang & Bruno M Jedynak & Joel S Bader, 2007.
"Where Have All the Interactions Gone? Estimating the Coverage of Two-Hybrid Protein Interaction Maps,"
PLOS Computational Biology, Public Library of Science, vol. 3(11), pages 1-20, November.
Handle:
RePEc:plo:pcbi00:0030214
DOI: 10.1371/journal.pcbi.0030214
Download full text from publisher
Citations
Citations are extracted by the
CitEc Project, subscribe to its
RSS feed for this item.
Cited by:
- Maddalena Dilucca & Giulio Cimini & Andrea Semmoloni & Antonio Deiana & Andrea Giansanti, 2015.
"Codon Bias Patterns of E. coli’s Interacting Proteins,"
PLOS ONE, Public Library of Science, vol. 10(11), pages 1-18, November.
- Matthew Burgess & Eytan Adar & Michael Cafarella, 2016.
"Link-Prediction Enhanced Consensus Clustering for Complex Networks,"
PLOS ONE, Public Library of Science, vol. 11(5), pages 1-23, May.
Most related items
These are the items that most often cite the same works as this one and are cited by the same works as this one.
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:0030214. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.