Sci. Aging Knowl. Environ., 12 May 2004
Vol. 2004, Issue 19, p. pe19
[DOI: 10.1126/sageke.2004.19.pe19]


Vital Connections

Scott D. Pletcher

The author is in the Department of Molecular and Human Genetics and at the Huffington Center on Aging, Baylor College of Medicine, Houston, TX 77030, USA. E-mail: pletcher{at}

Key Words: scale-free network • protein interaction • pleiotropy • Saccharomyces cerevisiae

Our scene opens on an August day in New York. Thousands of travelers scurry through JFK International Airport to catch flights that link the Big Apple to nearly every other major city in America. The United States airline system plays favorites: A small number of highly connected cities dominate its structure.

The camera pans back as our view weaves through the New York skyline and comes to rest around the water cooler in an unremarkable New York business. Robert and his colleagues are discussing their weekend plans when gossip reveals that the boss's new secretary from Atlanta is the sister of Robert's former roommate at Berkeley. Someone's remark, "it certainly is a small world" precedes the shock of an unexpected rain of darkness. Fifty million North Americans are suddenly experiencing the largest blackout in U.S. history. News agencies around the country will ponder how such a small, isolated incident could generate such a massive failure.

A flicker of light pierces the darkness and draws our attention. We follow it to the laboratories of a physicist and a biologist who discover that, as with U.S. airports, social interactions, and regional power grids, metabolic networks in many biological species exhibit emergent and surprisingly conserved properties. Could it be that the large-scale organization of interacting entities, regardless of source or composition, follow the same rules?

If the above historical fiction reminds you of a pitch for a Hollywood movie, it does so for good reason. The structure and behavior of complex systems capture the imagination. The fundamental properties that emerge from the growth and development of the most complicated formations of humans and nature appeal to scientists and laymen alike. A new chapter on this subject was opened by Barabasi and Albert in their articles in Science (1) and Nature (2) that described rules and characteristics of particular networks of interacting entities. The concept is simple, yet the ramifications are being felt throughout biology (3-6), information technology (7), epidemiology (8), social dynamics (9), and now aging (10).

Networks consist of nodes connected by links. Any two nodes may or may not be connected, but those that are linked can be linked by one and only one line (Fig. 1). Before the Barabasi and Albert papers, research on the structure and behavior of these sorts of networks was dominated by the ideas of the great mathematician Paul Erdös and his colleague Alfréd Rényi. Erdös and Rényi suggested that large networks could be effectively represented by randomly connecting different nodes through a random placement of links (11). Such random networks have interesting properties. For example, the number of links attached to any specific node follows a Poisson distribution with a nearly normal shape (Fig. 1). Most nodes have an average numbers of links, and it is extremely rare to find nodes with a very much larger or smaller number.

View larger version (22K):
[in this window]
[in a new window]
Fig. 1. Graphs (above) and diagrams (below) that illustrate a random (Erdös-Réyni) network and a scale-free network [adapted from (4)].

In the late 1990s, Barabasi and Albert set out to examine real-world networks and determine their characteristics in relation to those described and studied by Erdös and Rényi. They started with the World Wide Web. To their surprise, they discovered that the distribution of Web-page connectivity does not follow a Poisson distribution (for an example of a network that follows a Poisson distribution, see a U.S. road map). Instead, it follows a power law (Fig. 1), and there is a significant number of nodes that are very highly connected (2). These types of networks were termed "scale-free," and their characteristic hubs, such as Google or Yahoo, impart global properties that are strikingly different from Erdös-Rényi random networks.

As our introductory story reveals, these scale-free networks surround us. Examples include the U.S. airline system, with its arrangement of highly connected hubs; social relationships and sexual partners; author collaborations in various scientific disciplines; the Internet; the power grid of the eastern United States; the network of actors in Hollywood (made famous by the game called "Six Degrees of Kevin Bacon"); and, most relevant for this Perspective, protein interaction networks in Saccharomyces cerevisiae.

How is it that such diverse structures share similar properties? The answer appears to lie in the two principles of growth and preferential attachment (1). Instead of constructing a network from a set number of preexisting nodes attached at random (à la Erdös-Rényi), assume that we begin with a small number of nodes and then expand the network one by one. When a new node appears, the probability of it linking to an existing node is not random but is instead greater for nodes that are already more highly connected. Early nodes tend to acquire more links, making them more likely to connect to new nodes and setting up a sort of positive feedback that results in the network structure being dominated by a small number of hubs. Of course, this model closely represents what happens in the real world. Most large networks do not instantly appear. They grow from smaller structures, and preferential linking behavior is common. Highly cited papers in the scientific literature, for example, stimulate researchers to read and further cite them.

Scale-free networks harbor other important characteristics. First, they are "small-world" networks, meaning that it requires (on average) few steps to get from any particular node to another. Despite the tremendous size of the Web, it is estimated that it would take only about 21 clicks to navigate between any two documents (2). Second, scale-free networks are resilient to random failures but highly susceptible to targeted attack. A random failure will most likely destroy a weakly connected node, because they are much more common. As a result, up to 80% of randomly selected Internet documents can be removed without destroying network function; those remaining will form a coherent network with at least one path linking every pair of Web sites. A targeted attack on as few as 5 to 10% of the hubs, however, will cause tremendous disruption and possible failure of the network as a whole.

Several years ago, Jeong and colleagues described a network analysis of over 1800 yeast proteins (the nodes of the network), connected by over 2000 protein-protein interactions (the connections or links) identified primarily by two-hybrid data (4). They found that the distribution of the number of interactions among proteins followed a power law. There is a small number of highly connected proteins that play a principal role in interactions among many less well-connected proteins. Thus, this species' protein interaction network was shown to be scale-free.

An interesting consequence of the scale-free nature of the yeast protein network is its error tolerance. Proteins that exhibit less than five interactions constitute over 90% of the total, yet only 21% are essential. Highly connected proteins (over 15 interactions) make up roughly less than 1% of the genome, but a single deletion is lethal in over 60% of the cases (4). This implies that hub proteins at the center of the network are significantly more likely to be essential than are proteins at the periphery. Fraser et al. (12) expanded this analysis to include many more proteins and a more quantitative measure of phenotypic effect. They showed that a protein's effect on organismal fitness (defined as the reduction in growth rate caused by deletion or disruption of the gene that encodes the protein) was significantly positively correlated with the number of protein-protein interactions with which it was associated. These are two instances where an analysis of network topology provided unique insight into basic biochemical and genetic data. More recent examples include the evolutionary conservation of groups of interacting proteins in yeast (13) and the organization of metabolic fluxes that contribute to metabolism in Escherichia coli (6).

In his recent article in the Proceedings of the Royal Society, Daniel Promislow (10) describes striking results concerning the special characteristics that aging-related proteins seem to have in the yeast protein-protein interaction network. Promislow isolated sets of proteins that have been shown to affect six well-defined phenotypes: replicative senescence, cell cycle, cell size, ultraviolet (UV) radiation sensitivity, salt tolerance, and DNA silencing, and he asked whether proteins that are linked to aging (that is, replicative senescence) or to any of the other five traits are nonrandomly distributed through the network.

To answer this question, Promislow obtained extensive information about the yeast protein network from a variety of sources. The proteins and their interactions that formed the nodes and links of his network were those used by Fraser et al. (12). Associations between individual proteins and particular phenotypes were obtained directly from the literature or, as in the case of senescence-related proteins, from existing databases such as SAGE KE's Genes/Interventions database (14). To identify pleiotropic proteins (those with associations to multiple phenotypes), he turned to the Munich Information Center for Protein Sequences, and the degree of pleiotropy was defined as the number of different functional classifications listed for each protein. When completed, the data set contained 3575 proteins, 2611 of which had data for both connectivity and pleiotropy, and over 13,000 pairwise interactions. Statistical artifacts and confounding influences can linger in the depths of such large and complicated data sets, and Promislow took careful steps to avoid pitfalls. For example, previous work has documented a significant correlation between cellular localization and connectivity: Proteins present in the nucleus tend to show higher connectivity, a phenomenon that was controlled for in the present analysis.

The Promislow paper reports two important results. First, it establishes that proteins associated with replicative senescence interact with an unexpectedly large number of other proteins. Aging-related genes are, therefore, highly connected. This holds true for the complete set of known replicative aging genes (a total of 38) and for a subset that was discovered in the Guarente lab alone (22 genes). The result is highly statistically significant, and it is derived from an essentially model-free approach. I expect that it will prove to be quite robust. Proteins associated with the cell cycle are also highly connected. This is not surprising, however, because replicative senescence is directly associated with cell division. One might be concerned that high connectivity would be characteristic of most phenotypes that have been subjected to intensive genetic study, simply because they are not a random sample of all possible traits. However, the other four characters that were examined--UV sensitivity, cell size, salt tolerance, and DNA silencing--show no such effect; with respect to network topology, proteins involved in these four phenotypes are very ordinary.

The second result is more subtle but equally interesting and important. Promislow shows that a protein's connectivity positively correlates with its degree of pleiotropy. Proteins that are associated with a single biological function link with significantly fewer proteins, on average, than do those with two or more functions. As with many discoveries, this is not surprising in hindsight. It seems natural that genes encoding proteins that impact many different traits would have significantly more interactions. As mentioned previously, proteins associated with senescence are more highly connected than expected. They are, therefore, more pleiotropic than the average protein.

Global characteristics of biological networks illuminate both ultimate and proximate questions about aging. Evolutionary studies focus on highly statistical descriptions of the relation between genotype and phenotype to test alternative hypotheses of why aging has evolved (see Day Perspective). These studies have traditionally suffered because of a lack of discriminatory hypotheses and because of untested assumptions concerning the age-specific properties of genetic effects (15). Are genes highly pleiotropic, with effects at many different ages? Are these effects consistently in one direction (either beneficial or deleterious), or are they antagonistic such that beneficial effects occur early in life and deleterious effects later on (see Williams Classic Paper)? Although I do not feel that the observations described by Promislow have solved these basic problems, they are an important first step. We now know that aging genes are indeed highly pleiotropic. A difficult but important followup question concerns how protein-protein interactions change with age. Can the temporal dynamics of network structure help predict variation in patterns of aging across different taxa?

On a proximal level, the connectedness of currently known aging-related genes provides a context for understanding their characteristics and offers guidance for further discovery. It is intriguing to consider why it is that researchers in the field of aging have managed to identify highly connected genes when, in scale-free networks, they are so rare in comparison to those that are weakly connected. An answer to this question may be that genes associated with highly connected proteins have large effects, which are easier to detect. Although this hypothesis has yet to be tested, it is noted by Promislow that aging, because of our inability to measure it precisely (in comparison to traits such as cell size), may be particularly susceptible to this sort of observation bias. Such considerations suggest that the search for new aging-related genes should be focused on those that encode highly connected proteins, whereas weakly connected proteins might be explored through enhancer/suppressor experiments that increase the probability of their detection.

It will be a priority to execute a similar analysis using data from Caenorhabditis elegans. A draft of the protein-protein interaction network has recently been published (16), and the number of genes known to affect aging in nematodes is significantly greater than in any other organism. Intriguingly, this includes genes with major effects, such as those in the daf-2 pathway, as well as those with relatively minor effects that have recently shown up in RNA interference screens (17) (see Melov Perspective) and microarray experiments (18) (see "Vital Collaboration").

Thanks predominantly to molecular genetics, research on aging has experienced remarkable advancement in the past few years (see "Aging Research Grows Up"). But we have only scratched the surface. Network analysis, although still in its infancy, may help us take the next leap forward by integrating information from various biological disciplines to provide insight that is not available from a purely reductionist viewpoint. Hard problems often rely on advancing technology and increasingly integrative methods for their solution, and aging is no exception. Fortunately, if scientists share anything with great politicians, it is conviction: "We shall not fail or falter; we shall not weaken or tire . . . Give us the tools and we will finish the job." (Winston Churchill, 1941)

May 12, 2004
  1. A. L. Barabasi, R. Albert, Emergence of scaling in random networks. Science 286, 509-512 (1999).[Abstract/Free Full Text]
  2. R. Albert, H. Jeong, A.-L. Barabási, Internet: Diameter of the World-Wide Web. Nature 401, 130-131 (1999).[CrossRef]
  3. H. Jeong, B. Tombor, R. Albert, Z. N. Oltvai, A. L. Barabasi, The large-scale organization of metabolic networks. Nature 407, 651-654 (2000).[CrossRef][Medline]
  4. H. Jeong, S. P. Mason, A. L. Barabasi, Z. N. Oltvai, Lethality and centrality in protein networks. Nature 411, 41-42 (2001).[CrossRef][Medline]
  5. S. H. Yook, Z. N. Oltvai, A. L. Barabasi, Functional and topological characterization of protein interaction networks. Proteomics 4, 928-942 (2004). [CrossRef][Medline]
  6. E. Almaas, B. Kovacs, T. Vicsek, Z. N. Oltvai, A. L. Barabasi, Global organization of metabolic fluxes in the bacterium Escherichia coli. Nature 427, 839-843 (2004).[CrossRef][Medline]
  7. R. Cohen, K. Erez, D. ben-Avraham, S. Havlin, Breakdown of the internet under intentional attack. Phys. Rev. Lett. 86, 3682-3685 (2001).[CrossRef][Medline]
  8. R. Cohen, S. Havlin, D. Ben-Avraham, Efficient immunization strategies for computer networks and populations. Phys. Rev. Lett. 91, 247901 (2003).[CrossRef][Medline]
  9. H. Ebel, L. I. Mielsch, S. Bornholdt, Scale-free topology of e-mail networks. Phys. Rev. E. Stat. Nonlin. Soft Matter Phys. 66, 035103 (2002).[Medline]
  10. D. E. L. Promislow, Protein networks, pleiotropy and the evolution of senescence. Proc. R. Soc. Lond. B Biol. Sci., 5 May 2004 (10.1098/rspb.2004.2732) (FirstCite). [Abstract].
  11. P. Erdös, A. Rényi, On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci. 5, 17-61 (1960).
  12. H. B. Fraser, A. E. Hirsh, L. M. Steinmetz, C. Scharfe, M. W. Feldman, Evolutionary rate in the protein interaction network. Science 296, 750-752 (2002).[Abstract/Free Full Text]
  13. S. Wuchty, Z. N. Oltvai, A. L. Barabasi, Evolutionary conservation of motif constituents in the yeast protein interaction network. Nat. Genet. 35, 176-179 (2003). [CrossRef][Medline]
  14. M. Kaeberlein, B Jegalian, M. McVey, AGEID: a database of aging genes and interventions. Mech. Ageing Dev. 123, 1115-1119 (2002).[CrossRef][Medline]
  15. D. E. Promislow, S. D. Pletcher, Advice to an aging scientist. Mech. Ageing Dev. 123, 841-850 (2002).[CrossRef][Medline]
  16. S. Li, C. M. Armstrong, N. Bertin, H. Ge, S. Milstein, M. Boxem, P. O. Vidalain, J. D. Han, A. Chesneau, T. Ha et al., A map of the interactome network of the metazoan C. elegans. Science 303, 540-543 (2004).[Abstract/Free Full Text]
  17. S. S. Lee, R. Y. Lee, A. G. Fraser, R. S. Kamath, J. Ahringer, G. Ruvkun, A systematic RNAi screen identifies a critical role for mitochondria in C. elegans longevity. Nat. Genet. 33, 40-48 (2003).[CrossRef][Medline]
  18. C. T. Murphy, S. A. McCarroll, C. I. Bargmann, A. Fraser, R. S. Kamath, J. Ahringer, H. Li, C. Kenyon, Genes that act downstream of DAF-16 to influence the lifespan of Caenorhabditis elegans. Nature 424, 277-283 (2003).[CrossRef][Medline]
Citation: S. D. Pletcher, Vital Connections. Sci. Aging Knowl. Environ. 2004 (19), pe19 (2004).

Science of Aging Knowledge Environment. ISSN 1539-6150