Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4360))

Included in the following conference series:

  • 297 Accesses

Abstract

Linkage analysis is a tool used by geneticists for mapping disease-susceptibility genes in the study of Mendelian and complex diseases. However analyses of large inbred pedigrees with extensive missing data are often beyond the capabilities of a single computer. We present a distributed system called superlink-online for computing multipoint LOD scores of large inbred pedigrees. It achieves high performance via efficient parallelization of the algorithms in superlink, a state-of-the-art serial program for these tasks, and through utilization of thousands of resources residing in multiple opportunistic grid environments. Notably, the system is available online, which allows computationally intensive analyses to be performed with no need for either installation of software, or maintenance of a complicated distributed environment. The main algorithmic challenges have been to efficiently split large tasks for distributed execution in a highly dynamic non-dedicated running environment, as well as to utilize resources in all the available grid environments. Meeting these challenges has provided nearly interactive response time for shorter tasks while simultaneously serving massively parallel ones. The system, which is being used extensively by medical centers worldwide, achieves speedups of up to three orders of magnitude and allows analyses that were previously infeasible.

This work is supported by the Israeli Ministry of Science.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Superlink-online: Superlink-online genetic linkage analysis system, http://bioinfo.cs.technion.ac.il/superlink-online (2006)

    Google Scholar 

  2. Thain, D., Livny, M.: Building reliable clients and servers. In: Foster, I., Kesselman, C. (eds.) The Grid: Blueprint for a New Computing Infrastructure, Morgan Kaufmann, San-Francisco (2003)

    Google Scholar 

  3. Kleinrock, L., Muntz, R.: Processor sharing queueing models of mixed scheduling disciplines for time shared systems. Journal of ACM 19, 464–482 (1972)

    Article  MATH  Google Scholar 

  4. Fishelson, M., Geiger, D.: Exact genetic linkage computations for general pedigrees. Bioinformatics 18(Suppl. 1), 189–198 (2002)

    Google Scholar 

  5. Fishelson, M., Dovgolevsky, N., Geiger, D.: Maximum likelihood haplotyping for general pedigrees. Human Heredity 59, 41–60 (2005)

    Article  Google Scholar 

  6. Silberstein, M., Geiger, D., Schuster, A., Livny, M.: Scheduling of mixed workloads in multi-grids: The grid execution hierarchy. In: 15th IEEE International Symposium on High Performance Distributed Computing (HPDC-15 2006) (2006)

    Google Scholar 

  7. CSF: Community scheduler framework, http://www.globus.org/toolkit/docs/4.0/contributions/csf (2006)

    Google Scholar 

  8. England, D., Weissman, J.: Costs and benefits of load sharing in the computational grid. In: Feitelson, D.G., Rudolph, L. (eds.) 10th Workshop on Job Scheduling Strategies for Parallel Processing (2004)

    Google Scholar 

  9. Vadhiyar, S., Dongarra, J.: Self adaptivity in grid computing. Concurrency and Computation: Practice and Experience 17(2–4), 235–257 (2005)

    Article  Google Scholar 

  10. Friedman, N., Geiger, D., Lotner, N.: Likelihood computation with value abstraction. In: Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence (UAI), pp. 192–200. Morgan Kaufmann, San Francisco (2000)

    Google Scholar 

  11. Cooper, G.: The computational complexity of probabilistic inference using bayesian belief networks. Artificial Intelligence 42, 393–405 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  12. Dechter, R.: Bucket elimination: A unifying framework for probabilistic inference. In: Jordan, M. (ed.) Learning in Graphical Models, pp. 75–104. Kluwer Academic Press, Dordrecht (1998)

    Google Scholar 

  13. Arnborg, S., Corneil, D.G., Proskurowski, A.: Complexity of finding embeddings in a k-tree. SIAM Journal of Algorithms and Discrete Methods 8, 277–284 (1987)

    Article  MATH  MathSciNet  Google Scholar 

  14. Knappskog, P., Majewski, J., Livneh, A., Nilsen, P., Bringsli, J., Ott, J., Boman, H.: Cold-Induced Sweating Syndrome is caused by mutations in the CRLF1 Gene. American Journal of Human Genetics 72(2), 375–383 (2003)

    Article  Google Scholar 

  15. Miller, P., Nadkarni, P., Gelernter, G., Carriero, N., Pakstis, A., Kidd, K.: Parallelizing genetic linkage analysis: a case study for applying parallel computation in molecular biology. Computing and Biomedical Research 24(3), 234–248 (1991)

    Article  Google Scholar 

  16. Dwarkadas, S., Schäffer, A., Cottingham, R., Cox, A., Keleher, P., Zwaenepoel, W.: Parallelization of general linkage analysis problems. Human Heredity 44, 127–141 (1994)

    Article  Google Scholar 

  17. Matise, T., Schroeder, M., Chiarulli, D., Weeks, D.: Parallel computation of genetic likelihoods using CRI-MAP, PVM, and a network of distributed workstations. Human Heredity 45, 103–116 (1995)

    Google Scholar 

  18. Gupta, S., Schäffer, A., Cox, A., Dwarkadas, S., Zwaenepoel, W.: Integrating parallelization strategies for linkage analysis. Computing and Biomedical Research 28, 116–139 (1995)

    Article  Google Scholar 

  19. Rai, A., Lopez-Benitez, N., Hargis, J., Poduslo, S.: On the parallelization of Linkmap from the LINKAGE/FASTLINK package. Computing and Biomedical Research 33(5), 350–364 (2000)

    Article  Google Scholar 

  20. Kothari, K., Lopez-Benitez, N., Poduslo, S.: High-performance implementation and analysis of the linkmap program. Computing and Biomedical Research 34(6), 406–414 (2001)

    Google Scholar 

  21. Conant, G., Plimpton, S., Old, W., Wagner, A., Fain, P., Pacheco, T., Heffelfinger, G.: Parallel Genehunter: implementation of a linkage analysis package for distributed-memory architectures. Journal of Parallel and Distributed Computing 63(7–8), 674–682 (2003)

    Article  Google Scholar 

  22. Dietter, J., Spiegel, A., an Mey, D., Pflug, H.J., al Kateb, H., Hoffmann, K., Wienker, T., Strauch, K.: Efficient two-trait-locus linkage analysis through program optimization and parallelization: application to hypercholesterolemia. European Journal of Human Genetics 12, 542–550 (2005)

    Article  Google Scholar 

  23. Berman, F., Wolski, R.: Scheduling from the perspective of the application. In: 12th IEEE International Symposium on High Performance Distributed Computing (HPDC’03), Washington, DC, USA, IEEE Computer Society, pp. 100–111 (1996)

    Google Scholar 

  24. Yang, Y., Casanova, H.: Rumr: Robust scheduling for divisible workloads. In: 12th IEEE International Symposium on High Performance Distributed Computing (HPDC’03), Washington, DC, USA, IEEE Computer Society, p. 114 (2003)

    Google Scholar 

  25. Berman, F., Wolski, R., Casanova, H., Cirne, W., Dail, H., Faerman, M., Figueira, S., Hayes, J., Obertelli, G., Schopf, J., Shao, G., Smallen, S., Spring, S., Su, A., Zagorodnov, D.: Adaptive computing on the grid using AppLeS. IEEE Transactions on Parallel and Distributed Systems 14(4), 369–382 (2003)

    Article  Google Scholar 

  26. Heymann, E., Senar, M.A., Luque, E., Livny, M.: Adaptive scheduling for master-worker applications on the computational grid. In: GRID 2000, pp. 214–227 (2000)

    Google Scholar 

  27. Beaumont, O., Legrand, A., Robert, Y.: Scheduling divisible workloads on heterogeneous platforms. Parallel Computing 29(9), 1121–1152 (2003)

    Article  MathSciNet  Google Scholar 

  28. Kondo, D., Chien, A.A., Casanova, H.: Resource management for rapid application turnaround on enterprise desktop grids. In: ACM/IEEE Conference on Supercomputing (SC’04), Washington, DC, USA, IEEE Computer Society, p.17 (2004)

    Google Scholar 

  29. MOAB Grid Suite: Moab grid suite, http://www.clusterresources.com/pages/ products/moab-grid-suite.php (2006)

    Google Scholar 

  30. Dail, H., Sievert, O., Berman, F., Casanova, H., YarKhan, A., Vadhiyar, S., Dongarra, J., Liu, C., Yang, L., Angulo, D., Foster, I.: Scheduling in the grid application development software project. In: Grid Resource Management: State-of-the-art and Future Trends, pp. 73–98 (2004)

    Google Scholar 

  31. Vadhiyar, S., Dongarra, J.: A metascheduler for the grid. In: 11th IEEE International Symposium on High Performance Distributed Computing (HPDC’02), Washington, DC, USA, IEEE Computer Society (2002)

    Google Scholar 

  32. Sabin, G., Kettimuthu, R., Rajan, A., Sadayappan, P.: Scheduling of parallel jobs in a heterogeneous multi-site environment. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) Job Scheduling Strategies for Parallel Processing, pp. 87–104. Springer, Berlin Heidelberg (2003)

    Chapter  Google Scholar 

  33. Marchal, L., Yang, Y., Casanova, H., Robert, Y.: Steady-state scheduling of multiple divisible load applications on wide-area distributed computing platforms. International Journal of High Performance Computing Applications (2006, to appear)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Werner Dubitzky Assaf Schuster Peter M. A. Sloot Michael Schroeder Mathilde Romberg

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Silberstein, M., Geiger, D., Schuster, A. (2007). A Distributed System for Genetic Linkage Analysis. In: Dubitzky, W., Schuster, A., Sloot, P.M.A., Schroeder, M., Romberg, M. (eds) Distributed, High-Performance and Grid Computing in Computational Biology. GCCB 2007. Lecture Notes in Computer Science(), vol 4360. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69968-2_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-69968-2_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69841-8

  • Online ISBN: 978-3-540-69968-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics