Sign-up for our newsletter
MAIN
Event Calendar
Awardee Reports
ABOUT DIACOMP
Citing DiaComp
Contact
Committees
Institutions
Awardee Reports
Publications
Bioinformatics
RESOURCES
Protocols & Methods
Reagents & Resources
Mouse Diet
Breeding Schemes
Validation Criteria
IMPC / KOMP Data
Publications
Bioinformatics
CONTACT
PARTICIPANT AREA
Login
▹
Publications
▹
Home
Publication
ParaKMeans: Implementation of a Parallelized K-means algorithm Suitable for
General Laboratory Use.
Authors
Piotr Kraj, Ashok Sharma, Nikhil Garge, Robert Podolsky, and Richard A McIndoe
Submitted By
Richard McIndoe on 3/21/2008
Status
Published
Journal
BMC bioinformatics [electronic resource]
Year
2008
Date Published
5/1/2008
Volume : Pages
9 : 200
PubMed Reference
18416829
Abstract
Background
During the last decade, the use of microarrays to assess the transcriptome of
many biological systems has generated an enormous amount of data. A common
technique used to organize and analyze microarray data is to perform cluster
analysis. While many clustering algorithms have been developed, they all suffer
a significant decrease in computational performance as the size of the dataset
being analyzed becomes very large. For example, clustering 10000 genes from an
experiment containing 200 microarrays can be quite time consuming and
challenging on a desktop PC. One solution to the scalability problem of
clustering algorithms is to distribute or parallelize the algorithm across
multiple computers.
Results
The software described in this paper is a high performance multithreaded
application that implements a parallelized version of the K-means Clustering
algorithm. Most parallel processing applications are not accessible to the
general public and require specialized software libraries (e.g. MPI) and
specialized hardware configurations. The parallel nature of the application
comes from the use of a web service to perform the distance calculations and
cluster assignments. Here we show our parallel implementation provides
significant performance gains over a wide range of datasets using as little as
seven nodes. The software was written in C# and was designed in a modular
fashion to provide both deployment flexibility as well as flexibility in the
user interface.
Conclusions
ParaKMeans was designed to provide the general scientific community with an easy
and manageable client-server application that can be installed on a wide variety
of Windows operating systems.
Complications
All Complications
Bioinformatics
Bone
Cardiomyopathy
Cardiovascular
Gastro-Intestinal (GI)
Nephropathy
Neuropathy & Neurocognition
Pediatric Endocrinology
Retinopathy
Uropathy
Wound Healing
Welcome to the DiaComp Login / Account Request Page.
Email Address:
Password:
Note: Passwords are case-sensitive.
Please save my Email Address on this machine.
Not a member?
If you are a funded DiaComp investigator, a member of an investigator's lab,
or an External Scientific Panel member to the consortium, please
request an account.
Forgot your password?
Enter your Email Address and
click here.
ERROR!
There was a problem with the page:
User Info
User Confirm
Please acknowledge all posters, manuscripts or scientific materials that were generated in part or whole using funds from the Diabetic Complications Consortium(DiaComp) using the following text:
Financial support for this work provided by the NIDDK Diabetic Complications Consortium (RRID:SCR_001415, www.diacomp.org), grants DK076169 and DK115255
Citation text and image have been copied to your clipboard. You may now paste them into your document. Thank you!