Integration of Clustering and Multidimensional Scaling to Determine Phylogenetic Trees as Spherical Phylograms Visualized in 3 Dimensions Introduction Phylogenetic analysis is commonly used to.
Download ReportTranscript Integration of Clustering and Multidimensional Scaling to Determine Phylogenetic Trees as Spherical Phylograms Visualized in 3 Dimensions Introduction Phylogenetic analysis is commonly used to.
Integration of Clustering and Multidimensional Scaling to Determine Phylogenetic Trees as Spherical Phylograms Visualized in 3 Dimensions Introduction Phylogenetic analysis is commonly used to analyze genetic sequence data from fungal communities, while ordination and clustering techniques commonly are used to analyze sequence data from bacterial communities. However, few studies have attempted to link these two independent approaches. We propose a method, which we call spherical phylogram (SP), to display the phylogenetic tree within the clustering and visualization result from a pipeline called DACIDR. In comparison with traditional tree display methods, the correlations between the tree and the clustering can be observed directly. In addition, we propose an algorithm called interpolative joining (IJ) to construct and visualize the SP in 3D space. Mega Region Visualization of full data. (446k Fungi Data) Cluster Visualization from Mega Region 0 Spherical Phylogram visualized using the phylogenetic tree generated by RaXml using the representative sequences and reference sequences, the color scheme is same as in Figure 2 Visualization of all the clusters found by Recursive Clustering Figure 2 Maximum likelihood phylogenetic tree from reference sequences and representative sequences found in each clusters, which is collapsed into clades at the genus level as denoted by colored triangles at the end of the branches. Branch lengths denote levels of sequence divergence between genera and nodes are labeled with bootstrap confidence values. Representative sequences from spores that are not part of another clade are denoted with the label ‘454 sequence from spore’. This figure is generated by FigTree. Figure 1 Screen shots of visualization result after data clustering DACIDR Pairwise Clustering Pairwise Sequence Alignment Sample Clustering Result Dissimilarity Matrix Multidimensional Scaling Input Sequences Interpolation Mega Region 0 DACIDR Mega Region 1 DACIDR …… Mega Region N Find Cluster Centers Mega Region Result Visualization Recursive Clustering Final Clustering Result Visualization DACIDR Representative Sequences Reference Sequences 3D Phylogenetic Tree Visualization DACIDR Interpolative Joining Spherical Phylogram RaXml Figure 3. Flowchart of Large Scale Data Clustering and Visualization. This is based on MPI and MapReduce parallel computing framework Contacts: Yang Ruan ([email protected]), Saliya Ekanayake ([email protected]), Geoffrey Fox ([email protected]) Visualization Pairwise Sequence Alignment Dissimilarity Matrix Pairwise Clustering DACIDR Sample Clustering Result Multidimensional Scaling Interpolation Input Sequences Mega Region 0 DACIDR Mega Region 1 DACIDR Find Cluster Centers Visualization Recursive Clustering …… Mega Region N Mega Region Result Final Clustering Result Visualization DACIDR Representative Sequences 3D Phylogenetic Tree Visualization DACIDR Interpolative Joining Reference Sequences RaXml Spherical Phylogram Visualization