You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/contribution-data-visualization.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,23 +15,23 @@ title: "Visualizing Complexity: A Universal Ecosystem for Multi-Scale Scientific
15
15
16
16
## Defining the Standards: A Universal Framework for Multi-Scale Scientific Visualization
17
17
18
-
Modern scientific discovery relies on the ability to transform high-dimensional data into intuitive, accurate, and reproducible visual narratives. Our work has established a **comprehensive visualization ecosystem** that spans from foundational plotting algebra to specialized interpretation frameworks for phylogeny and functional omics.
18
+
Modern scientific discovery relies on the ability to transform high-dimensional data into intuitive, accurate, and reproducible visual narratives. Our work has established a **comprehensive visualization ecosystem** that spans from foundational plotting frameworks to specialized interpretation frameworks for phylogeny and functional omics.
19
19
20
20
## 1. Methodological Foundations: Bridging Systems and Logics
21
21
22
22
23
23
24
-
To address the fragmentation of the R plotting landscape, we developed a suite of tools that unify disparate plotting systems and introduce rigorous layout algebra.
24
+
To address the fragmentation of the R plotting landscape, we developed a suite of tools that unify disparate plotting systems and introduce rigorous data-driven alignment.
25
25
26
-
***[plotbb](https://github.com/YuLab-SMU/plotbb) (Seamless Base R Bridging):** Brings the Grammar of Graphics to Base R, allowing for structured, layered plotting within the traditional R graphics system.
26
+
***[plotbb](https://github.com/YuLab-SMU/plotbb) (Grammar of Graphics for Base R):** Brings the Grammar of Graphics to Base R, allowing for structured, layered plotting within the traditional R graphics system.
27
27
***[ggplotify](https://cran.r-project.org/package=ggplotify) (System Interoperability):** Allows researchers to convert virtually any plot object (**Base**, **Lattice**, **pheatmap**, etc.) into a **ggplot2** compatible object, enabling seamless integration and complex assembly within the ggplot2 ecosystem.
28
-
***[aplot](https://cran.r-project.org/package=aplot) & [aplotExtra](https://github.com/YuLab-SMU/aplotExtra) (Layout Algebra):**These packages introduce an algebraic approach to plot alignment, allowing heterogeneous subplots to reconcile their coordinate systems automatically.
28
+
***[aplot](https://cran.r-project.org/package=aplot) & [aplotExtra](https://github.com/YuLab-SMU/aplotExtra) (Data-Driven Alignment):**Moving beyond simple figure assembly, this suite introduces a systematic approach to automatically synchronize coordinate systems based on the underlying data structure (e.g., matching axes and reordering categories), ensuring heterogeneous subplots are both statistically and spatially congruent.
29
29
***[ggbreak](https://cran.r-project.org/package=ggbreak) (Dynamic Range Logic):** Provides a seamless, non-destructive method for axis breaks, essential for visualizing datasets with extreme outliers or multi-scale distributions.
30
30
***[ggtangle](https://github.com/YuLab-SMU/ggtangle) & [ggflow](https://github.com/YuLab-SMU/ggflow) (Relational & Process Flow):****ggtangle** reimagines network visualization within the tidy framework, while **ggflow** provides a dedicated grammar for flowcharts and transition processes, bridging the gap between static relationships and dynamic workflows.
31
-
***[ggfun](https://cran.r-project.org/package=ggfun) (UX & Utilities):** Provides foundational utilities that enhance the developer and user experience across the entire ecosystem.
31
+
***[ggfun](https://cran.r-project.org/package=ggfun) (Utilities):** Provides foundational utilities that enhance the developer and user experience across the entire ecosystem.
32
32
33
33
34
-
<strong>Logic Unification:</strong> By treating plots as first-class algebraic objects, these tools allow for the "compositional" creation of complexfigures that were previously impossible or required significant manual effort.
34
+
<strong>Logic Unification:</strong> Together, these tools establish a unified framework that bridges plotting systems, enables cross-system interoperability, and introduces intelligent data-driven alignment—transforming the creation of complex, multi-panel figures from a manual challenge into a systematic, reproducible workflow.
35
35
36
36
## 2. Specialized Academic Domains: Deep Integration
37
37
@@ -43,7 +43,7 @@ Beyond general-purpose utilities, we have pioneered visualization standards in s
43
43
***Functional Discovery ([Knowledge Mining Contribution](/contribution-knowledge-mining)):**[**enrichplot**](https://bioconductor.org/packages/enrichplot) transforms abstract enrichment results into statistically rigorous and biologically intuitive visual insights, enabling the automated interpretation of massive omics datasets.
44
44
***Sequence & Genomic Landscapes:**[**seqcombo**](https://github.com/YuLab-SMU/seqcombo) and [**ggmsa**](https://bioconductor.org/packages/ggmsa) provide a modular grammar for multiple sequence alignment and the visualization of **genomic reassortment** events. These tools facilitate the visual exploration of segmental exchanges and complex evolutionary associations, ensuring that structural and genomic conservation is accessible at multiple scales.
45
45
***Single-Cell & Fine-Scale Omics:**[**ggsc**](https://github.com/YuLab-SMU/ggsc) and [**ivolcano**](https://github.com/YuLab-SMU/ivolcano) address the unique needs of high-resolution data, providing specialized geometries and interactive exploration for single-cell clusters and differential expression.
46
-
***Glycobiology & Complex Carbohydrates:**[**gglycan**](https://github.com/YuLab-SMU/gglycan) introduces a grammar for visualizing complex glycan structures. By supporting standard symbolic nomenclature (e.g., SNFG), it enables researchers to integrate glycomic data with other biological layers, bridging a critical gap in multi-omics synthesis.
46
+
***Glycobiology & Complex Carbohydrates:**[**gglycan**](https://github.com/YuLab-SMU/gglycan) introduces a grammar for visualizing complex glycan structures. By supporting standard symbolic nomenclature (e.g., SNFG), it enables researchers to integrate glycomic data with other biological layers.
47
47
48
48
49
49
<strong>Domain Leadership:</strong> These tools are not mere plotting scripts but are **interpretative frameworks** cited in thousands of studies across <em>Nature</em>, <em>Science</em>, and <em>Cell</em>.
@@ -58,7 +58,7 @@ To bridge the gap between abstract data and human intuition, we developed tools
58
58
***[ggimage](https://cran.r-project.org/package=ggimage) & [scatterpie](https://cran.r-project.org/package=scatterpie):** Extending the visual vocabulary of **ggplot2** to include external imagery and composite geometries (like pie-charts within coordinates), allowing for "presentation-ready" visuals.
59
59
***[ggstar](https://cran.r-project.org/package=ggstar) (Multi-Variable Aesthetics):** Provides a comprehensive suite of easily discernible polygonal shapes for **ggplot2**. Similar to `geom_point`, it enables researchers to create scatter plots with enhanced visual distinction across high-dimensional data variables.
60
60
***[emojifont](https://cran.r-project.org/package=emojifont), [shadowtext](https://cran.r-project.org/package=shadowtext), & [meme](https://cran.r-project.org/package=meme):** Enhancing semantic storytelling through advanced typography and cultural icons. These tools improve optical clarity through text halos and allow for creative, engaging data interaction, bridging the gap between formal analysis and impactful communication.
61
-
***[hexSticker](https://cran.r-project.org/package=hexSticker):**Revolutionizing how R developers brand their work. **hexSticker** has established the "Hex Logo" as the universal symbol of professional R package development.
61
+
***[hexSticker](https://cran.r-project.org/package=hexSticker):**Facilitating professional branding for R developers. **hexSticker** has established the "Hex Logo" as the universal symbol of professional R package development.
62
62
63
63
64
64
<strong>Scientific Communication:</strong> These tools empower researchers to communicate complex data with clarity, impact, and professional polish, fostering broader adoption of open-source science.
Copy file name to clipboardExpand all lines: content/contribution-knowledge-mining.md
+15-11Lines changed: 15 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,37 +13,41 @@ title: "Semantic Knowledge Mining: Deciphering the Functional Architecture of Li
13
13
</style>
14
14
15
15
16
-
## From Big Data to Biological Intelligence: A Universal Framework for Semantic Knowledge Mining
16
+
## From Big Data to Biological Insights: A Universal Framework for Semantic Knowledge Mining
17
17
18
-
Translating massive omics datasets into clinical insights requires more than simple computation; it demands a deep understanding of the **functional architecture of life**. Our work has established a comprehensive **Semantic Knowledge Mining Framework** that serves as the bridge between raw biological big data and actionable intelligence. By quantifying biological intelligence, pioneering comparative multi-condition analysis, and integrating knowledge priors into high-resolution single-cell and spatial landscapes, we have provided the global research community with the tools to decipher molecular mechanisms at unprecedented scales.
18
+
Translating massive omics datasets into clinical insights requires more than simple computation; it demands a deep understanding of the **functional architecture of life**. Our work has established a comprehensive **Semantic Knowledge Mining Framework** that serves as the bridge between raw biological big data and actionable insights. By quantifying biological knowledge, pioneering comparative multi-condition analysis, and integrating knowledge priors into high-resolution single-cell and spatial landscapes, we have provided the global research community with the tools to decipher molecular mechanisms at unprecedented scales.
19
19
20
20
21
21
----
22
22
23
-
## Pillar 1: Quantifying Biological Intelligence — The Calculus of Knowledge
23
+
## Pillar 1: Quantifying Biological Knowledge — The Calculus of Functional Similarity
24
24
25
-
Before knowledge can be mined, it must be quantified. We established the mathematical prerequisites for biological discovery by developing semantic similarity measures that map fuzzy biological concepts onto rigorous mathematical spaces.
25
+
Before knowledge can be mined, it must be quantified. We provided comprehensive software implementations of semantic similarity measures, enabling researchers to systematically quantify fuzzy biological concepts within rigorous mathematical frameworks.
26
26
27
27
28
-
<strong>Semantic Quantification:</strong> Through [**GOSemSim**](https://bioconductor.org/packages/GOSemSim), [**DOSE**](https://bioconductor.org/packages/DOSE), and [**meshes**](https://bioconductor.org/packages/meshes), we provided the first comprehensive software suite for measuring functional similarity among gene products across Gene Ontology, Disease Ontology, and MeSH domains. This work, starting from <em>Bioinformatics</em> (2010), enables the prediction of gene functions, disease-gene associations, and drug-repurposing candidates by leveraging the underlying semantic structure of biomedical knowledge.
28
+
<strong>Semantic Quantification:</strong> Through [**GOSemSim**](https://bioconductor.org/packages/GOSemSim), [**DOSE**](https://bioconductor.org/packages/DOSE), and [**meshes**](https://bioconductor.org/packages/meshes), we provided a comprehensive software suite for measuring functional similarity among gene products across Gene Ontology, Disease Ontology, and MeSH domains. This work, starting from <em>Bioinformatics</em> (2010), enables the prediction of gene functions, disease-gene associations, and drug-repurposing candidates by leveraging the underlying semantic structure of biomedical knowledge.
29
29
30
30
31
31
32
32
## Pillar 2: The Universal Enrichment Framework — A Paradigm Shift in Theme Discovery
33
33
34
34
A cornerstone of our contribution is the creation of a **universal mechanistic engine** for functional genomics. By establishing [**clusterProfiler**](https://bioconductor.org/packages/clusterProfiler), we shifted the field from static gene-list analysis to dynamic discovery.
35
35
36
-
<strong>Comparative Theme Discovery:</strong> We pioneered the concept of **Comparative Biological Theme Analysis** (<em>OMICS</em> 2012), allowing researchers to compare functional profiles across complex experimental designs—multiple time points, genotypes, and drug treatments—simultaneously. This paradigm has been integrated into over 40 bioinformatics tools and cited over 40,000 times, becoming the <em>de facto</em> global standard as detailed in our **<em>Nature Protocols</em>** (2024) feature.
36
+
<strong>Comparative Theme Discovery:</strong> We pioneered the concept of **Comparative Biological Theme Analysis** (<em>OMICS</em> 2012), allowing researchers to compare functional profiles across complex experimental designs—multiple time points, genotypes, and drug treatments—simultaneously. This paradigm has been integrated into over 40 bioinformatics tools and cited over 40,000 times, becoming a widely-adopted framework as detailed in our **<em>Nature Protocols</em>** (2024) feature.
37
37
<br><br>
38
38
<strong>Universal Extensibility:</strong> **clusterProfiler** provides a generic platform that breaks the model-organism barrier, supporting thousands of species and any custom functional annotation, from KEGG and GO to Reactome and disease ontologies.
The microbiome represents one of the final frontiers in functional discovery. Our team has developed specialized infrastructures to address the unique challenges of inter-species interactions and metabolic flow in microbial communities.
43
+
The microbiome presents unique challenges in functional discovery. Our team has developed a comprehensive ecosystem of tools spanning from data structure standardization and ecological analysis to functional enrichment and metabolic profile prediction.
44
44
45
45
46
-
<strong>Tidy Ecological Mining:</strong> Through [**MicrobiotaProcess**](https://bioconductor.org/packages/MicrobiotaProcess) and [**MicrobiomeProfiler**](https://bioconductor.org/packages/MicrobiomeProfiler), we established a **tidy framework** for microbiome data mining. **MicrobiotaProcess** provides a comprehensive pipeline for ecological analysis (alpha/beta diversity, biomarker discovery), while **MicrobiomeProfiler** enables functional enrichment analysis specifically tailored to the metabolic and genomic context of microbial datasets.
46
+
<strong>Tidy Microbiome Data Framework:</strong> [**MicrobiotaProcess**](https://bioconductor.org/packages/MicrobiotaProcess) introduces the MPSE class and establishes a tidy microbiome data structure paradigm, providing a unified framework for analysis, visualization, and biomarker discovery. By integrating seamlessly with the existing computing ecosystem, it enables a wide variety of microbiome data analysis procedures including alpha/beta diversity analysis and biomarker identification under a common tidy-like framework.
47
+
<br><br>
48
+
<strong>Functional Enrichment for Microbiome:</strong> [**MicrobiomeProfiler**](https://bioconductor.org/packages/MicrobiomeProfiler) extends the **clusterProfiler** framework to microbiome data, supporting KEGG enrichment, COG enrichment, Microbe-Disease association analysis, and Metabo-Pathway analysis. This R/Shiny package bridges microbial community composition with functional and clinical interpretations.
49
+
<br><br>
50
+
<strong>Microbiome-Metabolome Integration:</strong> [**MMINP**](https://github.com/YuLab-SMU/MMINP) implements a computational framework to predict microbial community-based metabolic profiles using O2PLS models. By training on paired microbiome and metabolome data, MMINP enables metabolite prediction in analogous environments using only microbial feature abundances, bridging the gap between community structure and metabolic function.
47
51
48
52
49
53
@@ -52,13 +56,13 @@ The microbiome represents one of the final frontiers in functional discovery. Ou
52
56
Biological function is often encoded in the non-coding genome. We developed the infrastructure to bridge the gap between genomic cis-regulatory positions and biological function.
53
57
54
58
55
-
<strong>Cistromic Annotation:</strong> [**ChIPseeker**](https://bioconductor.org/packages/ChIPseeker) (<em>Bioinformatics</em> 2015) has become a global cornerstone for epigenomic annotation. It provides a comprehensive framework for annotating genomic peaks (ChIP-seq, ATAC-seq, DNase-seq) with biological context and facilitating cross-dataset comparison and mining within the GEO database.
59
+
<strong>Cistromic Annotation:</strong> [**ChIPseeker**](https://bioconductor.org/packages/ChIPseeker) (<em>Bioinformatics</em> 2015) provides a widely-used framework for epigenomic annotation. It enables comprehensive annotation of genomic peaks (ChIP-seq, ATAC-seq, DNase-seq) with biological context, facilitates cross-dataset comparison, and integrates with the GEO database for comparative analysis. Building upon this foundation, [**epiSeeker**](https://github.com/YuLab-SMU/epiSeeker) extends the framework to support multi-omics epigenetic data analysis, handling both fragment-type and base-type data. **epiSeeker** incorporates motif analysis, statistical methods for overlap significance estimation, and advanced visualization capabilities including average profiles, heatmaps of TSS-binding regions, and single-base resolution epigenetic data visualization considering strand orientation and motif information.
56
60
57
61
58
62
59
63
## Pillar 5: Knowledge-Driven Discovery in Single-Cell & Spatial Omics
60
64
61
-
As biological data moves toward higher resolution, the integration of prior knowledge becomes essential for interpretability. Our team has pioneered methods to incorporate biological intelligence into single-cell and spatial discovery.
65
+
As biological data moves toward higher resolution, the integration of prior knowledge becomes essential for interpretability. Our team has developed methods to incorporate biological knowledge priors into single-cell and spatial discovery.
62
66
63
67
64
68
<strong>Biological Latent Spaces:</strong> Through [**MSGNN**](https://github.com/YuLab-SMU/MSGNN), we proposed integrating biological knowledge priors directly into the graph-based community detection process via graph neural networks. This ensures that single-cell clustering outcomes are not only mathematically robust but also align closely with biological interpretations.
@@ -69,7 +73,7 @@ As biological data moves toward higher resolution, the integration of prior know
69
73
70
74
## Pillar 6: From Abstract Results to Intuitive Insight — Visual Logic
71
75
72
-
A functional discovery is only complete when it can be interpreted. Our team has pioneered the "Visual Logic" of enrichment, transforming abstract statistical tables into rigorous visual narratives.
76
+
A functional discovery is only complete when it can be interpreted. Our team has developed the "Visual Logic" of enrichment, transforming abstract statistical tables into rigorous visual narratives.
73
77
74
78
75
79
<strong>Interpretive Synthesis:</strong> Through [**enrichplot**](https://bioconductor.org/packages/enrichplot), we established the visualization standards for functional discovery. By implementing sophisticated geometries (e.g., dot plots, gene-concept networks, upset plots), **enrichplot** enables researchers to visually synthesize complex enrichment results, uncovering the hidden relationships between biological themes and their underlying molecular drivers.
0 commit comments