Skip to content

Commit 6f3df8f

Browse files
committed
Documentation update and removal of old documentation files
1 parent a31fc4b commit 6f3df8f

5 files changed

Lines changed: 75 additions & 417 deletions

File tree

docs/source/citation.md

Lines changed: 73 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
# Citing STREAMLINE
22

3-
If you use STREAMLINE in a scientific publication, please consider citing the following paper:
3+
If you use STREAMLINE in a scientific publication, please consider citing the following paper as well as noting the *release* applied within the manuscript (i.e. the Beta 0.2.4 release was applied in the publication below):
44

5-
Urbanowicz, Ryan, et al. "STREAMLINE: A Simple, Transparent, End-To-End Automated Machine Learning Pipeline Facilitating Data Analysis and Algorithm Comparison." Genetic Programming Theory and Practice XIX. Singapore: Springer Nature Singapore, 2023. 201-231.
5+
[Urbanowicz, Ryan, et al. "STREAMLINE: A Simple, Transparent, End-To-End Automated Machine Learning Pipeline Facilitating Data Analysis and Algorithm Comparison." Genetic Programming Theory and Practice XIX. Singapore: Springer Nature Singapore, 2023. 201-231.](https://link.springer.com/chapter/10.1007/978-981-19-8460-0_9)
66

7-
BibTeX entry:
7+
BibTeX Citation:
88
```
99
@incollection{urbanowicz2023streamline,
1010
title={STREAMLINE: A Simple, Transparent, End-To-End Automated Machine Learning Pipeline Facilitating Data Analysis and Algorithm Comparison},
@@ -16,7 +16,7 @@ BibTeX entry:
1616
}
1717
```
1818

19-
If you wish to cite the STREAMLINE codebase, please use the following (indicating the release used in the link: for example, v0.2.5-beta:
19+
If you wish to cite the STREAMLINE codebase instead, please use the following (indicating the release used in the link, for example, v0.2.5-beta):
2020
```
2121
@misc{streamline2022,
2222
author = {Urbanowicz, Ryan and Zhang, Robert},
@@ -27,11 +27,67 @@ If you wish to cite the STREAMLINE codebase, please use the following (indicatin
2727
howpublished = {\url{https://github.com/UrbsLab/STREAMLINE/releases/tag/v0.2.5-beta} }
2828
}
2929
```
30-
## Other STREAMLINE related research
30+
## STREAMLINE Applications
31+
This section provides citations to publications applying STREAMLINE in recent research.
32+
33+
* [Exploring Automated Machine Learning for Cognitive Outcome Prediction from Multimodal Brain Imaging using STREAMLINE](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10283099/)
34+
```
35+
@article{wang2023exploring,
36+
title={Exploring Automated Machine Learning for Cognitive Outcome Prediction from Multimodal Brain Imaging using STREAMLINE},
37+
author={Wang, Xinkai and Feng, Yanbo and Tong, Boning and Bao, Jingxuan and Ritchie, Marylyn D and Saykin, Andrew J and Moore, Jason H and Urbanowicz, Ryan and Shen, Li},
38+
journal={AMIA Summits on Translational Science Proceedings},
39+
volume={2023},
40+
pages={544},
41+
year={2023},
42+
publisher={American Medical Informatics Association}
43+
}
44+
```
45+
46+
* [Comparing Amyloid Imaging Normalization Strategies for Alzheimer’s Disease Classification using an Automated Machine Learning Pipeline](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10283108/)
47+
```
48+
@article{tong2023comparing,
49+
title={Comparing Amyloid Imaging Normalization Strategies for Alzheimer’s Disease Classification using an Automated Machine Learning Pipeline},
50+
author={Tong, Boning and Risacher, Shannon L and Bao, Jingxuan and Feng, Yanbo and Wang, Xinkai and Ritchie, Marylyn D and Moore, Jason H and Urbanowicz, Ryan and Saykin, Andrew J and Shen, Li},
51+
journal={AMIA Summits on Translational Science Proceedings},
52+
volume={2023},
53+
pages={525},
54+
year={2023},
55+
publisher={American Medical Informatics Association}
56+
}
57+
```
58+
59+
* [Toward Predicting 30-Day Readmission Among Oncology Patients: Identifying Timely and Actionable Risk Factors](https://ascopubs.org/doi/abs/10.1200/CCI.22.00097)
60+
```
61+
@article{hwang2023toward,
62+
title={Toward Predicting 30-Day Readmission Among Oncology Patients: Identifying Timely and Actionable Risk Factors},
63+
author={Hwang, Sy and Urbanowicz, Ryan and Lynch, Selah and Vernon, Tawnya and Bresz, Kellie and Giraldo, Carolina and Kennedy, Erin and Leabhart, Max and Bleacher, Troy and Ripchinski, Michael R and others},
64+
journal={JCO Clinical Cancer Informatics},
65+
volume={7},
66+
pages={e2200097},
67+
year={2023},
68+
publisher={Wolters Kluwer Health}
69+
}
70+
```
71+
72+
* [Identifying Barriers to Post-Acute Care Referral and Characterizing Negative Patient Preferences Among Hospitalized Older Adults Using Natural Language Processing](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10148308/)
73+
```
74+
@inproceedings{kennedy2022identifying,
75+
title={Identifying Barriers to Post-Acute Care Referral and Characterizing Negative Patient Preferences Among Hospitalized Older Adults Using Natural Language Processing},
76+
author={Kennedy, Erin E and Davoudi, Anahita and Hwang, Sy and Freda, Philip J and Urbanowicz, Ryan and Bowles, Kathryn H and Mowery, Danielle L},
77+
booktitle={AMIA Annual Symposium Proceedings},
78+
volume={2022},
79+
pages={606},
80+
year={2022},
81+
organization={American Medical Informatics Association}
82+
}
83+
```
84+
85+
## Other STREAMLINE Related Research
3186
In developing STREAMLINE we integrated a number of methods and lessons learned from our lab's previous research. We briefly summarize and provide citations for each.
3287

3388
### A rigorous ML pipeline for binary classification
34-
A preprint describing an early version of what would become STREAMLINE applied to pancreatic cancer.
89+
A [preprint](https://arxiv.org/abs/2008.12829) describing an early version of what would become STREAMLINE applied to pancreatic cancer.
90+
3591
```
3692
@article{urbanowicz2020rigorous,
3793
title={A Rigorous Machine Learning Analysis Pipeline for Biomedical Binary Classification: Application in Pancreatic Cancer Nested Case-control Studies with Implications for Bias Assessments},
@@ -41,7 +97,7 @@ A preprint describing an early version of what would become STREAMLINE applied t
4197
}
4298
```
4399

44-
The STREAMLINE preprint.
100+
The STREAMLINE [preprint](https://arxiv.org/abs/2206.12002).
45101
```
46102
@article{urbanowicz2022streamline,
47103
title={STREAMLINE: A Simple, Transparent, End-To-End Automated Machine Learning Pipeline Facilitating Data Analysis and Algorithm Comparison},
@@ -52,7 +108,7 @@ The STREAMLINE preprint.
52108
```
53109

54110
### Relief-based feature importance estimation
55-
One of the two feature importance algorithms used by STREAMLINE is MultiSURF, a Relief-based filter feature importance algorithm that can prioritize features involved in either univariate or multivariate feature interactions associated with outcome. We believe that it is important to have at least one 'interaction-sensitive' feature importance algorithm involved in feature selection prior such that relevant features involved in complex associations are not filtered out prior to modeling. The paper below is an introduction and review of Relief-based algorithms.
111+
One of the two feature importance algorithms used by STREAMLINE is MultiSURF, a Relief-based filter feature importance algorithm that can prioritize features involved in either univariate or multivariate feature interactions associated with outcome. We believe that it is important to have at least one 'interaction-sensitive' feature importance algorithm involved in feature selection prior such that relevant features involved in complex associations are not filtered out prior to modeling. The [paper below](https://www.sciencedirect.com/science/article/pii/S1532046418301400) is an introduction and review of Relief-based algorithms.
56112
```
57113
@article{urbanowicz2018relief,
58114
title={Relief-based feature selection: Introduction and review},
@@ -64,7 +120,7 @@ One of the two feature importance algorithms used by STREAMLINE is MultiSURF, a
64120
publisher={Elsevier}
65121
}
66122
```
67-
This next published research paper compared a number of Relief-based algorithms and demonstrated best overall performance with MultiSURF out of all evaluated. This second paper also introduced 'ReBATE', a scikit-learn package of Releif-based feature importance/selection algorithms (used by STREAMLINE).
123+
This [next published research paper](https://www.sciencedirect.com/science/article/pii/S1532046418301412) compared a number of Relief-based algorithms and demonstrated best overall performance with MultiSURF out of all evaluated. This second paper also introduced 'ReBATE', a scikit-learn package of Releif-based feature importance/selection algorithms (used by STREAMLINE).
68124
```
69125
@article{urbanowicz2018benchmarking,
70126
title={Benchmarking relief-based feature selection methods for bioinformatics data mining},
@@ -78,7 +134,7 @@ This next published research paper compared a number of Relief-based algorithms
78134
```
79135

80136
### Collective feature selection
81-
Following feature importance estimation, STREAMLINE adopts an ensemble approach to determining which features to select. The utility of this kind of 'collective' feature selection, was introduced in the next publication.
137+
Following feature importance estimation, STREAMLINE adopts an ensemble approach to determining which features to select. The utility of this kind of 'collective' feature selection, was introduced in the [next publication](https://link.springer.com/article/10.1186/s13040-018-0168-6).
82138
```
83139
@article{verma2018collective,
84140
title={Collective feature selection to identify crucial epistatic variants},
@@ -93,7 +149,7 @@ Following feature importance estimation, STREAMLINE adopts an ensemble approach
93149
```
94150

95151
### Learning classifier systems
96-
STREAMLINE currently incorporates 15 ML classification modeling algorithms that can be run. Our own research has closely followed a subfield of evolutionary algorithms that discover a set of rules that collectively constitute a trained model. The appeal of such 'rule-based machine learning algorithms' (e.g. learning classifier systems) is that they can model complex associations while also offering human interpretable models. In the first paper below we introduced 'ExSTraCS', a learning classifier system geared towards bioinformatics data analysis. ExSTraCS was the first ML algorithm demonstrated to be able to tackle the long-standing 135-bit multiplexer problem directly, largely due to it's ability to use prior feature importance estimates from a Relief algorithm to guide the evolutionary rule search.
152+
STREAMLINE currently incorporates 15 ML classification modeling algorithms that can be run. Our own research has closely followed a subfield of evolutionary algorithms that discover a set of rules that collectively constitute a trained model. The appeal of such 'rule-based machine learning algorithms' (e.g. learning classifier systems) is that they can model complex associations while also offering human interpretable models. In the [first paper below](https://link.springer.com/article/10.1007/s12065-015-0128-8) we introduced 'ExSTraCS', a learning classifier system geared towards bioinformatics data analysis. ExSTraCS was the first ML algorithm demonstrated to be able to tackle the long-standing 135-bit multiplexer problem directly, largely due to it's ability to use prior feature importance estimates from a Relief algorithm to guide the evolutionary rule search.
97153
```
98154
@article{urbanowicz2015exstracs,
99155
title={ExSTraCS 2.0: description and evaluation of a scalable learning classifier system},
@@ -106,7 +162,8 @@ STREAMLINE currently incorporates 15 ML classification modeling algorithms that
106162
publisher={Springer}
107163
}
108164
```
109-
In the next published pre-print we introduced a scikit-learn implementation of ExSTraCS (used by STREAMLINE) as well as a pipeline (LCS-DIVE) to take ExSTraCS output and characterize different patterns association between features and outcome. Future work will demonstrate how STREAMLINE can be linked with LCS-DIVE to better understand the relationship between features and outcome captured by rule-based modeling.
165+
166+
In the [next published pre-print](https://arxiv.org/abs/2104.12844) we introduced a scikit-learn implementation of ExSTraCS (used by STREAMLINE) as well as a pipeline (LCS-DIVE) to take ExSTraCS output and characterize different patterns association between features and outcome. Future work will demonstrate how STREAMLINE can be linked with LCS-DIVE to better understand the relationship between features and outcome captured by rule-based modeling.
110167
```
111168
@article{zhang2021lcs,
112169
title={LCS-DIVE: An Automated Rule-based Machine Learning Visualization Pipeline for Characterizing Complex Associations in Classification},
@@ -115,7 +172,8 @@ In the next published pre-print we introduced a scikit-learn implementation of E
115172
year={2021}
116173
}
117174
```
118-
In the next publication we introduced the first scikit-learn compatible implementation of an LCS algorithm. Specifically this paper implemented eLCS, an educational learning classifier system. This eLCS algorithm is a direct descendant of the UCS algorithm.
175+
176+
In the [next publication](https://dl.acm.org/doi/abs/10.1145/3377929.3398097) we introduced the first scikit-learn compatible implementation of an LCS algorithm. Specifically this paper implemented eLCS, an educational learning classifier system. This eLCS algorithm is a direct descendant of the UCS algorithm.
119177
```
120178
@inproceedings{zhang2020scikit,
121179
title={A scikit-learn compatible learning classifier system},
@@ -125,7 +183,8 @@ In the next publication we introduced the first scikit-learn compatible implemen
125183
year={2020}
126184
}
127185
```
128-
eLCS was originally developed as a very simple supervised learning LCS implementation primarily as an educational resource pairing with the following published textbook.
186+
187+
eLCS was originally developed as a very simple supervised learning LCS implementation primarily as an educational resource pairing with the following [published textbook](https://books.google.com/books?hl=en&lr=&id=C6QxDwAAQBAJ&oi=fnd&pg=PR5&dq=Introduction+to+learning+classifier+systems&ots=pTcnuuYQPE&sig=wNgZmWkcne9m3LQgDzuBu30uQ1Y#v=onepage&q=Introduction%20to%20learning%20classifier%20systems&f=false).
129188
```
130189
@book{urbanowicz2017introduction,
131190
title={Introduction to learning classifier systems},

0 commit comments

Comments
 (0)