You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _data/ph_authors.yml
+5-5Lines changed: 5 additions & 5 deletions
Original file line number
Diff line number
Diff line change
@@ -1151,13 +1151,13 @@
1151
1151
The Pennsylvania State University, Estados Unidos
1152
1152
bio:
1153
1153
en: |
1154
-
Jennifer Isasi is an Assistant Research Professor of Digital Scholarship and Director of the Digital Liberal Arts Research Initiative at Penn State, and a PhD on Hispanic Studies.
1154
+
Jennifer Isasi is an Associate Research Professor of Digital Scholarship and Director of the Digital Liberal Arts Research Initiative at Penn State, and a PhD on Hispanic Studies.
1155
1155
es: |
1156
-
Jennifer Isasi es Profesora asistente de investigación digital y Directora de la Iniciativa de Investigación Digital en Artes Liberales en Penn State, y doctora en Estudios Hispánicos.
1156
+
Jennifer Isasi es Profesora asociada de investigación digital y Directora de la Iniciativa de Investigación Digital en Artes Liberales en Penn State, y doctora en Estudios Hispánicos.
1157
1157
fr: |
1158
-
Jennifer Isasi est professeure adjointe de recherche en études numériques et directrice de la Digital Liberal Arts Research Initiative à Penn State, et titulaire d'un doctorat en Études Hispaniques.
1158
+
Jennifer Isasi est professeure associé de recherche en études numériques et directrice de la Digital Liberal Arts Research Initiative à Penn State, et titulaire d'un doctorat en Études Hispaniques.
1159
1159
pt: |
1160
-
Jennifer Isasi é professora assistente e pesquisadora digital, diretora da Digital Liberal Arts Research Initiative na Penn State e doutorada em Estudos Hispânicos.
1160
+
Jennifer Isasi é professora associada e pesquisadora digital, diretora da Digital Liberal Arts Research Initiative na Penn State e doutorada em Estudos Hispânicos.
Copy file name to clipboardExpand all lines: en/lessons/analyzing-documents-with-tfidf.md
+18-18Lines changed: 18 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -384,7 +384,7 @@ Text summarization is yet another way to explore a corpus. Rada Mihalcea and Pau
384
384
385
385
# References and Further Reading
386
386
387
-
- Beckman, Milo. "These Are The Phrases Each GOP Candidate Repeats Most," _FiveThirtyEight_, March 10, 2016. https://fivethirtyeight.com/features/these-are-the-phrases-each-gop-candidate-repeats-most/
387
+
- Beckman, Milo. "These Are The Phrases Each GOP Candidate Repeats Most," _FiveThirtyEight_, March 10, 2016. [https://fivethirtyeight.com/features/these-are-the-phrases-each-gop-candidate-repeats-most/](https://perma.cc/37WS-MB8F).
388
388
389
389
- Bennett, Jessica, and Amisha Padnani. "Overlooked," March 8, 2018. https://www.nytimes.com/interactive/2018/obituaries/overlooked.html
390
390
@@ -394,9 +394,9 @@ Text summarization is yet another way to explore a corpus. Rada Mihalcea and Pau
394
394
395
395
- Bowles, Nellie. "Overlooked No More: Karen Sparck Jones, Who Established the Basis for Search Engines" _The New York Times_, January 2, 2019. https://www.nytimes.com/2019/01/02/obituaries/karen-sparck-jones-overlooked.html
396
396
397
-
- Documentation for TfidfVectorizer. https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfVectorizer.html
397
+
- Documentation for TfidfVectorizer. [https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfVectorizer.html](https://perma.cc/JUN8-39Z6).
398
398
399
-
- Grimmer, Justin and King, Gary, Quantitative Discovery from Qualitative Information: A General-Purpose Document Clustering Methodology (2009). APSA 2009 Toronto Meeting Paper. Available at SSRN: https://ssrn.com/abstract=1450070
399
+
- Grimmer, Justin and King, Gary, Quantitative Discovery from Qualitative Information: A General-Purpose Document Clustering Methodology [2009](https://perma.cc/4YAL-H6VN). APSA 2009 Toronto Meeting Paper[PDF](https://perma.cc/NUS2-J3YP).
400
400
401
401
- "Ida M. Tarbell, 86, Dies in Bridgeport" _The New York Times_, January 7, 1944, 17. https://www.nytimes.com
402
402
@@ -408,19 +408,19 @@ Text summarization is yet another way to explore a corpus. Rada Mihalcea and Pau
408
408
409
409
- Salton, G. and M.J. McGill, _Introduction to Modern Information Retrieval_. New York: McGraw-Hill, 1983.
410
410
411
-
- Schmidt, Ben. "Do Digital Humanists Need to Understand Algorithms?" _Debates in the Digital Humanities 2016_. Online edition. Minneapois: University of Minnesota Press. http://dhdebates.gc.cuny.edu/debates/text/99
411
+
- Schmidt, Ben. "Do Digital Humanists Need to Understand Algorithms?" _Debates in the Digital Humanities 2016_. Online edition. Minneapois: University of Minnesota Press. [http://dhdebates.gc.cuny.edu/debates/text/99](https://perma.cc/95WD-SDM5)
412
412
413
-
- --. "Words Alone: Dismantling Topic Models in the Humanities," _Journal of Digital Humanities_. Vol. 2, No. 1 (2012): n.p. http://journalofdigitalhumanities.org/2-1/words-alone-by-benjamin-m-schmidt/
413
+
- --. "Words Alone: Dismantling Topic Models in the Humanities," _Journal of Digital Humanities_. Vol. 2, No. 1 (2012): n.p. [http://journalofdigitalhumanities.org/2-1/words-alone-by-benjamin-m-schmidt/](https://perma.cc/LT4N-X4MZ).
414
414
415
415
- Spärck Jones, Karen. "A Statistical Interpretation of Term Specificity and Its Application in Retrieval." Journal of Documentation 28, no. 1 (1972): 11–21.
416
416
417
-
- Stray, Jonathan, and Julian Burgess. "A Full-text Visualization of the Iraq War Logs," December 10, 2010 (Update April 2012). http://jonathanstray.com/a-full-text-visualization-of-the-iraq-war-logs
417
+
- Stray, Jonathan, and Julian Burgess. "A Full-text Visualization of the Iraq War Logs," December 10, 2010 (Update April 2012). [http://jonathanstray.com/a-full-text-visualization-of-the-iraq-war-logs](https://perma.cc/QBZ4-DKTE).
418
418
419
-
- Underwood, Ted. "Identifying diction that characterizes an author or genre: why Dunning's may not be the best method," _The Stone and the Shell_, November 9, 2011. https://tedunderwood.com/2011/11/09/identifying-the-terms-that-characterize-an-author-or-genre-why-dunnings-may-not-be-the-best-method/
419
+
- Underwood, Ted. "Identifying diction that characterizes an author or genre: why Dunning's may not be the best method," _The Stone and the Shell_, November 9, 2011. [https://tedunderwood.com/2011/11/09/identifying-the-terms-that-characterize-an-author-or-genre-why-dunnings-may-not-be-the-best-method/](https://perma.cc/SY25-UXK3).
420
420
421
-
- --. "The Historical Significance of Textual Distances", Preprint of LaTeCH-CLfL Workshop, COLING, Santa Fe, 2018. https://arxiv.org/abs/1807.00181
421
+
- --. "The Historical Significance of Textual Distances", Preprint of LaTeCH-CLfL Workshop, COLING, Santa Fe, 2018. [https://doi.org/10.48550/arXiv.1807.00181](https://doi.org/10.48550/arXiv.1807.00181).
422
422
423
-
- van Rossum, Guido, Barry Warsaw, and Nick Coghlan. "PEP 8 -- Style Guide for Python Code." July 5, 2001. Updated July 2013. https://www.python.org/dev/peps/pep-0008/
423
+
- van Rossum, Guido, Barry Warsaw, and Nick Coghlan. "PEP 8 -- Style Guide for Python Code." July 5, 2001. Updated July 2013. [https://www.python.org/dev/peps/pep-0008/](https://perma.cc/P2ZM-VPQM).
424
424
425
425
- Whitman, Alden. "Upton Sinclair, Author, Dead; Crusader for Social Justice, 90" _The New York Times_, November 26, 1968, 1, 34. https://www.nytimes.com
426
426
@@ -440,7 +440,7 @@ If you are not using Anaconda, you will need to cover the following dependencies
440
440
441
441
# Endnotes
442
442
443
-
[^1]: Underwood, Ted. "Identifying diction that characterizes an author or genre: why Dunning's may not be the best method," _The Stone and the Shell_, November 9, 2011. <https://tedunderwood.com/2011/11/09/identifying-the-terms-that-characterize-an-author-or-genre-why-dunnings-may-not-be-the-best-method/>
443
+
[^1]: Underwood, Ted. "Identifying diction that characterizes an author or genre: why Dunning's may not be the best method," _The Stone and the Shell_, November 9, 2011. [https://tedunderwood.com/2011/11/09/identifying-the-terms-that-characterize-an-author-or-genre-why-dunnings-may-not-be-the-best-method/](https://perma.cc/SY25-UXK3).
444
444
445
445
[^2]: Bennett, Jessica, and Amisha Padnani. "Overlooked," March 8, 2018. <https://www.nytimes.com/interactive/2018/obituaries/overlooked.html>
446
446
@@ -452,24 +452,24 @@ If you are not using Anaconda, you will need to cover the following dependencies
452
452
453
453
[^6]: "Nellie Bly, Journalist, Dies of Pneumonia" _The New York Times_, January 28, 1922, 11. <https://www.nytimes.com>
454
454
455
-
[^7]: Documentation for TfidfVectorizer. <https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfVectorizer.html>
455
+
[^7]: Documentation for TfidfVectorizer. [https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfVectorizer.html](https://perma.cc/JUN8-39Z6).
456
456
457
-
[^8]: Schmidt, Ben. "Do Digital Humanists Need to Understand Algorithms?" _Debates in the Digital Humanities 2016_. Online edition. (Minneapois: University of Minnesota Press): n.p. <http://dhdebates.gc.cuny.edu/debates/text/99>
457
+
[^8]: Schmidt, Ben. "Do Digital Humanists Need to Understand Algorithms?" _Debates in the Digital Humanities 2016_. Online edition. (Minneapois: University of Minnesota Press): n.p. [http://dhdebates.gc.cuny.edu/debates/text/99](https://perma.cc/95WD-SDM5)
458
458
459
-
[^9]: van Rossum, Guido, Barry Warsaw, and Nick Coghlan. "PEP 8 -- Style Guide for Python Code." July 5, 2001. Updated July 2013. <https://www.python.org/dev/peps/pep-0008/>
459
+
[^9]: van Rossum, Guido, Barry Warsaw, and Nick Coghlan. "PEP 8 -- Style Guide for Python Code." July 5, 2001. Updated July 2013. [https://www.python.org/dev/peps/pep-0008/](https://perma.cc/P2ZM-VPQM).
460
460
461
461
[^10]: "Ida M. Tarbell, 86, Dies in Bridgeport" _The New York Times_, January 7, 1944, 17. <https://www.nytimes.com>; "Nellie Bly, Journalist, Dies of Pneumonia" _The New York Times_, January 28, 1922, 11. <https://www.nytimes.com>; "W. E. B. DuBois Dies in Ghana; Negro Leader and Author, 95" _The New York Times_, August 28, 1963, 27. <https://www.nytimes.com>; Whitman, Alden. "Upton Sinclair, Author, Dead; Crusader for Social Justice, 90" _The New York Times_, November 26, 1968, 1, 34. <https://www.nytimes.com>; "Willa Cather Dies; Noted Novelist, 70" _The New York Times_, April 25, 1947, 21. <https://www.nytimes.com>
462
462
463
-
[^11]: Stray, Jonathan, and Julian Burgess. "A Full-text Visualization of the Iraq War Logs," December 10, 2010 (Update April 2012). <http://jonathanstray.com/a-full-text-visualization-of-the-iraq-war-logs>
463
+
[^11]: Stray, Jonathan, and Julian Burgess. "A Full-text Visualization of the Iraq War Logs," December 10, 2010 (Update April 2012). [http://jonathanstray.com/a-full-text-visualization-of-the-iraq-war-logs](https://perma.cc/QBZ4-DKTE).
464
464
465
465
[^12]: Manning, C.D., P. Raghavan, and H. Schütze, _Introduction to Information Retrieval_. (Cambridge: Cambridge University Press, 2008): 118-120.
466
466
467
-
[^13]: Beckman, Milo. "These Are The Phrases Each GOP Candidate Repeats Most," _FiveThirtyEight_, March 10, 2016. <https://fivethirtyeight.com/features/these-are-the-phrases-each-gop-candidate-repeats-most/>
467
+
[^13]: Beckman, Milo. "These Are The Phrases Each GOP Candidate Repeats Most," _FiveThirtyEight_, March 10, 2016. [https://fivethirtyeight.com/features/these-are-the-phrases-each-gop-candidate-repeats-most/](https://perma.cc/37WS-MB8F).
468
468
469
469
[^14]: Bondi, Marina, and Mike Scott, eds. _Keyness in Texts_. (Philadelphia: John Benjamins, 2010).
470
470
471
-
[^15]: __Tf-idf__ is not typically a recommended pre-processing step when generating topic models. See <https://datascience.stackexchange.com/questions/21950/why-we-should-not-feed-lda-with-tfidf>
471
+
[^15]: __Tf-idf__ is not typically a recommended pre-processing step when generating topic models. See [https://datascience.stackexchange.com/questions/21950/why-we-should-not-feed-lda-with-tfidf](https://perma.cc/N5W9-TYX7).
472
472
473
-
[^16]: Schmidt, Ben. "Words Alone: Dismantling Topic Models in the Humanities," _Journal of Digital Humanities_. Vol. 2, No. 1 (2012): n.p. <http://journalofdigitalhumanities.org/2-1/words-alone-by-benjamin-m-schmidt/>
473
+
[^16]: Schmidt, Ben. "Words Alone: Dismantling Topic Models in the Humanities," _Journal of Digital Humanities_. Vol. 2, No. 1 (2012): n.p. [http://journalofdigitalhumanities.org/2-1/words-alone-by-benjamin-m-schmidt/](https://perma.cc/LT4N-X4MZ).
474
474
475
-
[^17]: Mihalcea, Rada, and Paul Tarau. "Textrank: Bringing order into text." In _Proceedings of the 2004 conference on empirical methods in natural language processing_. 2004.
475
+
[^17]: Mihalcea, Rada, and Paul Tarau. "Textrank: Bringing order into text." In _Proceedings of the 2004 conference on empirical methods in natural language processing_. 2004.[http://www.aclweb.org/anthology/W04-3252](https://perma.cc/SMV5-7MYY).
0 commit comments