@@ -9,7 +9,7 @@ This project seeks to quantify the size and diversity of the commons--the
99collection of works that are openly licensed or in the public domain.
1010
1111
12- ## Code of Conduct
12+ ## Code of conduct
1313
1414[ ` CODE_OF_CONDUCT.md ` ] ( CODE_OF_CONDUCT.md ) :
1515> The Creative Commons team is committed to fostering a welcoming community.
@@ -27,174 +27,114 @@ collection of works that are openly licensed or in the public domain.
2727See [ ` CONTRIBUTING.md ` ] ( CONTRIBUTING.md ) .
2828
2929
30-
3130## Development
3231
3332
3433### Prerequisites
3534
3635This repository uses [ pipenv] [ pipenvdocs ] to manage the required Python
3736modules:
38- - Linux: [ Installing Pipenv] [ pipenvinstall ]
39- - macOS:
40- 1 . Install [ Homebrew] [ homebrew ]
41- 2 . Install pipenv:
42- ```
37+ 1 . Install ` pipenv ` :
38+ - Linux: [ Installing Pipenv] [ pipenvinstall ]
39+ - macOS:
40+ 1 . Install [ Homebrew] [ homebrew ]
41+ 2 . Install pipenv:
42+ ``` shell
4343 brew install pipenv
4444 ```
45+ - Windows: [Installing Pipenv][pipenvinstall]
46+ 2. Create the Python virtual environment and install prerequisites using
47+ ` pipenv` :
48+ ` ` ` shell
49+ pipenv sync --dev
50+ ` ` `
4551
4652[pipenvdocs]: https://pipenv.pypa.io/en/latest/
53+ [pipenvinstall]: https://pipenv.pypa.io/en/latest/installation/
4754[homebrew]: https://brew.sh/
48- [pipenvinstall]: https://pipenv.pypa.io/en/latest/install/#installing-pipenv
49-
50-
51- ### Tooling
52-
53- - **[Python Guidelines — Creative Commons Open Source][ccospyguide]**
54- - [Black][black]: the uncompromising Python code formatter
55- - [flake8][flake8]: a python tool that glues together pep8, pyflakes, mccabe,
56- and third-party plugins to check the style and quality of some python code.
57- - [isort][isort]: A Python utility / library to sort imports.
58-
59- [ccospyguide]: https://opensource.creativecommons.org/contributing-code/python-guidelines/
60- [black]: https://github.com/psf/black
61- [flake8]: https://gitlab.com/pycqa/flake8
62- [isort]: https://pycqa.github.io/isort/
63-
64-
65- ## Data Sources
66-
67-
68- ### CC Legal Tools
69-
70- - [`legal-tool-paths.txt`](google_custom_search/legal-tool-paths.txt)
71- - A `.txt` provided by Timid Robot containing all legal tool paths. The data
72- from Google Custom Search will only cover 50+ general, most significant
73- categories of CC License for data collection quota constraint. As an
74- additional note, the order of precedence of license the collected data's
75- first column is sorted due to intermediate data analysis progress.
76- - [add list of all current CC legal tool paths by TimidRobot · Pull Request
77- #7 · creativecommons/quantifying][pr7]
7855
79- [pr7]: https://github.com/creativecommons/quantifying/pull/7
8056
57+ # ## Running scripts that require client cedentials
8158
82- ### Flickr
59+ To successfully run scripts that require client credentials, you will need to
60+ follow these steps:
61+ 1. Copy the contents of the ` env.example` file in the script' s directory to
62+ `.env`:
63+ ```shell
64+ cp env.example .env
65+ ```
66+ 2. Uncomment the variables in the `.env` file and assign values as needed. See
67+ [`sources.md`](sources.md) on how to get credentials:
68+ ```
69+ GOOGLE_API_KEYS=your_api_key
70+ PSE_KEY=your_pse_key
71+ ```
72+ 3. Save the changes to the `.env` file.
73+ 4. You should now be able to run scripts that require client credentials
74+ without any issues.
8375
84- - The Flickr API exposes identifiers for users, photos, photosets and other
85- uniquely identifiable objects.
86- - The Flickr API consists of a set of callable methods, and some API endpoints.
87- - For more detailed description, visit: [API documentation - Flickr
88- Services](https://www.flickr.com/services/api/).
89- - The `hs.csv` file is a sample CSV of pulled data. Ideally the script will
90- generate final data CSVs.
91- - Each license will have a CSV to save the data.
92- - Due to memory limit, the license CSVs are not pushed into github.
9376
77+ ### Static analysis
9478
95- ### Google Custom Search JSON API
79+ The [`dev/tools.sh`][tools-sh] helper script runs the static analysis tools
80+ (`black`, `flake8`, and `isort`):
81+ ```shell
82+ ./dev/tools.sh
83+ ```
9684
97- - The Custom Search JSON API allows user-defined detailed query and access
98- towards related query data using a programmable search engine.
99- - [Custom Search JSON API Reference | Programmable Search Engine | Google
100- Developers][googlejsonapi]
101- - [Method: cse.list | Custom Search JSON API | Google Developers][cselist]
102- - [`google_countries.tsv`](google_custom_search/google_countries.txt)
103- - Created by directly copy and pasting the `cr` parameter list from the
104- following link into a `.tsv` file as there were no reliable algorithmic way
105- for retrieving such data found in the process so far. The script itself
106- will take care of the formatting and country-selection process.
107- - [Country Collection Values | JSON API reference | Programmable Search
108- Engine | Google Developers][googlecountry]
109- - [`google_lang.txt`](google_custom_search/google_lang.txt)
110- - Created by directly copy and pasting the `lr` parameter list from the
111- following link into a `.txt` file as there were no reliable algorithmic way
112- for retrieving such data found in the process so far. The script itself
113- will take care of the data formatting and language-selection process.
114- - [Parameter: lr | Method: cse.list | Custom Search JSON API | Google
115- Developers][googlelang]
85+ It can also accept command-line arguments to specify specific files or
86+ directories to check:
87+ ```shell
88+ ./dev/tools.sh PATH/TO/MY/FILE.PY
89+ ```
11690
117- [googlejsonapi]: https://developers.google.com/custom-search/v1
118- [cselist]: https://developers.google.com/custom-search/v1/reference/rest/v1/cse/list
119- [googlecountry]: https://developers.google.com/custom-search/docs/json_api_reference#countryCollections
120- [googlelang]: https://developers.google.com/custom-search/v1/reference/rest/v1/cse/list#body.QUERY_PARAMETERS.lr
91+ [tools-sh]: /dev/tools.sh
12192
12293
123- ### Internet Archive Python Interface
94+ ### Resources
12495
125- A python interface to archive.org to achieve API requests towards internet
126- archive.
127- - [`internetarchive.Search` - Internetarchive: A Python Interface to
128- archive.org][iasearch]
129-
130- [iasearch]: https://internetarchive.readthedocs.io/en/stable/internetarchive.html#internetarchive.Search
131-
132-
133- ### The Metropolitan Museum of Art Collection API
134-
135- An API endpoint for receiving Metropolitan Muesum of Art Collection's
136- CC-Licensed works.
137-
138- [Latest Updates | The Metropolitan Museum of Art Collection API][metapi]:
139- > The Metropolitan Museum of Art provides select datasets of information on
140- > more than 470,000 artworks in its Collection for unrestricted commercial and
141- > noncommercial use. To the extent possible under law, The Metropolitan Museum
142- > of Art has waived all copyright and related or neighboring rights to this
143- > dataset using the [Creative Commons Zero][cc-zero] license.
144-
145- [metapi]: https://metmuseum.github.io/
146- [cc-zero]: https://creativecommons.org/publicdomain/zero/1.0/
147-
148-
149- ### Vimeo API
150-
151- The Vimeo API allows users to perform filtered, advanced search on Vimeo
152- videos.
153- - [Getting Started with the Vimeo API][vimeostart]
154- - [Search for videos - Vimeo API Reference: Videos][vimeoapisearch]
96+ - **[Python Guidelines — Creative Commons Open Source][ccospyguide]**
97+ - [Black][black]: _the uncompromising Python code formatter_
98+ - [flake8][flake8]: _a python tool that glues together pep8, pyflakes, mccabe,
99+ and third-party plugins to check the style and quality of some python code._
100+ - [isort][isort]: _A Python utility / library to sort imports_
101+ - (It doesn' t import any libraries, it only sorts and formats them.)
102+ - [ppypa/pipenv][pipenv]: _Python Development Workflow for Humans._
155103
156- [vimeostart]: https://developer.vimeo.com/api/guides/start
157- [vimeoapisearch]: https://developer.vimeo.com/api/reference/videos#search_videos
104+ [ccospyguide]: https://opensource.creativecommons.org/contributing-code/python-guidelines/
105+ [black]: https://github.com/psf/black
106+ [flake8]: https://gitlab.com/pycqa/flake8
107+ [isort]: https://pycqa.github.io/isort/
108+ [pipenv]: https://github.com/pypa/pipenv
158109
159110
160- ### MediaWiki API
111+ # ## GitHub Actions
161112
162- - The MediaWiki Action API is a web service that allows access to some wiki
163- features like authentication, page operations, and search. It can provide
164- meta information about the wiki and the logged-in user.
165- - Example query: https://commons.wikimedia.org/w/api.php?action=query&cmtitle=Category:CC-BY&list=categorymembers
166- - [`language-codes_csv.csv`](wikipedia/language-codes_csv.csv)
167- - A list of language codes in ISO 639-1 Format to access statistics of each
168- wikipedia main page across different languages. In the script, this file is
169- named as `language-codes_csv` to minimize the amount of manual work
170- required for running the script provided the same language encoding file.
171- The user would have to rename the header and file name of their `.csv` ISO
172- code list according to the concurrent file on Github if they would like to
173- use some list other than the concurrent one.
174- - This file that this script uses can be downloaded from:
175- https://datahub.io/core/language-codes
113+ The [` .github/workflows/python_static_analysis.yml` ][workflow-static-analysis]
114+ GitHub Actions workflow performs static analysis (` black` , ` flake8` , and
115+ ` isort` ) on committed changes. The workflow is triggered automatically when you
116+ push changes to the main branch or open a pull request.
176117
118+ [workflow-static-analysis]: .github/workflows/python_static_analysis.yml
177119
178- ### Youtube Data API
179120
180- An API from YouTube for platform users to upload videos, adjust video
181- parameters, and obtain search results.
182- - [Search: list | YouTube Data API | Google Developers][youtubeapi]
121+ # # Data sources
183122
184- [youtubeapi]: https://developers.google.com/youtube/v3/docs/search/list
123+ Kindly visit the [ ` sources.md ` ](sources.md) file for it.
185124
186125
187126# # History
188127
189128For information on past efforts, see [` history.md` ](history.md).
190129
191130
192- ## Copying & License
131+ # # Copying & license
193132
194133
195134# ## Code
196135
197- [`LICENSE`](LICENSE): the code within this repository is licensed under the Expat/[MIT][mit] license.
136+ [` LICENSE` ](LICENSE): the code within this repository is licensed under the
137+ Expat/[MIT][mit] license.
198138
199139[mit]: http://www.opensource.org/licenses/MIT " The MIT License | Open Source Initiative"
200140
@@ -219,4 +159,4 @@ The documentation within the project is licensed under a [Creative Commons
219159Attribution 4.0 International License][cc-by].
220160
221161[cc-by-png]: https://licensebuttons.net/l/by/4.0/88x31.png#floatleft " CC BY 4.0 license button"
222- [cc-by]: https://creativecommons.org/licenses/by/4.0/ "Creative Commons Attribution 4.0 International License"
162+ [cc-by]: https://creativecommons.org/licenses/by/4.0/ " Creative Commons Attribution 4.0 International License"
0 commit comments