Skip to content

Commit 024dbbc

Browse files
authored
Merge pull request academic#581 from Oshgig/chore/add-contributing-guide
2 parents e4fcdcc + fef0296 commit 024dbbc

2 files changed

Lines changed: 83 additions & 4 deletions

File tree

CONTRIBUTING.md

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
# Contributing
2+
3+
Thanks for taking the time to contribute.
4+
5+
## What you can contribute
6+
7+
- Add new resources (links, tools, courses, books, etc.)
8+
- Fix broken links
9+
- Remove outdated resources
10+
- Improve organization (move items to a more appropriate section)
11+
- Fix typos and formatting issues
12+
13+
## How to add a new resource
14+
15+
- Add the resource to the most relevant section in `README.md`.
16+
- Use the existing style for that section.
17+
- Prefer authoritative sources (official docs, original GitHub repo, publisher site).
18+
19+
### Entry format
20+
21+
Use a consistent bullet format:
22+
23+
- `[Name](URL)` - Short description.
24+
25+
Guidelines:
26+
27+
- Keep descriptions short (ideally one sentence).
28+
- Use title case for the link text when it matches surrounding entries.
29+
- Avoid marketing language and superlatives.
30+
- If there are similar items already listed, add yours near them.
31+
- If the section is alphabetized, keep it alphabetized.
32+
33+
## Links
34+
35+
- Prefer `https://` when available.
36+
- Avoid URL shorteners.
37+
- If a site blocks automated link checking but is a valid resource, it may need to be ignored by the link checker configuration.
38+
39+
## Running link checks locally
40+
41+
This repository contains CI configurations for link checking.
42+
43+
### Option A: awesome_bot (Ruby)
44+
45+
If you have Ruby installed:
46+
47+
1. Install the gem:
48+
49+
`gem install awesome_bot`
50+
51+
2. Run:
52+
53+
`awesome_bot README.md`
54+
55+
### Option B: markdown-link-check (Node)
56+
57+
If you have Node installed:
58+
59+
1. Install:
60+
61+
`npm i -g markdown-link-check`
62+
63+
2. Run:
64+
65+
`markdown-link-check README.md -c mlc_config.json`
66+
67+
## Pull request checklist
68+
69+
- Your change is in the right section.
70+
- The link works.
71+
- The description is short and matches the repo style.
72+
- Link checks pass (or you explained why they fail and proposed an ignore pattern).
73+
74+
## Code of Conduct
75+
76+
By participating in this project, you agree to abide by the `CODE_OF_CONDUCT.md`.

README.md

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,8 @@
2020

2121
[![Awesome](https://cdn.jsdelivr.net/gh/sindresorhus/awesome@d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)](https://github.com/sindresorhus/awesome)
2222

23+
Contributions are welcome - see [`CONTRIBUTING.md`](CONTRIBUTING.md).
24+
2325
**An open-source Data Science repository to learn and apply concepts toward solving real- world problems.**
2426

2527
This is a shortcut path to start studying **Data Science**. Just follow the steps to answer the questions, "What is Data Science, and what should I study to learn Data Science?"
@@ -115,7 +117,7 @@ While not strictly necessary, having a programming language is a crucial skill t
115117

116118
Unlike R, Python was not built from the ground up with data science in mind, but there are plenty of third party libraries to make up for this. A much more exhaustive list of packages can be found later in this document, but these four packages are a good set of choices to start your data science journey with: [Scikit-Learn](https://scikit-learn.org/stable/index.html) is a general-purpose data science package which implements the most popular algorithms - it also includes rich documentation, tutorials, and examples of the models it implements. Even if you prefer to write your own implementations, Scikit-Learn is a valuable reference to the nuts-and-bolts behind many of the common algorithms you'll find. With [Pandas](https://pandas.pydata.org/), one can collect and analyze their data into a convenient table format. [Numpy](https://numpy.org/) provides very fast tooling for mathematical operations, with a focus on vectors and matrices. [Seaborn](https://seaborn.pydata.org/), itself based on the [Matplotlib](https://matplotlib.org/) package, is a quick way to generate beautiful visualizations of your data, with many good defaults available out of the box, as well as a gallery showing how to produce many common visualizations of your data.
117119

118-
When embarking on your journey to becoming a data scientist, the choice of language isn't particularly important, and both Python and R have their pros and cons. Pick a language you like, and check out one of the [Free courses](#free-courses) we've listed below!
120+
When embarking on your journey to becoming a data scientist, the choice of language isn't particularly important, and both Python and R have their pros and cons. Pick a language you like, and check out one of the [Free courses](#free-courses) we've listed below!
119121

120122
### Beginner Roadmap
121123
If you're just starting out, here's a simple recommended path:
@@ -128,20 +130,21 @@ If you're just starting out, here's a simple recommended path:
128130

129131
## Agents
130132

131-
Please, contribute about "agents"
133+
This section contains agent frameworks and tools that are useful for data science workflows.
132134

133135
### Frameworks
134136
- [ADK-Rust](https://github.com/zavora-ai/adk-rust) - Production-ready AI agent development kit for Rust with model-agnostic design (Gemini, OpenAI, Anthropic), multiple agent types (LLM, Graph, Workflow), MCP support, and built-in telemetry.
135137

136138
### Tools
137139
- [Frostbyte MCP](https://github.com/OzorOwn/frostbyte-mcp) - MCP server providing 13 data tools for AI agents: real-time crypto prices, IP geolocation, DNS lookups, web scraping to markdown, code execution, and screenshots. One API key for 40+ services.
138140
- [Arch Tools](https://archtools.dev) - 61 production-ready AI API tools for data science workflows: code analysis, web scraping, NLP, image generation, crypto data, and search. REST API and MCP protocol support. [GitHub](https://github.com/Deesmo/Arch-AI-Tools)
141+
139142
### Research & Knowledge Retrieval
140143
- [BGPT MCP](https://bgpt.pro/mcp) - MCP server that gives AI agents access to a database of scientific papers built from raw experimental data extracted from full-text studies. Returns 25+ structured fields per paper including methods, results, sample sizes, and quality scores. [GitHub](https://github.com/connerlambden/bgpt-mcp)
141144

142-
### Workflow
145+
### Workflow
143146
**[`^ back to top ^`](#awesome-data-science)**
144-
- [sim](https://sim.ai) Sim Studio's interface is a lightweight, intuitive way to quickly build and deploy LLMs that connect with your favorite tools.
147+
- [sim](https://sim.ai) - Sim Studio's interface is a lightweight, intuitive way to quickly build and deploy LLMs that connect with your favorite tools.
145148

146149

147150
## Training Resources

0 commit comments

Comments
 (0)