We are about to deploy the initial version of the API which currently provides data for the awesome-manim feed on the website (corresponding PR for the website ManimCommunity/manim-website#73)
The scraper currently:
- fetches the list of all YouTube channel links from the README file
- scrapes the publicly available RSS feeds that are available from
https://www.youtube.com/feeds/videos.xml?channel_id=xxxxxxx.
- searches for the substrings
Manim (case insensitive), #some (case insensitive), SoME (case sensitive) in the video title or description. When found, the videos are marked as "being manim videos".
- The scraper then puts them on a MySQL database and serves videos chronologically on a paginated endpoint
/videos/n, 30 videos at a time.
This issue records some ideas we could implement in the future based on feedback.
- A deeper scrape of all the channels (RSS feeds just return the latest 15 videos)
- An algorithmic feed that prioritizes videos with higher engagement, but still retains the chronological ordering to some degree
- ...
Feel free to discuss these and propose any other ideas.
We are about to deploy the initial version of the API which currently provides data for the awesome-manim feed on the website (corresponding PR for the website ManimCommunity/manim-website#73)
The scraper currently:
https://www.youtube.com/feeds/videos.xml?channel_id=xxxxxxx.Manim(case insensitive),#some(case insensitive),SoME(case sensitive) in the video title or description. When found, the videos are marked as "being manim videos"./videos/n, 30 videos at a time.This issue records some ideas we could implement in the future based on feedback.
Feel free to discuss these and propose any other ideas.