Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
84 changes: 79 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -404,9 +404,61 @@ ytt_api_2.fetch(video_id)

## Cookie Authentication

Some videos are age restricted, so this module won't be able to access those videos without some sort of
authentication. Unfortunately, some recent changes to the YouTube API have broken the current implementation of cookie
based authentication, so this feature is currently not available.
Some videos are age-restricted, so this module won't be able to access those videos without authentication. You can authenticate by extracting cookies from your browser, which allows access to age-restricted content.

### Automatic Browser Cookie Extraction

The easiest way to authenticate is by extracting cookies directly from your browser:

```python
from youtube_transcript_api import YouTubeTranscriptApi

# Extract cookies from Chrome
ytt_api = YouTubeTranscriptApi(cookies_from_browser='chrome')
transcript = ytt_api.fetch(video_id)
```

**Supported browsers:**
- `chrome` - Google Chrome
- `firefox` - Mozilla Firefox
- `edge` - Microsoft Edge
- `brave` - Brave Browser
- `chromium` - Chromium
- `opera` - Opera
- `vivaldi` - Vivaldi

**Installation:**

For Chrome-based browsers (Chrome, Edge, Brave, etc.), you need the `cryptography` package:

```bash
pip install 'youtube-transcript-api[cookies]'
```

For Firefox, no additional dependencies are required (cookies are stored unencrypted).

**How it works:**

1. The library reads your browser's cookie database (SQLite file)
2. For Chrome-based browsers, cookies are decrypted using platform-specific methods:
- **Linux**: PBKDF2 with hardcoded password or GNOME Keyring/KWallet
- **macOS**: PBKDF2 with password from macOS Keychain
- **Windows**: DPAPI (Data Protection API) + AES-GCM
3. YouTube cookies are extracted and used for authentication

**Important notes:**
- Make sure you're logged into YouTube in the specified browser
- The browser can be open or closed (the library copies the database to avoid lock issues)
- Your cookies are only used locally and never transmitted anywhere
- Firefox support works without additional dependencies as cookies aren't encrypted

**Example with custom profile:**

```python
# Use a specific Chrome profile
ytt_api = YouTubeTranscriptApi(cookies_from_browser='chrome')
# Note: Custom profile selection not yet implemented, uses 'Default' profile
```

## Using Formatters
Formatters are meant to be an additional layer of processing of the transcript you pass it. The goal is to convert a
Expand Down Expand Up @@ -555,10 +607,32 @@ youtube_transcript_api <first_video_id> <second_video_id> --http-proxy http://us

### Cookie Authentication using the CLI

To authenticate using cookies through the CLI as explained in [Cookie Authentication](#cookie-authentication) run:
To authenticate using browser cookies through the CLI as explained in [Cookie Authentication](#cookie-authentication), run:

```bash
youtube_transcript_api <first_video_id> <second_video_id> --cookies-from-browser chrome
```

This works with any supported browser:

```bash
# Chrome
youtube_transcript_api VIDEO_ID --cookies-from-browser chrome

# Firefox
youtube_transcript_api VIDEO_ID --cookies-from-browser firefox

# Edge
youtube_transcript_api VIDEO_ID --cookies-from-browser edge

# Brave
youtube_transcript_api VIDEO_ID --cookies-from-browser brave
```
youtube_transcript_api <first_video_id> <second_video_id> --cookies /path/to/your/cookies.txt

Remember to install the optional dependencies for Chrome-based browsers:

```bash
pip install 'youtube-transcript-api[cookies]'
```

## Warning
Expand Down
13 changes: 8 additions & 5 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,7 @@ version = "1.2.3"
description = "This is a python API which allows you to get the transcripts/subtitles for a given YouTube video. It also works for automatically generated subtitles, supports translating subtitles and it does not require a headless browser, like other selenium based solutions do!"
readme = "README.md"
license = "MIT"
authors = [
"Jonas Depoix <jonas.depoix@web.de>",
]
authors = ["Jonas Depoix <jonas.depoix@web.de>"]
homepage = "https://github.com/jdepoix/youtube-transcript-api"
repository = "https://github.com/jdepoix/youtube-transcript-api"
keywords = [
Expand Down Expand Up @@ -52,6 +50,11 @@ precommit.shell = "poe format && poe lint && poe coverage"
python = ">=3.8,<3.15"
requests = "*"
defusedxml = "^0.7.1"
cryptography = { version = "*", optional = true }
secretstorage = { version = "*", optional = true, markers = "sys_platform == 'linux'" }

[tool.poetry.extras]
cookies = ["cryptography", "secretstorage"]

[tool.poetry.group.test]
optional = true
Expand Down Expand Up @@ -91,6 +94,6 @@ exclude_lines = [
# Don't complain about empty stubs of abstract methods
"@abstractmethod",
"@abstractclassmethod",
"@abstractstaticmethod"
"@abstractstaticmethod",
]
show_missing = true
show_missing = true
25 changes: 21 additions & 4 deletions youtube_transcript_api/_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,16 @@
from .proxies import ProxyConfig

from ._transcripts import TranscriptListFetcher, FetchedTranscript, TranscriptList
from ._cookies import extract_cookies_from_browser
from ._errors import CookieError


class YouTubeTranscriptApi:
def __init__(
self,
proxy_config: Optional[ProxyConfig] = None,
http_client: Optional[Session] = None,
cookies_from_browser: Optional[str] = None,
):
"""
Note on thread-safety: As this class will initialize a `requests.Session`
Expand All @@ -28,13 +31,27 @@ def __init__(
:param http_client: You can optionally pass in a requests.Session object, if you
manually want to share cookies between different instances of
`YouTubeTranscriptApi`, overwrite defaults, specify SSL certificates, etc.
:param cookies_from_browser: Extract cookies from a browser to enable
authentication for age-restricted videos. Supported browsers: 'chrome',
'firefox', 'edge', 'brave', 'chromium', 'opera', 'vivaldi'.
Note: Requires the 'cryptography' package for Chrome-based browsers.
Install with: pip install 'youtube-transcript-api[cookies]'
"""
http_client = Session() if http_client is None else http_client
http_client.headers.update({"Accept-Language": "en-US"})
# Cookie auth has been temporarily disabled, as it is not working properly with
# YouTube's most recent changes.
# if cookie_path is not None:
# http_client.cookies = _load_cookie_jar(cookie_path)

# Extract cookies from browser if specified
if cookies_from_browser is not None:
try:
cookies = extract_cookies_from_browser(cookies_from_browser)
for name, value in cookies.items():
http_client.cookies.set(name, value, domain=".youtube.com")
except CookieError as e:
# Re-raise cookie errors with context
raise CookieError(
f"Failed to extract cookies from {cookies_from_browser}: {e}"
)

if proxy_config is not None:
http_client.proxies = proxy_config.to_requests_dict()
if proxy_config.prevent_keeping_connections_alive:
Expand Down
29 changes: 22 additions & 7 deletions youtube_transcript_api/_cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ def run(self) -> str:

ytt_api = YouTubeTranscriptApi(
proxy_config=proxy_config,
cookies_from_browser=parsed_args.cookies_from_browser,
)

for video_id in parsed_args.video_ids:
Expand Down Expand Up @@ -188,13 +189,27 @@ def _parse_args(self):
metavar="URL",
help="Use the specified HTTPS proxy.",
)
# Cookie auth has been temporarily disabled, as it is not working properly with
# YouTube's most recent changes.
# parser.add_argument(
# "--cookies",
# default=None,
# help="The cookie file that will be used for authorization with youtube.",
# )
parser.add_argument(
"--cookies-from-browser",
dest="cookies_from_browser",
default=None,
choices=[
"chrome",
"firefox",
"edge",
"brave",
"chromium",
"opera",
"vivaldi",
],
help=(
"Extract cookies from the specified browser for authentication. "
"This enables access to age-restricted videos. "
"Supported browsers: chrome, firefox, edge, brave, chromium, opera, vivaldi. "
"Note: Chrome-based browsers require the 'cryptography' package. "
"Install with: pip install 'youtube-transcript-api[cookies]'"
),
)

return self._sanitize_video_ids(parser.parse_args(self._args))

Expand Down
Loading
Loading