Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 51 additions & 0 deletions .github/workflows/test-and-release.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,57 @@ jobs:
run: pnpm build
working-directory: ./website

- name: Run Lychee Link Checker
id: lychee
if: github.event_name == 'pull_request'
uses: lycheeverse/lychee-action@v2.8.0
with:
fail: false
args: >
--base-url https://docs.apify.com
--max-retries 6
--exclude="github.com"
--exclude="npmjs.com"
--exclude="docs\.apify\.com/sdk/js/assets/"
--no-progress
--timeout '60'
--accept '100..=103,200..=299,429'
--max-redirects 5
--format markdown
'./website/build/**/*.html'

- name: Find Comment
if: github.event_name == 'pull_request'
uses: peter-evans/find-comment@v4
id: find-comment
with:
issue-number: ${{ github.event.pull_request.number }}
body-includes: There are broken links in the documentation.

- name: Links are passing
if: github.event_name == 'pull_request' && steps.lychee.outputs.exit_code == 0 && steps.find-comment.outputs.comment-id
uses: peter-evans/create-or-update-comment@v5
with:
comment-id: ${{ steps.find-comment.outputs.comment-id }}
issue-number: ${{ github.event.pull_request.number }}
body: |
✅ The link checker did not find any broken links.

See more at ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}#summary-${{ job.check_run_id }}
edit-mode: replace

- name: Links are failing
if: github.event_name == 'pull_request' && steps.lychee.outputs.exit_code != 0
uses: peter-evans/create-or-update-comment@v5
with:
comment-id: ${{ steps.find-comment.outputs.comment-id }}
issue-number: ${{ github.event.pull_request.number }}
body: |
⚠️ There are broken links in the documentation.

See more at ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}#summary-${{ job.check_run_id }}
edit-mode: replace

lint:
name: Lint
runs-on: ubuntu-22.04
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ await Actor.setValue('OUTPUT', {
await Actor.exit();
```

> You can also install the [`crawlee`](https://npmjs.org/crawlee) module, as it now provides the crawlers that were previously exported by Apify SDK. If you don't plan to use crawlers in your Actors, then you don't need to install it. Keep in mind that neither `playwright` nor `puppeteer` are bundled with `crawlee` in order to reduce install size and allow greater flexibility. That's why we manually install it with NPM. You can choose one, both, or neither. For more information and example please check [`documentation.`](https://docs.apify.com/sdk/js/docs/concepts/actor-lifecycle#running-crawlee-code-as-an-actor)
> You can also install the [`crawlee`](https://www.npmjs.com/package/crawlee) module, as it now provides the crawlers that were previously exported by Apify SDK. If you don't plan to use crawlers in your Actors, then you don't need to install it. Keep in mind that neither `playwright` nor `puppeteer` are bundled with `crawlee` in order to reduce install size and allow greater flexibility. That's why we manually install it with NPM. You can choose one, both, or neither. For more information and example please check [`documentation.`](https://docs.apify.com/sdk/js/docs/concepts/actor-lifecycle#running-crawlee-code-as-an-actor)

## Support

Expand Down
4 changes: 2 additions & 2 deletions docs/02_concepts/07_docker_images.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ import tsBrowserDocker from '!!raw-loader!./docker_browser_ts.txt';

Running headless browsers in Docker requires a lot of setup to do it right. But there's no need to worry about that, because we already created base images that you can freely use. We use them every day on the [Apify Platform](./actor-lifecycle).

All images can be found in their [GitHub repo](https://github.com/apify/apify-actor-docker) and in our [DockerHub](https://hub.docker.com/orgs/apify).
All images can be found in their [GitHub repo](https://github.com/apify/apify-actor-docker) and in our [DockerHub](https://hub.docker.com/u/apify).

## Overview

Expand Down Expand Up @@ -180,7 +180,7 @@ FROM apify/actor-node-playwright:24

Similar to [`actor-node-puppeteer-chrome`](#actor-node-puppeteer-chrome), but for Playwright. You can run <CrawleeApiLink to="cheerio-crawler/class/CheerioCrawler">`CheerioCrawler`</CrawleeApiLink> and <CrawleeApiLink to="playwright-crawler/class/PlaywrightCrawler">`PlaywrightCrawler`</CrawleeApiLink>, but **NOT** <CrawleeApiLink to="puppeteer-crawler/class/PuppeteerCrawler">`PuppeteerCrawler`</CrawleeApiLink>.

It uses the [`PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD`](https://playwright.dev/docs/api/environment-variables/) environment variable to block installation of more browsers into the image to keep it small. If you want more browsers, either use the [`actor-node-playwright`](#actor-node-playwright) image override this env var.
It uses the [`PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD`](https://playwright.dev/docs/library#browser-downloads) environment variable to block installation of more browsers into the image to keep it small. If you want more browsers, either use the [`actor-node-playwright`](#actor-node-playwright) image override this env var.

The image supports XVFB by default, so we can run both `headless` and `headful` browsers with it.

Expand Down
2 changes: 1 addition & 1 deletion docs/03_guides/crawl_some_links.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ import ApiLink from '@site/src/components/ApiLink';
import { CrawleeApiLink } from '@site/src/components/CrawleeLinks';
import CrawlSource from '!!raw-loader!roa-loader!./crawl_some_links.ts';

This <CrawleeApiLink to="cheerio-crawler/class/CheerioCrawler">`CheerioCrawler`</CrawleeApiLink> example uses the <CrawleeApiLink to="core/class/PseudoUrl">`pseudoUrls`</CrawleeApiLink> property in the <CrawleeApiLink to="cheerio-crawler/interface/CheerioRequestHandlerInputs#enqueueLinks">`enqueueLinks()`</CrawleeApiLink> method to only add links to the <ApiLink to="apify/class/RequestQueue">`RequestQueue`</ApiLink> queue if they match the specified regular expression.
This <CrawleeApiLink to="cheerio-crawler/class/CheerioCrawler">`CheerioCrawler`</CrawleeApiLink> example uses the <CrawleeApiLink to="core/class/PseudoUrl">`pseudoUrls`</CrawleeApiLink> property in the <CrawleeApiLink to="cheerio-crawler/interface/CheerioCrawlingContext#enqueueLinks">`enqueueLinks()`</CrawleeApiLink> method to only add links to the <ApiLink to="apify/class/RequestQueue">`RequestQueue`</ApiLink> queue if they match the specified regular expression.

<RunnableCodeBlock className="language-js" type="cheerio">
{CrawlSource}
Expand Down
2 changes: 1 addition & 1 deletion docs/03_guides/typescript_setup.mdx
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
id: type-script-actor
id: typescript-setup
sidebar_label: TypeScript Actors
title: Setting up a TypeScript project
---
Expand Down
2 changes: 1 addition & 1 deletion src/actor.ts
Original file line number Diff line number Diff line change
Expand Up @@ -217,7 +217,7 @@ export interface ApifyEnv {
logFormat: string | null;

/**
* Origin for the Actor run, i.e. how it was started. See [here](https://docs.apify.com/sdk/python/reference/enum/MetaOrigin)
* Origin for the Actor run, i.e. how it was started. See [here](https://docs.apify.com/platform/actors/running/runs-and-builds#origin)
* for more details. (APIFY_META_ORIGIN)
*/
metaOrigin: string | null;
Expand Down
3 changes: 1 addition & 2 deletions website/docusaurus.config.js
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,7 @@ module.exports = {
editUrl:
'https://github.com/apify/apify-sdk-js/edit/master/website/',
},
blog: false,
}),
],
]),
Expand Down Expand Up @@ -143,8 +144,6 @@ module.exports = {
'https://crawlee.dev/js/api/core/interface/StorageManagerOptions',
ConfigurationOptions:
'https://crawlee.dev/js/api/core/interface/ConfigurationOptions',
EventManager:
'https://crawlee.dev/js/api/core/interface/EventManager',
RecordOptions:
'https://crawlee.dev/js/api/core/interface/RecordOptions',
UseStateOptions:
Expand Down
2 changes: 1 addition & 1 deletion website/versioned_docs/version-1.3/api/Dataset.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ call [`datasetClient.downloadItems()`](https://github.com/apify/apify-client-js#
Returns an object containing general information about the dataset.

The function returns the same object as the Apify API Client's
[getDataset](https://docs.apify.com/api/apify-client-js/latest#ApifyClient-datasets-getDataset) function, which in turn calls the
[getDataset](https://docs.apify.com/api/client/js/reference/class/DatasetClient) function, which in turn calls the
[Get dataset](https://apify.com/docs/api/v2#/reference/datasets/dataset/get-dataset) API endpoint.

**Example:**
Expand Down
2 changes: 1 addition & 1 deletion website/versioned_docs/version-1.3/api/RequestQueue.md
Original file line number Diff line number Diff line change
Expand Up @@ -212,7 +212,7 @@ const { handledRequestCount } = await queue.getInfo();

Returns an object containing general information about the request queue.

The function returns the same object as the Apify API Client's [getQueue](https://docs.apify.com/api/apify-client-js/latest#ApifyClient-requestQueues)
The function returns the same object as the Apify API Client's [getQueue](https://docs.apify.com/api/client/js/reference/class/RequestQueueClient)
function, which in turn calls the [Get request queue](https://apify.com/docs/api/v2#/reference/request-queues/queue/get-request-queue) API endpoint.

**Example:**
Expand Down
4 changes: 2 additions & 2 deletions website/versioned_docs/version-1.3/guides/docker_images.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ id: docker-images

Running headless browsers in Docker requires a lot of setup to do it right. But you don't need to worry about that, because we already did it for you and created base images that you can freely use. We use them every day on the [Apify Platform](../guides/apify_platform.md).

All images can be found in their [GitHub repo](https://github.com/apify/apify-actor-docker) and in our [DockerHub](https://hub.docker.com/orgs/apify).
All images can be found in their [GitHub repo](https://github.com/apify/apify-actor-docker) and in our [DockerHub](https://hub.docker.com/u/apify).

## Overview

Expand Down Expand Up @@ -165,7 +165,7 @@ Similar to [`actor-node-puppeteer-chrome`](#actor-node-puppeteer-chrome), but fo
[`CheerioCrawler`](../api/cheerio-crawler) and [`PlaywrightCrawler`](../api/playwright-crawler),
but **NOT** [`PuppeteerCrawler`](../api/puppeteer-crawler).

It uses the [`PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD`](https://playwright.dev/docs/api/environment-variables/)
It uses the [`PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD`](https://playwright.dev/docs/library#browser-downloads)
environment variable to block installation of more browsers into your images (to keep them small).
If you want more browsers, either choose the [`actor-node-playwright`](#actor-node-playwright) image
or override this env var.
Expand Down
2 changes: 1 addition & 1 deletion website/versioned_docs/version-2.3/api/RequestQueue.md
Original file line number Diff line number Diff line change
Expand Up @@ -219,7 +219,7 @@ const { handledRequestCount } = await queue.getInfo();

Returns an object containing general information about the request queue.

The function returns the same object as the Apify API Client's [getQueue](https://docs.apify.com/api/apify-client-js/latest#ApifyClient-requestQueues)
The function returns the same object as the Apify API Client's [getQueue](https://docs.apify.com/api/client/js/reference/class/RequestQueueClient)
function, which in turn calls the [Get request queue](https://apify.com/docs/api/v2#/reference/request-queues/queue/get-request-queue) API endpoint.

**Example:**
Expand Down
4 changes: 2 additions & 2 deletions website/versioned_docs/version-2.3/guides/docker_images.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ id: docker-images

Running headless browsers in Docker requires a lot of setup to do it right. But you don't need to worry about that, because we already did it for you and created base images that you can freely use. We use them every day on the [Apify Platform](../guides/apify_platform.md).

All images can be found in their [GitHub repo](https://github.com/apify/apify-actor-docker) and in our [DockerHub](https://hub.docker.com/orgs/apify).
All images can be found in their [GitHub repo](https://github.com/apify/apify-actor-docker) and in our [DockerHub](https://hub.docker.com/u/apify).

## Overview

Expand Down Expand Up @@ -165,7 +165,7 @@ Similar to [`actor-node-puppeteer-chrome`](#actor-node-puppeteer-chrome), but fo
[`CheerioCrawler`](../api/cheerio-crawler) and [`PlaywrightCrawler`](../api/playwright-crawler),
but **NOT** [`PuppeteerCrawler`](../api/puppeteer-crawler).

It uses the [`PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD`](https://playwright.dev/docs/api/environment-variables/)
It uses the [`PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD`](https://playwright.dev/docs/library#browser-downloads)
environment variable to block installation of more browsers into your images (to keep them small).
If you want more browsers, either choose the [`actor-node-playwright`](#actor-node-playwright) image
or override this env var.
Expand Down
4 changes: 2 additions & 2 deletions website/versioned_docs/version-3.0/api-typedoc.json
Original file line number Diff line number Diff line change
Expand Up @@ -7086,7 +7086,7 @@
"summary": [
{
"kind": "text",
"text": "Returns an object containing general information about the dataset.\n\nThe function returns the same object as the Apify API Client's\n[getDataset](https://docs.apify.com/api/apify-client-js/latest#ApifyClient-datasets-getDataset)\nfunction, which in turn calls the\n[Get dataset](https://apify.com/docs/api/v2#/reference/datasets/dataset/get-dataset)\nAPI endpoint.\n\n**Example:**\n"
"text": "Returns an object containing general information about the dataset.\n\nThe function returns the same object as the Apify API Client's\n[getDataset](https://docs.apify.com/api/client/js/reference/class/DatasetClient)\nfunction, which in turn calls the\n[Get dataset](https://apify.com/docs/api/v2#/reference/datasets/dataset/get-dataset)\nAPI endpoint.\n\n**Example:**\n"
},
{
"kind": "code",
Expand Down Expand Up @@ -11948,7 +11948,7 @@
"summary": [
{
"kind": "text",
"text": "Returns an object containing general information about the request queue.\n\nThe function returns the same object as the Apify API Client's\n[getQueue](https://docs.apify.com/api/apify-client-js/latest#ApifyClient-requestQueues)\nfunction, which in turn calls the\n[Get request queue](https://apify.com/docs/api/v2#/reference/request-queues/queue/get-request-queue)\nAPI endpoint.\n\n**Example:**\n"
"text": "Returns an object containing general information about the request queue.\n\nThe function returns the same object as the Apify API Client's\n[getQueue](https://docs.apify.com/api/client/js/reference/class/RequestQueueClient)\nfunction, which in turn calls the\n[Get request queue](https://apify.com/docs/api/v2#/reference/request-queues/queue/get-request-queue)\nAPI endpoint.\n\n**Example:**\n"
},
{
"kind": "code",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,6 @@ import ApiLink from '@site/src/components/ApiLink';
import { CrawleeApiLink } from '@site/src/components/CrawleeLinks';
import CrawlSource from '!!raw-loader!./crawl_some_links.ts';

This <CrawleeApiLink to="cheerio-crawler/class/CheerioCrawler">`CheerioCrawler`</CrawleeApiLink> example uses the <CrawleeApiLink to="core/class/PseudoUrl">`pseudoUrls`</CrawleeApiLink> property in the <CrawleeApiLink to="cheerio-crawler/interface/CheerioRequestHandlerInputs#enqueueLinks">`enqueueLinks()`</CrawleeApiLink> method to only add links to the <ApiLink to="apify/class/RequestQueue">`RequestQueue`</ApiLink> queue if they match the specified regular expression.
This <CrawleeApiLink to="cheerio-crawler/class/CheerioCrawler">`CheerioCrawler`</CrawleeApiLink> example uses the <CrawleeApiLink to="core/class/PseudoUrl">`pseudoUrls`</CrawleeApiLink> property in the <CrawleeApiLink to="cheerio-crawler/interface/CheerioCrawlingContext#enqueueLinks">`enqueueLinks()`</CrawleeApiLink> method to only add links to the <ApiLink to="apify/class/RequestQueue">`RequestQueue`</ApiLink> queue if they match the specified regular expression.

<CodeBlock className="language-js">{CrawlSource}</CodeBlock>
4 changes: 2 additions & 2 deletions website/versioned_docs/version-3.0/guides/docker_images.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ import tsBrowserDocker from '!!raw-loader!./docker_browser_ts.txt';

Running headless browsers in Docker requires a lot of setup to do it right. But there's no need to worry about that, because we already created base images that you can freely use. We use them every day on the [Apify Platform](./apify-platform).

All images can be found in their [GitHub repo](https://github.com/apify/apify-actor-docker) and in our [DockerHub](https://hub.docker.com/orgs/apify).
All images can be found in their [GitHub repo](https://github.com/apify/apify-actor-docker) and in our [DockerHub](https://hub.docker.com/u/apify).

## Overview

Expand Down Expand Up @@ -138,7 +138,7 @@ FROM apify/actor-node-playwright:16

Similar to [`actor-node-puppeteer-chrome`](#actor-node-puppeteer-chrome), but for Playwright. You can run <CrawleeApiLink to="cheerio-crawler/class/CheerioCrawler">`CheerioCrawler`</CrawleeApiLink> and <CrawleeApiLink to="playwright-crawler/class/PlaywrightCrawler">`PlaywrightCrawler`</CrawleeApiLink>, but **NOT** <CrawleeApiLink to="puppeteer-crawler/class/PuppeteerCrawler">`PuppeteerCrawler`</CrawleeApiLink>.

It uses the [`PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD`](https://playwright.dev/docs/api/environment-variables/) environment variable to block installation of more browsers into the image to keep it small. If you want more browsers, either use the [`actor-node-playwright`](#actor-node-playwright) image override this env var.
It uses the [`PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD`](https://playwright.dev/docs/library#browser-downloads) environment variable to block installation of more browsers into the image to keep it small. If you want more browsers, either use the [`actor-node-playwright`](#actor-node-playwright) image override this env var.

The image supports XVFB by default, so we can run both `headless` and `headful` browsers with it.

Expand Down
Loading
Loading