Skip to content

Not returning any additional links from menu crawl #493

@Firelord710

Description

@Firelord710

Which package is this bug report for? If unsure which one to select, leave blank

@crawlee/core

Issue description

Hello, running the following code to attempt to scrape just links from this site as an adaptation of the introduction example.

It is not adding any additional links to the que and I am unable to figure out why. Thank you in advance.

Code sample

import asyncio

# You don't need to import RequestQueue anymore.
from crawlee.beautifulsoup_crawler import BeautifulSoupCrawler, BeautifulSoupCrawlingContext
from crawlee import EnqueueStrategy


async def main() -> None:
    crawler = BeautifulSoupCrawler(max_requests_per_crawl=10000)

    @crawler.router.default_handler
    async def request_handler(context: BeautifulSoupCrawlingContext) -> None:

        # Bypass age confirmation popup
        try:
            await context.soup.wait_for_selector('button:has-text("I am 21 or older")', timeout=5000)
            await context.soup.click('button:has-text("I am 21 or older")')
            await context.soup.wait_for_load_state('networkidle')
        except:
            print("Age confirmation popup not found or already handled.")

        url = context.request.url
        title = context.soup.title.string if context.soup.title else ''
        context.log.info(f'The title of {url} is: {title}.')

        # The enqueue_links function is available as one of the fields of the context.
        # It is also context aware, so it does not require any parameters.
        await context.enqueue_links(selector='button, a', strategy=EnqueueStrategy.ALL)

    # Start the crawler with the provided URLs.
    await crawler.run(['https://dutchie.com/embedded-menu/truormed-dispensary'])


if __name__ == '__main__':
    asyncio.run(main())

Package version

0.3.2b4

Node.js version

Python

Operating system

Windows

Apify platform

  • Tick me if you encountered this issue on the Apify platform

I have tested this on the next release

No response

Other context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working.t-toolingIssues with this label are in the ownership of the tooling team.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions