Skip to content

chore: node slim#11

Merged
hmbanan666 merged 2 commits into
mainfrom
slim
Jul 18, 2025
Merged

chore: node slim#11
hmbanan666 merged 2 commits into
mainfrom
slim

Conversation

@hmbanan666

@hmbanan666 hmbanan666 commented Jul 18, 2025

Copy link
Copy Markdown
Collaborator

Summary by CodeRabbit

  • Bug Fixes

    • Improved the reliability of selecting and clicking the correct button in the popup for ranking by novelty.
  • Chores

    • Updated the Docker image to use a more stable base and streamlined browser installation process.
    • Simplified user and group creation commands for broader compatibility.

@hmbanan666 hmbanan666 self-assigned this Jul 18, 2025
@coderabbitai

coderabbitai Bot commented Jul 18, 2025

Copy link
Copy Markdown

Walkthrough

The changes update the logic for selecting and clicking a button in a Playwright script, switching from text-based to index-based selection. The Dockerfile is revised to use a different Node.js base image, modifies Playwright installation steps, and standardizes user creation commands for broader Linux compatibility.

Changes

File(s) Change Summary
apps/web-parser/server/tasks/yandex/update-data.ts Modified Playwright selector logic to click the second element by index instead of by text content.
docker/web-parser/Dockerfile Changed Node.js base image from Alpine to Slim, updated Playwright install commands, removed Alpine-specific commands, and standardized user/group creation.

Possibly related PRs

  • chore: some updates #10: Both PRs update Dockerfile Playwright-related configurations, including browser installation and user setup.
  • chore: browser change #7: Both PRs modify browser automation logic in update-data.ts, with one replacing Puppeteer with Playwright and the other updating element selection.

Poem

In the warren where scripts are spun,
A button now clicked, not by word but by one.
Docker’s coat grows sleek and slim,
Playwright’s shell installed on a whim.
Rabbits hop with code anew—
Fresh carrots for the parsing crew! 🥕


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8e797e9 and 4a4b33e.

📒 Files selected for processing (1)
  • docker/web-parser/Dockerfile (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • docker/web-parser/Dockerfile
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
apps/web-parser/server/tasks/yandex/update-data.ts (1)

88-89: Consider the trade-offs of index-based selection.

Switching from text-based to index-based button selection makes the code more resilient to text changes but more fragile to DOM structure changes. If Yandex modifies their popup structure, buttons[1] might select the wrong element or be undefined.

The optional chaining (?.click()) is good defensive programming, but consider adding additional validation to ensure the correct button is selected.

Consider adding validation to ensure robustness:

 const buttons = await page.$$('.rating-ranking-view__popup-line')
-await buttons[1]?.click()
+if (buttons.length >= 2) {
+  await buttons[1].click()
+} else {
+  logger.warn('Expected at least 2 buttons in rating popup, found:', buttons.length)
+}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 43dd52f and 8e797e9.

📒 Files selected for processing (2)
  • apps/web-parser/server/tasks/yandex/update-data.ts (1 hunks)
  • docker/web-parser/Dockerfile (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (1)
docker/web-parser/Dockerfile (1)

27-28: LGTM: Correct adaptation of user creation commands.

The change from Alpine-specific commands (addgroup -S, adduser -S) to standard Linux commands (groupadd -r, useradd -r -g) is correct for the Debian-based slim image. This maintains the security best practice of running as non-root.

RUN apk add --no-cache curl
WORKDIR /app

RUN npx playwright install chromium --with-deps --only-shell

@coderabbitai coderabbitai Bot Jul 18, 2025

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider moving Playwright installation back to builder stage.

Installing Playwright in the production stage introduces a network dependency during container startup and increases the attack surface. Consider moving this back to the builder stage for better security and reliability.

# Builder stage
FROM base AS builder
WORKDIR /app
COPY package.json pnpm-lock.yaml pnpm-workspace.yaml .npmrc ./
COPY packages/database ./packages/database
COPY apps/web-parser ./apps/web-parser
RUN npm install --ignore-scripts -g pnpm && \
  pnpm i --frozen-lockfile && \
  pnpm build --filter @roll-stack/web-parser
+RUN npx playwright install chromium --with-deps --only-shell

# Production stage
FROM base AS production
WORKDIR /app

-RUN npx playwright install chromium --with-deps --only-shell
+COPY --from=builder /opt/ms-playwright /opt/ms-playwright

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In docker/web-parser/Dockerfile at line 19, the Playwright installation is done
in the production stage, causing network dependency and security concerns. Move
the RUN npx playwright install chromium --with-deps --only-shell command from
the production stage back into the builder stage to ensure all dependencies are
installed during build time, reducing startup time and attack surface.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. RUN installs all necessary libs. It must be on final stage

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hmbanan666, understood! You're absolutely right - the --with-deps flag installs system-level dependencies that need to be present in the runtime environment, not just the browser binaries. These system libraries can't be simply copied between stages and must be installed in the final production stage. Thank you for the clarification!


✏️ Learnings added
Learnt from: hmbanan666
PR: sushi-love/roll-stack#11
File: docker/web-parser/Dockerfile:19-19
Timestamp: 2025-07-18T15:09:04.345Z
Learning: In Docker multi-stage builds with Playwright, the `npx playwright install chromium --with-deps --only-shell` command must be run in the production stage, not the builder stage, because `--with-deps` installs system-level dependencies (libraries, fonts, etc.) that need to be present in the runtime environment and cannot be copied between stages.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Comment thread docker/web-parser/Dockerfile Outdated
@sonarqubecloud

Copy link
Copy Markdown

@hmbanan666 hmbanan666 merged commit a23acc9 into main Jul 18, 2025
8 checks passed
@hmbanan666 hmbanan666 deleted the slim branch July 18, 2025 15:10
@coderabbitai coderabbitai Bot mentioned this pull request Jul 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant