Skip to content
This repository was archived by the owner on Jul 31, 2025. It is now read-only.

Fix: Better anonymization#49

Closed
Squidly271 wants to merge 1 commit into
mainfrom
diagnostics
Closed

Fix: Better anonymization#49
Squidly271 wants to merge 1 commit into
mainfrom
diagnostics

Conversation

@Squidly271
Copy link
Copy Markdown
Collaborator

@Squidly271 Squidly271 commented Jun 8, 2025

Anonymization email, IPv6, and MAC addresses
Do not include non-anonymized rotated log files

Summary by CodeRabbit

  • Bug Fixes
    • Improved redaction of sensitive information in diagnostics by masking email addresses, MAC addresses, and IPv6 patterns.
    • Diagnostics now only display the first line of the "lock status" output for clarity.
    • Refined diagnostics file collection by excluding SSH and nginx configurations and specifying exact log files for collection.

Signed-off-by: Squidly271 <unraidsquid@gmail.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 8, 2025

Walkthrough

The diagnostics configuration for the Tailscale plugin was updated to enhance data redaction by adding new regex filters for sensitive information. The log collection process was refined to target specific log files, and the output of a diagnostic command was limited to its first line. Certain configuration files were excluded from collection.

Changes

File(s) Change Summary
src/usr/local/emhttp/plugins/tailscale/diagnostics.json Added regex filters for email, MAC, and IPv6 redaction; limited "tailscale lock status" to one line; refined log file list; removed SSH/nginx configs.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant DiagnosticsCollector
    participant System

    User->>DiagnosticsCollector: Initiate diagnostics collection
    DiagnosticsCollector->>System: Run "tailscale lock status" (capture first line)
    DiagnosticsCollector->>System: Collect specified log files
    DiagnosticsCollector->>System: Exclude SSH/nginx configs
    DiagnosticsCollector->>DiagnosticsCollector: Apply regex redaction filters
    DiagnosticsCollector-->>User: Provide redacted diagnostics data
Loading

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/usr/local/emhttp/plugins/tailscale/diagnostics.json (1)

5-11: Regex filters expanded for enhanced anonymization
The new filters cover magicDNS domains, email addresses, MAC addresses, and multiple IPv6 formats. These additions significantly improve redaction coverage—but please verify each pattern against edge cases (e.g., uncommon IPv6 shorthand, bracketed addresses, mixed‐case hex, varied delimiters).

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d889491 and cdb2095.

📒 Files selected for processing (1)
  • src/usr/local/emhttp/plugins/tailscale/diagnostics.json (3 hunks)
🔇 Additional comments (2)
src/usr/local/emhttp/plugins/tailscale/diagnostics.json (2)

28-28: Limit tailscale lock status output to first line
Piping through head -1 correctly restricts the output. Confirm that head is present in all supported environments or consider using a built‐in shell fallback (sed '1q') if portability is a concern.


76-78: Restrict log file collection to explicit Tailscale logs
Explicitly listing tailscale.log, tailscale-utils.log, and tailscale-error.log avoids inadvertently including un‐anonymized rotated files. Please verify these filenames exist on target systems and consider whether any rotated variants (e.g., .1, .gz) should also be captured and masked.

@Squidly271
Copy link
Copy Markdown
Collaborator Author

5-11: Regex filters expanded for enhanced anonymization

@coderabbitai - Already did except for some very sharp edge cases

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 8, 2025

Tip

For best results, initiate chat on the files or code changes.

@Squidly271 Great work on implementing the enhanced anonymization! I appreciate your transparency about the edge cases that might still remain.

The regex filters you've added for email addresses, MAC addresses, and IPv6 addresses in the diagnostics configuration look comprehensive. If you'd like me to help identify any potential edge cases that might still need coverage, or if you want to discuss specific patterns that are proving challenging to anonymize, feel free to let me know.

The current implementation appears to handle the common cases well, and it's perfectly reasonable to address edge cases incrementally as they're discovered in practice.

"/etc/ssh/sshd_config",
"/etc/nginx/conf.d/servers.conf",
"/etc/samba/smb-names.conf",
"/var/log/tailscale*"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you check to see if the .gz files were anonymized? plugin-diagnostics is smarter than I think you gave it credit for :)

https://github.com/dkaser/unraid-plugin-diagnostics/blob/89db27fee3c70ce9be3cef3af8423e681c09fb63/src/usr/local/php/unraid-plugin-diagnostics/unraid-plugin-diagnostics/Utils.php#L104

"file": "tailscale-debug-prefs.txt"
},
{
"command": "tailscale lock status",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tailscale lock status doesn't contain any sensitive information (public keys, but nothing more than that), and this removes valuable information (specifically, nodes that are currently locked out).

If the public keys are really a concern, then it would be better to apply a command-specific filter to substitute just the ID part of the keys.

@Squidly271
Copy link
Copy Markdown
Collaborator Author

Apologies with regards to the .gz files. You are correct. At the time of creating the regex I was informed that they were not being touched (my installs didn't have any rotated logs). Will revert that.

But, just noticed that these regex's are over anonymizing and a change which I had already tested somehow didn't make it into this (timestamps are getting anonymized in this version).

Re lock status, while we both realize that a public key is no problem, general users would likely not understand the difference between public / private and would question simply seeing something like that in, hence the removal. But, our installs used for testing didn't show any of the additional information -> only the status and the key so we never realized it was there. I'll revamp.

Thanks!

@Squidly271 Squidly271 closed this Jul 19, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants