Skip to content

Prefer HTML pasteboard flavor in rich-text-to-markdown for hyperlinked text fidelity#1033

Open
alibrohde wants to merge 1 commit intoraycast:masterfrom
alibrohde:ali/hyperlinked-html-rich-text-to-markdown
Open

Prefer HTML pasteboard flavor in rich-text-to-markdown for hyperlinked text fidelity#1033
alibrohde wants to merge 1 commit intoraycast:masterfrom
alibrohde:ali/hyperlinked-html-rich-text-to-markdown

Conversation

@alibrohde
Copy link
Copy Markdown

Description

Extends commands/conversions/rich-text-clipboard-to-markdown.sh to prefer the HTML pasteboard flavor over the RTF flavor, with the original RTF pipeline kept as a fallback.

Why

When you copy hyperlinked text from modern web sources (browsers, Gmail, Google Docs, Notion, Linear, Slack web, etc.), macOS puts both an RTF and an HTML flavor on the pasteboard. The HTML flavor carries the hyperlinks as real anchor tags, which pandoc renders cleanly as [label](url) in markdown. The RTF flavor from the same sources often drops or mangles those links, leaving the visible text intact but stripping the URL.

For anyone who frequently pastes rich text into CLIs or plain-text editors (Claude Code, terminals, vim, etc.), the HTML path meaningfully reduces link loss.

Before / after

Copy from a webpage:

Check out Raycast and Anthropic.

Before (RTF path): hyperlinks often stripped, only visible text survives.

After (HTML path):

```
Check out Raycast and Anthropic.
```

Backward compatibility

No breaking changes. When HTML is absent from the pasteboard (some native apps, older editors), the script falls through to the original RTF pipeline unchanged.

Type of change

  • Improvement of an existing script

Dependencies / Requirements

Unchanged — still only requires pandoc (brew install pandoc).

Checklist

…d text fidelity

Modern web sources (browsers, Gmail, Google Docs, Notion, Linear, etc.) place
higher-fidelity HTML on the macOS pasteboard than RTF. Hyperlinked text in
those sources arrives as real anchor tags in the HTML flavor, which pandoc
cleanly renders as [label](url). The RTF flavor from the same sources often
drops or mangles those links.

The script now tries the HTML flavor first and falls back to the original RTF
pipeline when HTML is not present, preserving existing behavior for all
sources that only expose RTF.
Copy link
Copy Markdown
Contributor

@unnamedd unnamedd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Ali,
Thank you for your contribution to our repository.

I've made only two comments, please check them and re-request a review when you finish addressing them, okay?

Comment on lines 7 to 8
# @raycast.author Adam Zethraeus
# @raycast.authorURL https://github.com/adam-zethraeus
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are making changes to the Script Command originally created by someone else, it is nice to credit them too, in this case: yourself.

Suggested change
# @raycast.author Adam Zethraeus
# @raycast.authorURL https://github.com/adam-zethraeus
# @raycast.author Ali Rohde
# @raycast.authorURL https://github.com/alibrohde

Comment on lines +24 to +29
# Prefer HTML: most modern web sources (browsers, Gmail, Google Docs, Notion,
# Linear, etc.) place higher-fidelity HTML on the pasteboard than RTF, and it
# preserves hyperlinked text as real anchor tags that pandoc renders as
# [label](url) in markdown.
html=$(osascript -e 'try' -e 'the clipboard as «class HTML»' -e 'on error' -e 'return ""' -e 'end try' 2>/dev/null \
| perl -ne 'chomp; next unless s/^«data HTML//; s/»$//; print pack("H*", $_)')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am concerned about that because if the editor works with both, and the person prefers to have Markdown instead of RTF, the editor will kind of force the user to make use of RTF. Right?

The case I am thinking of here is Google Docs, where we can have blocks of code and want to paste Markdown inside a code block. I am not sure if it will respect that.

Perhaps it is better to have a second Script Command to convert to RTF instead of doing two things with a single Script Command.

What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants