Word-level embeddings #110
Replies: 3 comments 3 replies
-
|
Hi @Mystic-Slice, Thank you for starting this thread. We need to gather evidence that by using word-level embedding we would also be able to identify more accurately the input prompts. How do you see this applied to the datasets and API endpoints we have today? |
Beta Was this translation helpful? Give feedback.
-
|
Thank you @Mystic-Slice. Hello, @santanavagner, so here's some context on how I came to know about this project. Now, when a user puts in some prompt, I vividly remember (from the presentation), the application currently recommends whole sentences in place of faulty (negative sentiment, foul words, etc) sentences (given by the user). This is because we are using sentence-level embeddings currently. Two problems could arise here (correct me if I'm wrong):
I had proposed, why not use word-level embeddings instead of entire sentences, so the application can be more dynamic, and recommend swapping faulty useful words instead of removing entire sentences. Happy to discuss more on the same, and now that I have a healthy knowledge of the repository, I'll be happy to help to implement it out as well, if it seems feasible. Thank you. |
Beta Was this translation helpful? Give feedback.
-
|
Hi @ArionDas , Got it. I agree with you and results from our latest paper are aligned with this, i.e., users don't like when the system removes the whole sentence in case one or more words are identified as harmful. Actually, this idea on swapping terms is being implemented by @luanssouza in a feature for recommending a reprhasing of the sentenced identified as harmful (#17) instead of removing it. Please consider sharing the working code you have so @ArionDas can collaborate with you on issue #17. Thank you all! Cheers |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
The system currently considers each sentence to represent a single value. This is a good tradeoff between granularity/accuracy and the speed of recommendation.
World-level embeddings could provide more granular and accurate classification. This is a discussion to explore that idea.
Beta Was this translation helpful? Give feedback.
All reactions