This project is an LLM-based question answering system that reads content from multiple URLs and answers user questions using only the extracted webpage content.
The implementation is done in a Google Colab notebook.
- Loads webpage content from given URLs
- Cleans and splits the text into chunks
- Uses LangChain with HuggingFace models
- Answers questions using strictly retrieved context
- Helps reduce hallucinations
- Open the notebook in Google Colab
- Install required libraries
- Run cells from top to bottom
- Enter URLs and a question
- Python
- Google Colab
- LangChain
- HuggingFace Transformers
This project lists only the primary dependencies required to run the notebook. Some libraries (e.g., tokenizers, sentencepiece, unstructured sub-dependencies) are installed automatically as transitive dependencies.