Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
pdf_reader.py	pdf_reader.py

Name

Last commit message

Last commit date

📄 PDF Text Processor

This Python script processes a PDF file by extracting its text content, performing some text modifications, and saving the result to a text file.

🛠️ Requirements

Set the path_to_pdf variable to the path of your input PDF file.
Set the original_txt variable to the desired output text file name.
(Optional) Add strings to the strings_to_delete list if you want to remove specific content from the text.
Run the script.

📂 The script opens the specified PDF file using pdfplumber.
📝 It extracts text from each page of the PDF and writes it to the original_txt file.
🔄 The script then reads the original_txt file, processes each line by:
- 🗑️ Removing specified strings (if any)
- ➕ Adding two newline characters before lines starting with "Frage"
💾 The processed text is written to a temporary file (temp.txt).
🔁 Finally, the temporary file replaces the original text file.

Happy PDF processing! 📚✨