-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathsample_documents.py
More file actions
106 lines (104 loc) · 5.6 KB
/
sample_documents.py
File metadata and controls
106 lines (104 loc) · 5.6 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
"""
Sample Documents for Search Engine
===================================
Pre-populated documents to demonstrate search functionality
"""
SAMPLE_DOCUMENTS = [
{
"title": "Introduction to Python Programming",
"content": """
Python is a high-level, interpreted programming language known for its simplicity and readability.
It was created by Guido van Rossum and first released in 1991. Python supports multiple programming
paradigms including procedural, object-oriented, and functional programming. It has a large standard
library and an active community that contributes thousands of third-party packages.
"""
},
{
"title": "Building Web Applications with FastAPI",
"content": """
FastAPI is a modern, fast web framework for building APIs with Python. It's built on top of Starlette
and Pydantic, providing automatic API documentation, type validation, and async support. FastAPI is
one of the fastest Python frameworks available, comparable to Node.js and Go. It's perfect for
building REST APIs, microservices, and modern web applications.
"""
},
{
"title": "Understanding Search Engines and Information Retrieval",
"content": """
Search engines use sophisticated algorithms to find and rank relevant documents. The core concepts
include inverted indexes, which map words to documents, and ranking algorithms like TF-IDF and BM25.
TF-IDF (Term Frequency-Inverse Document Frequency) measures how important a word is to a document
in a collection. Modern search engines also use machine learning and vector embeddings for semantic
search capabilities.
"""
},
{
"title": "Database Design and SQL Fundamentals",
"content": """
Databases are essential for storing and retrieving data in applications. SQL (Structured Query Language)
is the standard language for interacting with relational databases. Key concepts include tables, rows,
columns, primary keys, foreign keys, and relationships. Understanding normalization, indexing, and
query optimization is crucial for building efficient database systems.
"""
},
{
"title": "Machine Learning Basics",
"content": """
Machine learning is a subset of artificial intelligence that enables computers to learn from data
without being explicitly programmed. Common types include supervised learning (classification,
regression), unsupervised learning (clustering), and reinforcement learning. Popular algorithms
include linear regression, decision trees, neural networks, and support vector machines. Python
libraries like scikit-learn, TensorFlow, and PyTorch make it easy to implement ML models.
"""
},
{
"title": "RESTful API Design Principles",
"content": """
REST (Representational State Transfer) is an architectural style for designing web services. RESTful
APIs use HTTP methods (GET, POST, PUT, DELETE) to perform operations on resources identified by URLs.
Key principles include statelessness, uniform interface, and resource-based URLs. Good API design
includes proper status codes, error handling, versioning, and documentation. APIs should be intuitive,
consistent, and follow industry best practices.
"""
},
{
"title": "Version Control with Git",
"content": """
Git is a distributed version control system used by millions of developers worldwide. It allows
tracking changes in code, collaborating with teams, and managing different versions of projects.
Key concepts include repositories, commits, branches, merges, and remotes. Platforms like GitHub
and GitLab provide hosting and collaboration features. Understanding Git is essential for modern
software development workflows.
"""
},
{
"title": "Docker and Containerization",
"content": """
Docker is a platform for developing, shipping, and running applications in containers. Containers
package applications with their dependencies, ensuring consistency across different environments.
Docker uses images and containers, with Dockerfile for defining image builds. Containerization
simplifies deployment, improves scalability, and enables microservices architecture. Docker Compose
allows managing multi-container applications easily.
"""
},
{
"title": "JavaScript and Modern Web Development",
"content": """
JavaScript is the programming language of the web, running in browsers and on servers via Node.js.
Modern JavaScript includes ES6+ features like arrow functions, promises, async/await, and modules.
Popular frameworks include React, Vue, and Angular for building interactive user interfaces.
Understanding JavaScript fundamentals, DOM manipulation, and asynchronous programming is crucial
for web development.
"""
},
{
"title": "Cloud Computing and AWS Services",
"content": """
Cloud computing provides on-demand computing resources over the internet. Amazon Web Services (AWS)
offers a wide range of services including EC2 for virtual servers, S3 for storage, RDS for databases,
and Lambda for serverless computing. Understanding cloud architecture, scalability, and cost
optimization is important for modern application deployment. Other major cloud providers include
Google Cloud Platform and Microsoft Azure.
"""
}
]