Skip to content

Commit ce28683

Browse files
committed
Add GitHub Pages publishing workflow as alternative to Cloudflare R2
Adds a new workflow (github-pages-workflow.yml) that publishes the HTML export to GitHub Pages instead of Cloudflare R2. This provides a simpler alternative for users who don't have Cloudflare R2 credentials. The workflow: - Generates HTML in ELI structure (same as R2 workflow) - Generates index pages (index.html and latest.html) - Deploys to GitHub Pages using official actions - Adds .nojekyll to prevent interference with ELI URLs Updated documentation in README.md, README_EN.md, and DEVELOPMENT.md.
1 parent deb6eb5 commit ce28683

4 files changed

Lines changed: 151 additions & 0 deletions

File tree

Lines changed: 136 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,136 @@
1+
name: Exportera till HTML-format och publicera till GitHub Pages
2+
3+
on:
4+
workflow_dispatch: # Tillåter manuell körning
5+
inputs:
6+
source_ref:
7+
description: 'Git ref att bygga från'
8+
required: false
9+
default: 'main'
10+
filter:
11+
description: 'Filtrera filer efter år (YYYY) eller specifik beteckning (YYYY:NNN). Kommaseparerad lista.'
12+
required: false
13+
type: string
14+
workflow_call: # Tillåter anrop från andra workflows
15+
inputs:
16+
source_ref:
17+
required: false
18+
type: string
19+
default: 'main'
20+
filter:
21+
required: false
22+
type: string
23+
24+
# Behörigheter för GitHub Pages deployment
25+
permissions:
26+
contents: read
27+
pages: write
28+
id-token: write
29+
30+
# Tillåt endast en deployment åt gången
31+
concurrency:
32+
group: "pages"
33+
cancel-in-progress: false
34+
35+
jobs:
36+
build:
37+
runs-on: ubuntu-latest
38+
environment: Test
39+
40+
steps:
41+
- name: Checkout repository
42+
uses: actions/checkout@v4
43+
with:
44+
ref: ${{ inputs.source_ref || 'main' }}
45+
46+
- name: Set up Python
47+
uses: actions/setup-python@v4
48+
with:
49+
python-version: '3.11'
50+
51+
- name: Install dependencies
52+
run: |
53+
python -m pip install --upgrade pip
54+
pip install -r requirements.txt
55+
56+
- name: Get JSON source files (from git or R2)
57+
run: |
58+
# Try to use JSON files from git first
59+
if [ -d "data/sfs_json" ] && [ -n "$(ls -A data/sfs_json 2>/dev/null)" ]; then
60+
echo "✅ Found $(find data/sfs_json -name '*.json' | wc -l) JSON files in git"
61+
echo "Using JSON files from git checkout"
62+
else
63+
echo "⚠️ No JSON files in git, downloading from Cloudflare R2..."
64+
65+
# Configure AWS CLI for R2
66+
aws configure set aws_access_key_id ${{ secrets.CLOUDFLARE_R2_ACCESS_KEY_ID }}
67+
aws configure set aws_secret_access_key ${{ secrets.CLOUDFLARE_R2_SECRET_ACCESS_KEY }}
68+
aws configure set region us-east-1
69+
aws configure set output json
70+
71+
# Download all JSON files from R2
72+
mkdir -p data/sfs_json
73+
aws s3 sync s3://${{ secrets.CLOUDFLARE_R2_BUCKET_NAME }}/sfs_json/ data/sfs_json/ \
74+
--endpoint-url https://${{ secrets.CLOUDFLARE_R2_ACCOUNT_ID }}.r2.cloudflarestorage.com \
75+
--exclude "*" \
76+
--include "*.json"
77+
78+
# Verify download
79+
if [ ! -d "data/sfs_json" ] || [ -z "$(ls -A data/sfs_json)" ]; then
80+
echo "::error::Failed to download JSON files from R2"
81+
exit 1
82+
fi
83+
echo "✅ Downloaded $(find data/sfs_json -name '*.json' | wc -l) JSON files from R2"
84+
fi
85+
env:
86+
AWS_DEFAULT_REGION: us-east-1
87+
88+
- name: Generate HTML export
89+
run: |
90+
# Skapa output-katalog för GitHub Pages
91+
mkdir -p _site
92+
93+
if [ -n "${{ inputs.filter }}" ]; then
94+
python sfs_processor.py --input data/sfs_json --output _site --formats html --filter "${{ inputs.filter }}"
95+
else
96+
python sfs_processor.py --input data/sfs_json --output _site --formats html
97+
fi
98+
env:
99+
PYTHONPATH: ${{ github.workspace }}
100+
101+
- name: Generate index pages for HTML export
102+
run: |
103+
python exporters/html/populate_index_pages.py --input data/sfs_json --output _site/index.html --limit 30
104+
python exporters/html/populate_index_pages.py --input data/sfs_json --output _site/latest.html --limit 10
105+
env:
106+
PYTHONPATH: ${{ github.workspace }}
107+
108+
- name: Add .nojekyll file
109+
run: |
110+
# Förhindra Jekyll-processing som kan störa ELI-URL:er
111+
touch _site/.nojekyll
112+
113+
- name: Create deployment summary
114+
run: |
115+
echo "HTML export completed at $(date)" > _site/last-update.txt
116+
echo "Published to GitHub Pages" >> _site/last-update.txt
117+
echo "Index pages: index.html (30 senaste), latest.html (10 senaste)" >> _site/last-update.txt
118+
119+
- name: Setup Pages
120+
uses: actions/configure-pages@v5
121+
122+
- name: Upload artifact
123+
uses: actions/upload-pages-artifact@v3
124+
with:
125+
path: '_site'
126+
127+
deploy:
128+
environment:
129+
name: github-pages
130+
url: ${{ steps.deployment.outputs.page_url }}
131+
runs-on: ubuntu-latest
132+
needs: build
133+
steps:
134+
- name: Deploy to GitHub Pages
135+
id: deployment
136+
uses: actions/deploy-pages@v4

DEVELOPMENT.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -344,6 +344,13 @@ python -m pytest test/ --cov=. --cov-report=html
344344
- Laddar upp till Cloudflare R2
345345
- Kräver R2-credentials i GitHub Secrets
346346

347+
**`github-pages-workflow.yml`**:
348+
- Genererar HTML från Markdown
349+
- Publicerar till GitHub Pages istället för Cloudflare R2
350+
- Kräver att GitHub Pages är aktiverat i repository-inställningar
351+
- Använder `actions/deploy-pages` för deployment
352+
- Alternativ till R2-uppladdning för de som inte har Cloudflare-konto
353+
347354
**`upcoming-changes-workflow.yml`**:
348355
- Processar kommande ändringar
349356
- Temporal förhandsvisning

README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,10 @@ Verktyget kan generera författningar i flera olika format, beroende på använd
4646
- **`html`**: Genererar HTML-filer i ELI-struktur (`/eli/sfs/{år}/{nummer}/index.html`) för webbpublicering
4747
- **`htmldiff`**: Som HTML men inkluderar även separata versioner för varje ändringsförfattning
4848

49+
HTML-filer kan publiceras via:
50+
- **Cloudflare R2**: Med `html-export-workflow.yml` (kräver R2-credentials)
51+
- **GitHub Pages**: Med `github-pages-workflow.yml` (enklare setup, kräver aktiverad GitHub Pages)
52+
4953
### Vektor-format (för semantisk sökning)
5054

5155
- **`vector`**: Konverterar författningar till vektorembeddings för semantisk sökning och RAG-applikationer. Använder OpenAI:s text-embedding-3-large modell (3072 dimensioner) och stödjer lagring i PostgreSQL (pgvector), Elasticsearch eller JSON-fil.

README_EN.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,10 @@ The tool can generate legislation in several different formats, depending on use
4646
- **`html`**: Generates HTML files in ELI structure (`/eli/sfs/{year}/{number}/index.html`) for web publishing
4747
- **`htmldiff`**: Like HTML but also includes separate versions for each amending law
4848

49+
HTML files can be published via:
50+
- **Cloudflare R2**: Using `html-export-workflow.yml` (requires R2 credentials)
51+
- **GitHub Pages**: Using `github-pages-workflow.yml` (simpler setup, requires GitHub Pages enabled)
52+
4953
### Vector Format (for semantic search)
5054

5155
- **`vector`**: Converts legislation to vector embeddings for semantic search and RAG applications. Uses OpenAI's text-embedding-3-large model (3072 dimensions) and supports storage in PostgreSQL (pgvector), Elasticsearch, or JSON file.

0 commit comments

Comments
 (0)