Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 17 additions & 9 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.9"]
python-version: ["3.10", "3.12", "3.14"]

steps:
- uses: actions/checkout@v5
Expand All @@ -24,15 +24,23 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
python -m pip install -r requirements.txt
python -m pip install -e .[all]

- name: Run tests
run: pytest -ra --cov=tika
run: pytest -ra --cov

- name: Upload coverage to Coveralls
if: success()
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
coveralls
- name: Coveralls parallel
uses: coverallsapp/github-action@v2
with:
flag-name: coverage-python-${{ matrix.python-version }}
parallel: true

finish-coverage:
needs: test
if: ${{ always() }}
runs-on: ubuntu-slim
steps:
- name: Coveralls finished
uses: coverallsapp/github-action@v2
with:
parallel-finished: true
14 changes: 9 additions & 5 deletions .github/workflows/documentation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ name: Generate and deploy documentation
on:
# Runs on pushes targeting the default branch
push:
branches: ["main", "master", "add-automated-documentation"]
branches: ["master"]

# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:
Expand All @@ -29,11 +29,15 @@ jobs:
url: ${{ steps.deployment.outputs.page_url }}
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- uses: actions/checkout@v6
- uses: actions/setup-python@v6
with:
python-version: "3.14"
- run: python -m pip install --upgrade pip
- run: pip --version
- name: Install dependencies
run: |
pip install sphinx furo myst-parser
pip install . --group=docs
- name: Sphinx APIDoc
run: |
sphinx-apidoc -f -o docs/source/ .
Expand All @@ -43,7 +47,7 @@ jobs:
- name: Setup Pages
uses: actions/configure-pages@v5
- name: Upload artifact
uses: actions/upload-pages-artifact@v3
uses: actions/upload-pages-artifact@v4
with:
# Upload entire repository
path: './docs/build/html'
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,4 @@
.idea
__pycache__/
.coverage
docs/build
1 change: 0 additions & 1 deletion MANIFEST.IN

This file was deleted.

7 changes: 1 addition & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ library that makes Tika available using the
[Tika REST Server](https://cwiki.apache.org/confluence/display/TIKA/TikaServer).

This makes Apache Tika available as a Python library,
installable via Setuptools, Pip and Easy Install.
installable via pip.

To use this library, you need to have Java 11+ installed on your
system as tika-python starts up the Tika REST server in the
Expand All @@ -20,11 +20,6 @@ Installation (with pip)
-----------------------
1. `pip install tika`

Installation (without pip)
--------------------------
1. `python setup.py build`
2. `python setup.py install`

Airgap Environment Setup
------------------------
To get this working in a disconnected environment, download a tika server file (both tika-server.jar and tika-server.jar.md5, which can be found [here](https://repo1.maven.org/maven2/org/apache/tika/tika-server-standard/)) and set the TIKA_SERVER_JAR environment variable to TIKA_SERVER_JAR="file:///<yourpath>/tika-server-standard.jar" which successfully tells `python-tika` to "download" this file and move it to `/tmp/tika-server-standard.jar` and run as background process.
Expand Down
16 changes: 4 additions & 12 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,9 @@

# -- Project information -----------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information
import os
import sys

# Add the parent directory of the documentation root to sys.path
sys.path.insert(0, os.path.abspath("../.."))

project = 'tika-python'
copyright = '2024, Chris A. Mattmann'
copyright = '2026, Chris A. Mattmann'
author = 'Chris A. Mattmann'

# -- General configuration ---------------------------------------------------
Expand All @@ -26,16 +21,13 @@
"sphinx.ext.autosectionlabel",
"sphinx.ext.todo",
"sphinx.ext.duration",
"myst_parser"
"myst_parser",
]

templates_path = ['_templates']
exclude_patterns = ['tika.tests*']


exclude_patterns = ['_build']

# -- Options for HTML output -------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output

html_theme = 'furo'
html_static_path = ['_static']

15 changes: 15 additions & 0 deletions docs/source/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Welcome to tika-python's documentation!

```{toctree}
:maxdepth: 7
:caption: Contents
readme.md
tika.md
```

## Indices and tables

- {ref}`genindex`
- {ref}`modindex`
- {ref}`search`

21 changes: 0 additions & 21 deletions docs/source/index.rst

This file was deleted.

7 changes: 0 additions & 7 deletions docs/source/setup.rst

This file was deleted.

77 changes: 0 additions & 77 deletions docs/source/tika.tests.rst

This file was deleted.

64 changes: 64 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
[build-system]
build-backend = "setuptools.build_meta"
requires = [ "setuptools" ]

[project]
name = "tika"
description = "Apache Tika Python library"
readme = "README.md"
keywords = [ "tika", "digital", "babel fish", "apache" ]
license = "Apache-2.0"
authors = [ { name = "Chris Mattmann", email = "chris.a.mattmann@jpl.nasa.gov" } ]
requires-python = ">=3.10"
classifiers = [
"Development Status :: 3 - Alpha",
"Environment :: Console",
"Intended Audience :: Developers",
"Intended Audience :: Information Technology",
"Intended Audience :: Science/Research",
"Operating System :: OS Independent",
"Programming Language :: Python",
"Programming Language :: Python :: 3 :: Only",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
"Programming Language :: Python :: 3.14",
"Topic :: Database :: Front-Ends",
"Topic :: Scientific/Engineering",
"Topic :: Software Development :: Libraries :: Python Modules",
]
dynamic = [ "version" ]
dependencies = [
"beautifulsoup4==4.13.3",
"requests",
]

[dependency-groups]
test = [
"memory-profiler",
"pytest-benchmark",
"pytest-cov",
]
docs = [
"furo",
"myst-parser",
"sphinx",
]

[project.urls]
homepage = "http://github.com/chrismattmann/tika-python"
repository = "http://github.com/chrismattmann/tika-python.git"

[project.scripts]
tika-python = "tika.tika:main"

[tool.setuptools]
packages.find.include = [ "tika*" ]

[tool.setuptools.dynamic]
version = {attr = "tika.__version__"}

[tool.coverage.run]
source = [ "tika" ]
branch = true
7 changes: 0 additions & 7 deletions requirements.txt

This file was deleted.

Loading
Loading