AMFOrA: A digital toolkit for archaeological ceramic analysis


Submitting Author: Name (@aleciaco)
Package Name: AMFOrA_public
One-Line Description of Package: I have been developing an open source toolkit for field surveyors to extract macroscopic data from archaeological ceramic sherds since 2018. I am hoping to be able to publish in JOSS, but their updated terms for submission require that the GitHub has been active for longer than 6 months, and I am hoping that by reviewing through PyOpenSci it can be fast-tracked for submission. 
Repository Link (if existing): https://github.com/aleciaco/AMFOrA_public
EiC: TBD

---

## Code of Conduct & Commitment to Maintain Package

- [X] I agree to abide by [pyOpenSci's Code of Conduct][PyOpenSciCodeOfConduct] during the review process and in maintaining my package after should it be accepted.
- [X] I have read and will commit to package maintenance after the review as per the [pyOpenSci Policies Guidelines][Commitment].

## Description

- AMFOrA is an open-source Python toolkit that turns flatbed scans of ceramic cross-sections into quantitative archaeological data through a fully automated batch-processing pipeline. A single call to its full_analysis() function extracts up to 92 variables per sherd, covering sherd geometry, dual-method inclusion and void detection (blob and contour), size distributions, per-inclusion angularity and morphological classification (deliberate temper, natural inclusion, or weathered), sherd-corrected fabric orientation with circular statistics for inferring forming technique, paste and inclusion color in CIE Lab* space, and three-zone core-periphery firing atmosphere profiles. Beyond the high-level pipeline, individual functions can be called independently with full parameter customization, and a built-in CeramicStatisticalAnalyzer class provides PCA, hierarchical/k-means/DBSCAN clustering, and correlation analysis, while the companion CeramicVisualization class generates Plotly-based dendrograms, biplots, scree plots, heatmaps, and dashboards — all running in under three seconds per sherd on a sub-$400 hardware setup.

## Community Partnerships
We partner with communities to support peer review with an additional layer of
checks that satisfy community requirements. If your package fits into an
existing community please check below:

- [ ] Astropy: [My package adheres to Astropy community standards](https://www.pyopensci.org/software-peer-review/partners/astropy.html)
- [ ] Pangeo: My package adheres to the [Pangeo standards listed in the pyOpenSci peer review guidebook][PangeoCollaboration]

## Scope

- Please indicate which [category or categories][PackageCategories] this package falls under:

	- [ ] Data retrieval
	- [X] Data extraction
	- [X] Data processing/munging
	- [ ] Data deposition
	- [X] Data validation and testing
	- [X] Data visualization
	- [ ] Workflow automation
	- [ ] Citation management and bibliometrics
	- [ ] Scientific software wrappers
	- [ ] Database interoperability

## Domain Specific

- [ ] Geospatial
- [ ] Education

---

- Explain how and why the package falls under these categories (briefly, 1-2 sentences). For community partnerships, check also their specific guidelines as documented in the links above. Please note any areas you are unsure of:
- Data extraction: AMFOrA pulls quantitative measurements out of raw flatbed scans by applying computer vision algorithms (foreground segmentation, blob detection, contour detection, distance transforms, and k-means color clustering) to isolate sherds from their backgrounds and identify inclusions, voids, color zones, and orientations as structured numerical features.
- Data processing (munging): The toolkit converts raw detections into analysis-ready variables by correcting inclusion angles for arbitrary scan rotation, transforming circular orientation data into PCA-compatible linear metrics via sine/cosine projections and Von Mises statistics, and aggregating per-feature measurements into 92 sherd-level columns exported as a tidy pandas DataFrame/CSV.
- Data validation/testing: AMFOrA validates its own defaults through systematic parameter sweeps that produce cumulative detection curves across 16 diverse fabrics, and validates its output through dual-method convergence checks (blob vs. contour correlations) and hierarchical cluster analysis benchmarked against expert archaeological judgment using cophenetic correlation and silhouette scores.
- Data visualization: Through its CeramicVisualization class, AMFOrA generates interactive Plotly-based outputs including PCA biplots and 3D scatter plots, scree plots, dendrograms, correlation heatmaps, cluster comparison box plots, and an archaeological summary dashboard, alongside annotated images of detection results for individual sherds.

- Who is the target audience and what are the scientific applications of this package?
- The target audience for this package are archaeologists working on surveys where large numbers of non-diagnostic fragments are often underused for the amount of potential data they have; often they are just counted and weighed when they could be used for much more.

- Are there other Python packages that accomplish similar things? If so, how does yours differ?
- Not to my knowledge

- Any other questions or issues we should be aware of:


**P.S.** Have feedback/comments about our review process? Leave a comment [on our GitHub Discussions][Comments]


[PackageCategories]: https://www.pyopensci.org/software-peer-review/about/package-scope.html

[Conduct]: https://www.pyopensci.org/handbook/CODE_OF_CONDUCT.html

[Commitment]: https://www.pyopensci.org/software-peer-review/our-process/policies.html#after-acceptance-package-ownership-and-maintenance

[Comments]: https://github.com/orgs/pyOpenSci/discussions

[PangeoCollaboration]: https://www.pyopensci.org/software-peer-review/partners/pangeo

[pangeoWebsite]: https://www.pangeo.io

[PyOpenSciCodeOfConduct]: https://www.pyopensci.org/handbook/CODE_OF_CONDUCT.html


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AMFOrA: A digital toolkit for archaeological ceramic analysis #307

Code of Conduct & Commitment to Maintain Package

Description

Community Partnerships

Scope

Domain Specific

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

AMFOrA: A digital toolkit for archaeological ceramic analysis #307

Description

Code of Conduct & Commitment to Maintain Package

Description

Community Partnerships

Scope

Domain Specific

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions