Instructions for AI agents

PyThaiNLP specific

Project contribution guidelines

General language use

Naming conventions

Follow standard naming conventions for the programming language and framework you are using.
Use only ASCII letters, digits, hyphen (-), and underscore (_) in names.
For URLs/IRIs, use lowercase letters and hyphens to separate words (e.g., my-api-endpoint) and follow W3C Cool URIs for the Semantic Web: https://www.w3.org/TR/cooluris/
Consult Schema.org vocabularies when deciding about names.
Consult "Style Guidelines for Naming and Labeling Ontologies in the Multilingual Web" https://www.researchgate.net/publication/277224472

Tidy code and documentation

Ensure that the code is well-formatted and adheres to the style guidelines of the programming language you are using.
Use linters and formatters where applicable.
Use "sentence case" for headings and titles in documentation.
Write clear and concise comments and documentation for your code. For something obvious, avoid comments that just restate the code.
After making changes, review the code and documentation to ensure up-to-dateness, correctness, consistency, and clarity.
Make sure that all code comments, APIs, and documentation are consistent with the current state of the codebase.
Make sure that the examples in the documentation are runnable, up-to-date and reflect the current behavior of the code.

File header

When possible, put relevant SPDX File Tags at file header. See https://spdx.github.io/spdx-spec/v2.3/file-information/
- SPDX-FileContributor
- SPDX-FileCopyrightText
- Default SPDX-FileType for code is "SOURCE"
- Default SPDX-FileType for documentation is "DOCUMENTATION"
- Default SPDX-License-Identifier for code is "Apache-2.0"
- Default SPDX-License-Identifier for documentation is "CC0-1.0"
- Sort SPDX metadata.

Shell scripts and command line

Mind the differences between GNU, BSD, macOS, and other implementations of common Unix tools.
Be defensive on variable expansion.
Use quotes or other constructs to encapsulate paths, make it compatible with different kinds of shells.
Be mindful about the semantics of different types of quotation marks.

Library imports and dependencies

Check the correctness of library/module/package names. Be very careful of slopsquatting and typosquatting attacks.
Use the most updated version of the library that is supported by the OS/compiler/framework currently being used.
In source code, group and sort imports by the programming language convention (e.g., in Python, typically by standard library first, then by third-party libraries) and then by alphabetical order whenever possible. Be careful of specific order of import requirements of some dependencies, as moving the order may break the code or create cyclic import issues.
Remove unused imports.
In build metadata (like pyproject.toml in Python) or dependency list (like requirements.txt in Python), sort dependencies.
Warn users about abandoned dependencies with no maintenance for a long time and suggest equivalent drop-in replacements.

Security

API

The overall architecture, code, and API endpoints should follow the latest version of OpenAPI specification at https://spec.openapis.org/oas/
API endpoints must use proper HTTP return codes.
Follow web best practices as recommended by OpenAPI, IETF, W3C, etc.

Git

Follow these guidelines for writing a good commit message:
- How to Write a Git Commit Message https://chris.beams.io/posts/git-commit/
- Commit Verbs 101: why I like to use this and why you should also like it. https://chris.beams.io/posts/git-commit/

Python

Python type completeness

The following are best practice recommendations for how to define “type complete”:

Type annotations can be omitted in a few specific cases where the type is obvious from the context:

Constants that are assigned simple literal values (e.g. RED = '#F00' or MAX_TIMEOUT = 50 or room_temperature: Final = 20). A constant is a symbol that is assigned only once and is either annotated with Final or is named in all-caps. A constant that is not assigned a simple literal value requires explicit annotations, preferably with a Final annotation (e.g. WOODWINDS: Final[list[str]] = ['Oboe', 'Bassoon']).
Enum values within an Enum class do not require annotations because they take on the type of the Enum class.
Type aliases do not require annotations. A type alias is a symbol that is defined at a module level with a single assignment where the assigned value is an instantiable type, as opposed to a class instance (e.g. Foo = Callable[[Literal["a", "b"]], int | str] or Bar = MyGenericClass[int] | None).
The “self” parameter in an instance method and the “cls” parameter in a class method do not require an explicit annotation.
The return type for an __init__ method does not need to be specified, since it is always None.
The following module-level symbols do not require type annotations: __all__, __author__, __copyright__, __email__, __license__, __title__, __uri__, __version__.
The following class-level symbols do not require type annotations: __class__, __dict__, __doc__, __module__, __slots__.

JSON

When serialize to JSON, always enclose decimal values (for example, xs:decimal) in quotes to guarantee correct type interpretation and preserve precision.
Make sure JSON is valid and well-formatted.

Markdown

When including metadata in Markdown file, put them as YAML between triple-dashed lines, as used by Hugo and Jekyll front matter.
Be strict on the Markdown formatting. Be mindful that what works on GitHub may not work on MkDocs, for example. Try to keep with the standard Markdown.
Use Markdownlint to detect and fix malformatted.

Diagram

When draw the diagram in ASCII/text, recheck if all the lines are well aligned. Count the characters and adjust the spaces so the lines align well.

HTML

Make sure HTML is valid and well-formatted.
Make sure there is no trailing whitespace in the HTML file.
Be conscious about accessibility. Consider to follow W3C web accessibility recommendations when possible.
Use sensible and concise element IDs and names that allow code readability, name grouping also helps.

CSS

Make sure there is no unused styles.
Use sensible and concise element IDs and names that allow code readability, name grouping also helps.

Version

When suggest dependencies, recheck the version; if the version exists, or if the version is compatible with the system or other dependencies.
Prefer a Semantic Version when applicable.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Instructions for AI agents

PyThaiNLP specific

Project contribution guidelines

General language use

Naming conventions

Tidy code and documentation

File header

Shell scripts and command line

Library imports and dependencies

Security

API

Git

Python

Python type completeness

JSON

Markdown

Diagram

HTML

CSS

Version

Uh oh!

FilesExpand file tree

AGENTS.md

Latest commit

History

AGENTS.md

File metadata and controls

Instructions for AI agents

PyThaiNLP specific

Project contribution guidelines

General language use

Naming conventions

Tidy code and documentation

File header

Shell scripts and command line

Library imports and dependencies

Security

API

Git

Python

Python type completeness

JSON

Markdown

Diagram

HTML

CSS

Version