pxm gen-input converts structural inputs across multiple formats—mmCIF → AF3 / Protenix / Boltz / OpenFold3, AF3 ←→ Protenix, etc.
pxm gen-input \
-i INPUT_PATH \
-o OUTPUT_PATH \
-it cif|af3|protenix|boltz|openfold3 \
-ot af3|protenix|boltz|openfold3 \
[--seeds "0,1,2" | --num-seeds 5] \
[--assembly-id 1] \
[--num-cpu 8]Supported input types:
cif- mmCIF structureaf3- AlphaFold3 JSONprotenix- Protenix JSONboltz- Boltz YAMLopenfold3- OpenFold3 JSON
Supported output types:
af3,protenix,boltz,openfold3
The tool works on single files or directories (flat directory only).
If you don't have a source file and want to build a model input from scratch, you can use the Interactive Mode.
pxm gen-input -I
# or
pxm gen-input --interactive- Step-by-step Guidance: The tool will walk you through selecting the output format, naming the job, and adding components.
- Load from Existing File: You can optionally initialize your complex by loading components and bonds from an existing file (
.cif,.jsonfor AF3/Protenix, or.yamlfor Boltz). - Component Management:
- Add Polymer: Enter sequence strings (validated against standard alphabets) and add modifications at specific positions.
- Add Ligand: Support for CCD codes, SMILES, and file paths (validated against model-specific limits).
- Remove Component: Easily remove any added chain. All affected covalent bonds will be automatically cleaned up or re-indexed.
- Covalent Bonds: Add bonds between any two atoms across chains with real-time range validation for Residue IDs.
- User-friendly Interface:
- Numbered Menus: Quick selection using numbers (1, 2, 3...) instead of typing commands.
- Smart Defaults: Press
Enterto accept recommended values (marked with*). - Live Preview: See your complex grow as you add or modify components.
| Flag | Description |
|---|---|
-i, --input |
Input file or directory |
-o, --output |
Output file or directory |
-it, --input-type |
Input format |
-ot, --output-type |
Output format |
Input and output formats can be the same (e.g. for filtering/cleaning). File-to-file or dir-to-dir only.
| Flag | Description |
|---|---|
-p, --pdb-ids |
Filter inputs by PDB IDs (comma-separated or file path) |
-rm, --remove-entity-types |
Remove specific entities (comma-separated: ligand, ion, glycan, protein, dna, rna, covalent_ligand) |
--keep_polymer_crosslinks |
Keep polymer-polymer crosslinks (e.g. disulfide bonds, cyclic-peptides) in the bonds list |
--reassign-chain-id |
Reassign chain IDs, ignoring original ones from the input file. Default: Use original IDs. |
For AlphaFold3 output, you must provide exactly one of:
--seeds "0,1,2"— explicit list--num-seeds N— generates seeds[0…N-1]
For Protenix output, seeds are optional. If not provided, an empty seed list will be used.
Boltz and OpenFold3 outputs do not use seeds.
| Flag | Description |
|---|---|
--assembly-id |
Biological assembly ID to expand |
--num-cpu N
Number of workers (Joblib). -1 uses all available CPUs.
Currently, OpenFold3 does not support explicit covalent bonds via JSON inputs. As a result, when generating an openfold3 target format:
- Any specified covalent bonds will be ignored.
- Any covalent ligands (ligands or glycans that have explicit bonds to a polymer chain) will be automatically filtered out to prevent misleading the model. Non-covalent, fully detached ligands will still be retained.
Additionally, OpenFold3 does not support multiple CCD codes in a single ligand chain. Entities containing more than one CCD code will be skipped and not included in the output JSON.
You can call the same logic from Python instead of the CLI.
The CLI pxm gen-input is a thin wrapper around run_gen_input:
from pathlib import Path
from pxmeter.input_builder.gen_input import run_gen_input
run_gen_input(
input_path=Path("./cifs"),
output_path=Path("./af3_inputs"),
input_type="cif",
output_type="af3",
seeds=None, # for af3, use num_seeds OR seeds, not both
num_seeds=5,
assembly_id="1",
num_cpu=8,
)Rules are the same as the CLI:
input_type/output_typecan be the same (e.g. for filtering/cleaning).- For
output_type == "af3", you must provide eitherseedsornum_seeds. - For
output_typein{ "protenix", "boltz", "openfold3" }, bothseedsandnum_seedscan be left asNone.
Example: Protenix → Boltz (no seeds needed):
from pathlib import Path
from pxmeter.input_builder.gen_input import run_gen_input
run_gen_input(
input_path=Path("protenix.json"),
output_path=Path("boltz.yaml"),
input_type="protenix",
output_type="boltz",
# seeds / num_seeds not required for Boltz
)If you already have explicit file mappings, you can use the lower-level helpers:
from pathlib import Path
from pxmeter.input_builder.gen_input import gen_one, gen_batch
# Single file
gen_one(
input_f=Path("structure.cif"),
output_f=Path("af3.json"),
input_type="cif",
output_type="af3",
seeds=[0, 1, 2],
assembly_id="1",
)
# Batch (list of (input, output) pairs)
pairs = [
(Path("cifs/1abc.cif"), Path("af3/1abc.json")),
(Path("cifs/2xyz.cif"), Path("af3/2xyz.json")),
]
gen_batch(
input_and_output_files=pairs,
input_type="cif",
output_type="af3",
seeds=[0, 1, 2],
assembly_id="1",
num_cpu=8,
)These functions do not infer file lists or suffixes; they only perform the conversion.
pxm gen-input \
-i ./cifs \
-o ./af3_inputs \
-it cif -ot af3 \
--num-seeds 5 \
--assembly-id 1 \
--num-cpu 8pxm gen-input \
-i af3.json \
-o protenix.json \
-it af3 -ot protenix \
--seeds "0"pxm gen-input \
-i protenix.json \
-o boltz.yaml \
-it protenix -ot boltzpxm gen-input \
-i structure.cif \
-o boltz.yaml \
-it cif -ot boltzYou can remove specific entity types from the input during generation using -rm or --remove-entity-types.
Supported types: ligand, ion, glycan, protein, dna, rna, covalent_ligand.
pxm gen-input \
-i structure.cif \
-o structure_no_ion_dna.json \
-it cif -ot protenix \
-rm ion,dnaBy default, polymer-polymer crosslinks (like disulfide bonds) are filtered out. Use --keep_polymer_crosslinks to keep them.
pxm gen-input \
-i structure.cif \
-o structure_with_crosslinks.json \
-it cif -ot protenix \
--keep_polymer_crosslinks