/root/miniconda3/envs/minima/lib/python3.8/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers
warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning)
/mnt/f/ProgrammeProject/PythonProject/MINIMA/data_engine/tools/infrared/scepter/modules/inference/control_inference.py:19: UserWarning: Import swift failed, please check it.
warnings.warn('Import swift failed, please check it.')
/mnt/f/ProgrammeProject/PythonProject/MINIMA/data_engine/tools/infrared/scepter/modules/inference/tuner_inference.py:19: UserWarning: Import swift error, please deal with this problem: Failed to import swift.tuners.base because of the following error (look up to see its traceback):
Failed to import swift.tuners.mapping because of the following error (look up to see its traceback):
'type' object is not subscriptable
warnings.warn(f'Import swift error, please deal with this problem: {e}')
[Info]: Loading config from tools/infrared/stylebooth_tb_pro.yaml
[Info]: System take tools/infrared/stylebooth_tb_pro.yaml as yaml, because we find yaml in this file
[Info]: ENV is not set and will use default ENV as {'SEED': 2023, 'USE_PL': False, 'BACKEND': 'nccl', 'SYNC_BN': False, 'CUDNN_DETERMINISTIC': True, 'CUDNN_BENCHMARK': False}; If want to change this value, please set them in your config.
[Info]: Parse cfg file as
{
"NAME": "EDIT",
"IS_DEFAULT": false,
"DEFAULT_PARAS": {
"PARAS": {
"RESOLUTIONS": [
[
512,
512
],
[
1024,
1024
]
]
},
"INPUT": {
"IMAGE": null,
"PROMPT": "",
"NEGATIVE_PROMPT": "",
"TARGET_SIZE_AS_TUPLE": [
1024,
1024
],
"PROMPT_PREFIX": "",
"SAMPLE": "ddim",
"SAMPLE_STEPS": 50,
"GUIDE_SCALE": {
"text": 7.5,
"image": 1.5
},
"GUIDE_RESCALE": 0.5,
"DISCRETIZATION": "trailing"
},
"OUTPUT": {
"LATENT": null,
"IMAGES": null,
"SEED": null
},
"MODULES_PARAS": {
"FIRST_STAGE_MODEL": {
"FUNCTION": [
{
"NAME": "encode",
"DTYPE": "float16",
"INPUT": [
"IMAGE"
]
},
{
"NAME": "decode",
"DTYPE": "float16",
"INPUT": [
"LATENT"
]
}
],
"PARAS": {
"SCALE_FACTOR": 0.18215,
"SIZE_FACTOR": 8
}
},
"DIFFUSION_MODEL": {
"FUNCTION": [
{
"NAME": "forward",
"DTYPE": "float16",
"INPUT": [
"SAMPLE_STEPS",
"SAMPLE",
"GUIDE_SCALE",
"GUIDE_RESCALE",
"DISCRETIZATION"
]
}
]
},
"COND_STAGE_MODEL": {
"FUNCTION": [
{
"NAME": "encode_text",
"DTYPE": "float16",
"INPUT": [
"PROMPT",
"NEGATIVE_PROMPT"
]
}
]
}
}
},
"MODEL": {
"PRETRAINED_MODEL": "weights/stylebooth/stylebooth-tb-5000-0.bin",
"SCHEDULE": {
"PARAMETERIZATION": "eps",
"TIMESTEPS": 1000,
"ZERO_TERMINAL_SNR": false,
"SCHEDULE_ARGS": {
"NAME": "scaled_linear",
"BETA_MIN": 0.00085,
"BETA_MAX": 0.012
}
},
"DIFFUSION_MODEL": {
"NAME": "DiffusionUNet",
"PRETRAINED_PATH": null,
"IN_CHANNELS": 8,
"OUT_CHANNELS": 4,
"MODEL_CHANNELS": 320,
"NUM_HEADS": 8,
"NUM_RES_BLOCKS": 2,
"ATTENTION_RESOLUTIONS": [
4,
2,
1
],
"CHANNEL_MULT": [
1,
2,
4,
4
],
"CONV_RESAMPLE": true,
"DIMS": 2,
"USE_CHECKPOINT": false,
"USE_SCALE_SHIFT_NORM": false,
"RESBLOCK_UPDOWN": false,
"USE_SPATIAL_TRANSFORMER": true,
"TRANSFORMER_DEPTH": 1,
"CONTEXT_DIM": 768,
"DISABLE_MIDDLE_SELF_ATTN": false,
"USE_LINEAR_IN_TRANSFORMER": false,
"IGNORE_KEYS": []
},
"FIRST_STAGE_MODEL": {
"NAME": "AutoencoderKL",
"EMBED_DIM": 4,
"IGNORE_KEYS": [],
"BATCH_SIZE": 4,
"ENCODER": {
"NAME": "Encoder",
"CH": 128,
"OUT_CH": 3,
"NUM_RES_BLOCKS": 2,
"IN_CHANNELS": 3,
"ATTN_RESOLUTIONS": [],
"CH_MULT": [
1,
2,
4,
4
],
"Z_CHANNELS": 4,
"DOUBLE_Z": true,
"DROPOUT": 0.0,
"RESAMP_WITH_CONV": true
},
"DECODER": {
"NAME": "Decoder",
"CH": 128,
"OUT_CH": 3,
"NUM_RES_BLOCKS": 2,
"IN_CHANNELS": 3,
"ATTN_RESOLUTIONS": [],
"CH_MULT": [
1,
2,
4,
4
],
"Z_CHANNELS": 4,
"DROPOUT": 0.0,
"RESAMP_WITH_CONV": true,
"GIVE_PRE_END": false,
"TANH_OUT": false
}
},
"TOKENIZER": {
"NAME": "ClipTokenizer",
"PRETRAINED_PATH": "weights/clip-vit-large-patch14/",
"LENGTH": 77,
"CLEAN": true
},
"COND_STAGE_MODEL": {
"NAME": "FrozenCLIPEmbedder",
"FREEZE": true,
"USE_GRAD": false,
"LAYER": "last",
"PRETRAINED_MODEL": "weights/clip-vit-large-patch14/"
}
},
"ENV": {
"SEED": 2023,
"USE_PL": false,
"BACKEND": "nccl",
"SYNC_BN": false,
"CUDNN_DETERMINISTIC": true,
"CUDNN_BENCHMARK": false
}
}
Processing Infrared Images: 0%| | 0/1 [00:00<?, ?it/s]{'image': None, 'prompt': 'Convert the image to an infrared image', 'negative_prompt': '', 'target_size_as_tuple': [1184, 1600], 'prompt_prefix': '', 'sample': 'ddim', 'sample_steps': 50, 'guide_scale': {'text': 7.5, 'image': 1.5}, 'guide_rescale': 0.5, 'discretization': 'trailing'}
cuda
scepter [INFO] 2025-10-09 17:09:00,957 [File: diffusion_inference.py Function: dynamic_load at line 268] Loading diffusion_model model
scepter [INFO] 2025-10-09 17:09:31,365 [File: unet_module.py Function: init_from_ckpt at line 479] Restored from /mnt/f/ProgrammeProject/PythonProject/MINIMA/data_engine/weights/stylebooth/diffusion_model.pth with 0 missing and 0 unexpected keys
scepter [INFO] 2025-10-09 17:09:33,030 [File: diffusion_inference.py Function: dynamic_load at line 268] Loading cond_stage_model model
scepter [INFO] 2025-10-09 17:09:39,371 [File: diffusion_inference.py Function: init_from_ckpt at line 215] Restored from /mnt/f/ProgrammeProject/PythonProject/MINIMA/data_engine/weights/stylebooth/cond_stage_model.pth with 0 missing and 1 unexpected keys
scepter [INFO] 2025-10-09 17:09:39,371 [File: diffusion_inference.py Function: init_from_ckpt at line 221]
Unexpected Keys:
['transformer.text_model.embeddings.position_ids']
scepter [INFO] 2025-10-09 17:09:39,958 [File: tuner_inference.py Function: register_tuner at line 44] Loading tuner model
tunner_model_folder weights/stylebooth/step-210000/
Processing Infrared Images: 0%| | 0/1 [00:41<?, ?it/s]
Traceback (most recent call last):
File "modality_engine.py", line 156, in <module>
transfer_infrared(args, input_paths, device, output_dir, ext)
File "modality_engine.py", line 21, in transfer_infrared
images = infrared_transfer_single(args, image_path, diff_infer, tuner_model_list,
File "/mnt/f/ProgrammeProject/PythonProject/MINIMA/data_engine/tools/infrared/infrared_transfer.py", line 95, in infrared_transfer_single
output = diff_infer(input_dict, **other_args)
File "/root/miniconda3/envs/minima/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/mnt/f/ProgrammeProject/PythonProject/MINIMA/data_engine/tools/infrared/scepter/modules/inference/stylebooth_inference.py", line 154, in __call__
self.tuner_infer.register_tuner(tuner_model, self.diffusion_model,
File "/mnt/f/ProgrammeProject/PythonProject/MINIMA/data_engine/tools/infrared/scepter/modules/inference/tuner_inference.py", line 207, in register_tuner
diffusion_model['model'] = Swift.from_pretrained(
NameError: name 'Swift' is not defined
Firstly, thanks for your work!
I have encountered a problem that stops the code flow.
Python: 3.8.20
ms-swift: 2.0.1 (modified for a typo)
Thanks for any help!