|
| 1 | +# isolation-forest-onnx |
| 2 | + |
| 3 | +A converter for the LinkedIn Spark/Scala [isolation forest](https://github.com/linkedin/isolation-forest) model format to [ONNX](https://onnx.ai/) format for broad portability across platforms and languages. |
| 4 | + |
| 5 | +**Note:** ONNX conversion is supported for the standard `IsolationForestModel` only. The `ExtendedIsolationForestModel` uses hyperplane-based splits that are not compatible with the axis-aligned tree ensemble representation used by the ONNX converter. |
| 6 | + |
| 7 | +## Installation |
| 8 | + |
| 9 | +```bash |
| 10 | +pip install isolation-forest-onnx |
| 11 | +``` |
| 12 | + |
| 13 | +It is recommended to use the same version of the converter as the version of the `isolation-forest` library used to train the model. |
| 14 | + |
| 15 | +## Converting a trained model to ONNX |
| 16 | + |
| 17 | +```python |
| 18 | +import os |
| 19 | +from isolationforestonnx.isolation_forest_converter import IsolationForestConverter |
| 20 | + |
| 21 | +# Path where the trained IsolationForestModel was saved in Scala |
| 22 | +path = '/user/testuser/isolationForestWriteTest' |
| 23 | + |
| 24 | +# Get model data path |
| 25 | +data_dir_path = path + '/data' |
| 26 | +avro_model_file = os.listdir(data_dir_path) |
| 27 | +model_file_path = data_dir_path + '/' + avro_model_file[0] |
| 28 | + |
| 29 | +# Get model metadata file path |
| 30 | +metadata_dir_path = path + '/metadata' |
| 31 | +metadata_file = os.listdir(metadata_dir_path) |
| 32 | +metadata_file_path = metadata_dir_path + '/' + metadata_file[0] |
| 33 | + |
| 34 | +# Convert the model to ONNX format (returns the ONNX model in memory) |
| 35 | +converter = IsolationForestConverter(model_file_path, metadata_file_path) |
| 36 | +onnx_model = converter.convert() |
| 37 | + |
| 38 | +# Convert and save the model in ONNX format |
| 39 | +onnx_model_path = '/user/testuser/isolationForestWriteTest.onnx' |
| 40 | +converter.convert_and_save(onnx_model_path) |
| 41 | +``` |
| 42 | + |
| 43 | +## Using the ONNX model for inference |
| 44 | + |
| 45 | +```python |
| 46 | +import numpy as np |
| 47 | +import onnx |
| 48 | +from onnxruntime import InferenceSession |
| 49 | + |
| 50 | +onnx_model_path = '/user/testuser/isolationForestWriteTest.onnx' |
| 51 | +dataset_path = 'shuttle.csv' |
| 52 | + |
| 53 | +# Load data |
| 54 | +input_data = np.loadtxt(dataset_path, delimiter=',') |
| 55 | +num_features = input_data.shape[1] - 1 |
| 56 | +last_col_index = num_features |
| 57 | + |
| 58 | +# The last column is the label column |
| 59 | +input_dict = {'features': np.delete(input_data, last_col_index, 1).astype(dtype=np.float32)} |
| 60 | + |
| 61 | +# Load the ONNX model and run inference |
| 62 | +onx = onnx.load(onnx_model_path) |
| 63 | +sess = InferenceSession(onx.SerializeToString()) |
| 64 | +res = sess.run(None, input_dict) |
| 65 | + |
| 66 | +# Print scores |
| 67 | +outlier_scores = res[0] |
| 68 | +print(np.transpose(outlier_scores[:10])[0]) |
| 69 | +``` |
| 70 | + |
| 71 | +## License |
| 72 | + |
| 73 | +BSD 2-Clause License. See [LICENSE](https://github.com/linkedin/isolation-forest/blob/master/LICENSE) for details. |
0 commit comments