2525- [ Adding dependencies] ( #adding-dependencies )
2626- [ Using enums] ( #using-enums )
2727 - [ Inline enums] ( #inline-enums )
28+ - [ Creating a module in an external crate] ( #creating-a-module-in-an-external-crate )
29+ - [ Setting up the crate] ( #setting-up-the-crate )
30+ - [ Compiling the proto at build time] ( #compiling-the-proto-at-build-time )
31+ - [ Registering the module] ( #registering-the-module )
32+ - [ Adding callable functions to an external module] ( #adding-callable-functions-to-an-external-module )
33+ - [ Overriding the module output at scan time] ( #overriding-the-module-output-at-scan-time )
34+ - [ Ensuring the module is linked] ( #ensuring-the-module-is-linked )
2835- [ Tests] ( #tests )
2936 - [ Structuring Testdata Input] ( #structuring-testdata-input )
3037 - [ Linux] ( #linux )
@@ -121,7 +128,6 @@ Let's start with the interesting part:
121128option (yara.module_options) = {
122129 name : "text"
123130 root_message: "text.Text"
124- rust_module: "text"
125131 cargo_feature: "text-module"
126132};
127133```
@@ -133,10 +139,10 @@ file, but one describing a module. In fact, you can put any `.proto` file in the
133139files is describing a YARA module. Only files containing a ` yara.module_options `
134140section will define a module.
135141
136- Options ` name ` and ` root_message ` are required, while ` rust_module ` and
137- ` cargo_feature ` are optional. The ` name ` option defines the module's name. This
138- is the name that will be used for importing the module in a YARA rule, in this
139- case our module will be imported with ` import "text" ` .
142+ Options ` name ` and ` root_message ` are required, while ` cargo_feature ` is optional.
143+ The ` name ` option defines the module's name. This is the name that will be used
144+ for importing the module in a YARA rule, in this case our module will be imported
145+ with ` import "text" ` .
140146
141147The ` cargo_feature ` option indicates the name of the feature that controls
142148whether
@@ -292,11 +298,6 @@ need write the logic that parses every scanned file and fills the module's
292298structure with the data obtained from the file. This is done by implementing
293299a function that will act as the entry point for your module.
294300
295- This is where the `rust_module` option described in the previous section enters
296- into play. This option is the name of the Rust module that contains the code
297- for your module. In our `text.proto` file we have `rust_module : " text" ` , which
298- means that our Rust module must be named ` text`.
299-
300301There are two options for creating our `text` module :
301302
302303* Creating a `text.rs` file in `lib/src/modules`.
@@ -309,24 +310,25 @@ second approach is the recommended one.
309310So, let's create our `lib/src/modules/text.rs` file :
310311
311312` ` ` rust
312- use crate::modules ::prelude::*;
313+ use crate::mods ::prelude::*;
313314use crate::modules::protos::text::*;
314315
315- #[module_main]
316- fn main(data: &[u8]) -> Text {
316+ fn main(data: &[u8], _meta: Option<&[u8]>) -> Result<Text, ModuleError> {
317317 let mut text_proto = Text::new();
318318
319319 // TODO: parse the data and populate text_proto.
320320
321- text_proto
321+ Ok( text_proto)
322322}
323+
324+ register_module!("text", Text, main);
323325` ` `
324326
325327This is the simplest possible code for a YARA module, and it doesn't do anything
326328special yet. Let's describe what it does in detail :
327329
328330` ` ` rust
329- use crate::modules ::prelude::*;
331+ use crate::mods ::prelude::*;
330332` ` `
331333
332334This first line is very important as it imports all the dependencies required
@@ -348,38 +350,36 @@ will be `crate::modules::protos::foobar`
348350
349351---
350352
351- Next comes the module's main function :
353+ Next comes the module's main function and the module registration :
352354
353355` ` ` rust
354- #[module_main]
355- fn main(data: &[u8]) -> Text {
356+ fn main(data: &[u8], _meta: Option<&[u8]>) -> Result<Text, ModuleError> {
356357 ...
357358}
359+
360+ register_module!("text", Text, main);
358361` ` `
359362
360363The module's main function is called for every file scanned by YARA. This
361- function receives a byte slice with the content of the file being scanned. It
362- must return the `Text` structure that was generated from the `text.proto` file.
363- The main function must have the `#[module_main]` attribute. Notice that the
364- module's main function doesn't need to be called `main`, it can have any
365- arbitrary name, as long as it has the `#[module_main]` attribute. Of course,
366- this attribute can't be used with more than one function per module.
367-
368- The main function usually consists in creating an instance of the protobuf
369- you previously defined, and populating the protobuf with information extracted
370- from
371- the scanned file. Let's finish the implementation of the main function for our
372- ` text` module.
364+ function receives a byte slice with the content of the file being scanned and an
365+ optional byte slice with per-scan metadata, and it returns a `Result` containing the
366+ ` Text` structure that was generated from the `text.proto` file (or a `ModuleError`).
367+
368+ Registering the module is as simple as calling the `register_module!` macro.
369+ It takes the name of the module (as used in YARA rules' `import` statements), the
370+ protobuf message type returned by the module, and the main function name. If the
371+ module is a data-only module with no main function, the third argument can be omitted.
372+
373+ Let's finish the implementation of the main function for our `text` module.
373374
374375` ` ` rust
375- use crate::modules ::prelude::*;
376+ use crate::mods ::prelude::*;
376377use crate::modules::protos::text::*;
377378
378379use std::io;
379380use std::io::BufRead;
380381
381- #[module_main]
382- fn main(data: &[u8]) -> Text {
382+ fn main(data: &[u8], _meta: Option<&[u8]>) -> Result<Text, ModuleError> {
383383 // Create an empty instance of the Text protobuf.
384384 let mut text_proto = Text::new();
385385
@@ -396,7 +396,7 @@ fn main(data: &[u8]) -> Text {
396396 num_words += line.split_whitespace().count();
397397 num_lines += 1;
398398 }
399- Err(_) => return text_proto,
399+ Err(_) => return Ok( text_proto) ,
400400 }
401401 }
402402
@@ -405,8 +405,10 @@ fn main(data: &[u8]) -> Text {
405405 text_proto.set_num_words(num_words as i64);
406406
407407 // Return the Text proto after filling the relevant fields.
408- text_proto
408+ Ok( text_proto)
409409}
410+
411+ register_module!("text", Text, main);
410412```
411413
412414That's all you need for having a fully functional YARA module. Now, let's build
@@ -1009,6 +1011,202 @@ enum CPU_SUBTYPE_ARM {
10091011With the enums above you can refer to ` macho.CPU_TYPE_X86 ` and instead of
10101012` macho.CPU_TYPE.CPU_TYPE_X86 ` and ` macho.CPU_SUBTYPE_INTEL.CPU_SUBTYPE_I386 ` .
10111013
1014+ ## Creating a module in an external crate
1015+
1016+ Everything described so far assumes that your module lives inside the
1017+ ` yara-x ` repository itself. This is the right approach when you intend to
1018+ contribute the module upstream, but it requires modifying the ` yara-x ` source
1019+ tree and rebuilding the library. If you want to ship a module as part of your
1020+ own crate—without forking or patching ` yara-x ` —you can use the ** custom
1021+ modules** API instead.
1022+
1023+ Custom modules are registered at link time through the
1024+ [ inventory] ( https://docs.rs/inventory ) crate. When your crate is linked into
1025+ a binary together with ` yara-x ` , the module is discovered automatically and
1026+ behaves exactly like a built-in module: it can be ` import ` -ed in rules, its
1027+ fields are accessible in conditions, and its functions are callable.
1028+
1029+ ### Setting up the crate
1030+
1031+ Add ` yara-x ` and ` protobuf ` as dependencies:
1032+
1033+ ``` toml
1034+ [dependencies ]
1035+ yara-x = { version = " ..." }
1036+ protobuf = { version = " 3" }
1037+
1038+ [build-dependencies ]
1039+ protobuf-codegen = { version = " 3" }
1040+ ```
1041+
1042+ ### Compiling the proto at build time
1043+
1044+ Just like built-in modules, an external module's structure is described by a
1045+ ` .proto ` file. The difference is that instead of placing it inside the
1046+ ` yara-x ` source tree and relying on ` yara-x ` 's own build script, you compile
1047+ it yourself in ` build.rs ` using ` protobuf-codegen ` :
1048+
1049+ ``` protobuf
1050+ // proto/foobar.proto
1051+ syntax = "proto2";
1052+
1053+ package foobar;
1054+
1055+ message Foobar {
1056+ optional uint64 count = 1;
1057+ optional string label = 2;
1058+ repeated string tags = 3;
1059+ }
1060+ ```
1061+
1062+ ``` rust
1063+ // build.rs
1064+ fn main () {
1065+ println! (" cargo:rerun-if-changed=build.rs" );
1066+ println! (" cargo:rerun-if-changed=proto" );
1067+
1068+ protobuf_codegen :: Codegen :: new ()
1069+ . pure ()
1070+ . cargo_out_dir (" protos" )
1071+ . include (" proto" )
1072+ . input (" proto/foobar.proto" )
1073+ . run_from_script ();
1074+ }
1075+ ```
1076+
1077+ The ` .pure() ` call tells ` protobuf-codegen ` to use its built-in Rust parser,
1078+ so you do not need to have ` protoc ` installed. The generated code is placed
1079+ under ` $OUT_DIR/protos/ ` , and you include it from your library like this:
1080+
1081+ ``` rust
1082+ pub mod proto {
1083+ include! (concat! (env! (" OUT_DIR" ), " /protos/mod.rs" ));
1084+ }
1085+ pub use proto :: foobar :: Foobar ;
1086+ ```
1087+
1088+ ### Registering the module
1089+
1090+ With the proto in place, registering the module requires two things: a main
1091+ function and a call to ` yara_x::register_module! ` . The main function follows the same
1092+ contract as in built-in modules—it receives the scanned data, populates the
1093+ protobuf, and returns it:
1094+
1095+ ``` rust
1096+ use yara_x :: errors :: ModuleError ;
1097+
1098+ fn foobar_main (
1099+ data : & [u8 ],
1100+ _meta : Option <& [u8 ]>,
1101+ ) -> Result <Foobar , ModuleError > {
1102+ let mut out = Foobar :: new ();
1103+ out . count = Some (data . len () as u64 );
1104+ out . label = Some (" foobar" . to_owned ());
1105+ Ok (out )
1106+ }
1107+
1108+ yara_x :: register_module! (" foobar" , Foobar , foobar_main );
1109+ ```
1110+
1111+ A few things to note here:
1112+
1113+ * The first argument is the string used in ` import "foobar" ` in YARA rules.
1114+ * The second argument is the root protobuf message type. The root descriptor is automatically obtained from this type.
1115+ * The third argument is the main function. If your module is data-only (the caller always
1116+ injects the output via ` set_module_output ` ), it can be omitted.
1117+ * The macro automatically sets the Rust module name path via ` module_path!() ` so that YARA can find any callable functions.
1118+
1119+ If a custom module shares its name with a built-in module, the built-in one
1120+ takes precedence and your registration is silently ignored.
1121+
1122+ ### Adding callable functions to an external module
1123+
1124+ Custom modules can also export functions that are callable from YARA rules, the
1125+ same way built-in modules do with ` #[module_export] ` . From an external crate
1126+ you use the same attribute, but you must pass an extra ` yara_x_crate `
1127+ argument so that the macro can generate fully-qualified type references:
1128+
1129+ ``` rust
1130+ pub mod fns {
1131+ use yara_x :: ScanContext ;
1132+
1133+ #[yara_x:: module_export(yara_x_crate = " yara_x" )]
1134+ pub fn add (_ctx : & ScanContext , a : i64 , b : i64 ) -> i64 {
1135+ a + b
1136+ }
1137+ }
1138+ ```
1139+
1140+ The ` yara_x_crate = "yara_x" ` argument is required whenever the macro is used
1141+ outside of the ` yara-x ` source tree. Without it the macro generates bare type
1142+ names (` Caller ` , ` ScanContext ` , etc.) that are only in scope inside ` yara-x ` .
1143+
1144+ The function signature rules are exactly the same as for built-in modules—
1145+ see [ Valid function arguments] ( #valid-function-arguments ) and
1146+ [ Valid return types] ( #valid-return-types ) .
1147+
1148+ After this, the ` add ` function is callable from YARA rules as ` foobar.add(a, b) ` :
1149+
1150+ ``` yara
1151+ import "foobar"
1152+
1153+ rule add_works {
1154+ condition:
1155+ foobar.add(3, 4) == 7
1156+ }
1157+ ```
1158+
1159+ ### Overriding the module output at scan time
1160+
1161+ Sometimes you already have the data your module would expose, and you don't
1162+ want to re-derive it inside ` main_fn ` . The ` Scanner ` API lets you inject a
1163+ pre-built protobuf directly, bypassing ` main_fn ` entirely for that scan:
1164+
1165+ ``` rust
1166+ let rules = yara_x :: Compiler :: new ()
1167+ . add_source (r # " import "foobar" rule r { condition: foobar.count == 99 }" # )?
1168+ . build ();
1169+
1170+ let mut out = Foobar :: new ();
1171+ out . count = Some (99 );
1172+
1173+ let mut scanner = yara_x :: Scanner :: new (& rules );
1174+ scanner . set_module_output (Box :: new (out ))? ;
1175+ scanner . scan (data )? ;
1176+ ```
1177+
1178+ ` set_module_output ` identifies the target module by the type of the message
1179+ you pass in. YARA matches it against the ` root_descriptor ` you registered, so
1180+ the type must be exactly the message type declared as your module's root.
1181+
1182+ ### Ensuring the module is linked
1183+
1184+ Rust's linker may discard your crate's ` inventory::submit! ` initializer if
1185+ nothing in the final binary directly references a symbol from your crate.
1186+ The safest workaround is to expose a no-op function and call it from the
1187+ binary's entry point:
1188+
1189+ ``` rust
1190+ /// Call this from your binary's `main` (or from test setup) to ensure
1191+ /// the linker keeps this crate's module registration.
1192+ pub fn ensure_registered () {}
1193+ ```
1194+
1195+ ``` rust
1196+ fn main () {
1197+ my_module_crate :: ensure_registered ();
1198+ // ...
1199+ }
1200+ ```
1201+
1202+ You can verify that the registration worked by checking the module registry:
1203+
1204+ ``` rust
1205+ my_module_crate :: ensure_registered ();
1206+ let names : Vec <& str > = yara_x :: mods :: module_names (). collect ();
1207+ assert! (names . contains (& " foobar" ));
1208+ ```
1209+
10121210## Tests
10131211
10141212You'll notice that each module in ` /lib/src/modules/ ` has a ` tests/ `
0 commit comments