feat: Add "deps" command to generate a graph of rule depdendencies.#498
feat: Add "deps" command to generate a graph of rule depdendencies.#498wxsBSD wants to merge 19 commits intoVirusTotal:mainfrom
Conversation
Given a set of rules parse it and walk the AST to find identifiers and generate
a dot file of them that can be fed into graphviz for visualization.
By default it generates a graph of all the rules but you can select any number
of rules with the -r argument.
For example, given these rules:
```
rule a { condition: pe.is_dll() }
rule b { condition: a }
rule c { condition: b }
rule d { condition: false }
```
And selecting using `-r b` you get output that looks like this:
```
digraph {
b [fillcolor=paleturquoise, style="filled"];
a [fillcolor=paleturquoise, style="filled"];
pe [fillcolor=palegreen, style="filled"];
a -> pe;
b -> a;
}
```
This mode is best thought of as "what is the minimum set of rules and imports I
need to execute the selected rule."
Using the -R argument displays the reverse dependencies of a rule. For the same
rules above the output when using -R is:
```
digraph {
b [fillcolor=paleturquoise, style="filled"];
c [fillcolor=paleturquoise, style="filled"];
c -> b;
}
```
This mode is best thought of as "if I change this rule, what other rules do I
also need to test."
Move the dependency walking code to it's own command and make it hidden by default until it gets more testing.
|
I don't have any tests for this yet, but I'm willing to write them if you think this is a good idea to include in yara-x. I'm just putting this out there now to get some early feedback. I have tested this with a very complex set of rules from work and it does parse them and output graphs. However, the graphs quickly turn very hard to understand if you have exceptionally large dependency chains in your output. For smaller graphs (dozens of dependencies) it looks much better. |
|
I've updated the code to use the new features you've added and it works great, thanks! I've decided to stop efforts to track unknown identifiers because it can get weird. For example, these two rules produce drastically different ASTs: The ASTs: In the case of It is for this reason that I went ahead and removed the "unknown identifier" part of this PR and now we only track dependencies to existing rules or things that look like modules (all other identifiers are ignored). |
Variables can be tracked by using a vector that behaves as a stack of defined variable identifiers, and another vector containing the indexes within this stack were each variable scope starts.
Field names can't be handled as identifiers because they could cause dependencies that don't exist actually. For instance: With the current implementation the I'm just thinking out loud, but I believe any identifier that is under a field access expression should be ignored, except for the first operand. |
I think you're right here. I spent a bit of time trying to come up with a rule that would cause a problem here but I haven't been able to. I did find a different problem but I'll open a different issue for that. |
This branch adds a "deps" command that generates dependency information for a set of rules. It walks the AST looking for identifiers of rules, modules and unknown identifiers (hopefully external variables) and collects information about them. For any given rule it will output either the dependencies of that rule or the reverse dependencies of that rule. The output is in the form of a graphviz file that can be piped into
dotto generate a visual graph.For example, given these rules:
You can print out the dependencies of
awithyr deps -r a rules/test.yara:This can be useful if you're looking to find the minimum set of rules and modules needed to share
rule ain this case. It obviously becomes harder to determine this without a dependency walker when you have more complex graphs. For example, knowing the set of rules and imports to sharerule cis more complex just due to the length of the chain.You can also get reverse dependencies, which is a nice thing to know when you want to make a change to a rule. For example, if I were to change
rule ait would be nice to know that I haven't broken any of the rules that depend upon it (directly or indirectly). Assuming those rules have "expected matches" values in the metadata you can use the dependency walking code to determine what rules to test and what they should match.yr deps -R -r a rules/test.yara: