RFAnalyzer

class pyruleanalyzer.RFAnalyzer(classifier: RuleClassifier)

Random-Forest-specific rule analysis and comparison.

This class extracts the RF analysis pipeline out of RuleClassifier and adds redundancy breakdown tracking so the output shows:

  • intra_tree – boundary-redundant sibling pairs merged (within each tree)

  • inter_tree – semantically duplicate rules across trees

  • low_usage – rules removed because of low classification count

classifier

The underlying RuleClassifier instance.

redundancy_counts

Dict mapping redundancy type to count.

compare_initial_final_results(file_path: str) None

Compares performance of initial vs final rules for a Random Forest.

Evaluates both rule sets on the test data, displays metrics, logs divergent cases, and writes a detailed report.

Parameters:

file_path – Path to the CSV test file.

execute_rule_analysis(file_path: str, remove_below_n_classifications: int = -1) None

Evaluates RF rules on a dataset, detects redundancies, and refines.

This method: 1. Classifies every sample to gather per-rule usage stats. 2. Optionally removes low-usage rules (with sibling promotion). 3. Tracks redundancy counts by type (intra_tree, inter_tree, low_usage). 4. Writes a report and saves the final model.

Parameters:
  • file_path – Path to the CSV test file.

  • remove_below_n_classifications – Threshold for low-usage pruning (-1 disables).

print_redundancy_summary() None

Prints the redundancy breakdown and rule reduction summary.

track_from_adjust_and_remove(method: str, intra_tree_pairs: list, inter_tree_groups: list | None = None) None

Updates redundancy counters after an adjust_and_remove cycle.

Called by RuleClassifier.execute_rule_analysis() after the duplicate-removal loop finishes for RF.

Parameters:
  • method – The removal method used (‘soft’, ‘medium’, ‘hard’).

  • intra_tree_pairs – Pairs found by find_duplicated_rules.

  • inter_tree_groups – Groups found by find_duplicated_rules_between_trees (only for ‘hard’).