GBDTAnalyzer
- class pyruleanalyzer.GBDTAnalyzer(classifier: RuleClassifier)
Gradient-Boosting-specific rule analysis and comparison.
This class extracts the GBDT analysis pipeline out of
RuleClassifierand adds redundancy breakdown tracking so the output shows:intra_tree– boundary-redundant sibling pairs merged (within each tree)low_impact– rules with negligible contribution removedlow_usage– rules removed because of low classification count
- classifier
The underlying
RuleClassifierinstance.
- redundancy_counts
Dict mapping redundancy type to count.
- compare_initial_final_results(file_path: str) None
Compares performance of initial vs final rules for GBDT.
Evaluates both rule sets on the test data, displays metrics, logs divergent cases, and writes a detailed report.
- Parameters:
file_path – Path to the CSV test file.
- execute_rule_analysis(file_path: str, remove_below_n_classifications: int = -1) None
Evaluates GBDT rules on a dataset, detects redundancies, and refines.
This method: 1. Classifies every sample to gather per-rule usage stats. 2. Optionally removes low-usage rules (with sibling promotion). 3. Tracks redundancy counts by type. 4. Writes a report and saves the final model.
- Parameters:
file_path – Path to the CSV test file.
remove_below_n_classifications – Threshold for low-usage pruning (-1 disables).
- print_redundancy_summary() None
Prints the redundancy breakdown and rule reduction summary.
- track_from_adjust_and_remove(method: str, intra_tree_pairs: list) None
Updates redundancy counters after an adjust_and_remove cycle.
Called by
RuleClassifier.execute_rule_analysis()after the duplicate-removal loop finishes for GBDT.- Parameters:
method – The removal method used (‘soft’, ‘hard’).
intra_tree_pairs – Pairs found by
find_duplicated_rules.