### Feature Description We need to add a confidence score to the ninja-lm outputs to reduce false positives. >= 98% Jailbreak >= 98% benign Between 92 - 98% flagged for review Less that 92 false.