Ethics of Artificial Intelligence in Crime

UC San Diego Data Science Capstone 2025

Yuancheng (Kaleo) Cao Aj Falak Kavya Sriram Catherine Back

Emily Ramond
Mentor, Deloitte Greg Thein
Mentor, Deloitte

Introduction

The use of artificial intelligence (AI) in criminal justice, particularly in pretrial risk assessment, has raised concerns about bias. These algorithms predict the likelihood that a defendant will miss court or reoffend, affecting decisions such as bail eligibility. While intended to reduce human bias, they often reflect and amplify systemic inequalities in policing and arrest records. Ethical AI requires auditing models for biased error rates and outcomes. Without proper oversight, these systems can worsen inequalities instead of addressing them, ultimately eroding public trust in the legal system. Responsible AI in pretrial decisions must ensure fairness and accountability.

🚨 Three Main Risks of ML in Pretrial Risk Assessment

🔍

Biased Training Data

Over-policing in minority neighborhoods skews arrest records, leading to misleading crime rate predictions.

⚖️

Proxy Discrimination

Even when race is excluded, variables like ZIP code or prior arrests can act as hidden proxies, reinforcing bias.

🔄

Feedback Loop

Predictive models overestimate crime in minority areas, justifying more policing and creating a cycle of biased data.

This work examines racial bias in pretrial risk assessments, where "reoffend" labels, often based on rearrest rates, misrepresent actual criminal behavior due to systemic policing biases. Black individuals, for example, face higher arrest rates for minor offenses than White individuals. Using Scikit-learn and AIF360 (AI Fairness 360), we audit and mitigate bias across the machine learning pipeline. Pre-processing techniques rebalance training data, while post-processing adjustments calibrate decision thresholds to equalize error rates. By integrating fairness-aware methods, our approach enhances transparency and reduces discriminatory outcomes in pretrial decision-making.

Exploratory data analysis (EDA)

Our analysis utilizes the New York State Pretrial Release Dataset from 2023, comprising over 285,000 cases. This comprehensive dataset includes demographic information, criminal history, court proceedings, and pretrial outcomes, offering valuable insights into the pretrial release system in New York State. The dataset underwent thorough cleaning and preprocessing, including handling of missing values, standardization of categorical variables, and validation of temporal data.

Rearrest Outcomes Distribution

This chart shows the distribution of rearrest outcomes during the pretrial period. 83.7% had no rearrest, 8.1% were rearrested for misdemeanors, 6.0% for non-violent felonies, and 2.2% for violent felonies.

Race Distribution

This chart shows the racial breakdown of cases by reoffending status. Among White individuals, 30% had no reoffense while 15% reoffended. For Black individuals, 25% had no reoffense and 20% reoffended. Asian individuals show 10% with no reoffense and 5% with reoffense. For Other individuals, 5% had no reoffense and 3% with reoffense.

Ethnicity Distribution

This chart displays ethnicity distribution by reoffending status. Among Hispanic individuals, 35% had no reoffense while 12% reoffended. For Non-Hispanic individuals, 40% had no reoffense and 13% reoffended. The data shows similar reoffending proportions between the two ethnic categories.

Gender Distribution

This chart presents gender breakdown by reoffending status. Males show 45% with no reoffense and 15% with reoffense. Females show 30% with no reoffense and 10% with reoffense. Both genders have similar proportional reoffending rates, but males represent a larger portion of the overall cases.

Case Distribution by County

This map visualizes how cases are distributed across New York counties. Darker colors represent counties with higher case counts, while lighter colors indicate fewer cases. Hovering over a county displays its name and total case count.

Impact of Prior Offenses

This chart compares the reoffending rates between two groups: those with no prior offenses (6.01%) and those with prior offenses (10.18%). The bars represent the percentage of individuals in each category who reoffended during the pretrial period.

Model Development

Feature Engineering

The feature engineering process transformed raw pretrial data into predictive variables while minimizing bias. We created criminal history features capturing prior offense patterns, case characteristic variables describing current charges, and behavioral indicators reflecting past court compliance. We implemented log-transformation for skewed variables, one-hot encoding for categorical features, and standardization to ensure equal feature importance during model training.

Random Forest Model Development

Our Random Forest Classifier used 100 trees and was trained on seven key features selected for predicting pretrial reoffending. The model handled both numerical and categorical data effectively while capturing complex feature interactions.

We binarized prior criminal history variables, converting count-based features into binary indicators (e.g., prior_misd_binary, prior_vfo_binary). The model was trained on a 50-30-20 split (training, validation, test), ensuring a balanced evaluation.

Hyperparameters included max_depth=6, n_estimators=100, and random_state=42 for reproducibility. The entropy criterion was used to maximize information gain, enhancing prediction reliability.

Model Performance

The Random Forest model achieved 83.27% overall accuracy, but balanced accuracy was only 50.67%, indicating poor performance on the minority class.

For non-reoffenders (Class 0), the model performed well (precision: 0.84, recall: 0.99, F1-score: 0.91). However, for reoffenders (Class 1), recall was extremely low (0.02), meaning most reoffenders were misclassified. The ROC-AUC score was 0.6914, showing moderate discriminatory power.

The model's imbalance favored non-reoffenders, highlighting the need for bias mitigation techniques to improve fairness and recall for reoffenders.

Feature Importance Analysis

Feature importance analysis revealed that the strongest predictors were:

Age at Arrest - 48%
Pending Misdemeanor Charges - 21%
Pending Non-Violent Felony Charges - 11%
Prior Misdemeanor Convictions - 7%
Prior Non-Violent Felony Convictions - 3%

Addressing Imbalanced Labels

We applied Random Oversampling and SMOTE (Synthetic Minority Over-sampling Technique). SMOTE generates synthetic samples instead of duplicating existing ones, improving model generalization and preventing overfitting.

We used imbalanced_make_pipeline to integrate SMOTE with a Random Forest Classifier, setting class_weight='balanced' to adjust the model's learning based on class distribution.

We applied Stratified K-Fold Cross-Validation (k=5) and performed Grid Search to optimize n_estimators, max_depth, and random_state. Our final model prioritized recall for the minority class (reoffenders).

Best parameters found: max_depth=6, n_estimators=200, random_state=13.

After applying oversampling and fine-tuning, the model's ability to detect reoffenders (Class 1) improved. However, there were trade-offs in performance for non-reoffenders (Class 0).

	Recall (Class 1)	Precision (Class 1)	F1-Score (Class 1)	Recall (Class 0)	Precision (Class 0)	F1-Score (Class 0)
Before Fine-Tuning	0.50	0.26	0.35	0.73	0.88	0.80
After Fine-Tuning	0.66	0.26	0.38	0.63	0.90	0.75

Summary Insights:

Class 1 Improvement: Recall increased from 0.50 to 0.66, meaning the model now captures more actual reoffenders.
Class 0 Trade-off: Precision slightly improved, but recall decreased from 0.73 to 0.63.
Overall Balance: Balanced Accuracy improved to 0.6466, and the ROC-AUC score increased to 0.6914.
Fine-tuning resulted in a better trade-off between precision and recall, improving the detection of reoffenders while maintaining reasonable accuracy for non-reoffenders.

Bias Mitigation

Reweighing (Pre-processing)

Reweighing assigns different weights to dataset instances based on group membership, ensuring that underrepresented groups contribute more significantly to the model's learning process.

Key Steps:

Assigns weights based on group membership (Race = 0 as unprivileged, Race = 1 as privileged).
Transforms the dataset by adjusting sample weights.
Trains the classifier on the reweighted dataset for fairer decision-making.

Calibrated Equalized Odds (Post-processing)

This method modifies model predictions after training to balance false positive and false negative rates across groups.

Key Steps:

Train the model on the original dataset (with oversampling applied).
Apply AIF360's calibration model to adjust predicted probabilities.
Ensure balanced false positive and false negative rates across racial groups.

Fairness Metrics: Impact of Reweighing

Before reweighing: Before applying reweighing, the model was trained on the original dataset, which contained imbalances between the unprivileged and privileged groups. This led to disparities in fairness metrics, particularly in 1-min(DI, 1/DI), Statistical Parity Difference, and Average Odds Difference. The best Balanced Accuracy was achieved at a higher threshold of 0.49, but fairness trade-offs were observed.

After reweighing: After applying reweighing, the training dataset was adjusted by assigning different weights to instances based on group membership. This adjustment aimed to balance the learning process and reduce disparities in fairness metrics. As a result, the best Balanced Accuracy was achieved at a lower threshold (0.14), meaning the model needed less confidence to classify positive cases. Fairness improved across key metrics.

Threshold: 0.49 [This means that at a classification threshold of 0.49, the model reached its highest balanced accuracy.]
Best Balanced Accuracy: 0.6469 [The model correctly classified both groups 64.69% of the time.]
1-min(DI, 1/DI): 0.0928 [A higher value indicates more disparity in disparate impact (DI).]
Average Odds Difference: 0.0248 [Measures the difference in true positive rates and false positive rates across groups. ]
Statistical Parity Difference: 0.0408 [Shows how much the probability of a positive prediction differs between groups.]
Equal Opportunity Difference: 0.0059 [Measures the difference in true positive rates across groups.]
Theil Index: 0.1159 [Measures overall inequality in predictions. A lower value is preferred.]

Key Takeaway: The fairness metrics suggest that before applying reweighing, there was an imbalance in the way the model treated unprivileged and privileged groups, particularly in disparate impact and statistical parity.

Threshold: 0.14 [The model now reaches its highest balanced accuracy at a lower threshold, suggesting better calibration across groups.]
Best Balanced Accuracy: 0.6351 [A slight drop compared to before (from 0.6469 to 0.6351), but fairness improved.]
1-min(DI, 1/DI): 0.0542 [Decreased from 0.0928, indicating reduced disparity in disparate impact.]
Average Odds Difference: 0.0079 [Improved, meaning the model now has a more balanced error rate across groups ]
Statistical Parity Difference: 0.0235 [Lower than before, showing improved fairness in the probability of a positive prediction across groups.]
Equal Opportunity Difference: -0.0107 [The change in equal opportunity difference indicates that the model's true positive rates are more aligned across groups.]
Theil Index: 0.1193 [Slightly increased from 0.1159, suggesting a small trade-off in entropy-based fairness.]

Key Takeaway:

Fairness improved across key metrics, particularly in disparate impact and average odds difference.
Balanced Accuracy decreased slightly (from 64.69% to 63.51%), but this trade-off was acceptable for improving fairness.
The lower classification threshold (from 0.49 to 0.14) suggests that the model now requires less confidence to classify positive cases, ensuring more equal treatment of both privileged and unprivileged groups.

Fairness Metrics: Impact of Calibrated Equalized Odds Postprocessing

Before Postprocessing

Dataset	GFPR Difference	GFNR Difference
Train Set	0.0171	0.0001
Validation Set	0.0167	-0.0006
Test Set	0.0136	0.0024

Before postprocessing, the differences in false positive rates (GFPR) and false negative rates (GFNR) indicate disparities in model predictions across groups. The validation set shows a 0.0167 GFPR difference, meaning privileged groups had a lower false positive rate than unprivileged groups. The GFNR difference is small, but even slight imbalances can affect fairness.

After Postprocessing

Dataset	GFPR Difference	GFNR Difference
Validation Set	0.0162	-0.0001
Test Set	0.0132	0.0031

After postprocessing, GFPR differences reduced, with the test set showing an improved 0.0132 GFPR difference. The GFNR difference slightly increased in the test set, but overall, disparities between groups decreased, ensuring that privileged and unprivileged groups were treated more equally in terms of misclassification rates.

💡 Final Summary

Before Reweighing → Higher accuracy, but fairness disparities in disparate impact & statistical parity.
After Reweighing → Improved fairness, but a slight drop in accuracy due to balancing efforts.
Main Trade-off: Fairness increased, but Theil Index showed minor fluctuation, indicating a small increase in prediction uncertainty.
Fairness Improvement: GFPR and GFNR differences reduced after postprocessing.
Balanced Accuracy Trade-off: Postprocessing did not significantly impact balanced accuracy, meaning the model maintained its overall predictive performance.
Comparison with Reweighing: Unlike reweighing, which adjusts the training data, postprocessing modifies model predictions directly. This approach ensures fairness without altering data distributions, making it a better option for models deployed in production, where maintaining original data integrity is important. If fairness constraints must be met without modifying training data, postprocessing is preferable.

Conclusion

This research examines the application of machine learning fairness techniques to mitigate racial bias in pretrial risk assessment algorithms. Our initial random forest model achieved 83.27% accuracy but demonstrated poor balanced accuracy (50.67%) and severely underperformed on the minority class of reoffenders, with age at arrest emerging as the strongest predictor. We implemented a bias mitigation strategy that combining dataset rebalancing through SMOTE, reweighing to equalize outcomes between privileged and unprivileged groups, and calibrated equalized odds to balance error rate differences. These interventions improved the model's balanced accuracy to 64.66% and significantly increased recall for reoffenders from 0.02 to 0.66, demonstrating more equitable performance across racial groups while highlighting the inherent trade-offs between accuracy and fairness. Our findings highlight the challenges of predicting reoffending fairly and accurately. This improved model can help policymakers and criminal justice stakeholders develop more eqitable risk assessment tools that reduce bias while maintaining model performance. This research contributes to the field of fairness in machine learning by demonstrating the trade-off between accuracy and fairness in decisions that affect many lives. It emphasizes the importance of balancing model performance with ethical considerations. By applying fairness metrics and debiasing techniques, this study provides insight into addressing bias in predictive modeling within the criminal justice system.