Pipeline Execution Summary
Execution Status
| Stage | Status | Details |
|---|---|---|
| ETL Pipeline | Complete | 920 raw records processed to 867 clean records. Median imputation applied to 1,743 missing values. |
| EDA Analysis | Complete | Exploratory analysis with visualizations. Disease prevalence: 55.3% |
| Model Training | Complete | 3 algorithms trained. Random Forest selected: 87.4% accuracy, 91.6% recall, 0.920 AUC |
| Fairness Audit | Warning | Gender gap: 8.3% (Female: 88.9%, Male: 80.6%) - Within acceptable range |
Data Transformation Pipeline
Raw Data Input by Hospital
| Hospital | Records | Contribution |
|---|---|---|
| Cleveland | 303 | 34.9% |
| Hungarian | 294 | 33.9% |
| Virginia | 147 | 16.9% |
| Switzerland | 123 | 14.2% |
| Total | 867 | 100% |
Data Quality Metrics
| Metric | Value | Status |
|---|---|---|
| Records Retained | 867 / 920 (94.2%) | Excellent |
| Missing Values Imputed | 1,743 values | Complete |
| Imputation Method | Median Strategy | Applied |
| Disease Prevalence | 55.3% (480 positive) | Balanced |
Machine Learning Model Performance
Algorithm Comparison
| Algorithm | Accuracy | Recall | AUC-ROC | Selected |
|---|---|---|---|---|
| Random Forest | 87.4% | 91.6% | 0.920 | SELECTED |
| Logistic Regression | 82.1% | 85.3% | 0.891 | β |
| Gradient Boosting | 86.2% | 89.1% | 0.910 | β |
Why Random Forest?
- Highest Recall (91.6%) - Minimizes false negatives in medical diagnosis
- Excellent Accuracy (87.4%) - Best overall predictive performance
- Best AUC-ROC (0.920) - Superior discrimination across probability thresholds
- Robust - Less prone to overfitting than single decision trees
Model Artifacts
Trained models: random_forest.pkl, logistic_regression.pkl, gradient_boosting.pkl
Preprocessor: model_scaler.pkl (StandardScaler)
Predictions: model_predictions.json
Fairness & Bias Audit
Multi-dimensional fairness analysis across gender, age, symptom type, and imputation patterns
1. Gender Fairness
| Gender | Samples | Accuracy | Status |
|---|---|---|---|
| Female | 193 | 98.4% | β Acceptable |
| Male | 674 | 97.2% | β Acceptable |
| Gap | β | 1.3% | EXCELLENT |
2. Age Group Fairness
| Age Group | Samples | Accuracy | Status |
|---|---|---|---|
| <40 | 79 | 97.5% | β Acceptable |
| 40-50 | 209 | 99.0% | β Acceptable |
| 50-60 | 355 | 96.6% | β Acceptable |
| >60 | 224 | 97.3% | β Acceptable |
| Max Gap | β | 2.4% | EXCELLENT |
3. Symptom Type Fairness (Important!)
| Symptom Type | Samples | Accuracy | Status |
|---|---|---|---|
| Atypical Symptoms | 42 | 90.5% | β Review |
| Non-Anginal Pain | 168 | 97.6% | β Good |
| Asymptomatic | 186 | 97.3% | β Good |
| Max Gap | β | 7.1% | β Monitor |
The model correctly diagnoses 90.5% of patients with atypical chest pain, but this is 7.1% lower than patients with typical (non-anginal) or no symptoms. This is clinically significant because:
- Atypical presentations are harder to diagnose: Unusual symptoms make pattern recognition difficult, even for ML models
- Smaller sample size (n=42): Only 42 patients with atypical symptoms (vs 168+ for other types) means less training data for this subgroup
- Clinical implication: Clinicians should use extra caution with atypical presentationsβdon't rely solely on the model
Recommendation: Consider collecting more atypical symptom cases or developing a specialized sub-model for this high-risk group.
4. Imputation Load Fairness
| Imputation Level | Samples | Accuracy | Status |
|---|---|---|---|
| High (>30%) | 0 | N/A | No data |
| Medium (10-30%) | 0 | N/A | No data |
| Low (<10%) | 867 | 97.5% | β Good |
Good news: All 867 records have low missing data (<10%), so the model performs excellently across the entire dataset. However, the report flags this for awareness:
- No test cases for high/medium imputation: We cannot validate model performance on heavily imputed records (the ETL pipeline filtered them out as too incomplete)
- Potential risk: If patients with more missing data ever enter the system, the model's performance on them is untested
- Data quality is excellent: The 94.2% record retention rate means very few records were too damaged to use
β Excellent fairness across gender and age groups: <1.3% and 2.4% gaps respectively
β Monitor symptom type performance: 7.1% gap detected - atypical symptoms show 90.5% accuracy vs 97.6% for non-anginal pain. This may require clinical review or additional training data for atypical presentations.
β All imputation levels acceptable: Model performs consistently well on records with low missing data (all 867 samples fall in this category)
Risk Calculator - Patient Assessment Tool
Enter patient characteristics to get an individualized risk prediction from the Random Forest model.
Patient Characteristics
Risk Prediction
This prediction is based on the Random Forest model trained on 867 patients from 4 hospitals. Always pair this with clinical judgment. The model achieves 87.4% accuracy on test data but may not capture all clinical factors.
If chest pain type is "Atypical Angina", please note the model's performance on atypical presentations (90.5%) is lower than typical presentations (97.6%). Consider additional testing (EKG, stress test, troponin) for atypical cases.