CDMP Fundamentals • 100 Questions • 90 Minutes
← Back to Case Studies

UrbanTransit's Predictive Maintenance Analytics Program

Big Data and Data Science Hard

💼 Scenario

UrbanTransit operates a public transportation network of 3,000 buses and 500 rail cars serving 2 million daily passengers. Equipment failures cause an average of 15 service disruptions per week, costing $2 million in emergency repairs, replacement services, and lost revenue. The fleet generates 200 GB of sensor data daily from engine diagnostics, brake systems, HVAC units, and door mechanisms. The data science team has developed a predictive maintenance model using supervised learning on two years of historical failure data. The model predicts component failures 72 hours in advance with 78% accuracy and a 12% false positive rate. However, the maintenance department reports that the 12% false positive rate is causing unnecessary maintenance interventions that cost $500,000 per month, and the 22% miss rate (false negatives) still results in unexpected failures. The CTO wants to improve model accuracy to 92% with a false positive rate below 5%, integrate the predictions with the maintenance scheduling system for automated work order generation, and extend the prediction window from 72 hours to 7 days to improve maintenance planning. Additionally, the model must be fair across bus and rail fleet segments and not systematically underperform for older equipment that serves lower-income neighborhoods.

Question 1: What approach would MOST likely improve the model accuracy from 78% to the target 92%?

Question 2: The model must not systematically underperform for older equipment serving lower-income neighborhoods. What type of bias concern is this, and how should it be addressed?

Question 3: What CRISP-DM phase activities are MOST important when extending the prediction window from 72 hours to 7 days?