Modern distributed systems, like microservices and cloud-native architectures, are built to be scalable and reliable. However, their complexity can lead to unexpected failures. Chaos engineering is a useful way to test and improve system resilience by intentionally creating controlled failures. However, it can be costly due to resource usage, monitoring needs, and testing in production-like