“If something breaks, the application should be coded to fail gracefully and provide the relevant debugging information as part of its output. This behavior should be an immutable part of your application framework.”
Effective troubleshooting depends on being able to see what is happening inside your application. To do that, you must build in proper error handling. If something fails in your application, it should fail gracefully and provide clear error messaging that displays the data element involved in the failure.
Such a capability requires building functionality into your application framework to monitor the entire transaction life cycle, especially if microservices with complex dependencies are involved. You can embed such functionality through an agent that monitors everything that happens throughout the stack. You want to understand the journey of a transaction. Then, if something breaks, the application should be coded to fail gracefully and provide the relevant debugging information as part of its output. This behavior should be an immutable part of your application framework.
Machine learning can also be a useful tool for analyzing large volumes of operational data to detect problems or security issues. Machine learning is best at detecting trends that may point to a problem or an opportunity to optimize an application. Invariably, though, humans have to get involved because machine learning cannot provide adaptive solutions. When a machine learning algorithm encounters a scenario it does not know how to handle, it alerts a human, who can perform an evaluation and decide how best to handle that scenario. The human’s decision goes into the machine learning feedback loop so that over time, the machine is able to do more and pass less off to the human.
Machine learning is just an application that you provide with operational parameters. It is only as good as what you tell it to do. It will learn, but the learning needs validation.