Help. My application is down. What now?
In short:
- Error identification: Programmers search logs to find and locate the cause of the application failure.
- Updating the application: Regular updates are essential to prevent compatibility issues and keep the application up and running.
- Local debugging: Developers debug the code step by step to identify problematic rules and prevent chain reactions.
Help. My application is down. What now?
Applications: We can't live without them. Many business processes fall or stand in the way these applications work. So when a business-critical application fails, it's crucial to get it up and running again as soon as possible. Although it is a critical issue for the organization, it is a fantastic challenge for Thesio's programmers. Developer Joey is happy to share the five most important steps he's taking to get an application back up and running.
1. Looking for the error
Code, software, syntax; it's all a piece of cake for the employees at Thesio and very familiar to every developer. But even for experienced professionals, it is time-consuming and almost impossible to search for the proverbial needle in a haystack. In order not to spend days doing this, every decent application therefore has a simple log that records errors very minimally. In many cases, it allows you to find out where and in which part the critical error lies. But in some cases, there is only an “error, exception...” message and then you have to continue searching.
2. Update the application
Technology is changing rapidly. In order to keep up, an application must therefore be updated regularly. If you do not do this, there is a risk that the application is no longer compatible (can work together) with other applications, the browser, or the operating system. In that case, malfunctions may occur. Updating an application that is out of the air to the latest version is therefore often one of the first things we do when we see that it is “outdated”.
3. Digging through the haystack
If step 1 and step 2 do not offer the desired solution, we will look for a wider overview and start by splitting all application components (also known as services). This way, you don't have to dig through the whole haystack.
If an error occurred in step 1, you should also be able to see which service it was registered with. Is the error in an API connection, the front end, or the backend? And is this problem in the server, database or code? By checking the logs per component, you can exclude a large part of the code and work more specifically during the search.
4. Local debugging
Once you have found a clear location in the components and/or the code, you can continue debugging. With debugging, we can check step-by-step whether the code actually does what it was intended for. With debugging, you can clearly see which rule (s) are not working. This can be a time-consuming task if it turns out that the code is not completely 'broken', but simply gives the wrong response or takes a very long time. This creates a chain reaction that causes other components to stop waiting and the server goes black.
For these types of specialized jobs, it is very important that you have knowledge and experience in developing and debugging many different applications. After all, you must be able to thoroughly understand the code and be able to test it for its operation, and you must also understand why this happens in a certain way.
5. Prevention is better than cure
The moment the application no longer functions, you are actually already too late. In almost all cases, this can be prevented by carrying out proper maintenance and making recurring tests that generate a daily or weekly report with the errors found. Even if you manage to get the application up and running again, you still lost a lot of time and money because employees were unable to use the application and a developer was busy solving the problem. Management and maintenance should be an important part of ICT policy to prevent these types of situations and to minimize the risk of a crash.