Error handling: Report YAML pars errors and continue if error setting#353
Open
harriscr wants to merge 1 commit into
Open
Error handling: Report YAML pars errors and continue if error setting#353harriscr wants to merge 1 commit into
harriscr wants to merge 1 commit into
Conversation
This PR addresses 2 weaknesses in CBT error handling: 1. When there is a malformed CBT yaml, CBT will exit without any helpful error message as to where the error in the file is. 2. There is no configuration setting to chaneg the value for continue-if-error when calling the underlying PDSH methods. for 1. the YAML parsing exception is caught and reported to the user which allows them to see exactly where the error in their file is and fix it before trying again. The problem for 2 was found when the FIO process running during a test was crashing, or never starting. The only way that we could figure out what went wrong was by oddities in the response curves from the benchmark run and them nonitoring the individual processes on the system as the benchmark was running via CBT. It would be better to have a setting to stop on error to allow quicker and more efficient debugging of issues like this. The pdsh helper methods aready support this, but there was no way to change the behaviour. A setting has been added to the common section of the yaml. 2 of the example files have been updated to show this new setting as well. There is potentially more that can be done in this area in the future, but this addresses our current pain-point. Signed-off-by: Chris Harris <harriscr@uk.ibm.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR addresses 2 weaknesses in CBT error handling:
for 1. the YAML parsing exception is caught and reported to the user which allows them to see exactly where the error in their file is and fix it before trying again.
The problem for 2 was found when the FIO process running during a test was crashing, or never starting. The only way that we could figure out what went wrong was by oddities in the response curves from the benchmark run and them nonitoring the individual processes on the system as the benchmark was running via CBT. It would be better to have a setting to stop on error to allow quicker and more efficient debugging of issues like this. The pdsh helper methods aready support this, but there was no way to change the behaviour. A setting has been added to the common section of the yaml. 2 of the example files have been updated to show this new setting as well. There is potentially more that can be done in this area in the future, but this addresses our current pain-point.
An example output for a malformed yaml: