add command line flag for system test result dumps#2967
add command line flag for system test result dumps#2967efd6 wants to merge 2 commits intoelastic:mainfrom
Conversation
| TestDumpPrefixFlagName = "dump" | ||
| TestDumpPrefixFlagDescription = "prefix for system test dump file" | ||
|
|
There was a problem hiding this comment.
Adding just one flag it forces to always set a prefix to the user, but probably they often need just a default value for that.
What about adding two flags to manage this feature?
- One flag to enable or not dumping the documents (
--dump-docs) without defining any path or prefix.- The default filename could be the same one used when the environment variable is set.
- Should this file be saved in the same location where
elastic-packageruns ? Take into account that this could change with the parameter-C.elastic-package -C test/packages/parallel/nginx test system -v
- Another flag to set a custom prefix (
--dump-docs-prefix) or even it could be set the whole path (--dump-docs-output) leaving to the user the choice to where save the output file. I think I'd prefer the latter.
# enable dumping documents and using the default filename for the path
elastic-package test system -v --dump-docs-
# enable dumping documents and using a custom path
elastic-package test system -v --dump-docs --dump-docs-output /path/to/dump.jsonDoing it like this, it would be similar on how coverage is set in elastic-package:
--test-coverageto enable the feature,--coverage-formatto choose which format is required.
There was a problem hiding this comment.
I think this comes down to a philosophical position; I think having a default location for things like this is a significant misfeature. IMO actions should require explicit requests. Including things like this in the config add complexity to the user experience and is also IMO a misfeature.
I'm not entirely sure what the gain is for users in adding this additional complexity.
There was a problem hiding this comment.
IMO I think most users (from my POV) would like just to download/dump the documents without really requiring setting any specific path or prefix. That's why I would say that the most common usage would be (not requiring to set any prefix there):
elastic-package test system -v --dump-docsMoreover, doing it like this, it would enable this feature and it could be coded to be equivalent to set the ELASTIC_PACKAGE_TEST_DUMP_SCENARIO_DOCS environment variable. So users setting the environment variable and setting that flag can expect the same exact behavior of elastic-package.
For the other flag I proposed, I thought there could be users that require some specific known file, for instance they are running tests with different packages and they want those files to be easily identified. That's my reasoning to add that new --dump-docs-output or --dump-docs-prefix flag, so they can decide how to save those documents.
# allow a full path
elastic-package test system -v --dump-docs --dump-docs-output /path/store/documents.json
# or as you proposed adding a prefix and save them in the same location where elastic-package runs
elastic-package test system -v --dump-docs --dump-docs-prefix "package-test"The concern I have of writing those new documents in the same location as elastic-package runs is that this could lead to users committing and pushing those files in the repository inadvertently.
There was a problem hiding this comment.
IMO I think most users (from my POV) would like just to download/dump the documents without really requiring setting any specific path or prefix.
This is certainly not what I'd want; doing this means that I cannot script obtaining the data.
In terms of work, specifying a location is easier than digging through logs to find the name that the program chose. In my ideal world, the output would not be to a file whose name was under the control of the program at all. IMO this should be entirely under the control of the user, but I made a concession to the current behaviour on the basis that it was still relatively easy to obtain the files if the prefix was known.
There was a problem hiding this comment.
Ok, got it.
Could it be changed what the value would indicate for this flag? Instead of setting a prefix for files, this flag could set the path to a folder.
I'm trying to be consistent with other commands/parameters used in elastic-package. For instance, in these commands --output is a path to a folder:
- dump stack logs:
elastic-package stack dump --output <path> - dump assets:
elastic-package dump agent-policies --output <path>
That folder should be checked (and created if needed) by elastic-package. And it would be the folder where the JSON files are stored.
Files stored in that folder could follow the same naming patterns as it was set when setting the environment variable:
dumpPath = filepath.Join(dumpFolderBase, fmt.Sprintf("elastic-package-test-docs-dump-%s.json", time.Now().Format("20060102150405")))So, elastic-package would be like this:
elastic-package test system -v --dump-logs "docs_folder_package/"Probably, doing it like this, the parameter could be renamed as --dump-logs-to.
There was a problem hiding this comment.
I think I'd also prefer this flag to specify a folder instead of a prefix.
Which behaviour would you prefer.
mkdir -p ${outdir} echo ${result} > ${outdir}/${prefix_slug}-$(date).ndjsonmkdir ${outdir} echo ${result} > ${outdir}/${prefix_slug}-$(date).ndjson(where
${prefix_slug}is determined without user input and$(data)is the current timestamp suffix logic).
I prefer the first behavior, as this is more convenient for users.
This is also what elastic-package does at the moment in most of the cases, so it would be consistent with current behaviors:
.../elastic-package$ git grep Mkdir | wc -l
65
.../elastic-package$ git grep MkdirAll | wc -l
60
.../elastic-package$ git grep MkdirTemp | wc -l
4Though it is also true that maybe scripts don't always try to create parent directories, so for this case both options would work for me if you prefer the second behavior.
There was a problem hiding this comment.
I would prefer the second; explicit is better than implicit.
There was a problem hiding this comment.
Ok!
So then, IIUC with that option, the command would be like:
elastic-package test system --dump-docs <folder>And elastic-package would create the (last) folder and use the same prefix (prefix_slug as mentioned above) for the filename as set with the environment variable (elastic-package-test-docs-dump-%s.json). Is that right ?
There was a problem hiding this comment.
That's correct, but only if there is a path to the location where the directory would be created.
There was a problem hiding this comment.
I prefer 1. If it's making any directories, why not make them all?
If I were doing this work in a script, I'd write mkdir -p, because in addition to wanting make sure I can write to the requested destination, I don't want to fail if the destination directory already exists.
Also fix case of failing tests without validation in the case that the dump fails.
💚 Build Succeeded
History
cc @efd6 |
| err := dumpScenarioDocs(scenario.docs, dumpPath) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to dump scenario docs: %w", err) | ||
| dumpErr = fmt.Errorf("failed to dump scenario docs: %w", err) |
There was a problem hiding this comment.
just for understanding, why we change the behavior, so now the error is not thrown but joined to the result of validateTestScenario. Does it make sense to run r.validateTestScenario if this fails?
There was a problem hiding this comment.
Why would it be relevant to the success of the test if some aspect of the dump fails? As an example, if the dump fails because of an fs error, do we now not care that the validation is not valid?
There was a problem hiding this comment.
that is the question i asked in order to understand the change. before, the test was failing if dump failed, isn't it? i am just asking to understand, lacking context here.
There was a problem hiding this comment.
Yes. I think that behaviour was unhelpful. (see this for context)
chrisberkhout
left a comment
There was a problem hiding this comment.
I think the best story here would be:
I can run elastic-package test system -h and see an option for dumping documents ingested during the run of a system test scenario. I run with that option, possibly with a value, and know exactly where the documents will be written.
Several things were/are blocking that:
- (Old env var implementation) A temp dir has an unpredictable name.
- If you give a name prefix you still need to know the suffix.
- If the suffix includes a timestamp, you need to know the time it was run (may be okay if you only want the latest).
- If your system test run covers multiple scenarios, you need to find the file (timestamp) that corresponds to the run you're interested in (or if you want all, you still need to know which is which).
I think having a default is good for the manual dev workflow, and having a destination parameter is good for scripting.
If the option took a pattern and had a default like ./system-test-docs-%DATASTREAM-%SCENARIO-%TIMESTAMP.json, I think that would be getting pretty close. The default value also serves as compact documentation of the manual option.
System test scenarios don't have names, but the scenario files need to match ^test-[a-z0-9_.-]+-config\.yml$, so I guess we'd say the scenario defined in test-cel-bad-creds-config.yml is called cel-bad-creds (unique for a data stream).
If the user asks for an output location that doesn't exist, then it's probably good to create it for them. But I think no directories or all directories are both okay options.
It might be worth doing something to avoid the confusion of an earlier scenario's dump being overwritten by a later scenario in the same run. Possibly appending rather than overwriting and possibly having a wrapper with some metadata, or logging when a file exists and possibly adding a -COUNTER suffix.
Philosophical position:
Simplicity is what's important, especially simplicity for the user. UNIX does a version of it well.
An option to say "write docs here" isn't simple if you still have to look at the code or the logging or all recently created files to know where the docs are.
Please take a look.