Skip to content

[DSIP-99][TaskPlugin] Save task output to a separate file#18098

Open
Mrhs121 wants to merge 4 commits intoapache:devfrom
Mrhs121:DSIP-99
Open

[DSIP-99][TaskPlugin] Save task output to a separate file#18098
Mrhs121 wants to merge 4 commits intoapache:devfrom
Mrhs121:DSIP-99

Conversation

@Mrhs121
Copy link
Copy Markdown
Contributor

@Mrhs121 Mrhs121 commented Mar 26, 2026

Was this PR generated or assisted by AI?

Purpose of the pull request

close #17791

Brief change log

Verify this pull request

This pull request is code cleanup without any test coverage.

(or)

This pull request is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(or)

Pull Request Notice

Pull Request Notice

If your pull request contains incompatible change, you should also add it to docs/docs/en/guide/upgrade/incompatible.md

@github-actions github-actions bot added UI ui and front end related backend test labels Mar 26, 2026
@Mrhs121 Mrhs121 changed the title [DSIP-99] initial commit [DSIP-99][TaskPlugin] Save task output to a separate file Mar 26, 2026
@Mrhs121
Copy link
Copy Markdown
Contributor Author

Mrhs121 commented Mar 26, 2026

initial commit

@SbloodyS SbloodyS added the DSIP label Mar 26, 2026
@SbloodyS SbloodyS added this to the 3.4.2 milestone Mar 26, 2026
@Mrhs121
Copy link
Copy Markdown
Contributor Author

Mrhs121 commented Mar 26, 2026

This commit was done a long time ago. When I just rebase the code, there were many conflicts. After using AI to automatically solve them, I found many error points. I need to recheck them


@Test
public void testQueryLogInSpecifiedProject() {
long projectCode = 1L;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dc38d34 this commit removed the related service, So I casually removed the useless code from the test as well

Copy link
Copy Markdown
Member

@SbloodyS SbloodyS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this PR ready to review? @Mrhs121

@Mrhs121
Copy link
Copy Markdown
Contributor Author

Mrhs121 commented Apr 7, 2026

Is this PR ready to review? @Mrhs121

I'm not ready yet. I haven't tested myself yet

@SbloodyS
Copy link
Copy Markdown
Member

SbloodyS commented Apr 7, 2026

Feel free to ping me when you are ready to review. @Mrhs121

@Mrhs121 Mrhs121 force-pushed the DSIP-99 branch 2 times, most recently from 0100189 to 111aeca Compare April 9, 2026 10:45
@Mrhs121
Copy link
Copy Markdown
Contributor Author

Mrhs121 commented Apr 9, 2026

Self-test results of common tasks

  • stored procedure task
截屏2026-04-09 16 02 11
  • http task
截屏2026-04-09 15 53 21
  • sql task

  • shell task

  • spark task

@sonarqubecloud
Copy link
Copy Markdown

Quality Gate Failed Quality Gate failed

Failed conditions
0.0% Coverage on New Code (required ≥ 60%)

See analysis details on SonarQube Cloud

Comment on lines +22 to +40
public enum TaskLogType {

LOG {

@Override
public String getLogPath(TaskInstance taskInstance) {
return taskInstance.getLogPath();
}
},
OUTPUT {

@Override
public String getLogPath(TaskInstance taskInstance) {
return taskInstance.getTaskOutputLogPath();
}
};

public abstract String getLogPath(TaskInstance taskInstance);
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to split this kind of logic from enum to a seperate class.

host varchar(135) DEFAULT NULL,
execute_path varchar(200) DEFAULT NULL,
log_path longtext DEFAULT NULL,
task_output_log_path longtext DEFAULT NULL,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
task_output_log_path longtext DEFAULT NULL,
task_output_log_path varchar(255) DEFAULT NULL,

Don't use longtext

Comment on lines +102 to +104
<logger name="TaskOutput" level="INFO" additivity="false">
<appender-ref ref="TASKOUTPUTLOGFILE"/>
</logger>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move under root?

Comment on lines 205 to +229
@@ -205,13 +214,19 @@ private Optional<CompletableFuture<?>> collectPodLogIfNeeded() {
String line;
try (BufferedReader reader = new BufferedReader(new InputStreamReader(watcher.getOutput()))) {
while ((line = reader.readLine()) != null) {
log.info("[K8S-pod-log-{}]: {}", taskRequest.getTaskName(), line);
if (StringUtils.isBlank(taskRequest.getTaskOutputLogPath())) {
log.info("[K8S-pod-log-{}]: {}", taskRequest.getTaskName(), line);
} else {
TASK_OUTPUT_LOGGER.info(line);
}
}
}
}
} catch (Exception e) {
log.error("Collect pod log error", e);
throw new RuntimeException(e);
} finally {
LogUtils.removeTaskInstanceLogFullPathMDC();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't only one logger is enough.
Don't need to do below judge.

if (StringUtils.isBlank(taskRequest.getTaskOutputLogPath())) {
                                log.info("[K8S-pod-log-{}]: {}", taskRequest.getTaskName(), line);
                            } else {
                                TASK_OUTPUT_LOGGER.info(line);
                            }

Comment on lines +245 to +257
try (
LogUtils.MDCAutoClosableContext ignored =
LogUtils.withTaskOutputLogPathMDC(taskRequest.getTaskOutputLogPath())) {
for (String line : (Iterable<String>) inReader.lines()::iterator) {
if (StringUtils.isBlank(taskRequest.getTaskOutputLogPath())) {
log.info(" -> {}", line);
} else {
TASK_OUTPUT_LOGGER.info(line);
}
taskOutputParameterParser.appendParseLog(line);
}
} finally {
LogUtils.removeTaskInstanceLogFullPathMDC();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which case taskRequest.getTaskOutputLogPath() can be empty?

Comment on lines +77 to +82
public Result<ResponseTaskLog> queryTaskLog(@Parameter(hidden = true) @RequestAttribute(value = Constants.SESSION_USER) User loginUser,
@RequestParam(value = "taskInstanceId") int taskInstanceId,
@RequestParam(value = "skipLineNum") int skipNum,
@RequestParam(value = "limit") int limit) {
return loggerService.queryTaskLog(loginUser, taskInstanceId, skipNum, limit);
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a request param logType is enough, don't add a new api.

Comment on lines +44 to 50
public TaskInstanceLogFileDownloadResponse getTaskLog(TaskInstance taskInstance) {
return getLocalWholeLog(taskInstance, TaskLogType.LOG);
}

public TaskInstanceLogFileDownloadResponse getTaskOutput(TaskInstance taskInstance) {
return getLocalWholeLog(taskInstance, TaskLogType.OUTPUT);
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public TaskInstanceLogFileDownloadResponse getTaskLog(TaskInstance taskInstance) {
return getLocalWholeLog(taskInstance, TaskLogType.LOG);
}
public TaskInstanceLogFileDownloadResponse getTaskOutput(TaskInstance taskInstance) {
return getLocalWholeLog(taskInstance, TaskLogType.OUTPUT);
}
public TaskInstanceLogFileDownloadResponse getTaskLog(TaskInstance taskInstance, TaskLogType logType) {
return getLocalWholeLog(taskInstance, TaskLogType.LOG);
}

Comment on lines +62 to 68
public TaskInstanceLogPageQueryResponse getTaskLog(TaskInstance taskInstance, int skipLineNum, int limit) {
return getLocalPartLog(taskInstance, skipLineNum, limit, TaskLogType.LOG);
}

public TaskInstanceLogPageQueryResponse getTaskOutput(TaskInstance taskInstance, int skipLineNum, int limit) {
return getLocalPartLog(taskInstance, skipLineNum, limit, TaskLogType.OUTPUT);
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public TaskInstanceLogPageQueryResponse getTaskLog(TaskInstance taskInstance, int skipLineNum, int limit) {
return getLocalPartLog(taskInstance, skipLineNum, limit, TaskLogType.LOG);
}
public TaskInstanceLogPageQueryResponse getTaskOutput(TaskInstance taskInstance, int skipLineNum, int limit) {
return getLocalPartLog(taskInstance, skipLineNum, limit, TaskLogType.OUTPUT);
}
public TaskInstanceLogPageQueryResponse getTaskLog(TaskInstance taskInstance, int skipLineNum, int limit, TaskLogType logType) {
return getLocalPartLog(taskInstance, skipLineNum, limit, TaskLogType.LOG);
}

@Mrhs121
Copy link
Copy Markdown
Contributor Author

Mrhs121 commented Apr 13, 2026

@ruanwenjun Thanks for the review. I summarized the feedback and these points make sense, I'll update.

  1. Unify the API by adding a logType parameter instead of introducing separate task output endpoints.
  2. Merge LocalLogClient into a single getTaskLog(..., logType) design for both full and paged queries.
  3. Refactor task output logging to remove the empty-path check and dual-logger branching in AbstractCommandExecutor.
  4. Keep TaskLogType lightweight and move task-instance-to-path resolution into a separate resolver/util class.
  5. Change task_output_log_path from longtext to a bounded string type such as varchar(255) across all related SQL definitions.

Regarding this idea :#17791 (comment).

My understanding is that the concern is not only about adding another file path field for task output, but also about the long-term storage model.

Instead of storing separate file paths like log_path and task_output_log_path, a more extensible approach would be to store a single directory path such as task_out_path, and place all task-instance-generated files under it, for example log, output, and possibly other generated files in the future.

maybe such as:

  • ${task_out_path}/task.log
  • ${task_out_path}/task.out
  • ${task_out_path}/stderr.log
  • ${task_out_path}/result.json

Comment on lines +138 to +159
private void logHttpResponse(String message, int statusCode, String checkCondition, String body) {
if (StringUtils.isBlank(taskExecutionContext.getTaskOutputLogPath())) {
if (checkCondition == null) {
log.info(message, httpParameters.getUrl(), statusCode, body);
} else {
log.error(message, httpParameters.getUrl(), statusCode, checkCondition, body);
}
return;
}
LogUtils.setTaskInstanceLogFullPathMDC(taskExecutionContext.getLogPath());
try (
LogUtils.MDCAutoClosableContext ignored =
LogUtils.withTaskOutputLogPathMDC(taskExecutionContext.getTaskOutputLogPath())) {
if (checkCondition == null) {
TASK_OUTPUT_LOGGER.info(message, httpParameters.getUrl(), statusCode, body);
} else {
TASK_OUTPUT_LOGGER.info(message, httpParameters.getUrl(), statusCode, checkCondition, body);
}
} finally {
LogUtils.removeTaskInstanceLogFullPathMDC();
}
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic like this should not be customized by users, but should be handled at spi level.

</appender>
</sift>
</appender>
<appender name="TASKOUTPUTLOGFILE" class="ch.qos.logback.classic.sift.SiftingAppender">
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add an automatic cleaning policy like other logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend DSIP test UI ui and front end related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[WIP][DSIP-99][TaskPlugin] Save task output to a separate file

3 participants