Separate checker for communication tasks

Consider a communication task where you make queries, and your score is dependent on the number of queries that you make (e.g. let $S=Q^\*/Q$ where $Q$ is the number of queries you make and $Q^*$ is the number of queries of some admin-provided solution). 

Currently, AFAICT the possible ways to do this are:

1. Have $Q^\*$ be part of the testcase input file. This can be done in a transparent way to the contestant because the communication will not involve $Q^\*$. However, this breaks down for user tests, where the contestants need to know to include some arbitrary $Q^\*$ in their input, and then reverse the scoring function to find how many queries they used (since they don't know what the true value of $Q^\*$ is).
2. Have the communication manager run the optimal solution inline to compute $Q^\*$ after it finishes communicating with the contestant's solution. This is quite wasteful since we're essentially running both the contestant's solution and the admin solution for every submission, even though the admin solution is static.
3. Have a batch task, where:
    a. $Q^\*$ is stored in the testcase "correct" outputs.
    b. A stub does the "communication" part of the problem, and writes $Q$ to stdout at the end
    c. A checker reads $Q$ from the contestant's output and $Q^\*$ from the correct output, and finds the score as $Q^\*/Q$.
  This is bad because the contestant's solution and the grader are running in the same process so the contestant could access restricted data.

Instead, I propose:
1. Communication task types get an extra parameter, `checker`, which is either `manager` or `separate` (or something along those lines)
2. In the `manager` case, the behavior is identical to the current one, i.e. the manager prints a standard manager output and the score is based on that.
3. In the `separate` case, admins provide both a `manager` executable as well as a `checker` executable. The `manager` writes to stdout any text, (this would be the "contestant output", in my example this would be $Q$). Then, the `checker` executable would run in the same way as for `batch` tasks, i.e. compare the contestant output ($Q$) with the correct output ($Q^\*$) and find a score

This is basically the same as (3) above, but keeping the nice properties of the separation provided by communication tasks. Note that we can make this a completely new task type instead (since the behavior is significantly different depending on the value of the `checker` task type parameter).

(If we do this, then batch tasks become a strict subset of communication tasks and in theory can be replaced with a "default" manager which just redirects `stdin -> fifo to contestant` and `fifo from contestant -> stdout`, but this would add some overhead and lots of SIGPIPE headaches so unifying batch and communication is probably both out of scope and not worth it)

I've seen a few problems where this would be beneficial, but all of them where of the same shape (i.e. the scoring formula depends on some constant which is not given to the contestants)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separate checker for communication tasks #1689

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Separate checker for communication tasks #1689

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions