Skip to content

Separate checker for communication tasks #1689

@cfalas

Description

@cfalas

Consider a communication task where you make queries, and your score is dependent on the number of queries that you make (e.g. let $S=Q^*/Q$ where $Q$ is the number of queries you make and $Q^*$ is the number of queries of some admin-provided solution).

Currently, AFAICT the possible ways to do this are:

  1. Have $Q^*$ be part of the testcase input file. This can be done in a transparent way to the contestant because the communication will not involve $Q^*$. However, this breaks down for user tests, where the contestants need to know to include some arbitrary $Q^*$ in their input, and then reverse the scoring function to find how many queries they used (since they don't know what the true value of $Q^*$ is).
  2. Have the communication manager run the optimal solution inline to compute $Q^*$ after it finishes communicating with the contestant's solution. This is quite wasteful since we're essentially running both the contestant's solution and the admin solution for every submission, even though the admin solution is static.
  3. Have a batch task, where:
    a. $Q^*$ is stored in the testcase "correct" outputs.
    b. A stub does the "communication" part of the problem, and writes $Q$ to stdout at the end
    c. A checker reads $Q$ from the contestant's output and $Q^*$ from the correct output, and finds the score as $Q^*/Q$.
    This is bad because the contestant's solution and the grader are running in the same process so the contestant could access restricted data.

Instead, I propose:

  1. Communication task types get an extra parameter, checker, which is either manager or separate (or something along those lines)
  2. In the manager case, the behavior is identical to the current one, i.e. the manager prints a standard manager output and the score is based on that.
  3. In the separate case, admins provide both a manager executable as well as a checker executable. The manager writes to stdout any text, (this would be the "contestant output", in my example this would be $Q$). Then, the checker executable would run in the same way as for batch tasks, i.e. compare the contestant output ($Q$) with the correct output ($Q^*$) and find a score

This is basically the same as (3) above, but keeping the nice properties of the separation provided by communication tasks. Note that we can make this a completely new task type instead (since the behavior is significantly different depending on the value of the checker task type parameter).

(If we do this, then batch tasks become a strict subset of communication tasks and in theory can be replaced with a "default" manager which just redirects stdin -> fifo to contestant and fifo from contestant -> stdout, but this would add some overhead and lots of SIGPIPE headaches so unifying batch and communication is probably both out of scope and not worth it)

I've seen a few problems where this would be beneficial, but all of them where of the same shape (i.e. the scoring formula depends on some constant which is not given to the contestants)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions