Use validator output if it exits first, for interactive problems #77

simonlindholm · 2018-04-10T23:01:23Z

As per the commit message:

Before this change, if a validator detected, say, an invalid input
format and exited, it would close its output pipe, and the submission
would proceed to read EOF and possibly crash/TLE. The output of the
submission would then be displayed to the user.

This isn't great -- it either results in confusing judgements, or
necessitates adding convoluted clean exit protocols to the problem to
deal with these cases. In the latter case, the contestant needs to add
probably-unnecessary hard-to-test code, and the description of the
protocol can bog down the problem statement (see e.g. "Go, Gopher" from
Google Code Jam 2018: https://goo.gl/ccT6t3).

This commit takes the stand that if the validator exits first with a WA
status, that should take precedence. Note that this may result in WA
statuses where the time limit is exceeded.

With a few other cleanups thrown in.

Changes the output format of interactive.cc, I don't know if that matters for other consumers than kattis.

It mixed spaces and tabs.

Before this change, if a validator detected, say, an invalid input format and exited, it would close its output pipe, and the submission would proceed to read EOF and possibly crash/TLE. The output of the submission would then be displayed to the user. This isn't great -- it either results in confusing judgements, or necessitates adding convoluted clean exit protocols to the problem to deal with these cases. In the latter case, the contestant needs to add probably-unnecessary hard-to-test code, and the description of the protocol can bog down the problem statement (see e.g. "Go, Gopher" from Google Code Jam 2018: https://goo.gl/ccT6t3). This commit takes the stand that if the validator exits first with a WA status, that should take precedence. Note that this may result in WA statuses where the time limit is exceeded.

simonlindholm · 2018-04-13T23:17:54Z

Now also with another (not yet tested) fix for validator SIGPIPEs. Unrelated but pretty similar to (and dependent upon) the rest of the PR.

simonlindholm · 2018-05-03T18:09:02Z

@austrin Think you could review this? Does the approach make sense?

simonlindholm · 2018-05-17T12:35:37Z

@austrin Ping! Or are you trying to safeguard yourself against interactive problems in the coding cup finals? ;)

simonlindholm · 2018-05-31T21:49:51Z

@austrin Re-ping.

austrin

I agree with the goal of giving the validator a bit more say. One general concern I have however is that this breaks the invariant "judgement = TLE <=> runtime > timelim", and I really don't want to do that. I can see two ways of addressing that:

Change it so that if runtime > timelim then result is TLE regardless of what validator says (but if submission crashed or exceeded walltime limit, which are probably more common ways of failing after validator has terminated, then validator gets its say).
Set runtime=timelim in the case when validator terminated first with WA but submission ran over time

I think option 2 is pretty ugly so would prefer option 1.

austrin · 2018-06-28T08:38:41Z

support/interactive/interactive.cc

+volatile int val_pid = -1, user_pid = -1;
+volatile int user_status = -1, val_status = -1;
+volatile static rusage user_ru;
+volatile static rusage val_ru;


Why do you think these (and others below) need to be volatile?

They are used from a signal handler, so it seemed "more correct" (even if still actually incorrect -- only volatile sig_atomic_ts and atomics are allowed, and even then we have race conditions that make the whole approach wrong). But until the signal handler is fixed properly, this may at least prevent the compiler from reordering some assignments etc. which I wouldn't be surprised if it does.

Makes sense!

austrin · 2018-06-28T08:44:40Z

support/interactive/interactive.cc

+	while (remaining > 0) {
+		int status;
+		struct rusage ru;
+		int r = wait4(-1, &status, 0, &ru);


Change this to wait3?

austrin · 2018-06-28T08:51:16Z

support/interactive/interactive.cc

+		if (r == user_pid) {
+			// In case of broken pipes, let validator decide
+			user_status = (WIFSIGNALED(status) && WTERMSIG(status) == SIGPIPE ? 0 : status);
+			memcpy((void*)&val_ru, &ru, sizeof(rusage));


This should be user_ru not val_ru.

simonlindholm · 2018-06-28T09:39:48Z

Change it so that if runtime > timelim then result is TLE regardless of what validator says (but if submission crashed or exceeded walltime limit, which are probably more common ways of failing after validator has terminated, then validator gets its say).

With the PR as is walltime limit exceeded will practically never happen when the validator exits early, because then we close all pipes and you cannot block on them forever. We could of course decide not to close the pipes and then wait for the wall time limit to hit, but that seems somewhat silly in that it only prolongs time until error...

How about setting runtime = min(validator exit time, submission exit time)?

austrin · 2018-06-28T09:44:09Z

With the PR as is walltime limit exceeded will practically never happen when the validator exits early, because then we close all pipes and you cannot block on them forever. We could of course decide not to close the pipes and then wait for the wall time limit to hit, but that seems somewhat silly in that it only prolongs time until error...

Right, I agree. (But the crashing part is still probably the most common way for a solution to fail after the validator has terminated.)

How about setting runtime = min(validator exit time, submission exit time)?

I don't think that makes sense, the two programs may need very differing amounts of CPU time and there is no reason why the CPU time of the validator would be any reasonable measure of the CPU time of the submission. However, one possibility that I think would make a lot of sense would be to fetch the current CPU time of the submission process at the time the validator exits.

simonlindholm · 2018-06-28T09:49:30Z

Right, I agree. (But the crashing part is still probably the most common way for a solution to fail after the validator has terminated.)

If it's just "the most common way" then it doesn't help because you still need to know about the edge case... Also, I'm not sure my experiences agree with crashing being more likely than TLE.

However, one possibility that I think would make a lot of sense would be to fetch the current CPU time of the submission process at the time the validator exits.

Sounds good! I'll see if I can make this work here and in Kattis (tomorrow or so).

simonlindholm · 2018-06-28T11:31:29Z

Also worth mentioning: the first approach I had to this was to kill the submission when the validator exited; however, it turns out that that doesn't work well with isolate, which runs as sudo and is thus unkillable without complex hackery. Fetching the current CPU time should have the same effect as that (UI-wise), nicely enough.

niemela · 2018-06-28T18:04:47Z

However, one possibility that I think would make a lot of sense would be to fetch the current CPU time of the submission process at the time the validator exits.

Sounds good! I'll see if I can make this work here and in Kattis (tomorrow or so).

Agreed, that would give the expected results. Because what you would actually really want is...

Also worth mentioning: the first approach I had to this was to kill the submission when the validator exited;

... to kill the submissions as soon as the validator has decided to give up.

however, it turns out that that doesn't work well with isolate, which runs as sudo and is thus unkillable without complex hackery. Fetching the current CPU time should have the same effect as that (UI-wise), nicely enough.

👍

simonlindholm · 2018-07-01T17:20:24Z

It appears to be pretty hard to get to that information. wait4 doesn't return usage information about running processes; you have to parse /proc/<id>/stat. However, that only gives you the CPU usage of the process and the sum of CPU usage of terminated-and-waited-for children, and if the process is, say, isolate, both of these values are zero. Which means we only have bad options remaining:

do some sort of process tree traversal, with all the racefulness that implies. Could be improved by SIGSTOP'ing the process first, but ugh. In a sense it's OK if we mess up and return an invalid value because the result will be WA anyway, but...
kill the submission as soon as the validator stops, and on Kattis's side try to run the program with extended caps that allow killing privileged processes (setcap cap_kill+ep). Hard to get right (security issues, and is it even OK to kill isolate half-way through?).
pull a runtime out of thin air, e.g. 0 or exactly the TL.

austrin · 2018-08-23T08:42:59Z

There is also the (also bad) option of keeping TLE if process uses too much CPU. I would slightly prefer that, but I am also fine with capping runtime at timelim in the case when validator decides on WA before the user process exits.

simonlindholm · 2018-08-23T13:55:54Z

There is also the (also bad) option of keeping TLE if process uses too much CPU. I would slightly prefer that, but I am also fine with capping runtime at timelim in the case when validator decides on WA before the user process exits.

In my view that's more or less equivalent to (maybe slightly worse than) wontfixing this PR, since it would still require explaining in the problem statement that you must (somehow, by some problem-specific scheme) exit nicely when the validator exits to avoid confusing verdicts. But sure, it's an option.

Anyway, I pushed a commit for the runtime = min(runtime, timelim) behavior (not very well tested but I expect it to work), and another to address the review comments you had earlier.

austrin · 2018-08-23T20:14:17Z

Sounds good. How about also adding a test submission to examples/guess which gets WA this way?

simonlindholm · 2018-09-19T09:08:54Z

Thanks for merging! I agree that a test is a great idea, I'll try to get to that at some point if you don't do it before me. Also, does this need some sort of spec update or coordination with other judges?

…, override TLE. See Kattis/problemtools#77 for an in-depth discussion. With this, all test submissions of the example guess problem give the expected answer, but we need to do some cleanup and UI/import changes before we can close DOMjudge#465.

…, override TLE. See Kattis/problemtools#77 for an in-depth discussion. With this, all test submissions of the example guess problem give the expected answer, but we need to do some cleanup and UI/import changes before we can close #465.

simonlindholm added 4 commits April 10, 2018 12:07

Extend .gitignore

0a058fb

Reindent interactive.cc

18091f2

It mixed spaces and tabs.

State the condition for the output limit warning

7e831d4

simonlindholm requested a review from austrin April 10, 2018 23:01

niemela added the in progress label Apr 10, 2018

Avoid validator->submission SIGPIPEs

228c623

simonlindholm mentioned this pull request May 3, 2018

Create new API with support for interactive programs. jsannemo/omogenjudge#5

Closed

austrin requested changes Jun 28, 2018

View reviewed changes

simonlindholm added 2 commits August 23, 2018 15:24

Set runtime = min(runtime, timelim) when validator exits first

6a078bb

Address review comments

aa9822b

austrin merged commit fec8d1e into Kattis:develop Sep 19, 2018

simonlindholm mentioned this pull request Sep 19, 2018

A test and a small fix for the TLE -> WA behavior #93

Merged

meisterT mentioned this pull request Jan 4, 2019

If the validator of a combined run/compare script exits first with WA, override TLE. DOMjudge/domjudge#484

Merged

simonlindholm mentioned this pull request Feb 7, 2019

Interactive problem: pipes can fill up #113

Open

simonlindholm deleted the interactive-wa branch April 17, 2019 14:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use validator output if it exits first, for interactive problems #77

Use validator output if it exits first, for interactive problems #77

simonlindholm commented Apr 10, 2018

simonlindholm commented Apr 13, 2018

simonlindholm commented May 3, 2018

simonlindholm commented May 17, 2018

simonlindholm commented May 31, 2018

austrin left a comment

austrin Jun 28, 2018

simonlindholm Jun 28, 2018

austrin Jun 28, 2018

austrin Jun 28, 2018

austrin Jun 28, 2018

simonlindholm commented Jun 28, 2018

austrin commented Jun 28, 2018

simonlindholm commented Jun 28, 2018

simonlindholm commented Jun 28, 2018

niemela commented Jun 28, 2018

simonlindholm commented Jul 1, 2018 •

edited

Loading

austrin commented Aug 23, 2018

simonlindholm commented Aug 23, 2018

austrin commented Aug 23, 2018

simonlindholm commented Sep 19, 2018

Use validator output if it exits first, for interactive problems #77

Use validator output if it exits first, for interactive problems #77

Conversation

simonlindholm commented Apr 10, 2018

simonlindholm commented Apr 13, 2018

simonlindholm commented May 3, 2018

simonlindholm commented May 17, 2018

simonlindholm commented May 31, 2018

austrin left a comment

Choose a reason for hiding this comment

austrin Jun 28, 2018

Choose a reason for hiding this comment

simonlindholm Jun 28, 2018

Choose a reason for hiding this comment

austrin Jun 28, 2018

Choose a reason for hiding this comment

austrin Jun 28, 2018

Choose a reason for hiding this comment

austrin Jun 28, 2018

Choose a reason for hiding this comment

simonlindholm commented Jun 28, 2018

austrin commented Jun 28, 2018

simonlindholm commented Jun 28, 2018

simonlindholm commented Jun 28, 2018

niemela commented Jun 28, 2018

simonlindholm commented Jul 1, 2018 • edited Loading

austrin commented Aug 23, 2018

simonlindholm commented Aug 23, 2018

austrin commented Aug 23, 2018

simonlindholm commented Sep 19, 2018

simonlindholm commented Jul 1, 2018 •

edited

Loading