Skip to content

Crash when attempting to open a TCP socket. #1438

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
0branch opened this issue Nov 14, 2019 · 10 comments
Open

Crash when attempting to open a TCP socket. #1438

0branch opened this issue Nov 14, 2019 · 10 comments

Comments

@0branch
Copy link

0branch commented Nov 14, 2019

Description of problem:
ksh crashes when attempting to establish a TCP connection with exec under certain situations.

Ksh version:
Version A 2020.0.0-beta1-166-g1025e369.

How reproducible:
Consistently, for certain inputs.

Steps to reproduce:

  1. Start an interactive ksh session
  2. exec 3<>/dev/tcp/ntp-b.nist.gov/13
  3. Observe crash

Actual results:

$ : print $KSH_VERSION; exec 3<>/dev/tcp/ntp-b.nist.gov/13
Version A 2020.0.0-beta1-166-g1025e369
Killed: 9

Expected results:

$ print $KSH_VERSION; exec 3<>/dev/tcp/ntp-b.nist.gov/13
Version AJM 93u+ 2012-08-01
$

Additional info:
Compare the above with the following, which seems to work as intended (at least, there's no immediate crash):

$ exec 3<>/dev/tcp/irc.freenode.net/6667

So, not all host/port pairs are triggering the problem. UDP not tested.

@krader1961
Copy link
Contributor

Please remember to include the platform on which you are observing the issue 😄 I can reproduce this on macOS but not Linux.

In this case the issue is that fd 3 is special on macOS due to something known as a "NECP descriptor": http://newosxbook.com/bonus/vol1ch16.html. You can see this by inserting a run_lsof(); statement between these two statements (the close(fd) is causing the OS to terminate the process with a SIGKILL):

ast/src/cmd/ksh93/sh/io.c

Lines 467 to 468 in ffbfaf6

if (fdnotify) (*fdnotify)(fd, SH_FDCLOSE);
close(fd);

The same failure occurs with macOS /bin/ksh (ksh93u+) and ksh93v-. Thus I'm marking this as an enhancement for improved compatibility with macOS rather than a bug since, at this time, there isn't an obvious bug in the usual sense.

@0branch
Copy link
Author

0branch commented Nov 15, 2019

Apologies for omitting platform information; yes, macOS (10.14.6).

The failure isn't the same for me under /bin/ksh (referenced in the original issue, perhaps it wasn't clear that this was output from an actual session); indeed, everything works as expected. With 93u+ I can successfully open fd 3 for r/w and read data from NIST:

$ exec 3<>/dev/tcp/ntp-b.nist.gov/13
$ read -ru3; read -ru3; print "$REPLY"
58802 19-11-15 04:39:55 00 0 0 543.6 UTC(NIST) *

whereas 2020.0.0-beta1-166-g1025e369 crashes on exec.

@0branch
Copy link
Author

0branch commented Nov 15, 2019

Here are two more datapoints:

  1. In a fresh ksh (2020.0.0-beta1-166-g) session, I can establish a connection to NIST on fd 3 if it supplants an existing connection to Freenode (which as outlined above, works):

     $ exec 3<>/dev/tcp/irc.freenode.net/6667  # no crash
     $ read -ru3; print "$REPLY"                         
     :niven.freenode.net NOTICE * :*** Looking up your hostname...
     $ exec 3<>/dev/tcp/ntp-b.nist.gov/13      # now this works   
     $ read -ru3; read -ru3; print "$REPLY"
     58802 19-11-15 04:48:48 00 0 0   7.9 UTC(NIST) *
    
  2. The NIST connection crashes if it occurs first (again, a fresh 2020.0.0-beta1-166-g session):

     $ exec 3<>/dev/tcp/ntp-b.nist.gov/13      # crash
     Killed: 9
    

where (2) is essentially the originally reported issue.

@krader1961
Copy link
Contributor

I tested this on macOS 10.15.1 using it's ksh93u+ binary and saw your problem:

$ /bin/ksh -c 'exec 3<>/dev/tcp/ntp-b.nist.gov/13'                                                             8 ms
fish: '/bin/ksh -c 'exec 3<>/dev/tcp/n…' terminated by signal SIGKILL (Forced quit)

However, it only happens when running that specific command. Starting an interactive /bin/ksh session and doing the exec based redirect does not result in a SIGKILL. Whereas I see the same failure mode for both scenarios (i.e., -c and interactive) using ksh built from the current git master branch and the ksh93v- (ast.beta) branch. So this is a bug even with the ksh93u+ release but is also an unwanted change in behavior introduced by the ksh93v- release.

@0branch
Copy link
Author

0branch commented Nov 15, 2019

Thanks for testing.

Just to clarify: do you agree that this is a bug? (I see that the label was removed just before your first reply.)

@0branch
Copy link
Author

0branch commented Nov 15, 2019

Additional datapoint: crash in 93u+ with

$ echo $KSH_VERSION
Version AJM 93u+ 2012-08-01
$ exec 4<>/dev/tcp/ntp-b.nist.gov/13
Killed: 9

while this doesn't crash g1025e369. (Note fd 4 rather than 3.)

@krader1961
Copy link
Contributor

Just to clarify: do you agree that this is a bug?

Whether this issue documents a bug or long standing behavior that should be changed is unclear. Is the behavior a ksh bug or a platform behavior that needs to be accommodated by ksh? The unwanted, unexpected, behavior occurs even with a 7+ year old ksh version. Not just the current source.

After sleeping on the issue I've decided this should be classified as a bug. Primarily because, even if this is due to a macOS quirk, a straightforward C program that performs the same action would "just work". This may be an implementation bug but I won't be surprised if it turns out to be a design bug. Specifically, with how fd numbers are dynamically remapped using dup2() when exec 3<>/some-path is handled.

@0branch
Copy link
Author

0branch commented Nov 16, 2019

Just to clarify: do you agree that this is a bug?

Whether this issue documents a bug or long standing behavior that should be changed is unclear.

ksh crashes with a legal (syntactic, documented) statement—so that strikes me as a bug. (Printing an error would be another matter.)

Is the behavior a ksh bug or a platform behavior that needs to be accommodated by ksh? The unwanted, unexpected, behavior occurs even with a 7+ year old ksh version. Not just the current source.

Should that make a difference? I thought the intent of this repository was to continue KornShell development, which would include fixing historic bugs as well as newly introduced ones?

After sleeping on the issue I've decided this should be classified as a bug. Primarily because, even if this is due to a macOS quirk, a straightforward C program that performs the same action would "just work".

Alternative reading: From a user perspective, crashing unexpectedly is buggy behaviour.

@krader1961
Copy link
Contributor

krader1961 commented Nov 16, 2019

Should that make a difference? I thought the intent of this repository was to continue KornShell development, which would include fixing historic bugs as well as newly introduced ones?

No to #1 and yes to #2, with a huge caveat. There are currently only two people who have contributed more than a single change in the past two years. Therefore, we have to characterize bugs that have existed for at least 7 years (back to the ksh93u+ release) as lower priority than bugs introduced since that release. If and when we have at least a handful of people regularly contributing or reviewing changes I'll be happy to prioritize old and new bugs based solely on the impact of the bug rather than its age. For now we have to care more about any bug people will notice that was not present in the ksh93u+ release.

@0branch
Copy link
Author

0branch commented Nov 16, 2019

Thanks for the explanation; this approach to prioritisation wasn't clear to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants