Page 30 of 32

Re: New shell server transition

Posted: Thu Jun 07, 2018 3:41 pm
by scott
casner wrote:
yronwode wrote:
It seems that you're still working on it. Thanks, if so. I keep getting disconnected. It's not urgent, but i would prefer not having to keep re-connecting.

Scott mentioned turning off some keepalives. That may be counterproductive for this problem. So long as the underlying network connectivity is stable, sending keepalives avoids having NAT devices time out the connection.

[As an aside, I'll mention that back in the early days of the Internet (late 1970s and 1980s) there were folks working on packet radio who hated keepalives because their network connectivity was intermittent. Their TCP connections would break even if they weren't active during the time of a connectivity loss due to a keepalive being sent automatically. But that was before NATs created the converse problem.]


I would have thought that ssh (not tcp) keepalives would keep NAT devices from terminating the connection.

This is how it is set up in sshd_config:

Code: Select all

ClientAliveInterval 60
ClientAliveCountMax 10
TCPKeepAlive no


If I'm correct, this should withstand an outage of up to 10 minutes. Am I out to lunch?

-Scott

Re: New shell server transition

Posted: Thu Jun 07, 2018 4:04 pm
by casner
scott wrote:
If I'm correct, this should withstand an outage of up to 10 minutes. Am I out to lunch?

-Scott

That should work because the ssh-level keepalive takes the place of the TCP-level keepalive.

Re: New shell server transition

Posted: Thu Jun 07, 2018 6:12 pm
by scott
yronwode wrote:
It seems that you're still working on it. Thanks, if so. I keep getting disconnected. It's not urgent, but i would prefer not having to keep re-connecting.


If the disconnect happened around 3:50am, that's when I rebooted. The server wasn't allowing logins.

I've stopped using auditctl and am now using systemtap to monitor the unmounting of volumes. I'm hoping to catch it in the act next time the server falls over.

-Scott

Re: New shell server transition

Posted: Fri Jun 08, 2018 8:19 pm
by scott
scott wrote:
yronwode wrote:
It seems that you're still working on it. Thanks, if so. I keep getting disconnected. It's not urgent, but i would prefer not having to keep re-connecting.


If the disconnect happened around 3:50am, that's when I rebooted. The server wasn't allowing logins.

I've stopped using auditctl and am now using systemtap to monitor the unmounting of volumes. I'm hoping to catch it in the act next time the server falls over.

-Scott


I was up with the server this morning at 2am and 3am. Made a few changes beforehand that might be mitigating the problem.

It turns out sshfs was allocating a pty for every mount, when it really didn't have to. I disabled that, and made some changes regarding when sshfs volumes get unmounted. (Not often.) I'm going to have to rethink the mount management, we're already using pam_mount in a way that its designers probably didn't expect.

How has the server been? Feedback appreciated. :)

-Scott

Re: New shell server transition

Posted: Fri Jun 08, 2018 8:51 pm
by netllama
No issues since the outage a few days ago.

Re: New shell server transition

Posted: Tue Jun 12, 2018 3:20 pm
by scott
Knock on wood, I think we may have done a lot to stabilize the new shell server:

15:19:37 up 5 days, 11:40, 22 users, load average: 0.39, 0.65, 0.73

-Scott

Re: New shell server transition

Posted: Fri Jun 15, 2018 2:04 am
by scott
scott wrote:
Knock on wood, I think we may have done a lot to stabilize the new shell server:

15:19:37 up 5 days, 11:40, 22 users, load average: 0.39, 0.65, 0.73


After being up for a little over a week, I rebooted it to clear out some stale cruft. I have more work to do on the mount manager so that it will be able to have multi-month uptimes.

-Scott

Re: New shell server transition

Posted: Wed Jun 27, 2018 12:31 pm
by yronwode
I'll check in here occasionally. I am seeing your intentional reboots (maintenance, upgrades, etc.) disconnecting me. Thanks.

Re: New shell server transition

Posted: Wed Jun 27, 2018 2:53 pm
by scott
yronwode wrote:
I'll check in here occasionally. I am seeing your intentional reboots (maintenance, upgrades, etc.) disconnecting me. Thanks.


Sorry about that, it's a necessary evil. Still working on mount management.

-Scott

Re: New shell server transition

Posted: Sat Jul 28, 2018 10:03 am
by yronwode
Suddenly this morning my whole shell was cleared. How can i get all of my data back?