static IPs not being routed to if idle

Internet access discussion, including Fusion, IP Broadband, and Gigabit Fiber!
33 posts Page 3 of 4
by doctorfb » Fri Mar 01, 2013 10:29 am
Ok, I've switched over to a dedicated host with one ethernet port with default gateway pointing to 173.228.5.1:
IP: 173.228.5.243
MAC: 00:03:47:06:f7:ba (shows up as Intel_06:f7:ba in wireshark).
The host is live now and I've done a ping to 173.228.5.1 to prime things.
I'll run a few tests now.
by tdo » Fri Mar 01, 2013 10:48 am
doctorfb wrote:Ok, I've switched over to a dedicated host with one ethernet port with default gateway pointing to 173.228.5.1:
IP: 173.228.5.243
MAC: 00:03:47:06:f7:ba (shows up as Intel_06:f7:ba in wireshark).
The host is live now and I've done a ping to 173.228.5.1 to prime things.
I'll run a few tests now.
It's looking a little more normal to me from my end now. I see ARP from you every 60 seconds when I'm running a ping towards you. When I stop and start the ping at random intervals it appears to always work.
by doctorfb » Fri Mar 01, 2013 2:11 pm
tdo wrote:
doctorfb wrote:Ok, I've switched over to a dedicated host with one ethernet port with default gateway pointing to 173.228.5.1:
IP: 173.228.5.243
MAC: 00:03:47:06:f7:ba (shows up as Intel_06:f7:ba in wireshark).
The host is live now and I've done a ping to 173.228.5.1 to prime things.
I'll run a few tests now.
It's looking a little more normal to me from my end now. I see ARP from you every 60 seconds when I'm running a ping towards you. When I stop and start the ping at random intervals it appears to always work.
Yes, I am seeing the same (running wireshark on the host). Doing traceroute from an external host after a very long quite period appears to always get through now. But, the packets begin received and sent look identical to what I was seeing with the previous host.
So, what's the difference? From the networks perspective, all hosts look alike (and should be treated as such). My other hosts are VMs (running on vmware), but have unique IPs and MACs. So, too, was my previous test with my machine which had two ethernet ports, one of which was isolated off for testing. Why is this latest host being treated differently?
Is there a way you can check where/how the MACs are being routed on your end? Is there possible some overlap going on whereby my MACs aren't truely unique (strange as that might seem) and packets are being misrouted?
by doctorfb » Fri Mar 01, 2013 11:53 pm
Ok, for fun I switch out the test machine and plugged the network back into my previous test machine with two ports. Same setup and config as it had before. But, now, I seem to be able to get packets from a remote host to it (still doing the traceroute test from bluemoon.net), reguardless of how long I let it sit idle.
What is going on?
Again, I have strict gateway routes for just 173.228.5.0 and 64.200.84.0, so those two subnets are really the only thing that the host will respond to on that interface. Nothing different from my previous attempts, really. Except, I did an initial 'arping -b 173.228.5.1', this time, to prime things. I'm not sure I can believe this would be of any real importance, since I was doing a regular ping which would have evoked the equivalent of the arping packets. I'll leave this up tonight and check it in the morning.
by doctorfb » Sat Mar 02, 2013 8:45 pm
Some further testing shows that my dual-port machine is always addressable now, reguardless of how long a quiet time there is. This is a change in behaviour from my previous testing a few days ago. I haven't done anything different in my setup from before, so it must be something operational that I'm not aware of.
My VMs, however are still exhibiting the problem. I briefly disabled the 60-second ping on my webserver and, sure enough, after a few minutes it becomes inaccessable remotely. For fun, I had the webserver do a few arping's and ping's both in broadcast and unicast to 173.228.5.1, with the hope that things would sync properly. But it appears to not have helped.
I have some wireshark captures of both my test host and the VM and compared them for requests and responses. I see nothing different between them, other than after a quiet period the VM simply doesn't get packets anymore. I do notice that after the quiet, the first ARP request goes unanswered, but the second ARP request gets a reply (from the DSLAM: Cisco_8b:52:c6). This is pretty consistent.
You mentioned that the DSLAM updates it's cache when it sees packets come over the DSL line for a specific IP (like the initial, gratuitous, ARP?). Is there a specific packet it's looking for that I could, perhaps, manufacture, or is it any packet?
It there any possibility that the known MAC prefix for VMWare controllers (Vmware_be in this case) are treated differently by the DSLAM? There's got to be something!
by waynesung » Mon Mar 04, 2013 8:57 pm
From your Mar 02 post:
> I briefly disabled the 60-second ping on my webserver and, sure enough, after a few minutes it becomes inaccessable remotely.

When this host is inaccessible from the outside, can one of your inside machines reach it? It would be ideal if you can check using one machine that already has an arp in the webserver and another that does not have an arp.

"A few minutes" still sounds like a layer 2 timeout to me. Don't some VM implementations have pseudo ethernet switches in them?
by doctorfb » Tue Mar 05, 2013 2:43 pm
waynesung wrote:From your Mar 02 post:
> I briefly disabled the 60-second ping on my webserver and, sure enough, after a few minutes it becomes inaccessable remotely.

When this host is inaccessible from the outside, can one of your inside machines reach it? It would be ideal if you can check using one machine that already has an arp in the webserver and another that does not have an arp.
I've tried this experiment and, yes, any machine on my side of the DSL line (either plugged into the LAN switch on the ZyXEL, or another VM on the same subnet) can access the webserver. My machines claim to have a 60 second ARP cache timeout (according to the value of gc_stale_time in /proc) but I notice the cache entries persist much longer than that (about 5 minutes). Anyway, after the magic timeout, my other machines can access the webserver just fine.
waynesung wrote:"A few minutes" still sounds like a layer 2 timeout to me. Don't some VM implementations have pseudo ethernet switches in them?
Yes, VMWare can emulate a virtual switch for a private subnet shared between VMs, but I'm not using that here. The VMs are bridged over the host systems ethernet interface which is physically plugged into the ZyXEL.. I'm doing my wireshark tracing using the hosts physical net device and thus I can see all packets to/from all VM's bridged on that interface. What I observe is that I'm not seeing any packets coming in from the DSL side of things after "a few minutes". :-(
As for this being a layer-2 timeout, well, "whose" layer-2?. I still maintain this is happening across the line over in Sonic's neck of the woods. I just can't prove it.
I have one more test scenario to try which is to change the MAC address of the webserver VM to one from an old physical ethernet card I have on hand (but not plugged in). That should tell me if this is, perhaps, related to VMWare's MAC prefix, or, perhaps, someone else on the same DSLAM is using VMWare and happens to have the same MAC.
I'm still wondering about a comment one of the Sonic support people said to me about needing to do a ping every 2 minutes, like they knew about this problem, but could not explain to me why. I don't get that.
Anyway, more testing..
by waynesung » Mon Mar 11, 2013 2:20 pm
>I have one more test scenario to try which is to change the MAC address of the webserver VM to one from an old physical ethernet >card I have on hand (but not plugged in). That should tell me if this is, perhaps, related to VMWare's MAC prefix, or, perhaps, >someone else on the same DSLAM is using VMWare and happens to have the same MAC.

Ah yes. If there is a duplicate mac address in a switch then packets get misdirected, and your periodic probes force the entry back to your port.
by doctorfb » Sat Mar 16, 2013 8:14 pm
As a followup on this, I changed the MAC address of the webserver to that of an old ethernet card I have (not installer in any computer), disabled the 60 second ping and now the VM can be accessed externally all the time!
I'm still not sure if the problem was a duplicate MAC (ie: someone else has the same address) or that somethinv has it in for VMWare prefixed MACs, but at least for now my problem is solved. I'll just put my doner card in a safe place with a note that says 'Do Not Use!'
I'll have to scrounge up another card to mooch a MAC from for my other VM machine, but I expect to have similar results.
by Guest » Sun Mar 17, 2013 2:59 am
What was the VM's MAC address? It's possible others are using VMware and the product does not generate random MAC addresses. Here's my 2 VMnets' MAC: 00-50-56-C0-00-0[18]. It's not accessible outside my firewall but it would be interesting to see what your old one was in relationship to mine.
33 posts Page 3 of 4

Who is online

In total there are 143 users online :: 2 registered, 0 hidden and 141 guests (based on users active over the past 5 minutes)
Most users ever online was 999 on Mon May 10, 2021 1:02 am

Users browsing this forum: Bing [Bot], Google [Bot] and 141 guests