Anonymous | Login | 2024-12-21 23:57 MST |
Main | My View | View Issues | Change Log | Roadmap | Repositories |
View Issue Details [ Jump to Notes ] | [ Issue History ] [ Print ] | ||||||||
ID | Project | Category | View Status | Date Submitted | Last Update | ||||
0000025 | ClearOS | app-multiwan - Multi-WAN | public | 2010-02-08 11:53 | 2013-02-02 12:22 | ||||
Reporter | Vejlefjordskolen | ||||||||
Assigned To | dsokoloski | ||||||||
Priority | normal | Severity | tweak | Reproducibility | unable to reproduce | ||||
Status | closed | Resolution | fixed | ||||||
Platform | OS | OS Version | |||||||
Product Version | 5.1 | ||||||||
Target Version | Fixed in Version | 5.2 | |||||||
Summary | 0000025: Syswatch periodically reports interfaces as down | ||||||||
Description | The syswatch service seems to be having problems sending pings, as it reports interfaces as down randomly and then removes them and restarts the firewall, at which point they are brought down even though they were functioning fine. This happens many times a day; a quick look at the syswatch log, shows about one hundred mentions of "down" for each external interface every day. This is a problem as it breaks active connections through the interface that is shut down. We have four external interfaces, and the problem began as three of the connections were changed from DHCP to static setups. The last external interface was always static and never had any problems before the other interfaces were changed to static, but the problem also affects this interface. | ||||||||
Additional Information | We've been having this issue for a while, and first thought it was an ISP problem, but they've checked everything and they did find a few minor issues, but even after they've been corrected, we're still experiencing this problem. After our ISP ensured us that there was no problem on their end, I logged on with SSH and monitored the syswatch log in real-time (tail -f /var/log/syswatch), and when syswatch began reporting errors on an interface (e.g. eth3), I manually did a ping to the same server syswatch wasn't able to ping from that interface (e.g. ping 69.90.141.72 -I eth3) and the manual ping worked. This test showed that the server was indeed able to ping from the interface, but syswatch was somehow having problems, which leads to believe that it is a problem in the syswatch software itself. I've tried changing different parameters in /etc/syswatch, but nothing has helped so far. I hope you can find a fix for this problem soon :) | ||||||||
Tags | No tags attached. | ||||||||
Attached Files | |||||||||
Notes | |
(0000026) user2 2010-02-08 17:57 |
Hi there. Every time we have seen this reported, the root cause was: - A network loop - A problem with the ISP - Some other network issue Using the "ping" command (even with the -I flag) does not guarantee that the network packet goes out the correct interface. The -I flag merely sets the source address. In fact, I don't know a good way to force a ping packet down a specific network interface using the command line (maybe netcat?). Next time it happens, use the tcpdump command to see what's really happening with network traffic: tcpdump -i eth3 icmp Don't be surprised if you see a ping test go out on eth3 and then come back on another interface like eth2. I have personally seen this a handful of times. Strange behavior that shouldn't work in my mind... but it does! |
(0000027) Vejlefjordskolen (reporter) 2010-02-08 23:51 |
You're absolutely right! In my test they went out eth3 and came back on eth0. How can this be fixed? Is it our ISP that needs to fix their network configuration? Thanks for the quick response. I didn't want to believe that it was ClearOS, as we have had quite a few problems with bad quality connections at our location, but our ISP insisted that there was no problem, and my (inadequate) test showed that the pings were being replied to, so I thought there was only our system left to blame. |
(0000028) user2 2010-02-09 11:56 |
Fundamentally, it is an ISP issue. Pragmatically, it is a ClearOS issue since we shouldn't count on the ability of ISPs to fix the problem. Let me see if there's a workaround for ClearOS. More to come! |
(0000032) Vejlefjordskolen (reporter) 2010-02-16 05:40 |
Having looked at the ARP traffic (tcpdump -i eth0 arp), I've noticed that the server sometimes responds with the wrong IP address for a MAC. The server responded with the correct MAC on a who-has coming in on the correct interface for the IP, but when a who-has came for the IP address of one of the other NIC's, the server responded with the MAC address of the current NIC and not with the MAC of the NIC that actually has that address. I believe that this may have some relevance, as we seem to be poisoning the ARP cache of our ISP. Is this still an ISP issue? |
(0000033) Vejlefjordskolen (reporter) 2010-02-16 06:53 |
Ok, it seems I have fixed the issue... The problem was that arp_filter option of the external NIC's was set to 0. This meant that all external interfaces answered with their own MAC whenever our ISP broadcasted for a specific IP address. I found a description at the following URL: http://www.linuxinsight.com/proc_sys_net_ipv4_conf_eth0_arp_filter.html [^] It specifically mentions that having this option set to 0 could give problems with load balancing setups. This should be fixable by you :) |
(0000034) user2 2010-02-17 10:37 |
Nice detective work Vejlefjordskolen! |
(0000035) dsokoloski (developer) 2010-02-17 11:00 |
Committed revision 2560. |
Issue History | |||
Date Modified | Username | Field | Change |
2010-02-08 11:53 | Vejlefjordskolen | New Issue | |
2010-02-08 17:57 | user2 | Note Added: 0000026 | |
2010-02-08 17:57 | user2 | Reproducibility | have not tried => unable to reproduce |
2010-02-08 17:57 | user2 | Status | new => feedback |
2010-02-08 23:51 | Vejlefjordskolen | Note Added: 0000027 | |
2010-02-09 11:56 | user2 | Note Added: 0000028 | |
2010-02-09 14:06 | user2 | Severity | major => tweak |
2010-02-16 05:40 | Vejlefjordskolen | Note Added: 0000032 | |
2010-02-16 06:53 | Vejlefjordskolen | Note Added: 0000033 | |
2010-02-17 10:37 | user2 | Note Added: 0000034 | |
2010-02-17 10:40 | user2 | Status | feedback => assigned |
2010-02-17 10:40 | user2 | Assigned To | => dsokoloski |
2010-02-17 11:00 | dsokoloski | Resolution | open => fixed |
2010-02-17 11:00 | dsokoloski | Fixed in Version | => 5.2 |
2010-02-17 11:00 | dsokoloski | Note Added: 0000035 | |
2010-08-26 08:38 | user2 | Status | assigned => confirmed |
2013-02-02 09:00 | user2 | Status | confirmed => resolved |
2013-02-02 12:22 | user2 | Status | resolved => closed |