The Case of the Mysterious Missing MAC Address

Coriander is dead. Turmeric is alive. What does this mean? All the content on sftp.knarrnia.com went away except for the PDFs and various other ebook formats which I was smart enough to grab off the RAID before it entirely keeled over. The uptime for coriander was ultimately 20 minutes tops before it keeled over. Not bad for a computer I built in 2003 and then rebuilt when Drexel had the Heat Wave of Death which caused me to request an extension on the finals. The box was two RAIDed 80GB IDE drives, running OpenSuSE I had installed as a desktop and later simply retired to serving up content to my Nook and XBOX.

Turmeric, however, is a first gen nVidia motherboard. And if this is people’s experience with nVidia, I am entirely, absolutely done with them as a motherboard maker. My network is pretty standard for home use. I have an honest to god Cisco router, a Cisco WAP, a comcast bullshit cable modem which is probably going to have a terrible accident so I can get one that works, and all the devices meander through those. The XBOX is UPnP permissions, nothing else does. Turmeric/Coriander had a MAC address reservation so they would come up, get the right IP, and then the cisco firewalls would pass traffic to them as needed. It worked swimmingly well until Turmeric refused to get the IP I had reserved for it. It would always get a different IP than reserved, but it would get the IP consistently. I racked my brain on this problem for a few hours and finally broke out ettercap to see WTF it was doing.

Turns out the first gen nVidia motherboards do something really stupid with DHCP. Actually lets rewind for a minute – they generally do really stupid stuff. This motherboard has hardware RAID also, but it only works for the SATA drives. IDE? Shit out of luck. To further add insult to injury, the bootp stuff for jumpstarting a box? Doesn’t work. Never figured that one out. Finally there’s only two default devices you get set in the BIOS. For the moment it’s CDROM and then the first drive in the RAID, but to actually do the install I had to change CDROM to USB after burning out an image to it. What the heck guys?

Now, I’ll save you the boring TCP spec – When the nVidia board comes up it actually sends a DHCP packet on its own which is nice. The problem here is the HLEN of the packet is… 0. Yup. Someone didn’t know what to put in the field, so they send 0. This causes the router (thank god) to respond to FF:FF:FF:FF:FF:FF, which while it’s not correct, works because it’s a broadcast packet. The adaptor (seems) to configure itself, then Linux does something goofy where it sees the adapter is configured, so it sends out a release/renew, which the router, apparently knowing the MAC address but having an entry for a bogus MAC, sends out a different IP (next in pool) for the correct MAC address.

The BIOS, of course, doesn’t have a way to disable this “convenience feature” and to add insult to injury, dmesg doesn’t work in Linux because the BIOS is doing something funky by itself. For right now I’m just ignoring it. But seriously nVidia, fix your shit.

Linux Arcanum and SMART Warnings

If Languages Were Religions is riotously funny.

I finally figured out what’s wrong with my desktop. For the longest time the instrumentation was just weird. It would crash randomly, have strange bus problems (which I thought were related to aging video cards), and the voltage from the power supply would have a noticeable bit of noise from it. Other than the generic logs of “your computer has recovered from a serious error” there was nothing to point to. MEMTEST would show all the DIMMs had a bad line, so I just assumed the mobo was slowly dying and figured one day I would come home to it not working.

Finally one day I happened to be reading the syslog on my Linux box trying to track down this one idiot on a modem who was trying to hack it when I got the message:

Dec 17 08:29:39 HopsAndBarley smartd[2532]: Device: /dev/sdb, Failed SMART usage Attribute: 9 Power_On_Hours.

OH MY GOD SMART ACTUALLY WORKED. Basically it’s saying my old Linux drive, the one I use all the time, is crapping out. I checked to see where the spare was and realized that the spare became the windows drive (120GB) and my windows drive became my Linux drive. The spare-spare drive I had is a 10GB drive I used to use as a raw device for caching DVD data while authoring. Which means I have no device at all. So I have a choice. I can go through my windows drive and reload it, thus creating enough space for a Linux partition or I can run the computer without the Linux drive entirely and give up my primary OS for the sake of having anything to use at all.

Since the botnets have been a pain recently I came up with a new /etc/hosts.deny

ALL : .ru
ALL : .cn
ALL : UNKNOWN

Basically, if you’re from a .ru, or from .cn, or your IP doesn’t resolve to a hostname, you’re not connecting.

And of course all the other security stuff is in place like denying root login, which seems to be what most of the idiots out there are after.

Here’s the types of logs:

Dec 16 13:12:03 HopsAndBarley sshd[2917]: Invalid user t1na from 195.162.62.230

These actually go on for quite a few usernames and the guy’s working off a default list. These will now be denied outright by TCPWrappers since they’re caught by hosts.deny’s UNKNOWN directive.

Oct 24 20:19:11 HopsAndBarley sshd[9074]: Invalid user newsletter from 59.145.145.146
Oct 24 20:19:14 HopsAndBarley sshd[9079]: reverse mapping checking getaddrinfo for dsl-kk-static-146.145.145.59.airtelbroadband.in [59.145.145.146] failed – POSSIBLE BREAK-IN ATTEMPT!

That asshole is from india. I’m trying to decide if I want to blacklist India from connecting to me except that I have Indian friends. I simply set my SSH max auth retries down to 1 and set the “connect” timeout to 5 seconds making it prohibitively expensive time-wise to try this crap.

And finally this poor asshole wins the award:

Dec 8 14:52:29 HopsAndBarley sshd[15188]: Invalid user felix from 62.141.122.246
Dec 8 14:52:31 HopsAndBarley sshd[15193]: reverse mapping checking getaddrinfo for dial-up-1-118.spb.co.ru [62.141.122.246] failed – POSSIBLE BREAK-IN ATTEMPT!

Because it took him so long to connect, he was at it for over 12 hours.

Now, stuff that makes me less happy is that this is OpenSuSE. I love SuSE, it feels like RedHat Done Right. But some of their default security settings aren’t appropriate to a desktop system. I realize there’s probably times where having UNKNOWN hosts denied access would ruin someones websurfing experience, but having SSH respawn indefinitely with no delay or max auth retries is sloppy. On the other hand, OpenSuSE and SuSE in general is really good at not spawning services it doesn’t need, and the default firewall for a desktop host is really restrictive (actually, it denies all inbound traffic without matching outbound traffic). It’s OK to have this as a point defense, but in todays age of browser based exploits, it wouldn’t surprise me in the least to find out someone starts killing Linux desktops by connecting to localhost once they have your browser. A firewall is nice, but defense in depth is a requirement.