The Case of the Mysterious Missing MAC Address

Coriander is dead. Turmeric is alive. What does this mean? All the content on sftp.knarrnia.com went away except for the PDFs and various other ebook formats which I was smart enough to grab off the RAID before it entirely keeled over. The uptime for coriander was ultimately 20 minutes tops before it keeled over. Not bad for a computer I built in 2003 and then rebuilt when Drexel had the Heat Wave of Death which caused me to request an extension on the finals. The box was two RAIDed 80GB IDE drives, running OpenSuSE I had installed as a desktop and later simply retired to serving up content to my Nook and XBOX.

Turmeric, however, is a first gen nVidia motherboard. And if this is people’s experience with nVidia, I am entirely, absolutely done with them as a motherboard maker. My network is pretty standard for home use. I have an honest to god Cisco router, a Cisco WAP, a comcast bullshit cable modem which is probably going to have a terrible accident so I can get one that works, and all the devices meander through those. The XBOX is UPnP permissions, nothing else does. Turmeric/Coriander had a MAC address reservation so they would come up, get the right IP, and then the cisco firewalls would pass traffic to them as needed. It worked swimmingly well until Turmeric refused to get the IP I had reserved for it. It would always get a different IP than reserved, but it would get the IP consistently. I racked my brain on this problem for a few hours and finally broke out ettercap to see WTF it was doing.

Turns out the first gen nVidia motherboards do something really stupid with DHCP. Actually lets rewind for a minute – they generally do really stupid stuff. This motherboard has hardware RAID also, but it only works for the SATA drives. IDE? Shit out of luck. To further add insult to injury, the bootp stuff for jumpstarting a box? Doesn’t work. Never figured that one out. Finally there’s only two default devices you get set in the BIOS. For the moment it’s CDROM and then the first drive in the RAID, but to actually do the install I had to change CDROM to USB after burning out an image to it. What the heck guys?

Now, I’ll save you the boring TCP spec – When the nVidia board comes up it actually sends a DHCP packet on its own which is nice. The problem here is the HLEN of the packet is… 0. Yup. Someone didn’t know what to put in the field, so they send 0. This causes the router (thank god) to respond to FF:FF:FF:FF:FF:FF, which while it’s not correct, works because it’s a broadcast packet. The adaptor (seems) to configure itself, then Linux does something goofy where it sees the adapter is configured, so it sends out a release/renew, which the router, apparently knowing the MAC address but having an entry for a bogus MAC, sends out a different IP (next in pool) for the correct MAC address.

The BIOS, of course, doesn’t have a way to disable this “convenience feature” and to add insult to injury, dmesg doesn’t work in Linux because the BIOS is doing something funky by itself. For right now I’m just ignoring it. But seriously nVidia, fix your shit.

Advertisements

I Hate nVidia

I have a dirty confession – I’ve always liked ATI stuff except for when the GeForces first came out and they were cheap as heck. Buying two of them would buy you a high end 3DFX card or ATI card and it would still outperform them. Also these were the college days when AGP was still new and having two videocards meant one less thing to kill your PC. Then again this was drexel, and we had the Kelly Hall heat wave that year, and it killed my motherboard.

ATI fought and fought hard to get back to the top, and it was only after AMD bought out the last of the DEC stuff for really awesome 64bit support and then gobbled up ATI that things got good again. Frankly it was a great move since graphics are almost all math, so having a 64bit (or even 128bit) pipe with multipath and short-lines is just great.

The came the licensing wars.

Linus (correctly) said that kernel shims were OK so long as they’re open source. He’s no dummy, kernel shims let the kernel load blobs, but being open source they replace the linking and once you’ve got the linking objects you’re most of the way to having a driver since you can see what the card is being sent and you can see what the kernel is sending. Open source drivers followed, but some of the really exotic stuff only recently caught up.

nVidia has always, always been a pain in the ass in Linux. The shim wouldn’t build when it first came out and required users to edit the Makefile, certain gcc versions produced drivers which were slow or had unintended consequences depending on how they did memcpy and other low level functions. Installing nVidia was mostly a one way ticket to either kernel lock-in or building it by hand. To further add insult to injury, nVidia never offered a unified driver and always had three versions. Now this was OK up until recently – They kept a list of cards so you generally knew if you needed nv, nvidia-G01 or nvidia-G02. Now the bad news: nVidia has decided to drop updates for older cards. I realize they can’t update them forever, but what’s missing? Open sourcing the drivers.

ATI hasn’t really offered up any open source drivers, but they did offer unified drivers. Download one, build it, you win! The build process is pretty seamless. ATI hasn’t moved to quash open source drivers either, to the point where the open source drivers are so stable that they are now officially merged to MESA. If you’re wondering what MESA is, MESA provides openGL functionality to the system in a common package. To have drivers in it for a major manufacturer like ATI means you simply install MESA and 3D just works. No more diddling around with drivers, third party crap, and the ATI clock tray icon (unless you want to).

Now if you’re like me, you’re running OpenSuSE. You’re probably not like me but you might be running Linux. Windows users should have stopped reading six paragraphs ago. I upgraded to 11.3 from 11.1 (which I needed to run to hack the novell client from SuSE 10 into working because novell doesn’t even update their own stuff) and what broke? Oh, the nvidia drivers. Given that this is a work PC, I have no sway in my videocard. I went to fire up sax2 and I was told it was deprecated because of XOrg updating their autodetection routines. The new XOrg is nice, the new SuSE is nice, but with no new nVidia release my KDM login manager doesn’t work. Weirdly enough I can log in on the console and do a startx which does work, but it would be nicer to have a GUI running. (Then again having an ominous text console keeps the n00bs off my PC). After hacking on this most of the last few days, it’s definitely a problem in how nVidia does the initialization and it’s directly related to the fact I am running nvidia-G01. Way to go nVidia.

My laptop (ATI)? Runs great, and it’s a radeon mobility 600. Hardly new. Guess we know who’s videocard I’ll be buying in the future.

What the Hell. Seriously.

Todays happy fun UNIX error message:

/etc/X11/xim: Checking whether an input method should be started.
/etc/X11/xim: line 72: You can write a small letter to Grandma in the filename.: command not found
sourcing /etc/sysconfig/language to get the value of INPUT_METHOD
INPUT_METHOD is not set or empty (no user selected input method).
Trying to start a default input method for the locale en_US.UTF-8 ...
There is no default input method for the current locale.
Dummy input method "none" (do not use any fancy input method by default)

Tertullian was born in Carthage somewhere about 160 A.D.  He was a
pagan, and he abandoned himself to the lascivious life of his city
until about his 35th year, when he became a Christian .... To him is
ascribed the sublime confession: Credo quia absurdum est (I believe
because it is absurd).  This does not altogether accord with historical
fact, for he merely said:

 "And the Son of God died, which is immediately credible because
 it is absurd.  And buried he rose again, which is certain
 because it is impossible."

Thanks to the acuteness of his mind, he saw through the poverty of
philosophical and Gnostic knowledge, and contemptuously rejected it.
 -- C. G. Jung, in Psychological Types

(Teruillian was one of the founders of the Catholic Church).

startkde: Starting up...
kdeinit4: preparing to launch /usr/lib64/libkdeinit4_klauncher.so
kdeinit4: preparing to launch /usr/lib64/libkdeinit4_kded4.so
kdeinit4: preparing to launch /usr/lib64/libkdeinit4_kbuildsycoca4.so
kbuildsycoca4 running...

And so on. I was trying to connect from my XOrg host running KDE to a Sun (Solaris 7) box running X11R3 with CDE. I have no idea how jesus got involved.


hh, too many forks, PIDs got reused, we’re confused

I’m not entirely sure I’m in love with EXT4. I’ve installed OpenSuSE 11.2 on my laptop (64bit, SMP) and I said “WOW EXT4 IS OUT LETS USE BLEEDING EDGE FILESYSTEMS WHAT COULD POSSIBLE GO WRONG?”

Everything.

This laptop has a slow, slow harddrive, so I figured any filesystem doing bleeding edge fast stuff might help. My previous favorite was XFS which I’ve had nothing but fantastic luck with. EXT3 was also nice on the servers and was bulletproof, although noticeably slower than XFS. EXT4 is fine until you put the filesystem under load. It’s blazing fast until it runs out of cache, and then you’re screwed hard waiting for it to catch up. In my case I managed to blow it by coping my documents directory I backed up to my other PC back to my laptop (about 6GB worth of crap) while running X, firefox and setting up YAST update sources. I thought things were going way too well when we hit the commit wall. The drive was on solid, and I could move the mouse. Clicking on things resulted in the drive chugging for a bit and then slowly starting the animation. Nothing ever launched. Eventually I couldn’t even kill X. I ended up pulling the plug.

The good news is that it recovers gracefully from a crash. The dreaded “empty file” error didn’t happen to me, but I’m marginally pissed the filesystem will let you outrun it catastrophically. That being said, when I restarted the SCP, I got the cryptic error: hh, too many forks, PIDs got reused, we’re confused…

OpenSuSE 11.1 is full of win, but new ATI drivers, maybe not

Phortunate the Phoronix Phaggots knew what to do. ATI has had a horrible problem with their drivers in 64bit mode since day one. I’m working on a Dual Core Pentium 4 64bit with an ATI x600 Radeon card here. OpenSuSE 11.1 is a dream but has it’s share of bugs. Ah, the joys of operating system maintenence. Anyway, turns out that whatever you do with the ATI drivers, you end up with /usr/lib/dri. This is full of fail. You want to remove this directory and link ./usr/lib64/dri to here.

Special thanks to the Phoronix Crew.

Linux Arcanum and SMART Warnings

If Languages Were Religions is riotously funny.

I finally figured out what’s wrong with my desktop. For the longest time the instrumentation was just weird. It would crash randomly, have strange bus problems (which I thought were related to aging video cards), and the voltage from the power supply would have a noticeable bit of noise from it. Other than the generic logs of “your computer has recovered from a serious error” there was nothing to point to. MEMTEST would show all the DIMMs had a bad line, so I just assumed the mobo was slowly dying and figured one day I would come home to it not working.

Finally one day I happened to be reading the syslog on my Linux box trying to track down this one idiot on a modem who was trying to hack it when I got the message:

Dec 17 08:29:39 HopsAndBarley smartd[2532]: Device: /dev/sdb, Failed SMART usage Attribute: 9 Power_On_Hours.

OH MY GOD SMART ACTUALLY WORKED. Basically it’s saying my old Linux drive, the one I use all the time, is crapping out. I checked to see where the spare was and realized that the spare became the windows drive (120GB) and my windows drive became my Linux drive. The spare-spare drive I had is a 10GB drive I used to use as a raw device for caching DVD data while authoring. Which means I have no device at all. So I have a choice. I can go through my windows drive and reload it, thus creating enough space for a Linux partition or I can run the computer without the Linux drive entirely and give up my primary OS for the sake of having anything to use at all.

Since the botnets have been a pain recently I came up with a new /etc/hosts.deny

ALL : .ru
ALL : .cn
ALL : UNKNOWN

Basically, if you’re from a .ru, or from .cn, or your IP doesn’t resolve to a hostname, you’re not connecting.

And of course all the other security stuff is in place like denying root login, which seems to be what most of the idiots out there are after.

Here’s the types of logs:

Dec 16 13:12:03 HopsAndBarley sshd[2917]: Invalid user t1na from 195.162.62.230

These actually go on for quite a few usernames and the guy’s working off a default list. These will now be denied outright by TCPWrappers since they’re caught by hosts.deny’s UNKNOWN directive.

Oct 24 20:19:11 HopsAndBarley sshd[9074]: Invalid user newsletter from 59.145.145.146
Oct 24 20:19:14 HopsAndBarley sshd[9079]: reverse mapping checking getaddrinfo for dsl-kk-static-146.145.145.59.airtelbroadband.in [59.145.145.146] failed – POSSIBLE BREAK-IN ATTEMPT!

That asshole is from india. I’m trying to decide if I want to blacklist India from connecting to me except that I have Indian friends. I simply set my SSH max auth retries down to 1 and set the “connect” timeout to 5 seconds making it prohibitively expensive time-wise to try this crap.

And finally this poor asshole wins the award:

Dec 8 14:52:29 HopsAndBarley sshd[15188]: Invalid user felix from 62.141.122.246
Dec 8 14:52:31 HopsAndBarley sshd[15193]: reverse mapping checking getaddrinfo for dial-up-1-118.spb.co.ru [62.141.122.246] failed – POSSIBLE BREAK-IN ATTEMPT!

Because it took him so long to connect, he was at it for over 12 hours.

Now, stuff that makes me less happy is that this is OpenSuSE. I love SuSE, it feels like RedHat Done Right. But some of their default security settings aren’t appropriate to a desktop system. I realize there’s probably times where having UNKNOWN hosts denied access would ruin someones websurfing experience, but having SSH respawn indefinitely with no delay or max auth retries is sloppy. On the other hand, OpenSuSE and SuSE in general is really good at not spawning services it doesn’t need, and the default firewall for a desktop host is really restrictive (actually, it denies all inbound traffic without matching outbound traffic). It’s OK to have this as a point defense, but in todays age of browser based exploits, it wouldn’t surprise me in the least to find out someone starts killing Linux desktops by connecting to localhost once they have your browser. A firewall is nice, but defense in depth is a requirement.