Before I worked in tech, I had never heard anyone actually say, "I hate computers." But my first day in my second tech job, as a server admin at a Rochester, NY Gannett newspaper that shall remain nameless, I heard one of my co-workers say it. During the 5 years that I worked there, I heard that phrase from someone or other many times a day. Usually it was because the computer was doing something it wasn't supposed to do. Or because it was doing something it was supposed to do that no one in their right mind should ever have programmed it to do. Or because the problems it was having just didn't make sense.
In those ways, IT (Information Technology) is very much like medicine, except without that gratifying and rewarding human contact. That's actually a joke. I have a cousin who is a nurse who has much more caustic things to say about that "human contact." She never says bad things about her patients, except maybe to mention that dealing with them can be gross. The rest of the human race is not always, or even often, in her good graces. I sympathize, believe me.
Ok, maybe IT isn't like medcine. Software development (a small portion of IT), in which I spent quite a lot of my career, is like medicine only your patient is unable to communicate without special equipment (You know. A keyboard and monitor), test results almost always come back polluted with a lot of noise1, and "illnesses" are often not actually the patient's problem at all. Instead, they are caused by some other computer not giving the answer that was expected, or not giving any answer at all. Or giving an answer but later than it was supposed to.
You think I'm kidding?
A few years ago I was working on a project related to a website for a large customer. You don't need to know who it was. They were a painful customer but this particular sistuation wasn't their fault. I'm not sure it was anybody's really and that's scary. I really prefer it if I can blame someone. Especially if that someone isn't me.
So the customer complained that some of their end users were complaining that the site was crashing. That's a big problem! It tends to annoy people when they can't get stuff done. I don't know why. Some kind of human quirk, I suppose.
The thing was that not everyone was having the problem. It was only some of them. I know what you're thinking. They probably clicked something they weren't supposed to. This is another human quirk. People can't resist clicking things, or typing things they aren't supposed to, or doing anything else just because they can. But we don't hate users. We hate computers, for giving people too many toys to play with.
After a while, we had a bunch of our people logging on to the site (using test credentials) and trying to do the things that users would do. The crash happened - once in a while. Once in a while? It was at the exact moment when the backend of the system was communicating with another server that ran software from another company. Sometimes the communication worked, sometimes it failed.
Ohhhhh! So blame the other copmany and make THEM fix it, right?
Nothing is ever that easy. It wasn't actually the software that was failing. It was the communication. But only sometimes, Usually it was fine.
Picture, if you will, myself and about 6 other people, tearing our hair out because trying to figure out a problem that just didn't make sense. How can a computer be inconsistent? That's the opposite of what computers are, isn't it?
Somewhere along the line, we found the error message that was being returned when the communication failed. That was what put us on to the other software server in the first place.
So I started searching for that error message in the log files. This is where it got to be fun! You see, because this was a production system and because the customer didn't want it to have problems because of too much load, we had 8 identical servers going at once. And because we had never gotten around to consolidating all the log files together2, I had to go to each of the 8 servers and search for the error.
You know what it found? On most of the servers, nothing. Or maybe just a few instances. But on 2 of the 8 servers, there were hundreds. On those two servers, the communication step never worked.
Did I mention that the 8 were IDENTICAL? But 2 of them apparently weren't as identical as the others. Heh. Heh. You've heard of Animal Farm? "All animals are equal, but some animals are more equal than others." Now you've heard of ... Server Farm3!
They were actually virtual servers4 but "Virtual Server Farm" would have ruined the rythm of the punchline. See, this stuff isn't as easy as you might think.
We figured out a fix for the problem. Since they were virtual servers, it was easy to tear them down and recreate new ones just exactly like them. The new ones never had the communication problem. So we just arranged for them to be recycled regularly, so they were always new and with good communication.
This is one of the most annoying things in working with computers. We fixed the problem without ever knowing what caused it.
THAT's why, sometimes, IT people say they hate computers.
This means that log files take a lot of experience to interpret, partly because they are not intended to be read by humans. Also, sometimes they have garbage in them because of things hackers did. Or because irrelevant processes crashed. Or because it's Tuesday.
This is a basic step that should be done the moment you decide you need more than one server. But it often gets on the, “When we get around to it” list.
I know. “Server Farm” is not a new term. It’s the context that makes it different here. Am I explaining too much?
A virtual server doesn't run on dedicated computer hardware like this laptop I'm typing on does. It's more like a piece of software that acts like a computer. You can run several of them on one piece of hardware, which is pretty cool.