Sunday 2 November 2014

Google Sheets - inserting a static variable

Ever had a spreadsheet with set variables and wanted to use them in a formula but not have to type them over and over - just use the drag down feature and have it work? For example, if I had a formula where I wanted A3 to be present as part of it, and when I try to apply that formula to multiple cells the A3 becomes A4, A5, A6 etc? That's very annoying and up until now I've gone through and manually corrected it.

Not any more! By putting a $ sign in front of the column ID and the row ID e.g. $A$3 - it won't change! It stays the same! OMGWTFBBQ!

How long have I looked for this - I found the answer here: http://www.gcflearnfree.org/googlespreadsheets/14.3 - read up for more info! Yay!

Wednesday 24 September 2014

How to restore a file with StorageCraft ShadowProtect

I've installed ShadowProtect on most of my clients' servers - it's a great product and if you're not using it for backups, then seriously consider it. One of our sub-contractors emailed me with some issues on restoring files so I thought I'd add my reply to him here as a quick cheat sheet:

  • Log onto the server you need to restore the file from
  • Open up the share where your backups are going
  • Browse through the list of files and look for an .cd or .cw or .cm file around the correct date
    • -cd.spi is a consolidated daily
    • -cw.spi is a consolidated weekly
    • -cm.spi is a consolidated monthly
    • for a full listing see here: http://bit.ly/1mNLBps
  • once you've found the correct file, right click on it and choose ShadowProtect Mount
  • pick the defaults, except when it comes to the right date - the consolidated files have a list of possible days / weeks that you can choose - find the right one and click on it, then go Next
  • mount the file as read-only unless you need read-write access
  • the computer will mount the drive as a new drive letter
  • browse through that drive until you find the file you want to restore, then copy and paste it to the right location and that's it
  • unmount the backup image and all finished! 

It's important to note this is only one of two ways of doing it. You can use the wizard that is part of ShadowProtect and it's even easier. At the time I was in a hurry and had to find multiple awful files so I used this method, plus I find that I use this method for Granular Recovery for Exchange restores.

Friday 8 August 2014

Restoring Windows Sharepoint Services 3.0 - from disaster to victory beer!

Recently during a server upgrade I applied SP3 to Windows Sharepoint Services 3.0. This particular server had seen no love in a long, long time and it needed an absolute slew of updates. Naturally, Sharepoint broke and the site loved by my client was unavailable, as were many other services.

The errors in the Eventlog were varied and painful, with lots of vague references to the apocalypse and the like. Naturally the logs get incredibly dense and I had another issue to contend with along the way - disk corruption. The ntfrs filesystem was reporting corruption and had taken out a chunk of the Sharepoint Wizard's configuration. That obviously had to be fixed first and was very worrisome - especially given I was working on a RAID1 disk.

Normally, because the database needs an upgrade when you apply SP3, if it doesn't start straight up you can run the SharePoint Products and Technologies Configuration Wizard to repair it. Failing that, you can disconnect the farm, fix the database issues and then re-run the wizard and connect back to the farm. With the disk issues and also with the failure of the systems admin to fully apply all the updates none of this was working - in fact the Wizard was failing spectacularly.

This is where things got to from my notes:
  • Ran the config wizard and told it to disconnect from the server farm per MS documentation
  • re-ran config wizard - it is now reporting that IIS is not working properly.
  • have looked in to this - suggestion is that a compatibility mode setting has not been applied. Unable to apply this in Windows Server 2003.
  • have run a repair on WSS 3.0 - this requires a reboot
  • many many ASP.NET 2.blah errors. All non-descriptive and very dense to understand without being a .NET programmer.
So we were right up that fabled creek without a paddle. I finished patching the system, which sorted out the issues with ASP.NET. I still had no connectivity to SharePoint so I ran through some more updates and managed to partially get the SharePoint Admin site up. I was still getting all sorts of errors and came across a post that suggested I change the ASP.NET version of the Admin site to 2.0.whatever. You can get to this via the IIS management tool, right click on the website, go to ASP.NET and edit the configuration, altering it to the version you want. I did this and it made no difference, but after restarting IIS the admin site came up. Awesome sauce. There were also a few permission changes I needed to make - the Network Service account had somehow lost access to the content database.

I had a backup of the all the WSS databases, and the databases themselves were actually running on the server still. What I didn't realise and what I hope you, gentle reader, can take from this, is that the restore was far easier than I thought. I removed SharePoint from IIS, and created a new web application. I also created a new site collection and new database. From here I went to Content Databases and added in the old content database but I still couldn't get the right site to come up. In fact, the old content DB and the new one conflicted and I had no access to anything. What I should have done was this (all through the WSS Central Administration Site)

  • create a web application
  • in Content Databases add the old content database - you may have to use the stsadm command to do it which is:
    • stsadm -o addcontentdb -url http://server -databasename WSS_Content (which is the default name)
  • Check under Site Collection List - you should see your old website application there
  • restart IIS and check the site.
Where I had a lot of pain was that I didn't realise the old site was held within the WSS_Content database and I didn't need to add a new site or create a new site collection. How remarkably painful is all I can say. I hope in future that it'll be a bit easier during upgrades.

Tuesday 15 July 2014

Upgrading DragonFlyBSD

I always forget how to do this, so I'm documenting it here. The DragonFlyBSD website is quite good and this all comes from www.dragonflybsd.org/docs/newhandbook/Upgrading

Firstly, make sure the Makefile is present:

# cd /usr
# make src-create

and wait while it does it's thing.

Then we need to get the new source to build it:

# cd /usr/src
# git checkout DragonFlyBSD_RELEASE_3_8 (which is the current one)

To find out what the current one is:

# cd /usr/srv
# git pull
# git branch -r

Then the build and upgrade process:

# cd /usr/src
# make buildworld
# make buildkernel
# make installkernel
# make install world
# make upgrade
# reboot

And it should all be done.

Monday 9 June 2014

Adventures with Crashplan for backups

Recently through the excellent SAGE-AU (www.sage-au.org.au) I read about Crashplan. Produced by Code42 and found here: http://www.code42.com/crashplan/ there were lots of positive comments about it. I've since deployed it in two separate locations - in the office and at home. I'm using the free implementation at the moment, which allows you a backup each day to a variety of places. They include a 30 day trial of their cloud backup solution - which is quite cheap for a home implementation - $165 / year for 2 - 10 computers. Check out the full pricing - but see what you can do with the free version:-

At the office we have a straight Microsoft Windows based environment - Windows 7, Server 2008 R2 and a wee Windows 8 here and there. I've set up a Crashplan account using a single email address and installed it on almost all our machines. I have a Windows 2012 Server running in our virtual environment and I'm using it as the base for all the backups to go to. I added a 2TB virtual disk to it, configured Crashplan and started pointing machines back to it. It's working brilliantly! As they say though, backups are optional - restores are mandatory. Since implementation I've had to run three separate restores, from all sorts of weird files to basic word documents and it's run flawlessly!

At home I've been messing with it too. I've installed it on my Linux Mint desktop which runs all the time, and has an NFS share back to my FreeNAS. I've set up Crashplan to use that location for backups and I have the wife's Windows 7 laptop, my MacBook Air and my Windows 8 PC all backing up to that location now. Totally cool! Crashplan has installed and worked on all the machines without any issues, complications or anything. It's excellent!

Emails are sent from Crashplan to notify you if machines are backing up properly or haven't backed up for a given amount of time and this is very handy. Our offsite techs are frequently away for days and as soon as they get back, their laptops start automatically backing up. It's the easiest implementation I've found so far.

Check it out http://www.code42.com/crashplan/ it's awesome!

Sunday 25 May 2014

Securely wiping a hard disk in Linux

We're getting ready for some changes at home, and I thought I'd go through the old hard disk drives I have laying around. Once I'd managed to get them all together there are a staggering 25 to be wiped :(

Usually I use the excellent Darik's Boot and Nuke (DBAN) which is awesome and very simple to use. In this instance, however, I'm also doing a fairly large data sort, archive etc and I need to have a functional machine to browse the disks prior to their destruction and reissue. Given my well know love for Linux Mint I executed an extensive (20 second) search of Google and came up with the following interesting information:-

ATA, SATA and SSD's now have an internal way of securely wiping themselves! From a command prompt (elevate it to root for ease of use and make a note of your disk drives - if you wipe your system disk or data disk then it's game over! Maybe use a LiveCD?)

Go and check out https://ata.wiki.kernel.org/index.php/ATA_Secure_Erase

The quick version is:

# hdparm -I /dev/sdx (where sdx is your disk) and check that "not frozen" is there. If that's OK proceed:

Set a password on the disk (otherwise the secure wipe won't work):

# hdparm --user-master u --security-set-pass ryv1 /dev/sdx (where ryv1 is the password, and the username is u)

Check it worked:

# hdparm -I /dev/sdx
Security:
       Master password revision code = 65534
               supported
               enabled
       not     locked
       not     frozen
       not     expired: security count
               supported: enhanced erase
       Security level high
       440min for SECURITY ERASE UNIT. 440min for ENHANCED SECURITY ERASE UNIT.


Note the 440min is for a 2TB Western Digital Green drive. 440min is over 6 hours!

Now it's time to unleash the full power of this fully operational command!

# time hdparm --user-master u --security-erase ryv1 /dev/sdg security_password="ryv1"
/dev/sdg:
 Issuing SECURITY_ERASE command, password="ryv1", user=user

It's potentially valuable to note that when I ran the command above on my Linux box I stupidly pressed CTRL-C to copy the above text - which is also the command for cancelling a running program. NOTHING HAPPENED! It's a runaway freight train so be *very* careful to select the right disk or it could be a sad day for you.

The good thing about this command though, the load on your computer is negligible - the disk itself is doing all the work. I can see it's I/O is through the roof, but otherwise normal system actions are not compromised.

The upshot of all of this is as follows - although it's a cool way to do it, I'm going to simply find the data I need off all these disks, then take them and hook them up to another machine with multiple SATA ports and DBAN the lot - much faster in the long run!

Saturday 24 May 2014

Effects of travel on IT or What the hell do I take when I go overseas?

Recently I was on a trip to Jakarta, for pipe band of all things, however while there I still needed to keep up with my normal information load. My gear load out for work, or for holidays in Australia typically consists of two mobile phones (one work / one private), Google Nexus 7 (WiFi) and my 11" MacBook Air or 15" MacBook Pro. Taking all of this junk to Indonesia was unfeasible - although altogether the weight was under 3KG. I knew I would have my normal number of emails, still want to check my Feedly, Facebook, take photos etc. Keeping everything charged and good to go is a usual challenge, and I imagined it would be worse in Jakarta.

Heading over, I took my HTC One X, Nexus and that was it. It was a gamble because I didn't want to unplug too much, but still needed to have access to a wide variety of data. I wondered at what other people travelling took and it seemed very much that this was fairly typical - tablet + mobile phone. Very few people seemed to have included a laptop of any type. I generally find that typing on a tablet, even one with a bluetooth keyboard, is difficult to do over a long period of time, especially with any degree of accuracy so I thought this was pretty interesting. Also given the data storage limitations of tablets/phones I thought it was interesting given the amount of photos and videos everyone was taking. More than one person remarked to me that they had filled their storage and needed to delete some stuff.

Neither of the devices I took have upgradeable storage, so I had to manage it fairly carefully and took less shots than I might normally have.

Something I found to be very nice was lots and lots of free WiFi everywhere. Hotels, airports, cafes, coffeeshops, etc all had free internet and it was beautiful. As a country lad where we're lucky to get 3G coverage - let alone 4G - it was very exciting. It was nice to see such strong cell coverage everywhere too. I noted that mobile towers were spotted across the landscape. It was even better for me with the photo backups to Dropbox my HTC performs whenever it's on a WiFi connection. This is a cool feature and HTC give you a space upgrade to your Dropbox when you connect. Very nice indeed.

In reflection, I should have taken my MacBook Air at least. There were a number of times I needed to SSH to a server for changes, and using the tablet/phone was awful - slow and cumbersome. Also, I wanted to write up a travel journal, but I found that using the tablet/phone to type was interrupting to my flow - I tend to write, refine and spellcheck as I type, so getting the whole tiny little keyboard, searching for the key etc thing was very hard to get around. Constantly refining my expression was very hard. I asked about and the chaps I travelled with found no difficulty - rarely did they send big messages, and those that did were adept at using tablets to do so. It should be noted they have much smaller hands than I do! USB power adaptors were very useful, although the power in Indonesia can be a bit sketchy at times.

Good luck if you're travelling and be safe.

Wednesday 12 March 2014

Amazon EC2 experiences

Recently I was reading about Arscoins and the usage they made of the free Amazon EC micro instances. Intrigues I decided to take a look.

Amazon have a free tier of services. Minimal devices with enough hours to run all month. I chose an Ubuntu Linux instance and after running through a simple sign up had an instance ready to go. Using shared keys I could ssh to it (the only way to go) and I had set firewall rules so that only a couple of static addresses could get to it. Amazing! It was all up and going in about 15 minutes. Only a barebones server of course but enough for testing and the obligatory oooh from my co-workers.

The instance is free for 12 months and I've set alarms so that in the case of exceeding usage I will be notified of any billing. They also offer Windows servers too and a variety of different operating systems. For the minimal amount of time involved it was a great experience. I strongly recommend treating the instance like a server and keeping it updated and secured.

Saturday 8 March 2014

Hubsan X4 H107C 2.4GHz Quad Copter Adventures

Recently I acquired from eBay one of these:


It's a little tiny quadcopter! I first became interested, not so much in RC, but in quads after watching the TEDTalk by Raffaello D'Andrea - The astounding athletic power of quadcopters. I found this amazing and further found it amazing watching this from TEDTalks by Andreas Raptopoulos: No roads? There's a drone for that. These devices are pretty awesome and could revolutionise the way we do so many things - from deliveries to rescue, precision work in the air, even video recording at the Winter Olympics!

Given the X4's small size I've been constantly surprised by it's speed and stability - even under fairly strong wind conditions. Throughout our house we have ducted air and when the A/C is up high the breeze is really quite strong. The X4 handles it quite well - I never use more that 60% of power to push against the air currents and can keep it fairly steady. Comparatively my much larger RC helicopters struggle mightily and frequently can do no more than skim along the floor.

Bravely (I thought) I took my little X4 out yesterday afternoon. Our small yard is quite sheltered, but the breeze we do get swirls through the area, generating all kinds of random currents and eddies. It's amazing how much the quad gets pushed around. Getting used to the control surface is taking a while - it's easy to overcompensate for a swirl in the air, getting myself into more trouble and then desperately trying to recover from that before crashing. The X4 is incredibly agile, far more so than the other RC helicopters I have, and also far more stable.

Long story short, being the brave adventurer I am, I rocketed the throttle to 100 and watched my little X4 zoom up into the sky. Immediately it became apparent the wind was much stronger up there. Uh oh - tree, oh no - roof, arrgh! - tree again - each time these hazards came up I was able to "gracefully" recover and not hit anything. I always forget though to increase power when manoeuvring. The quad tends to drop a bit when you execute movements - the more extreme the movement, the more it impacts on your altitude. As the X4 soared over the roof of the house I momentarily forgot this: tried to reverse, lost altitude, tapped the roof with the front rotors, flipped and that was that.

Getting it off the roof was a real chore and the blades of the rotors were quite badly damaged. Luckily the X4 comes with spare blades - interesting they are labelled A and B. If you put them on the wrong pylon the thing won't take off. As you can see in the first TED talk above, you can cut the blades off and it will still fly. Get the wrong blades on the wrong rotors though and its not going to get off the ground.

The X4 was about $80 AUD ($20 + postage) from Hong Kong. I've already ordered more blades, new plastic protective ring (for flying indoors) and larger capacity batteries. I get about 6-7 minutes flying time and then it's a 40 minute charge on USB. This particular model has a 0.3 megapixel camera built in to it too. Camera on/off doesn't seem to knock the battery around too much, and the footage is OK. When I eventually take some footage that doesn't cause motion sickness I'll post it.

The packaging says this isn't a toy and it really isn't. You've got to really pay attention to it and fly with some care to avoid crashing and damaging it or other things. I accidentally ran it into myself and have two 3cm superficial scratches that bled!

There are quite a few videos on YouTube of this quadcopter available. Some of them are very nice. I'm looking forward to flying mine in a large open area and seeing how it goes. I've included the specs from the manufacturer Hubsan below:

Hubsan X4 H107C Quadcopter


Item No.H107C
Item Name:The Hubsan X4
barcode:6922572400030(EAN-13)
Motor (x4): Coreless Motor
Frequency: 2.4GHz
With 4 channels
Battery: 3.7V 380mAh
Flight time: around 7 minutes
Charging time:40 minutes
Latest 6-axis flight control system with adjustable gyro sensitivity
Permits super stable flight
Lightweight airframe with nice durability
4-ways flip(left,right,forward,backward)
USB charging cable allows to charge by computer.
Flying outdoor ability
Transmitter: 2.4Ghz 4 channels
Camera: 0.3 MP
Video recording module included
memory card:Micro SDHC(excluded)

Friday 7 March 2014

Using Nagios and SNMP to monitor network devices

Usage:
check_snmp 
-H <ip_address>
-o <OID>
[-w warn_range]
[-c crit_range]
[-C community]
[-s string]
[-r regex]
[-R regexi]
[-t timeout]
[-e retries]
[-l label]
[-u units]
[-p port-number]
[-d delimiter]
[-D output-delimiter]
[-m miblist]
[-P snmp version]
[-L seclevel]
[-U secname]
[-a authproto]
[-A authpasswd]
[-x privproto]
[-X privpasswd]

Note:

the -c and -w (critical and warning ranges respectively) reflect ranges differently depending on if you want a critical to be low (under 10 for example) or high (over 90). In the former case, say the Signal Level of a microwave device you are monitoring is critical when under 10% and warning under 20% then the format of the -w and -c would be:

            -w 25: -c 10:

If on the other hand you are looking at Signal to Noise ratio where warning is 50 dB and critical is 75 dB then the command would be:

            -w :50 -c :75

If you have say a table where numbers translate to other things, again using a Microwave example:

wvSubDataRate  OBJECT-TYPE
               SYNTAX      INTEGER {
                              rf-bw-1p5-Mbps(1),
                                                          rf-bw-2p25-Mbps(2),
                                                          rf-bw-3-Mbps(3),
                                                          rf-bw-4p5-Mbps(4),
                                      rf-bw-6-Mbps(5),
                                      rf-bw-9-Mbps(6),
                                      rf-bw-12-Mbps(7),
                                                          rf-bw-13p5-Mbps(8),
                                      rf-bw-18-Mbps(9),
                                      rf-bw-24-Mbps(10),
                                                          rf-bw-27-Mbps(11),
                                      rf-bw-36-Mbps(12),
                                      rf-bw-48-Mbps(13),
                                      rf-bw-54-Mbps(14),
                                      rf-bw-72-Mbps(15),
                                      rf-bw-96-Mbps(16),
                                      rf-bw-108-Mbps(17)
                           }
               MAX-ACCESS  read-only
               STATUS      current
               DESCRIPTION
                    "The data rate of the station."
               ::= { wvSubStatusEntry 4 }            
and you'd like to have the actual data rate instead of the number then you need to tell it what MIB to use by putting the -m switch at the end e.g.

-m MWAVE-MIB and it will translate the output (typically just a number like "15 = rf-bw-72-Mbps) - giving you meaningful output.

That's it for now but more to follow as I keep working with this type of hardware.


How to fix Nagios3 external commands error

After installing nagios3 and trying to send it a command to reschedule a check or do some other external activity you may get the following error:

Error: Could not stat() command file '/var/lib/nagios3/rw/nagios.cmd'!
The external command file may be missing, Nagios may not be running, and/or Nagios may not be checking external commands.
An error occurred while attempting to commit your command for processing.
In order to fix it, do the following actions:

Check that /etc/nagios3/nagios.cfg has:

check_external_commands=1

Also check that

command_check_interval=15s is uncommented and
command_check_interval=-1 is commented like this:
command_check_interval=15s
#command_check_interval=-1
Check the path for command_file is OK. It usually looks like this:
command_file=/var/lib/nagios3/rw/nagios.cmd
Make sure that the user www-data is part of the nagios group - this is located in /etc/group

Check permissions on the command file that we looked at above:
# ls -l /var/lib/nagios3/rw/nagios.cmd
prw-rw---- 1 nagios nagios 0 Mar  7 11:56 /var/lib/nagios3/rw/nagios.cmd
If it looks like this, we're good.

The next thing to check is that the directory that nagios.cmd resides in has executable rights for the nagios group:
# ls -l /var/lib/nagios3/
total 180
-rw------- 1 nagios www-data 176049 Mar  7 11:58 retention.dat
drwx------ 2 nagios www-data   4096 Mar  7 11:56 rw
drwxr-x--- 3 nagios nagios     4096 Jun 14  2013 spool
Uh oh - rw has no group rights! Fix it with this command:

# chmod g+x /var/lib/nagios3/rw
and then
# service nagios3 restart
And the crowd goes wild!

Wednesday 5 February 2014

How to update a single XenServer 6.2 host before adding it to a pool


You may find yourself in the position where you need to add a machine into a pool, a new host that is, and it's patching is well behind. Thankfully there is a relatively straightforward way to do this.

  • You will need to have already downloaded and extracted the required files.
  • On a computer with XenCentre installed run the following commands (assuming default install and Windows 7 / 8)
    • Open a command prompt
      • Win+r, then type cmd (or go to Start -> Run -> cmd and return)
    • cd "C:\Program Files (x86)\Citrix\XenCenter"
    • From here run: xe patch-upload -s server-ip -u root -pw secret-password file-name=c:\username\Downloads\XS62ESP1\XS62ESP1.xupdate and hit return. Make sure the items in italics is replaced with your actual details.
  • Once this finished, you'll get a long string of characters appear and that's the patch UUID. Normally when updating a pool we make note of that, but I've found another way that works just as well.
  • Download puTTY - it's an awesome terminal program and you should have it.
  • Use puTTY to connect via SSH to your xenserver and get to the console. Alternatively, do this from the server console. At any rate, type the following commands and note the two UUID's they produce:
    • # xe patch-list
      • note the UUID of the patch
    • # xe host-list
      • note the UUID of the host
    • # xe patch-apply UUID=patchUUID host-uuid=hostUUID and hit return. Once all the information stops flowing past and you're back to a # prompt type:
    • # reboot
  • if you're using puTTY, you can select the UUID's, the simply right click where you want them to be and puTTY will paste them out of the buffer. Very quick to do, rather than laboriously typing it out.
Once the machine reboots, continue to patch as applicable, and then add it to the pool.

Happy XenServering!

XenServer 6.2 - installed on a USB drive and installed on an SD Card

In this post I want to talk about recent experiences I've had with XenServer 6.2 on USB drives and SD cards. In a particular situation, I was forced initially down this path due to specific hardware and lack of appropriate drives. The upshot is that it was entirely unsuccessful and here is why:

Firstly, it's important to be aware that XenServer isn't all that happy being installed on these devices. After you install, you must boot back up on the XenServer CD and then (using ALT-F2) drop to a command line. To get XenServer to boot on USB do the following:

  • From the command line do this: 
    • # cat /proc/partitions
      • this tells you what partitions there are that XenServer can see. Typically you will see /dev/sda and it's children - /dev/sda1, /dev/sda2 etc. We want /dev/sda1
    • # mkdir /target
      • create a temporary location so we can change our root directory
    • # mount -t ext3 /dev/sda1 /target
      • mount our existing /dev/sda1 partition to /target, giving us access to the files
    • # mount proc /target/proc -t proc
      • we need a proc filesystem!
    • # mount sysfs /target/sys -t sysfs
      • and a sysfs
    • # mount --bind /dev /target/dev
      • and of course somewhere for devices to be found
    • # chroot /target
      • change to the /target install of the system (chrooting means we are running now under the actual XenServer install, not the boot cd live filesystem)
    • # cd /boot
    • # ls –al
Now we need to actually tell the kernel of our installed XenServer to add USB support to it so it'll boot. Up until this point, the system will just hang, unable to find a filesystem.

  • # mv initrd-lots.of-numbersxen.img initrd-lots.of-numbersxen.img.old
    • this backs up the initrd ing
  • # mkinitrd --with-usb initrd-lots.of-numbersxen.img lots.of-numbersxen.img
    • make sure you replace lots.of-numbers with the actual kernel version. It could look like this: 2.6.18-53.1.13.el5.xs4.1.0.254.273
  • # exit
  • # sync
  • # reboot
At this stage we are finished and the server will reboot and actually get into XenServer. After patching the pool, or indeed, the server by itself, you may need to apply the initrd update again, but with the new kernel number.

While this was cumbersome to do, particularly because I had to do it six times - 3 for initial install, 3 again after SP1 was applied, it wasn't hard to do, or particularly time consuming. I found that I had to do this for both USB drives and on the SD Card. The SD Card was significantly slower than USB. The key difference between the two was that the USB, after some magical, indeterminate time, would lose it's filesystem! XenCenter would lose connectivity to it, and the console interface was non-functional - unable to find "root"! A reboot fixed this, but these are supposed to be on all the time with minimal service loss. That was when I turned to the SD Card idea. These servers had been running an awful install of VMware 4.1 on SD Cards so I thought they would just work. Sadly, it was not to be. Once again, I'd lose connectivity to the server, but this time, I was able to get the xsconsole to respond and reboot the machine. Totally unacceptable though. I don't have time to test with this system - it's supposed to be production! At any rate, I've purchased disks and installing them. If anyone has experienced this and knows the answer - I'd love to hear it!

Saturday 25 January 2014

ShadowCraft ShadowProtect - Employee of the Month

The last two weeks have been absolutely awful. The IOmega NAS I mentioned in a previous post had a large number of things go wrong with it all at the same time, resulting in complete failure and total data loss. This also meant a significant amount of down time for a client and ridiculous amounts of stress for me.

The good thing to come out of it was the performance of ShadowProtect. This backup solution far exceeded my expectations to the point where once the recoveries were completed we had only lost 10 minutes of work. Very little data was realistically lost, only time as we worked to recover it. The Recovery Environment was easy to use and the Hardware Independent Restore allowed me to recover virtual machine from VMware 4 ESX to VMware Player, a physical machine and also to XenServer 6.2 quite a remarkable feat really. I've had this client running in borrowed hardware for a week while we get new gear in.

The snapshot capability, ease of restore and simply replication of backup files makes ShadowProtect a real winner. Check it out! 

Saturday 4 January 2014

Experiences with the IOmega ix12 NAS

One of my clients has an IOmega ix12 - a rack mounted, 12x3.5" bay NAS. This one in question is running 2.6.5.something firmware and has (or should I say had) 8 x 1TB disks in it. Today it dropped another one. The NAS has been set up with a RAID10 with total storage of 2TB (1.5 or so in reality). Although it has this redundancy, I can assure you it's not all that it's cracked up to be. In the main storage pool is a number of iSCSI drives shared out with various ESX servers. Sadly, the largest and most important of these iSCSI drives has an unfortunate tendency to vanish whenever there is a disk issue.

Today one of the drives failed. No big deal right? RAID10 is tolerant of such faults usually. This one kept going except for the aforementioned large drive that disappeared like magic. After replacing the disk, impatiently waiting for it to rebuild and then attempting to reconnect the iSCSI drives I realised I still couldn't see it. The fix? Turn the NAS completely off, count to 30 and turn it back on again. Once it booted, just like magic everything reappeared. Be wary of this. My previous post was the precursor to this. Finding out the actual issue is still to be completed.

I'm concerned about upgrading the firmware because the backups, while reliable, are never straightforward to recover and I'd hate to have to recover so many disks. Its a job for next week I reckon.

Friday 3 January 2014

VMware nightmares

After a great couple of days away I was called urgently to work - one of my client's networks was down and my colleague was stuck. The VMsphere client couldn't see the hosts, which couldn't see their datastores and the network had zero stability. Yay. Just wanted I wanted to come home to.

The servers are running ESXi 4.1 that needs a few updates but the networks stability has always been a real issue for us. We took it over from some other chaps, solving a long series of issues with the servers and particularly the IOmega NAS. Throw in a database issue on one of the guest servers that kept the disk IO at absolute peak all the time and things were pretty tricky. All those things have been resolved, more or less, and the network had been fully functional for some months. So why did it change? Several reasons and I hope you take away from this some ideas for yourself.

The NAS appeared to have dropped a disk and while it has a RAID10 set up, it paralysed the system (not at all like the FreeNAS I've talked about before). The whole thing fell over and we shut it down before reseating the disk and powering back up. The NAS detected the disk and began a rebuild. The VMware hosts couldn't see the datastore though and the server managing the virtual environment periodically disconnected from everything. Initially we thought it was a network issue and restarted everything. The hosts were slow to come up, with errors in the logs indicating an inability to see the iSCSI disks. We could connect to the NAS via it's web interface, ping it and it looked quite happy on the network so we couldn't understand what was happening. An added complication was we have a new HP NAS on the network and while we were able to migrate several of the hosts to it, we've had problems getting then started. Don't know why, but the host's cpu goes through the roof everytime we try to start them. I thought we might have an all paths down bug and most of the documentation suggests a call to VMware and let them sort it out. At 8pm at night this isn't so great a plan, and with the client losing production time and money we had to solve it.

So with all these errors and problems left and right I was at a loss. Eventually we took inspiration from The IT Crowd and turned it all off, counted to 10 and started it back up. Would you believe that it took 3 reboots of the management server before the VMsphere client would connect to anything - especially considering it was running locally! The iSCSI shares from the NAS became available finally - there was a service issue on the iOmega NAS that was failing silently. It was alleging that all was good and it really wasn't. Now those shares were available we were able to reconnect to the datastores and boot the guest machines. The management server was still disconnecting from the hosts constantly but we were able to at least get things going. There is still quite a bit to do there but the servers are finally running.

Going forward I think we'll use XenServer and NFS shares. Simpler, fully functional and quick. Easy to backup and expand disks. Adios VMware I reckon.

The final thought is "Have you turned it off and back on again?" Something in that for all of us :-)

Wednesday 1 January 2014

The pure awesomeness of Google Forms

As a Google Apps reseller it never fails to surprise me how many people love Forms. I swear, I could nearly sell Google Apps to people just based on Forms. If you've never used it, here is a quick run down of what it can do for you:

  • create a web form very quickly and easily, choosing from a range of different responses including:
    • time (duration and date)
    • multiple choice
    • checkboxes
  • create separate pages based on responses
  • autofill a Google Spreadsheet
  • automatically create a summary of responses with a range of different graphs and information instantly
  • embed it into websites
  • email notifications of changes and updates
  • make you more popular with friends (OK maybe not this one!)

Some examples of how I use Google Forms:
  • PC Audit
    • capture make / model, RAM, CPU etc
  • Job Audit
    • bookings for jobs
    • completion of job information
  • Incident responses
Seriously they are awesome and if you're trying to do any type of statistics about stuff, they are a terrific place to start. Make sure though that your questions are good and that you choose the right type of responses to get the right Summary of Responses. It's cool though - you can change it even with live data in the spreadsheet. I have taught a friend how to work Google Forms in under 15 minutes - it is that straightforward. Enjoy!

Playing with Proxmox

 Up until recently I've used Hyper-V for most of my virtualisation needs. Hyper-V is a fully integrated Type 1 hypervisor and comes with...