SIDEBAR
»
S
I
D
E
B
A
R
«
CeRch servers
Oct 19th, 2012 by Tim Watts

Latest:

  • lists.cerch is back.
  • Elta has a rather weird “wiring” problem. Now with temporary fix pending proper fix.
  • That leaves lists.cerch and we should move elta-stg and kindura-stg as that part of Xen is a little unstable.
  • elta aka http://www.elta-project.org/ is back, on the new VMWare system. Just waiting for final testing by the project team.
  • Now re-arranging the disks on elta to suit VMWare
  • elta.cerch (live site) has been copied over – now we have to build a VMWare VM around it.
  • wiki.cerch is now stable and back!
  • wiki.cerch nearly done – needs a little work on the networking…
  • Cleaning up wiki.cerch now – a lot of files there.
  • It seems only kindura on 137.72.172.254 had open firewall ports on 8080 to 4 specific IPs. I am very surprised if elta did not???
  • Made an urgent firewall request to ITS as the broken servers were on 137.73.172.x and they will have to be rehomed to 137.73.123.x
  • So sorry folks – flat out busy again today…
  • Wednesday – the not-booting servers have serious boot problems (cannot find LVM disks) – will migrate them to VMWare.
  • I think everything is back except elta, wiki and lists – these will not start on Xen. Will probably migrate to VMWare.
  • Yes – mysql-server-1.cerch is now restored onto a brand new VM amd ahnet.cerch is back as a result.
  • Data restored – now checking that the permissions restored correctly.
  • New server for CeRch MySql built – just trying to restore the data…
  • Sadly mysql-server-1 is miserably unstable. Migrating to the ESX Cluster now.
  • Right – I think that is everything. Let me know if anything is missing…
  • elta, kindura, hansards, cmes-stg, drupal and projects are back.
  • triples and demos are back.
  • bril-dev is back.
  • Oh – it’s back???
  • Spoke too soon – mysql-server-1 has just died 🙁
  • mysql-server-1 is now booted on Xen – Hooray!
  • I strongly suspect this will run into Tuesday too.
  • Now we will try to get Xen/Opennebula going in order to start the VMs.
  • Yay! NFS mounts on all 3 hosts.
  • Slight NFS issues on drury-vmhost-1. OK – needed the NFS option subtree_check enabled on tereus.
  • OK – winning slowly… Applied urgent security patches to tereus, ulcc-vmhost-{3,4} and drury-vmhost-1 – just rebooting now…
  • The reason for the iSCSI disk not mount is twofold: 1 – /etc/fstab had the wrong entry; 2 – multipathing has been setup and is causing a timing problem (device not ready when /etc/fstab is run)
  • Well – we can now mount the volume containing the virtual servers disk files. It would be sort of nice if this would happen when the system reboots!
  • Some success in locating and remounting the SAN disk. The rest will have to wait until Monday.
  • iSCSI target IP discovered to be 192.168.130.101/24 (also default) – can now ping this. Now we have a chance!
  • Management IP 192.168.128.101/24 (which is the manufacturer default) via eth5 on tereus.
  • Hooked up SAN management port to tereus and will attempt to access SAN management to determine settings.
  • Identified SAN as Dell MD3000i.
  • Tereus is not mounting its SAN disk where the virtual server disks live – because it was not configured correctly. So I have no idea of the iSCSI IP addresses or other parameters!!!
  • Filestore.cerch is back and LDAP was migrated to the VMWare ESX cluster.

Read the rest of this entry »

FIXED: blogs.cerch crashed (2 blogs disabled)
Oct 4th, 2012 by Tim Watts

This server is proving difficult to restart in the Xen environment so tomorrow, Friday 5th October, it will be migrated to the DDH VMware cluster as an emergency.

Update 10:14am Friday: Disk files now copied to ESX cluster SAN. Starting the conversion process.

[Xen VMs boots in a different way to pure virtual machines, eg there is no bootloader so we have to add this in and tidy the disk layout up]

Update 12:54 Friday: Migration is complete. Host is security patch at the OS level and integrated with DDH core systems. Blogs are all running except:

I do not know if these ever worked, so please submit a ticket via Mantis if these need to be looked at.

System backups are running now
Oct 4th, 2012 by Tim Watts

As you know, we run most of our systems on virtual machines on top of a VMWare ESXi 4.1 cluster, which is located at the University of London Computer Centre off Russell Square. The cluster runs with a SAN (disk array) providing  20TB of storage (in RAID10 format for speed) for virtual machine operating systems and data.

Read the rest of this entry »

SIDEBAR
»
S
I
D
E
B
A
R
«
»  Substance:WordPress   »  Style:Ahren Ahimsa