Jump to content
 English      
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
     Forums advanced search
HP.com Home
IT Resource Center Forums > Storage > HP LeftHand P4000

Virtual Managers/Failover not working, why?

» 

IT Resource Center

» Login
» Register
» My profile
» Search knowledge base
» Forums
» Patch database
» Download drivers, software and firmware
» Warranty check
» Support Case Manager
» Software Update Manager
» Training and Education
» More maintenance and support options
» Online help
» Site map

Member icons
 
 HP moderator  HP moderator
 Expert in this area  Expert in this area
Member status
ITRC Pro ITRC Pro
250 points
ITRC Graduate ITRC Graduate
500 points
ITRC Wizard ITRC Wizard
1000 points
ITRC Royalty ITRC Royalty
2500 points
ITRC Pharaoh ITRC Pharaoh
7500 points
Olympian Olympian
20000 points
1-Star Olympian 1-Star Olympian
40000 points
2-Star Olympian 2-Star Olympian
80000 points
»  How to earn points
»  Support forums FAQs
Question status
Magical answer Magical answer
Message with a response that solved the author's question
Favorites status
Add to my favorites Add to my favorites
Delete from my favorites Delete from my favorites
This thread has been closed Thread closed
 

Content starts here
   Create a new message    Receive e-mail notification if a new reply is posted  Reply to this message
Author Subject: Virtual Managers/Failover not working, why?      Add to my favorites
Paul Hutchings
Oct 18, 2009 13:02:32 GMT   

OK so I've downloaded the ESX VSA demo and setup two nodes.

I have a single site, and a single cluster containing both nodes.

The cluster has a virtual IP.

I have a couple of volumes, each set to 2-way replication.

I can continuously ping the virtual IP so long as the node that is the virtual manager is up and running, if I power that node off (either via the CMC or to simulate the failure of a link or node) I lose the ability to ping the virtual IP, and of course my test server loses access to iSCSI volumes.

I believe I need to move the virtual manager, but it doesn't seem to give any option to do this in the CMC if it thinks the virtual manager is running on the failed/down node, it just says the manager is offline and I seem to go around in a circle where I can't make the other node the virtual manager until I stop the existing virtual manager - which of course I can't do as that node could be down/on fire which is the whole point :-)

I guess I'm doing something wrong, and as much as I read the manual I don't know what?
Note: If you are the author of this question and wish to assign points to any of the answers, please login first.For more information on assigning points ,click here


Sort Answers By: Date or Points
René Loser
Oct 19, 2009 08:02:43 GMT    Unassigned

Hi Paul,

The concept with the Virtual Manager is a manual failover process.
You just add a node to the Virtual Manager in case of a failure.

I recommend to use the FOM (Failover Manager) running on another host (could be Virtual Server or VMware Workstation or VMware Player). The FOM runs always instead of the Virtual Manager. FOM is the decision maker in case of a failure and works automatically.

You should find the FOM Image as well on the CD.

Best regards,
reNe

HP Presales Storage
Mike Povall
Oct 19, 2009 08:22:20 GMT    Unassigned

Hi Paul,

Under normal operating circumstances you should not have a virtual manager running - it should be started manually on the surviving node when a failure occurs thus restoring quorum and access to the volumes.

Using a Failover Manager is definitely the best solution for your environment as it will run all of the time and will maintain quorum and access to the volumes during the periods when one of your nodes is offline for whatever reason.

Regards, Mike.
Paul Hutchings
Oct 19, 2009 08:26:21 GMT    N/A: Question Author

Thanks guys, a little after posting I worked out that a FOM is what I needed, so I installed one and it works seamlessly (well, a few seconds delay whilst it figures out what's happening but near as dammit seamless).

Paul
teledata Expert in this area
Oct 19, 2009 16:15:33 GMT    Unassigned

The Virtual Manager must be configured in advanced.

You may then manually start the virtual manager (on the remaining node) to re-establish quorum.

Using Virtual Manager is NOT providing high availability, as you WILL loose quorum and have to manually start the virtual manager.

The better design is to create a Failover Manager to provide that automated 3rd manager that will maintain quorum for you...
Paul Hutchings
Oct 22, 2009 19:18:25 GMT    N/A: Question Author

OK next question :-)

I guess this question applies to any sort of redundant storage that is seen as a single addressable "cluster" by ESX:

Suppose you have two sites, A and B.
Each contains some nodes making up a storage cluster.
Each contains some servers, ESX most likely.
A and B are linked by a fast LAN link.

Let's say you lose the link between A and B, but the kit in each location is up and running.

You now have "highly available" storage that is still available in both locations.

You have ESX servers that can each still see the storage local to them, but can't see the other servers in the ESX cluster as the link has gone.

So don't you end up with the same VM's now running in both locations as each ESX box's HA would kick in, and each ESX box can still access its shared storage as your SAN is resilient?

I'm sure I'm overlooking something obvious here as I can only think about it right now vs. actually do it.
teledata Expert in this area
Oct 22, 2009 23:44:11 GMT    Unassigned

Split-brain scenario is prevented by requiring a majority of storage managers to maintain quorum.

In a perfect world you would have a 3rd site (that is connected to the first 2 sites) that hosts your failover manager.
If that configuration is not possible, you then run your failover manager at your primary site that you want to stay up if your site link goes down.

If you have 4 nodes (2 at each site), only the site that has access to 3 managers (the 2 local nodes, plus failover manager) will have quorum (and thus access to storage)in the event of a link failure.
HP moderator Gauche
Oct 23, 2009 17:34:54 GMT    Unassigned

This might be handy.
http://bizsupport1.austin.hp.com/bc/docs/support/SupportManual/c01727773/c01727773.pdf
Paul Hutchings
Oct 23, 2009 17:37:57 GMT    N/A: Question Author

Thanks, I'd actually tested failover using the FOM in this exact scenario and I clearly wasn't thinking when I posted the question - only the site that has Quorum will have the cluster IP so you can't actually have two sites in "split brain".
 
Create a new message    Receive e-mail notification if a new reply is posted   Reply to this message
 
 
Printable version
Privacy statement Using this site means you accept its terms
© 2010 Hewlett-Packard Development Company, L.P.