Remove Failed Server From Tree#

Things happen. It is not always possible to remove a server gracefully.

Disks fail, power supplies burn-out and ???

Be VERY Careful. You should know what your are doing before you do ANYTHING on this page. Actions may destroy your Edirectory Tree and NOT be able to Recover!

Fortunately there are ways to get your EDirectory Tree back in shape.

Verify Time is Synchronized#

Be sure to Verify Time is Synchronized.

Server Not Failed?#

Be VERY Careful. This will Remove every replica from the server!
- WARNING Remove every replica from the server If the ndsd service is still available, but there is some advanced condition that you must remove the server from the tree, you can remove the replicas from the server with the following command:
/opt/novell/eDirectory/bin/ndsrepair -R -Ad -Xk2

This may take a while, but we have found it acceptable on trees of less than 200,000 entries.

We also typically rename (or remove) the /var/opt/novell/eDirectory/data/dib directory if we need to prevent this server from EVER coming back.

Once this completes, you may continue with items on this page to remove the server.

Migrate Master Replicas#

Each NDS Partition must have an Active Master Replica for proper EDirectory tree operations. If the failed server has any Master Replicas you need to Designate another server to become the replica Master.

If you are sure, the server doesn't have any replicas, you could skip to CleaningUp The Tree.

http://ldapwiki.willeke.com/wiki/Removing%20Failed%20Server#section-Removing+Failed+Server-CleaningUpTheTreeRemovingFailedServerEntries

Go to another server holding the same replicas in the tree and typing the command:

/opt/novell/eDirectory/bin/ndsrepair -P -Ad
  • Select - the Partition
  • Select - the View Replica Ring

It will show a list of all the servers in the replica ring of that partition indicating the "REPLICA TYPE".

Make sure another server holding a read/write replica of the same partition is made the master before you do go any further!
If the crashed server is the master of that particular partition, you MUST make sure another server holding a read / write replica of the same partition is made the master before you do go any further.

Repeat this check for Each Partition.

If the server is a Master of any partition you will need to From the server which is to be Designated as the new master of the partition type the command:

/opt/novell/eDirectory/bin/ndsrepair -P -Ad
  • Select - the Partition
  • Select -┬áDesignate this server as the new master replica

Repeat for Each Partition where the failed server is a Master.

If the crashed server is the master of that particular partition - and there is no other server holding read/write replica of the same partition and the only other replica type is sub reference - then you have lost all the objects in that particular partition and Restore is required.

WARNING: DO NOT designate a Subordinate Reference replica as the Master Replica unless no read/write replica or Read Only replica exists of that partition. Doing so will cause all of your partition objects to go unknown and you will have to recreate or restore the entries.

Remove Crashed Server From Repica#

Remove the NCP server object of the crashed server.

Verify that each replica ring is consistent and valid On the each server in the tree - type the command:

/opt/novell/eDirectory/bin/ndsrepair -P -Ad
Then:
  • Select - A partition
  • Select - View replica ring
  • Select - Failed Server (If if exists)
  • Select - Remove this server From Replica Ring

Repeat for each partition.

Cleaning up the Tree - Removing Failed Server Entries#

There will still be some orphaned entries in eDirectory from the Crashed Server.

Sometimes after removing sub ref from the replica ring it still shows in the replica ring - you would need to manually to remove the crashed server from the replica ring of that particular server.

  • Go to the container that contains the NCP server object
  • Remove the NCP server object. (Note: Make sure to remove the correct server object.)

Delete all the other objects relating to the server:

  • Http Server
  • LDAP Server
  • LDAP Group
  • SNMP Group
  • SAS Service
  • PS object
  • Four certificates
    • IP AG
    • SSL Certificate IP
    • DNS AG
    • SSL Certificate DNS

Force Immediate Synchronization#

You should Force Immediate Synchronization using ndstrace.

Final check #

Verify time is in sync and there are no errors or references pointing to the crashed server in report sync status.

Verify Time is Synchronized#

Be sure to Verify Time is Synchronized.

Sync Check#

/opt/novell/eDirectory/bin/ndsrepair -E

More Information#

There might be more information for this subject on one of the following:

Add new attachment

Only authorized users are allowed to upload new attachments.
« This page (revision-28) was last changed on 05-Jun-2013 12:11 by jim