Replace a defective compute node

Use this information to replace a defective compute node.

About this task

To avoid possible danger, read and follow the following safety statement.
  • S002
    disconnect all power
    CAUTION:
    The power-control button on the device and the power switch on the power supply do not turn off the electrical current supplied to the device. The device also might have more than one power cord. To remove all electrical current from the device, ensure that all power cords are disconnected from the power source.
Attention:
  • Only trained service technicians are allowed to perform this procedure. Unauthorized personnel should not attempt to replace this component.

  • If possible, back up all compute node settings, including the settings of the optional components installed in the compute node.

  • Read the Installation guidelines to ensure that you work safely.
Important: After a compute node is replaced, it is required to update the compute node with the latest firmware or restore the pre-existing firmware. Make sure that you have the latest firmware or a copy of the preexisting one before you proceed (see Firmware updates for more information).
Watch the procedure. A video of the installation process is available:
Procedure

  1. Make preparations for this task.
    1. Turn off the corresponding compute node on which you are going to perform the task.
    2. Remove the compute node from the enclosure (see Remove a compute node from the enclosure).
  2. Remove the node front cover (see Remove the node front cover).
  3. Remove the node air baffles (see Remove the front air baffle and Remove the middle air baffle).
  4. Remove all of the drives and fillers (if any) and place them on the static protective surface (see Remove a hot-swap solid-state drive).
    Note: Record the drive bay number when removing the drives so that you can install them back into the same drive bays in the replacement compute node.
  5. Remove the drive cage assembly and place it on the static protective surface (see Remove the drive cage assembly).
  6. Remove the PCIe riser assembly and place it on the static protective surface (see Remove the PCIe riser assembly).
  7. Make sure the correct spacer (SATA or NVMe) has been installed into the replacement compute node (see Replace SATA and NVMe spacers).
  8. Install the drive cage assembly in the replacement compute node (see Install the drive cage assembly).
  9. Install the drives that were removed earlier into the replacement compute node (see Install a hot-swap solid-state drive).
  10. Install the PCIe riser assembly into the replacement compute node (see Install the PCIe riser assembly).
  11. Transfer the two processor-heat-sink modules (PHM) from the defective compute node to the replacement unit.
    1. Remove the socket cover from processor socket 1 in which you plan to install processor 1 in the replacement compute node.
    2. Remove the PHM (processor 1) from the defective compute node (see Remove a processor and heat sink).
    3. Install the PHM (processor 1) into the socket in the replacement compute node (see Install a processor and heat sink).
    4. Orient the socket cover that was removed earlier above the empty processor socket 1 in the defective compute node; then, gently press on the four corners of the socket cover that you placed on the empty processor socket in to secure the cover to the socket.
    5. Repeat the previous steps for processor 2 and its T-shaped heat sink.
    Attention: When transferring a PHM to a replacement compute node:
    • Remove and install only one PHM at a time. As the system board supports two processors, install the PHMs starting with the first processor socket.

    • Each processor socket must always contain a cover or a PHM. Always protect an empty processor socket in a compute node with a socket cover.

    • Install the removed PHM to the replacement compute node immediately after its removal.

  12. Remove one memory module at a time from the defective compute node (see Remove a memory module), and immediately install it into the same memory module slot in the replacement compute node (see Install a memory module) until all of the memory modules are transferred.
  13. If a M.2 backplane has been installed in the compute node, remove it (see Remove the M.2 backplane) and install it in the replacement compute node (see Install the M.2 backplane).
  14. If a TPM card has been installed in the compute node, remove it from the compute node and install it in the replacement compute node.
  15. Route and connect all cables transferred in previous steps (see Internal cable routing).
  16. Install the node air baffles in the replacement compute node (see Install the front air baffle and Install the middle air baffle).
    Note: For proper cooling and airflow, make sure to install the node air baffles. Operating the compute node without the air baffles might lead to component damage.
  17. Install the node front cover in the replacement compute node (see Install the node front cover).

After you finish

  1. Install the replacement compute node into the enclosure (see Install a compute node in the enclosure).

  2. Update the machine type and serial number with new vital product data (VPD). Use the Lenovo XClarity Provisioning Manager V3 to update the machine type and serial number. See Update the machine type and serial number.

  3. Enable Trusted Platform Module (TPM). See Enable TPM.

  4. Optionally, enable UEFI Secure Boot. See Enable UEFI Secure Boot.

  5. Update the compute node configuration.
  6. Check the power LED on each node to make sure it changes from fast blink to slow blink to indicate the node is ready to be powered on.

  7. If you are planning to recycle the compute node, follow the instruction in Disassemble the compute node for recycle for compliance with local regulations.

Important: Before returning the defective compute node, make sure that a socket cover is securely attached to each empty processor socket and the cover was re-installed over the defective node.