Category: Servers

  • HP DL380 G9 Not Booting Into Smart Storage Administrator

    HP DL380 G9 Not Booting Into Smart Storage Administrator

    You may run into a case where a G9 server fails to boot into the HP Smart Storage Administrator. In my case the server would freeze here after hitting enter.

    It’s not actually frozen but hitting enter does nothing and your only recourse is to just reboot the server.

    Beyond that you may also notice Intelligent Provisioning appears to be completely borked. And most of the time this is the case.

    The resolution is to simply reinstall Intelligent Provisioning. Start by downloading a copy here.

    The file will come in an ISO format. Turn this into a bootable USB or alternatively just boot from it using the remote console in iLO. If you have difficulties making the USB I find booting it from the virtual KVM almost always works.

    This is the screen you want to see. You’ll get a progress bar and the whole process takes about 10 – 15 minutes. The UID light will flash blue indicating a firmware update is taking place.

    Once the system restarts you can attempt to boot into the Smart Storage Administrator once again.

    In my case I was now able to successfully boot into the software and configure my drives.

  • A fatal error was detected on a component at bus 25 device 0 function 0.

    A fatal error was detected on a component at bus 25 device 0 function 0.

    One of my R640s presented a PCIe error with the following iDRAC log.

    A fatal error was detected on a component at bus 25 device 0 function 0.

    At boot time I received a generic PCIe error message prompting me to dig deeper into the logs. Going into the BIOS and looking at the devices it was immediately apparent what the problem was.

    The server was not seeing my NDC.

    Replacing the NDC resolved the issue. However, if you’re unable to determine the problematic device you can also do the following.

    In iDRAC, navigate to system—>Inventory—>hardware inventory.

    Here you will see hardware info related to devices and their associated bus numbers. Finding the bus number referenced in the error will tell you which device is causing the problem.

    Here you can see bus number 25 is associated with the NDC:

    Sometimes it’s not this easy though. If you can’t locate the bus number the best troubleshooting step is simply removing PCIe devices one by one until the error goes away. Reseat the component and replace it if the problem remains.

  • R640 10 Bay NVME Tutorial

    R640 10 Bay NVME Tutorial

    The DELL Poweredge R640 does support NVME with a bit of extra cabling. More specifically, the 10 bay chassis. In this tutorial I’ll show you exactly what cables you need to setup 2, 4, 8, or a complete 10 bay NVME system.

    The first thing to know is there are 3 DELL cables used to make this happen. The first cable we’ll discuss supplies the first 2 ports with NVME:

    DELL P/N 0M026C

    It’s essentially a slim SAS cable that connects to an NVME controller/expander card. The other cables we’ll discuss plug straight into the motherboard, but this one requires a separate card.

    If you’re facing the server from the front, the connector is on the bottom left of the backplane. One end of the cable is labeled BP which is short for backplane. You’ll plug the BP side into the backplane.

    There will be some other cables in your way. You don’t have to unplug them but it does help. The other end of the cable is labeled CTRL. The cable will route alongside the backplane and then up the entire length of the right side of the chassis.

    Route this cable alongside the fans and then up the right side of the chassis

    Now that the cable is routed appropriately it’s time to install the NVME controller card.

    DELL P/N 0CDC7W

    Install the card into riser 1 and then attach the cable to the first port.

    Now you’ve officially supplied the first 2 drive bays with NVME. The remaining 8 bays also require their own cabling setup. If all you needed was support for 2 NVME drives you can stop here. No further work is required, just plug in your drives and fire up the server.

    Let’s install the other 2 sets of cables. Each of these cables supplies 4 bays with NVME support. The set of cables labeled A0 and B0 supply bays 6 – 9. The set of cables labeled A1 and B1 supply bays 2 – 5. The previous cable we just installed supplies bays 0 – 1.

    Don’t feel bad if you struggle installing these, they’re an absolute pain unless you’ve cabled hundreds of them and have experience.

    Here’s what the next cable looks like:

    DELL P/N 0684MR

    We will now move all the way to the right side of the backplane to the other Slim SAS connectors.

    The cable labeled A0 and B0 will plug into their corresponding ports on the backplane (also labeled A0 and B0.)

    These cables will route all the way to the left side of the chassis, up the cable channel, and then plug in to the Slim SAS connectors in the left rear of the motherboard.

    Now onto the other set of cables (A1 and B1)

    DELL P/N 0TXC4H

    Do your best to tuck them under as best you can. Alongside the fans you’ll see hooks which keep the cables restrained and from popping out.

    When it comes to routing them along the left side channel, I recommend pulling out the cables already installed. You don’t have to unplug them but it’s easier to route the NVME cables without them in the way. It’s much easier to tuck them in next to the NVME cables later.

    Pull out the cables from the side channel to make installing the NVME cables easier.

    Once you have them tucked in nicely you’ll see where they have to plug in to in the rear.

    Find the ports labeled M1/M2/M3/M4 and just match them up.

    And that’s all there is to it. If you’ve installed all 3 sets of cables you now have a server that supports 10 NVME drives. Keep in mind you don’t have to install all 3 sets. Once again, maybe you just want 4 bays with NVME connectivity. In that case, just install one set of the cables.

    Thank you for reading.

  • The storage BP2 SAS A0 cable is not connected, or is improperly connected.

    The storage BP2 SAS A0 cable is not connected, or is improperly connected.

    The R740xd has specific numbers it uses to refer to different backplanes on your server. The R740xd can technically support 3 different backplanes. Of course you have the primary backplane. This is the backplane used to install drives through the front of the server. The other 2 backplanes refer to the mid and rear flex bays if you have those installed.

    So quite simply:

    • BP0 – Rear backplane for flex bay
    • BP1 – Primary backplane for the front drive slots
    • BP2 – Backplane for the mid bay

    So if you’re getting backplane errors on BP2 this means the server is detecting a problem with your mid bay. (See the end of this post if you don’t actually have a mid bay installed!)

    This could happen for a number of reasons. If you just installed the mid bay it’s likely you have the wrong cable installed or have it plugged into the wrong port on the backplane. The easiest first troubleshooting step is ensuring the cable is plugged in properly. The proper cable will have 2 SAS connectors on one end that plug into the mid bay and then a single SAS connector on the other that routes along the side of the chassis and plugs into the A1 port on the primary backplane. Perhaps you have it plugged into the A2 port (12 bay model) or the B1 port (24 bay model.)

    You also might have the incorrect cable. The R740xd has 2 primary models. The 12 bay LFF version and the 24 bay SFF version. Both of these servers use different cables to interface with the backplane. The mid bay is physically the same hardware but both servers will use a different cable due to differences in the primary backplane design. You might have a cable designed for the 24 Bay installed in a 12 bay server, or vice versa.

    Take a look at the cable and then look at the port you’re plugging it into on the backplane. The port numbers should match. For example, on an R740xd 12 bay the ports for the mid bay and rear flex bay are labeled A1 and A2.

    A cable labeled A1 should only go into a port also labeled A1

    The cable will also be labeled the same. A cable labeled anything else is not going to work. Cables for the 12 and 24 bay systems might fit in each others ports but you’ll notice they go in at awkward angles. In the case of the 24 bay chassis, using a cable designed for the 12 bay will block the B1 port due to the angle of the connector. It may also throw errors. You must source the correct cable.

    However, If you’ve determined that you’re using the correct cable it’s time to look deeper. I recently built a server and had this error at boot time. I was positive I had the correct SAS cable connected to the mid bay. Most of the time I simply replace the entire mid bay but this time I looked a little deeper and noticed the cable was physically damaged. Sometimes there is obvious damage to the cable from being stuffed against and forced down into the cable channeling system.

    Replacing the cable resolved the issue. Other times the backplane itself was faulty, In that case the backplane and arguably the entire mid bay should be replaced. My logic is if you’re going to order parts to fix the problem you should order every possible part necessary to avoid wasting time and reordering should one of the parts alone fail to fix the problem.

    I’ve also had some success updating the CPLD firmware on the system. This was a classic fix we discovered at work. Sometimes we’d have a problem with one of the backplanes and swapping out the entire hardware for new stuff didn’t fix the issue. In these cases we found updating the firmware for the complex programmable logic device was the solution. This chip is involved with detecting whether or not cables are plugged in and what SAS lanes are active, so to speak. It has worked enough times that I’d say it’s worth a shot to try.

    So just to reiterate:

    • Ensure you have the right cable
    • Ensure the cable is plugged into the correct port on the backplane (A1)
    • Ensure the cable is not damaged

    Assuming all of the above conditions are satisfied yet the issue is still not resolved, try updating the CPLD firmware. Failing that it’s probably time to order a new mid bay cable, another backplane for the midbay, or the entire backplane configuration with the correct cable from a trusted source.

    As a final note, I’ve also seen this error pop up in servers that have no flex bay installed at all. In some cases updating the CPLD firmware resolved the issue, in other cases we considered the server failed at least for a serious production environment.