Jump to content
  • Linux 6.1 Will Make It A Bit Easier To Help Spot Faulty CPUs

    aum

    • 284 views
    • 2 minutes
     Share


    • 284 views
    • 2 minutes

    While mostly of benefit to server administrators with large fleets of hardware, Linux 6.1 aims to make it easier to help spot problematic CPUs/cores by reporting the likely socket and core when a segmentation fault occurs, which can help in spotting any trends if routinely finding the same CPU/core is causing problems.


    Queued up now in TIP's x86/cpu branch for the Linux 6.1 merge window in October is a patch to print the likely CPU at segmentation fault time. Printing the likely CPU core and socket when a seg fault occurs can be beneficial if routinely finding seg faults happening on the same CPU package or particular core.


    Rik van Riel who authored the change summed it up as:


     In a large enough fleet of computers, it is common to have a few bad CPUs. Those can often be identified by seeing that some commonly run kernel code, which runs fine everywhere else, keeps crashing on the same CPU core on one particular bad system.


     However, the failure modes in CPUs that have gone bad over the years are often oddly specific, and the only bad behavior seen might be segfaults in programs like bash, python, or various system daemons that run fine everywhere else.


     Add a printk() to show_signal_msg() to print the CPU, core, and socket at segfault time.


     This is not perfect, since the task might get rescheduled on another CPU between when the fault hit, and when the message is printed, but in practice this has been good enough to help people identify several bad CPU cores.


    This little helper to assist in spotting potentially faulty processors will be there for use starting on Linux 6.1 later this year.

     

    image.php?id=2017&image=bent_kaby_1_med

    Not directly related: I Bent A Kabylake CPU & It Still Works

     

    It's a small but useful complement to the likes of the new Intel In-Field Scan, MCEs, EDAC reporting, etc.

     

    Source


    User Feedback

    Recommended Comments

    There are no comments to display.



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Paste as plain text instead

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...